npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2026 – Pkg Stats / Ryan Hefner

metascope

v0.6.0

Published

A CLI tool and TypeScript library to easily extract metadata from all kinds of software repositories.

Readme

metascope

NPM Package metascope License: MIT CI

A CLI tool and TypeScript library to easily extract metadata from all kinds of software repositories.

[!NOTE]

Metascope is under development. Expect breaking changes until a 1.0 release.

Overview

Metascope aggregates metadata from a local code repository into a single monolithic JSON object. Given a project directory, it checks multiple sources in parallel — local git history, package manifests, the GitHub API, the NPM registry, lines of code analysis, and more — and returns a JSON object containing everything it could find.

From there, an (optional) template system lets you refine and transform the output to reflect exactly which fields you need, useful for archival purposes, populating dashboards, or feeding data into other tools. The template system also provides a spec-compliant implementation of the CodeMeta vocabulary, allowing easy generation of codemeta.json files for a semantically normalized view of a variety of project types.

Highlights:

  • A wide net
    Metascope pulls project metadata from many available sources: package.json, pyproject.toml, NPM, PyPI, GitHub, git, filesystem stats, and more.

  • Graceful degradation
    Each source checks its own availability before extraction. Missing tools, unavailable APIs, or absent credentials are silently skipped — you always get back whatever data is available within the constraints of the calling context.

  • Parallel extraction
    After an initial codemeta pass for discovery hints (package name, repository URL, keywords), all remaining sources are checked and extracted concurrently.

  • Typed templates
    The defineTemplate() helper provides full autocomplete on available fields. TypeScript infers the return type from your template function, so getMetadata() returns exactly the shape you need.

  • CLI and library
    Use it as a command-line tool for quick inspection or pipe-friendly JSON output, or import it as a library for programmatic access with full type safety.

Getting started

Dependencies

Metascope requires Node.js 22.17+. It is implemented in TypeScript, ships as ESM, and bundles complete type definitions.

Metascope also requires a recent version of git on your path for quickly identifying ignored files and aggregating repository statistics.

Optional external tools:

  • GitHub CLI
    Used as a fallback for GitHub API authentication if no token is provided via --github-token or $GITHUB_TOKEN. It's trivially installed from Homebrew: brew install gh.

Installation

Invoke directly on the current directory:

npx metascope

...or install locally:

npm install metascope

...or install globally:

npm install --global metascope

If you're using PNPM, you can safely ignore the build scripts for the tree-sitter dependencies, since we're only interested in their bundled WASM implementations.

In your pnpm-workspace.yaml:

ignoredBuiltDependencies:
  - tree-sitter-python
  - tree-sitter-ruby

Usage

CLI

Command: metascope

Extract metadata from a code repository.

Usage:

metascope [path]

| Positional Argument | Description | Type | Default | | ------------------- | ---------------------- | -------- | ------- | | path | Project directory path | string | "." |

| Option | Description | Type | Default | | ---------------------- | -------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | --------- | ------- | | --template-t | Built-in template name (codemeta, codemetaJson, frontmatter, metadata, project) or path to a custom template file | string | | | --github-token | GitHub API token (or set $GITHUB_TOKEN) | string | | | --author-name | Optional author name(s) for ownership checks in templates | array | | | --github-account | Optional GitHub account name(s) for ownership checks in templates | array | | | --absolute | Output absolute paths. Use --no-absolute for relative paths. | boolean | true | | --offline | Skip sources requiring network requests | boolean | false | | --sources-s | Only run specific metadata sources (defaults to all) | array | | | --no-ignore | Include files ignored by .gitignore in the file tree | boolean | false | | --recursive-r | Search for metadata files recursively in subdirectories | boolean | false | | --workspaces-w | Include workspace-specific metadata in monorepos; pass a boolean to enable or disable auto-detection, or pass one or more strings to explicitly define workspace paths | | true | | --verbose | Run with verbose logging | boolean | false | | --help-h | Show help | boolean | | | --version-v | Show version number | boolean | |

Examples

Basic metadata extraction

Extract all available metadata from the current directory:

metascope

Output is pretty-printed JSON when writing to a terminal, compact JSON when piped.

Scan a specific directory
metascope /path/to/project
Use a built-in template
metascope --template project
Pass template data for ownership checks

Some preset templates return information based on the (relative) ownership status of a repo. This requires additional context data, which can be passed in via additional CLI flags:

metascope --template project --author-name "Jane Doe" --github-account janedoe

Multiple values are supported:

metascope --template project --author-name "Jane Doe" "John Doe" --github-account janedoe johndoe
Use a custom template file
metascope --template ./my-template.ts

Where my-template.ts might look like:

import { defineTemplate, helpers } from 'metascope'

export default defineTemplate(({ codemetaJson, github, gitStats }) => {
  const codemeta = helpers.firstOf(codemetaJson)
  const git = helpers.firstOf(gitStats)
  const gh = helpers.firstOf(github)
  return {
    commits: git?.data.commitCount,
    name: codemeta?.data.name,
    stars: gh?.data.stargazerCount,
    version: codemeta?.data.version,
  }
})
Run only specific sources

Extract metadata from only the sources you need, skipping everything else for faster results:

metascope --sources nodePackageJson gitStats
Pipe compact JSON to another tool
metascope | jq '.github.stargazerCount'
Provide a GitHub token

An optional GitHub token can allow access to metadata about private repositories, and raises the request limit if you're operating on a large collection of repositories:

metascope --github-token ghp_xxxxxxxxxxxx

Or set the GITHUB_TOKEN environment variable, or authenticate via gh auth login. Metascope will attempt to find a credential without bothering you.

Verbose logging
metascope --verbose

Logs source availability checks, extraction durations, and other diagnostics to stderr.

API

The metascope library exports getMetadata as its primary function, defineTemplate for type-safe template authoring, and a helpers namespace with utility functions for working with metadata in templates.

getMetadata

// Without a template — returns full MetadataContext
function getMetadata(options: GetMetadataOptions): Promise<MetadataContext>

// With a template — returns the template's return type
function getMetadata<T>(options: GetMetadataTemplateOptions<T>): Promise<T>

The function accepts a project directory path, optional credentials, and an optional template (a built-in name or a template function). It returns a promise resolving to either the full MetadataContext or the shaped output of your template.

All undefined values and empty source objects are deep-stripped from the output before returning.

To run only a subset of sources, pass a sources array with the desired source key names. When omitted, all sources run (the default). This is useful for faster extraction when you only need specific data:

const result = await getMetadata({
  path: '.',
  sources: ['nodePackageJson', 'gitStats'],
})

Templates can be combined with the sources option, but note that some of the built-in templates might suffer missing data if they rely on specific sources.

defineTemplate

function defineTemplate<T>(
  transform: (context: MetadataContext, templateData: TemplateData) => T,
): Template<T>

An identity wrapper that provides autocomplete and type inference when authoring templates. The optional second templateData argument provides user-supplied values (like author names or GitHub accounts) for parameterized ownership checks. Templates that don't need it can simply ignore the argument. Template developers can pass additional values as needed.

Examples

Get all metadata
import { getMetadata, helpers } from 'metascope'

const metadata = await getMetadata({ path: '.' })
console.log(helpers.firstOf(metadata.codemetaJson)?.data.name)
console.log(helpers.firstOf(metadata.github)?.data.stargazerCount)
console.log(helpers.firstOf(metadata.gitStats)?.data.commitCount)

See output sample for this repository.

Get metadata from specific sources only
import { getMetadata, helpers } from 'metascope'

const metadata = await getMetadata({
  path: '.',
  sources: ['nodePackageJson', 'licenseFile'],
})

// Only the requested sources are populated
console.log(helpers.firstOf(metadata.nodePackageJson)?.data.name)
console.log(helpers.firstOf(metadata.licenseFile)?.data.spdxId)
// Other sources are undefined
console.log(metadata.github) // Undefined
Get shaped metadata via a template
import { defineTemplate, getMetadata, helpers } from 'metascope'

const template = defineTemplate(({ codemetaJson, github }) => ({
  name: helpers.firstOf(codemetaJson)?.data.name,
  stars: helpers.firstOf(github)?.data.stargazerCount,
}))

// Result is typed as { name: ..., stars: ... }
const result = await getMetadata({ path: '.', template })
Provide credentials
import { getMetadata } from 'metascope'

const metadata = await getMetadata({
  credentials: { githubToken: 'ghp_xxxxxxxxxxxx' },
  path: '.',
})

Credential resolution follows a precedence chain: explicit options > environment variables > CLI tool fallbacks (e.g. gh auth token). This makes metascope work in both CI environments and local development without configuration.

Pass template data
import { defineTemplate, getMetadata, helpers } from 'metascope'

const template = defineTemplate(({ codemetaJson }, { authorName }) => {
  const codemeta = helpers.firstOf(codemetaJson)
  return {
    isAuthoredByMe: codemeta?.data.author?.some((a) => a.name === authorName),
    name: codemeta?.data.name,
  }
})

const result = await getMetadata({
  path: '.',
  template,
  templateData: { authorName: 'Jane Doe' },
})
Use a built-in template
import { getMetadata } from 'metascope'

const result = await getMetadata({ path: '.', template: 'frontmatter' })

Sources

Metascope extracts data from a wide range of data sources:

Local Files

| Ecosystem | Organization | Metascope Key | Source Specifications | | ---------- | ------------------------------------------------------------------------------------------------------- | ----------------------------- | --------------------------------------------------------------------------------------------------- | | Agnostic | | readmeFile | README.md (and variants) | | Agnostic | CodeMeta (v1) | codemetaJson | codemeta.json | | Agnostic | CodeMeta (v2) | codemetaJson | codemeta.json | | Agnostic | CodeMeta (v3.1) | codemetaJson | codemeta.json | | Agnostic | CodeMeta (v3) | codemetaJson | codemeta.json | | Agnostic | Documented below | metadataFile | metadata.json (and .yaml / .yml variants) | | Agnostic | Git | gitConfig | .git/config | | Agnostic | Public Code | publiccodeYaml | publiccode.yml (Also matches .yaml) | | Agnostic | SPDX | licenseFile | LICENSE, LICENCE, COPYING, UNLICENSE (and .md/.txt variants) | | Apple | Apple Info.plist | xcodeInfoPlist | Info.plist | | Apple | Xcode Project | xcodeProjectPbxproj | *.xcodeproj/project.pbxproj | | C++ | Arduino Library | arduinoLibraryProperties | library.properties | | C++ | Cinder CinderBlock | cinderCinderblockXml | cinderblock.xml | | C++ | openFrameworks Addon (Legacy) | openframeworksInstallXml | install.xml (Legacy format, replaced by addon_config.mk) | | C++ | openFrameworks Addon | openframeworksAddonConfigMk | addon_config.mk | | Go | Go Modules | goGoMod | go.mod | | Go | GoReleaser | goGoreleaserYaml | .goreleaser.yaml (Also matches .yml) | | Java | Maven | javaPomXml | pom.xml | | Java | Processing Library | processingLibraryProperties | library.properties | | Java | Processing Sketch | processingSketchProperties | sketch.properties (Not really specified...) | | JavaScript | NPM | nodePackageJson | package.json | | Obsidian | Obsidian | obsidianPluginManifestJson | manifest.json | | Python | PyPi (Distutils) | pythonSetupCfg | setup.cfg | | Python | PyPi (Distutils) | pythonSetupPy | setup.py | | Python | PyPi (pep-0621) | pythonPyprojectToml | pyproject.toml | | Python | PyPi (PKG-INFO) | pythonPkgInfo | .egg-info/PKG-INFO | | Ruby | Ruby Gems | rubyGemspec | *.gemspec | | Rust | Crates | rustCargoToml | Cargo.toml |

Local Tools

| Ecosystem | Organization | Metascope Key | Source Specifications | | --------- | --------------------------- | ------------------- | -------------------------------------------------------------------------------------------------------------------------------------------------- | | Agnostic | | dependencyUpdates | Dependency freshness (outdated packages, libyears) | | Agnostic | | fileStats | Filesystem metadata (file counts, directory counts, total size) | | Agnostic | Git | gitStats | Git CLI statistics (commits, branches, tags, contributors) | | Agnostic | None | codeStats | Lines of code analysis from tokei via bundled native bindings |

Remote Sources

You can skip network calls by passing --offline to the CLI.

| Ecosystem | Organization | Metascope Key | Source Specifications | | ---------- | --------------------------------------------------------------------------------------- | ------------------------ | -------------------------------------------------------------------- | | Agnostic | GitHub Repository Metadata | github | GitHub GraphQL metadata | | JavaScript | NPM Registry | nodeNpmRegistry | NPM registry API (download counts, publish dates, latest version) | | Obsidian | Obsidian Community Plugins | obsidianPluginRegistry | Obsidian community plugin stats (download counts) | | Python | PyPI Registry | pythonPypiRegistry | PyPI registry API (download counts, publish dates, latest version) |

About metadata.json

Metascope supports a minimalist metadata.json (or .yaml) file is supported, which can capture the minimal metadata required to populate a GitHub project's repository page's description, homepage, and topics.

This is a non-standard format that exists primarily for use in combination with github-action-repo-sync.

| Key | Key Aliases | CodeMeta Property | Notes | | ------------- | ---------------------------- | ----------------- | ----------------------------------------------------------------------------- | | description | None | description | String description of project | | homepage | url repository website | url | For repository values, git+ prefix and .git suffix are automatically stripped | | keywords | tags topics | keywords | Array of strings, or a single comma-delimited string |

If multiple key aliases are present in the object, priority for populating the associated codemeta.json goes to the key, then falls through to key aliases in the order shown above. (E.g. homepage takes priority over url.)

If you have more metadata to define but your project lacks a canonical package specification format, then creating a codemeta.json file is recommended over the non-standard metadata.json.

Templates

Metascope provides a basic templating / output transformation functionality to compose its output into more compact and focused representations.

Built-in templates

Five built-in templates are available by name. Pass the name as the template option on the CLI or in the API.

codemeta

The CodeMeta template provides a standard way to describe software using JSON-LD and schema.org terms. Most software projects already have rich metadata in manifests and other files (e.g. package.json, Cargo.toml, pyproject.toml, LICENSE, etc.), but the name and structure of semantically equivalent metadata is often inconsistent across ecosystems.

It leverages the crosswalk data generously compiled by CodeMeta contributors to assist in automating the mapping of various metadata formats to the CodeMeta standard. Where crosswalk data is unavailable or incomplete, heuristics are used instead.

This tool always outputs CodeMeta v3.1 files. When ingesting codemeta.json files defined in the older CodeMeta 1 and CodeMeta v2 contexts, all simple key re-mappings as defined in the crosswalk table are applied. However, some more nuanced conditional transformations (like the reassignment of copyright holding agents in v1) are not implemented.

More mature Python-based tools like codemetapy and codemeta-harvester perform a similar task, and either of these are recommended if you need codemeta.json output and aren't limited to a Node.js runtime.

Note that Metascope and its its author is not affiliated with the CodeMeta project / governing bodies.

metascope --template codemeta

See an output sample from the codemeta template run against this repository.

codemetaJson

A JSON-friendly derivation of the codemeta template. Produces the same aggregated metadata but parses it through a strict schema, stripping JSON-LD artifacts (like @context and @type) to yield plain JSON suitable for consumption by tools that don't understand JSON-LD.

metascope --template codemetaJson

See an output sample from the codemetaJson template run against this repository.

frontmatter

A compact, non-nested, polyglot overview of the project. Designed for Obsidian frontmatter — flat keys with natural language names, blending all available sources into a single trackable snapshot. Uses null for missing values to ensure stable keys.

metascope --template frontmatter

See an output sample from the frontmatter template run against this repository.

metadata

A minimal template that outputs the three fields used by metadata.json / metadata.yaml: description, homepage, and topics. Designed for use with github-action-repo-sync to populate a GitHub repository's description, homepage, and topics. Values from a metadata.json source file override what the codemeta template would otherwise produce.

metascope --template metadata

See an output sample from the metadata template run against this repository.

project

I needed this one for a legacy internal dashboard application. Includes ownership checks via authorName and githubAccount template data.

metascope --template project --author-name "Jane Doe" --github-account janedoe

See an output sample from the project template run against this repository.

Defining a custom template

Templates are pure functions that receive the full MetadataContext and an optional TemplateData object, and return whatever shape you like. They are applied after all sources have been extracted, so all available data is accessible.

Yes, you can just pipe output to jq and filter / transform as you please, but for complex templates with a lot of logic, TypeScript can be nicer to work with.

Use defineTemplate() for type inference and autocomplete.

Many helper functions for working with template data are also under the helpers namespace:

// In e.g. "metascope-template.ts":
import { defineTemplate, helpers } from 'metascope'

export default defineTemplate(({ codemetaJson, codeStats, github, gitStats }) => {
  const codemeta = helpers.firstOf(codemetaJson)
  const git = helpers.firstOf(gitStats)
  const gh = helpers.firstOf(github)
  const loc = helpers.firstOf(codeStats)
  return {
    commits: git?.data.commitCount,
    forks: gh?.data.forkCount,
    linesOfCode: loc?.data.total?.code,
    name: codemeta?.data.name,
    stars: gh?.data.stargazerCount,
    version: codemeta?.data.version,
  }
})

Passing template data

The second argument to a template function is a TemplateData object with optional authorName and githubAccount fields. This lets templates parameterize ownership checks instead of hardcoding author names:

import { defineTemplate, helpers } from 'metascope'

export default defineTemplate(({ codemetaJson }, { authorName, githubAccount }) => {
  const codemeta = helpers.firstOf(codemetaJson)
  const authors = codemeta?.data.author?.map((a) => a.name) ?? []
  const repo = codemeta?.data.codeRepository?.toLowerCase() ?? ''
  return {
    isMyProject: authors.includes(authorName),
    isOnMyGitHub: typeof githubAccount === 'string' && repo.includes(`/${githubAccount}/`),
    name: codemeta?.data.name,
  }
})

Values for the built-in templates are provided via the --author-name and --github-account CLI flags, or via the templateData option in the API. Templates that don't need this data can simply omit the second argument.

Using a custom template via the CLI

metascope --template ./metascope-template.ts

Template files are loaded via jiti, so TypeScript works out of the box without a build step.

Background

Metascope was built to support automated generation of project dashboards, badges, and documentation where a single source of truth for project metadata is useful. Rather than querying each API individually, metascope handles the discovery, authentication, and aggregation in one pass for a wide variety of project types.

Related projects

  • codemeta
    Standard shared metadata vocabulary (JSON-LD)
  • codemetapy
    Translate software metadata into the CodeMeta vocabulary (Python)
  • codemeta-harvester
    Aggregate software metadata into the CodeMeta vocabulary from source repositories and service endpoints (Python)
  • bibliothecary
    Manifest discovery and parsing for libraries.io (Ruby)
  • diggity
    Generates SBOMs for container images, filesystems, archives, and more (Go)
  • SOMEF
    Software Metadata Extraction Framework (Python)
  • Upstream Ontologist
    A common interface for finding metadata about upstream software projects (Rust)
  • GrimoireLab
    Platform for software development analytics and insights (Python)
  • OSS Review Toolkit
    A suite of CLI tools to automate software compliance checks (Kotlin)
  • Git Truck
    Repository visualization. (TypeScript)
  • Onefetch
    Offline command-line Git information tool (Rust)
  • Sokrates
    Polyglot source code examination tool (Java)

Slop factor

Medium.

The architecture and non-boilerplate parts of the documentation were human-driven, but sizable chunks of the implementation were mostly Claude Code's doing and have been subject to only moderate post-facto human scrutiny.

Maintainers

@kitschpatrol

Acknowledgments

Thank you to the CodeMeta Project Management Committee and contributors for their development and stewardship of the standard.

Jacob Peddicord's askalono project inspired the Dice-Sørensen scoring strategy used for classifying arbitrary license text.

Contributing

Issues and pull requests are welcome.

License

MIT © Eric Mika