npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2026 – Pkg Stats / Ryan Hefner

@zzzhizhia/ingest

v1.7.0

Published

Interactive CLI for ingesting raw files into an org-mode LLM wiki

Readme

ingest

Interactive CLI for ingesting raw source files into an org-mode LLM wiki via claude -p. Supports standalone knowledge bases and subwiki knowledge bases with independent digestion.

Quick Start

# Install
npm install -g @zzzhizhia/ingest

# Scaffold a new wiki
ingest init ./wiki
cd wiki

# Drop a file into raw/ and ingest it
cp ~/notes/article.md raw/
ingest

Or run directly without installing:

npx @zzzhizhia/ingest init ./wiki

Requirements

  • Node >= 20
  • claude CLI in PATH (install guide)
  • rg (ripgrep) for ingest grep
  • LibreOffice (optional, for Office file conversion)
  • glow (optional, for rendered query output)

Optional dependencies can be installed via Homebrew:

brew install ripgrep glow libreoffice

Usage

# Interactive checkbox -- select which pending files to ingest
ingest

# Ingest all pending files without prompting
ingest --all

# Ingest specific files directly (skips pending scan)
ingest raw/clips/article.org

# Show pending files, subwiki grouping, and config
ingest status

# Scaffold a blank wiki (+ pre-commit hook if git repo)
ingest init
ingest init ./path/to/new-wiki

# Remove a file from lock (makes it pending again for re-ingestion)
ingest forget raw/clips/article.org

# Validate wiki files (format, links, ID uniqueness)
ingest lint

# Validate + apply safe deterministic auto-fixes
ingest lint --fix

# Ask a question against the wiki via Claude
ingest query "What do we know about X?"

# Search wiki pages by title and print full org content
ingest grep "Alice"
ingest grep "^Claude$"

# Export a wiki page and its linked neighborhood as HTML
ingest export <id> [--depth N] [--backlinks] [--output PATH] [--open]
ingest export --list

# List past ingest runs (state in $XDG_STATE_HOME/ingest/runs.json)
ingest history
ingest history --last 5
ingest history --status interrupted,completed
ingest history 01HXYZW...

# Resume the most recent interrupted run (reuses the original claude session)
ingest resume
ingest resume 01HXYZW...

Options

| Option | Description | |--------|-------------| | -a, --all | Ingest all pending files without prompting | | --no-pull | Skip git pull and subwiki sync before ingesting | | -V, --version | Show version and exit | | --verbose | Stream Claude output in real-time (default: spinner with elapsed time) | | --depth N | BFS hops for export (default 1) | | --backlinks | Include reverse links during BFS for export | | --output P | Output HTML path for export | | --output-root D | Directory for export with auto Denote-style filename | | --open | Open exported HTML in browser | | --fix | Apply safe auto-fixes (used with lint) | | --last N | Show only the last N runs (used with history) | | --status S | Filter by status: in-progress, completed, interrupted (used with history) |

Knowledge Base Structure

An ingest knowledge base is a git repository with this layout:

wiki/
├── entities.org          ← People, organizations, products, places
├── concepts.org          ← Ideas, theories, frameworks, methods
├── sources.org           ← One summary per ingested source file
├── analyses.org          ← Comparisons, syntheses, deep dives
├── summary.org           ← Dashboard + activity log (optional, main repo only)
├── raw/                  ← Immutable source material
│   ├── clips/            ←   Web clippings (.org, .md)
│   ├── books/            ←   Book notes
│   ├── papers/           ←   Academic papers (.pdf)
│   ├── plaud/            ←   Audio transcripts
│   └── assets/           ←   Images, diagrams
├── subs/                 ← Subwiki knowledge bases
│   ├── team-wiki/        ←   Each with its own entities/concepts/sources/analyses.org
│   └── project-wiki/
├── ingest-lock.json      ← Digestion state (path → content hash + timestamp)
├── ingest.json           ← CLI config (model, effort, allowedTools)
├── CLAUDE.md             ← Schema instructions for Claude
├── .gitignore
└── .gitattributes

Category files are org-mode files where each top-level heading is a wiki "page":

* Claude Code                                                        :entity:
:PROPERTIES:
:ID:       20260503T120000
:DATE:     [2026-05-03]
:SOURCES:  raw/clips/20260503T115200--claude-code-announcement__dev.org
:END:

** Overview

Anthropic's CLI tool for AI-assisted software development.

** Content

Claude Code provides terminal-based access to Claude...
  [source: raw/clips/20260503T115200--claude-code-announcement__dev.org § Features | HIGH]

** Cross-references

- [[id:20260501T090000][Anthropic]] — developer

Key properties:

  • :ID: -- Timestamp identifier (YYYYMMDDTHHMMSS), unique across all files
  • :DATE: -- Creation date in [YYYY-MM-DD] format
  • :SOURCES: -- Path to the raw source file that contributed this page
  • Tag -- One of :entity:, :concept:, :source:, :analysis:, must match the file
  • Cross-references -- Bidirectional [[id:...][Title]] links between pages

Source files under raw/ are immutable -- ingest never modifies them. They use Denote naming: {YYYYMMDDTHHMMSS}--{title}__{tags}.ext.

Subwiki knowledge bases under subs/ are fully independent: own category files, own raw/, own git history. Useful for team/project wikis with different access permissions.

History & Resume

Every ingest run is recorded in $XDG_STATE_HOME/ingest/runs.json (machine-local, not version-controlled):

  • ingest history — list all runs, most recent first
  • ingest history <id> — show a run's session id, wiki, and timing
  • ingest history --last N / --status interrupted,completed — narrow the list

If a run is interrupted (Ctrl+C, crash, network drop), ingest resume re-invokes Claude with --resume <session-id> "continue". Claude's session already knows which sources it had been processing, which headings it had created, and which ones were still pending — so a one-word prompt is enough to pick up where it left off. The wiki does not need to be re-scanned and files are not re-listed.

ingest resume accepts an explicit run id; with no id it picks the most recent in-progress or interrupted run for the current wiki. The session id is only retained while Claude's own session log is alive (~30 days) — older runs fall back to a fresh ingest run.

Full Flow

git pull --ff-only (auto stash/pop)
  ↓
git submodule update --remote --init
  ↓
Scan raw/ + subs/ vs ingest-lock.json → find new + updated files
  ↓
Pre-convert: Office → PDF (LibreOffice)
  ↓
Interactive checkbox (skipped with --all or explicit paths)
  ↓
Group files by subwiki
  ↓
Claude sessions (main repo sequential, subwikis parallel)
  ↓
Write lock entries to ingest-lock.json (batch)
  ↓
Commit subwikis first + push
  ↓
Commit main repo (wiki files + lock + subwiki pointers) + push

Config

Place ingest.json at the org root to override defaults:

{
  "model": "sonnet",
  "effort": "medium",
  "allowedTools": ["Read", "Edit", "Bash(date *)", "Bash(date)", "Bash(grep *)", "Bash(git status)", "Bash(git log *)"],
  "prompt": {
    "systemAppend": "Additional instructions appended to the system prompt",
    "userPrefix": "Text prepended to the user prompt"
  }
}

All fields are optional. Missing fields use the defaults shown above. ingest init generates a starter config with model and effort.

Supported File Types

| Type | Extensions | Processing | |------|-----------|------------| | Text | .org, .md, .txt | Direct read | | HTML | .html | Direct read | | PDF | .pdf | Direct read | | Office | .doc, .docx, .ppt, .pptx, .xls, .xlsx | Pre-converted to PDF via LibreOffice |

Subwiki Knowledge Bases

Subwikis under subs/ are treated as independent knowledge bases. Each subwiki has its own entities.org, concepts.org, sources.org, analyses.org, and raw/.

When ingest finds source files inside a subwiki, it:

  1. Invokes Claude with the subwiki root as working directory
  2. Claude writes wiki pages to the subwiki's own category files
  3. Commits inside the subwiki, then pushes
  4. Main repo commits the subwiki pointer update + lock

Subwiki Claude sessions run in parallel. Subwiki wiki files at the root level are skipped during scanning.

Scaffolding

ingest init [path] creates a blank wiki template:

entities.org
concepts.org
sources.org
analyses.org
ingest-lock.json
ingest.json
raw/
  example-ingest-readme.md
subs/
.gitignore
.gitattributes
CLAUDE.md

If the target directory is a git repository, the pre-commit hook is also installed.

Wiki Validation

ingest lint checks all four category files:

  • Every top-level heading has a tag, :ID:, and :DATE:
  • Tag matches the file (entities.org:entity:, etc.)
  • :ID: format is YYYYMMDDTHHMMSS
  • All [[id:...]] links resolve to existing :ID: values
  • No duplicate :ID: across category files
  • :PROPERTIES: / :END: drawers are balanced

ingest lint --fix applies safe deterministic fixes before reporting:

  • Tag mismatch: replaces wrong tag with expected tag for the file
  • Broken link with unique title match: replaces invalid ID with the correct one

The same validation runs as a pre-commit hook. If a commit is rejected during ingestion, the CLI retries with safe fixes first, then an LLM fix pass.

Query

ingest query "question" invokes Claude in read-only mode against the wiki. Output is rendered with glow in interactive terminals, plain text when piped.

What Claude Does

Claude runs as claude -p with model and effort from ingest.json. Its instructions are embedded in the CLI -- it does not read CLAUDE.md.

For each source file, Claude:

  1. Reads the file (Office files via pre-conversion to PDF)
  2. Checks for existing wiki pages (grep SOURCES: in wiki files)
  3. Extracts entities, concepts, and key arguments with section-level attribution
  4. Matches existing headings (fuzzy) or appends new ones
  5. Writes pages following the org-mode wiki template, with source citations and confidence levels (HIGH / MED / LOW)
  6. Flags contradictions between new content and existing wiki pages

After all files are processed, Claude appends an entry to summary.org log (main repo only; subwikis skip this).

Claude is not responsible for git commits or lock updates -- the CLI handles both.

Allowed Tools

| Tool | Purpose | |------|---------| | Read | Read source files, wiki files, and images | | Edit | Write to wiki files | | Bash(date *) | Generate YYYYMMDDTHHMMSS IDs | | Bash(grep *) | Search for existing headings | | Bash(git status) | Check repo state | | Bash(git log *) | Read recent commit history |

Configurable via allowedTools in ingest.json.

Org Root Detection

The CLI walks up from the current directory looking for ingest-lock.json. Run ingest init to scaffold a new wiki, or run from anywhere inside an existing one.

License

MIT