anydocs
v0.0.3
Published
Documentation search CLI using SQLite FTS5 full-text search with Porter stemming and BM25 ranking
Maintainers
Readme
anydocs
Markdown documentation search CLI using SQLite FTS5 full-text search.
Features
- Full-text search with SQLite FTS5 and Porter stemming
- Fast local indexing with better-sqlite3
- Simple CLI with
docsandsearchcommands - WAL mode enabled for better concurrency
- BM25 ranking for search results
Installation
pnpm install
pnpm run buildQuick Start
# 1. Install globally
npm install -g anydocs
# or
pnpm install -g anydocs
# 2. Initialize anydocs
anydocs init
# 3. Edit config file at ~/.config/anydocs/anydocs.json
cat > ~/.config/anydocs/anydocs.json << 'EOF'
{
"projects": [
{ "repo": "vercel/next.js" }
]
}
EOF
# 4. Install (clone and index) projects
anydocs install
# 5. Search for content
anydocs search "routing" -n 5
# 6. Retrieve a specific document
anydocs docs /docs/app/getting-started.md --project next.jsUsage
Configure Projects
Edit ~/.config/anydocs/anydocs.json to define projects:
{
"projects": [
{ "repo": "vercel/next.js" },
{ "repo": "facebook/react" },
{
"repo": "github.com/vuejs/core",
"name": "vue3",
"ref": "v3.4.0",
"path": "packages/*/README.md"
}
]
}Configuration fields:
repo(required): Repository inowner/repoorhost/owner/repoformatname(optional): Project name, defaults to repo name (e.g., "next.js")ref(optional): Git branch or tag, defaults to repository's default branchpath(optional): Glob pattern for indexing, defaults to**/*.{md,mdx}sparse-checkout(optional): Array of paths for sparse checkoutoptions(optional): Additional CLI options
Install Projects
Install (clone and index) all configured projects:
# Install all projects
anydocs install
# Install specific project only
anydocs install --project next.jsWhat install does:
- Clones repositories to
~/.local/share/anydocs/repos/host/owner/repo - Creates symlinks under
~/.local/share/anydocs/docs/ - Indexes Markdown files matching the glob pattern
- Updates lockfile at
~/.local/share/anydocs/anydocs-lock.yaml - Idempotent: re-running updates existing installations
Search Documents
# Basic search
anydocs search "hello"
# Limit number of results
anydocs search "world" -n 5
# Search with FTS5 query syntax
anydocs search "hello AND world"
anydocs search '"exact phrase"'
anydocs search "run*" # Prefix search
anydocs search "hello OR world"
anydocs search "hello NOT world"Search output (JSON):
[
{
"path": "/guide/intro.md",
"title": "Getting Started",
"snippet": "Welcome to the <b>documentation</b> system!",
"score": -0.0000015
}
]Retrieve Document
Retrieve the raw Markdown content:
anydocs docs /guide/intro.md --project next.jsOutput is the original Markdown with front-matter removed.
Complete Example
# Initialize anydocs
anydocs init
# Configure projects
cat > ~/.config/anydocs/anydocs.json << 'EOF'
{
"projects": [
{ "repo": "vercel/next.js" },
{ "repo": "facebook/react" }
]
}
EOF
# Install all projects
anydocs install
# Search across all projects
anydocs search "hooks" -n 5
# Search specific project
anydocs search "routing" --project next.js
# Get specific document
anydocs docs /docs/app/routing.md --project next.js
# Re-install (updates existing entries)
anydocs installArchitecture
- Config:
~/.config/anydocs/anydocs.json(user-editable project list) - Lockfile:
~/.local/share/anydocs/anydocs-lock.yaml(auto-generated) - Repositories:
~/.local/share/anydocs/repos/host/owner/repo - Symlinks:
~/.local/share/anydocs/docs/project-name - Database:
~/.local/share/anydocs/db/default.db(SQLite FTS5) - Schema:
pages(path UNINDEXED, project UNINDEXED, title, body) USING fts5(tokenize='porter') - Output:
docs: Raw Markdown to stdoutsearch: JSON array with{path, project, title, snippet, score}
Search Features
- Porter stemming:
runmatchesrunning - FTS5 syntax: AND/OR/NOT, phrases, NEAR, prefix search with
* - Highlighting: Search snippets with
<b>...</b>tags - BM25 scoring: Relevance-ranked results
TODO
- [x] Implement
index <root> [pattern]command to recursively index Markdown files - [x] Parse and extract front-matter
- [x] Extract first heading as title
- [x] Normalize paths (relative from root, starting with
/) - [x] Make indexing idempotent (replace existing paths)
- [x] Add transactional batch indexing
- [x] Add
initcommand for directory setup - [x] Add
installcommand with anydocs.json config - [x] Support ghq-style repository paths (owner/repo or host/owner/repo)
- [x] Auto-detect repository default branch
- [x] Generate lockfile (anydocs-lock.yaml)
- [ ] Support differential re-indexing with mtime tracking
- [ ] Implement
export-llmscommand for llms.txt generation - [ ] Add CLI package installation (npm/pnpm global install)
- [ ] Add progress indicator for large indexing jobs
Specification
- Configuration:
~/.config/anydocs/anydocs.jsondefines projects with minimal required fields - Repository format: Supports
owner/repo(implies GitHub) orhost/owner/repo - Default values: Only
reporequired;name,ref,pathhave smart defaults - Lockfile: Auto-generated
anydocs-lock.yamltracks cloned refs and timestamps - Storage: Repositories at
$XDG_DATA_HOME/anydocs/repos/host/owner/repo - Database: Single FTS5 database at
$XDG_DATA_HOME/anydocs/db/default.db - Schema:
pages(path UNINDEXED, project UNINDEXED, title, body) USING fts5(tokenize='porter') - Commands:
init: Create directory structure and empty configinstall [--project name]: Clone repos and index docssearch <query> [-n N] [--project name]: Full-text searchdocs <path> [--project name]: Retrieve document
- Indexing: Strip front-matter, extract first
# ...as title - Idempotent: Re-running
installupdates existing installations - Path normalization: Relative from repo root,
/-separated, starts with/ - Search: BM25 ranking, Porter stemming, FTS5 syntax (AND/OR/NOT/NEAR/*)
- Output: JSON with fixed key order, snippets with
<b>...</b>highlighting - Error handling: Exit non-zero on errors, empty array for no results
- Runtime: Node.js with better-sqlite3, TypeScript, works with pnpm
Development
# Install dependencies
pnpm install
# Build TypeScript to JavaScript
pnpm run build
# Run tests
pnpm test:run # Run all tests once
pnpm test # Run tests in watch mode
pnpm test:ui # Open Vitest UI
# Linting and formatting
pnpm run lint # Check code
pnpm run lint:fix # Auto-fix issues
pnpm run format # Format code
# Development with tsx (no build needed)
pnpm run dev init
pnpm run dev install
pnpm run dev search "query"
# Run built CLI
node dist/index.js init
node dist/index.js search "query"Database Schema
CREATE VIRTUAL TABLE pages USING fts5(
path UNINDEXED,
title,
body,
tokenize='porter'
);FTS5 creates auxiliary tables: pages_content, pages_data, pages_idx, pages_docsize, pages_config.
