npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2026 – Pkg Stats / Ryan Hefner

@vpxa/kb

v0.1.15

Published

Local-first AI developer toolkit — knowledge base, code analysis, context management, and developer tools for LLM agents

Readme

@vpxa/kb

Local-first AI developer toolkit — knowledge base, code analysis, context management, and developer tools for LLM agents.

Features

  • 64 MCP tools for AI agents to search, analyze, and manipulate codebases
  • 47 CLI commands for shell-based interaction
  • Local-first — ONNX embeddings, LanceDB vector store, no cloud dependencies
  • Three interfaces: MCP (for agent IDEs), CLI (for terminal agents), and programmatic API

What This Is

This is an MCP (Model Context Protocol) server that gives AI agents:

  1. Hybrid search — combines semantic vector search with full-text keyword search (BM25) using Reciprocal Rank Fusion (RRF) for best results. Supports hybrid (default), semantic, and keyword modes.
  2. Persistent curated memory — agents can remember, update, read, list, and forget knowledge entries that survive across sessions
  3. Codebase analysis — structural analysis, dependency graphs with confidence scoring, symbol extraction, pattern detection, and Mermaid diagram generation. Analysis results are auto-persisted into the vector store for future search.
  4. Knowledge production — automated analysis pipelines that produce structured baselines for the agent to synthesize and store
  5. Next-step workflow hints — every tool response includes contextual _Next: suggestions guiding the agent to logical follow-up actions
  6. Tree-sitter code chunking — AST-based chunking for TS/JS/Python/Go/Rust/Java preserves function/class boundaries for higher-quality code search
  7. Knowledge graph — auto-populated SQLite graph of modules, symbols, and import relationships with traversal, neighbor queries, and change impact tracking
  8. Graph auto-population — during indexing, a lightweight regex extractor builds the knowledge graph automatically from TS/JS source files (functions, classes, interfaces, types, imports)
  9. Optimized reindex — full reindex skips redundant hash checks, batches graph writes (flush every 50 files), and skips per-file graph deletes when bulk-cleared

The KB auto-indexes configured source directories on startup, stores embeddings in a local LanceDB vector store, and exposes everything through 64 MCP tools, 45 CLI commands, and 2 resources.


Quick Start

# Install
pnpm add -D @vpxa/kb

# Initialize in your project
npx @vpxa/kb init

# After upgrading, overwrite all scaffold/skill files
npx @vpxa/kb init --force

# Check which files are outdated (JSON report for LLM consumption)
npx @vpxa/kb init --guide

# Index your codebase
npx @vpxa/kb reindex

# Search
npx @vpxa/kb search "authentication middleware"

# Start MCP server for AI agents
npx @vpxa/kb serve

Note: Once @vpxa/kb is installed locally, you can use the short kb command (e.g. kb search, kb serve) since the local binary takes precedence.

Tools by Category

Search & Discovery

| Tool | CLI | Description | |------|-----|-------------| | kb_search | kb search | Hybrid vector + keyword search | | kb_find | kb find | Federated search (vector, FTS, glob, regex). Use mode: 'examples' to find usage examples. | | kb_symbol | kb symbol | Resolve symbol definition, imports, references | | kb_lookup | kb lookup | Look up indexed chunks by file path | | kb_trace | kb trace | Forward/backward flow tracing | | kb_scope_map | kb scope-map | Generate task-scoped reading plan | | kb_dead_symbols | kb dead-symbols | Find unused exported symbols (source vs docs) | | kb_file_summary | kb summarize | Structural file overview |

Code Analysis

| Tool | CLI | Description | |------|-----|-------------| | kb_analyze_structure | kb analyze structure | Project structure analysis | | kb_analyze_dependencies | kb analyze deps | Dependency graph | | kb_analyze_symbols | kb analyze symbols | Symbol extraction | | kb_analyze_patterns | kb analyze patterns | Code pattern detection | | kb_analyze_entry_points | kb analyze entry-points | Entry point discovery: handlers, CDK constructs, test suites, package exports (walks monorepo workspaces) | | kb_analyze_diagram | kb analyze diagram | Mermaid diagram generation | | kb_blast_radius | kb analyze blast-radius | Change impact analysis |

Context Management

| Tool | CLI | Description | |------|-----|-------------| | kb_compact | kb compact | Compress text/file to relevant sections (accepts path) | | kb_workset | kb workset | Named file set management | | kb_stash | kb stash | Named key-value store | | kb_checkpoint | kb checkpoint | Session checkpoint save/restore | | kb_parse_output | kb parse-output | Parse build tool output (tsc, vitest, biome) |

FORGE & Context Compression

| Tool | CLI | Description | |------|-----|-------------| | kb_forge_ground | — | Complete FORGE Ground phase in one call (chains classify→scope→summarize→constraints→evidence) | | kb_forge_classify | — | Classify FORGE tier (Floor/Standard/Critical) from files and task | | kb_evidence_map | — | Track critical-path claims as V/A/U with receipts + deterministic Gate | | kb_digest | — | Compress multiple text sources into token-budgeted digest | | kb_stratum_card | — | Generate STRATUM T1/T2 context cards (10-100x token reduction) |

Code Manipulation

| Tool | CLI | Description | |------|-----|-------------| | kb_rename | kb rename | Smart symbol rename across files | | kb_codemod | kb codemod | Regex-based code transformations | | kb_diff_parse | kb diff | Parse unified diff into structured changes | | kb_data_transform | kb transform | JQ-like JSON transformations |

Execution & Validation

| Tool | CLI | Description | |------|-----|-------------| | kb_eval | kb eval | Sandboxed JavaScript/TypeScript execution | | kb_check | kb check | Incremental typecheck + lint (detail: summary/errors/full) | | kb_test_run | kb test | Run tests with structured results | | kb_batch | kb batch | Parallel operation execution | | kb_audit | kb audit | Unified project audit: runs structure, deps, patterns, health, dead symbols, check, entry points in one call. Returns score, recommendations, and next steps. |

Knowledge Management

| Tool | CLI | Description | |------|-----|-------------| | kb_remember | kb remember | Store curated knowledge | | kb_update | kb update | Update existing entries | | kb_forget | kb forget | Remove entries | | kb_read | kb read | Read entry content | | kb_list | kb list | List entries | | kb_produce_knowledge | — | Auto-generate knowledge from analysis |

Git & Environment

| Tool | CLI | Description | |------|-----|-------------| | kb_git_context | kb git | Branch, status, recent commits | | kb_process | kb proc | Process supervisor | | kb_watch | kb watch | Filesystem watcher | | kb_delegate | kb delegate | Delegate subtask to local Ollama model |

Web & Network

| Tool | CLI | Description | |------|-----|-------------| | kb_web_fetch | — | Fetch web page → markdown/raw/links/outline | | kb_web_search | — | Search the web via DuckDuckGo (no API key) | | kb_http | — | Make HTTP requests for API testing/debugging |

Developer Utilities

| Tool | CLI | Description | |------|-----|-------------| | kb_regex_test | — | Test regex patterns (match/replace/split modes) | | kb_encode | — | Base64, URL, SHA-256, MD5, hex, JWT decode | | kb_measure | — | Code complexity and line-count metrics | | kb_changelog | — | Generate changelog from git history | | kb_schema_validate | — | Validate JSON data against JSON Schema | | kb_snippet | — | Persistent code snippet/template storage | | kb_env | — | System and runtime environment info | | kb_time | — | Date parsing, timezone conversion, duration math |

Verified Lanes

| Tool | CLI | Description | |------|-----|-------------| | kb_lane (create) | kb lane create | Create isolated file copy for parallel exploration | | kb_lane (list) | kb lane list | List active lanes | | kb_lane (status) | kb lane status | Show modified/added/deleted files in a lane | | kb_lane (diff) | kb lane diff | Generate diff between lane and originals | | kb_lane (merge) | kb lane merge | Merge lane files back to originals | | kb_lane (discard) | kb lane discard | Discard a lane entirely |

System

| Tool | CLI | Description | |------|-----|-------------| | kb_status | kb status | Index statistics + tree-sitter availability | | kb_reindex | kb reindex | Rebuild index | | kb_health | kb health | Project health checks (package.json, tsconfig, lockfile, circular deps) | | kb_guide | kb guide | Tool discovery — given a goal, recommends tools and workflow order | | kb_queue | kb queue | Task queue for sequential agent operations | | kb_graph | kb graph | Query and manage the knowledge graph (8 actions: find_nodes, find_edges, neighbors, traverse, stats, add, delete, clear) | | kb_onboard | kb onboard | First-time codebase onboarding — runs all analysis tools in one command, auto-persists results | | kb_replay | kb replay | View or clear the audit trail of tool invocations (action: list/clear) |

TUI Dashboard

kb tui          # Launch interactive Ink terminal dashboard

The TUI is a human monitoring dashboard with 4 panels: Status, Search, Curated, and Activity Log. It shows real-time tool activity via the replay audit trail, letting you observe what an AI agent is doing with the KB.

MCP Integration

After kb init, your .vscode/mcp.json is configured automatically:

{
  "servers": {
    "knowledge-base": {
      "type": "stdio",
      "command": "npx",
      "args": ["@vpxa/kb", "serve"]
    }
  }
}

CLI Usage

kb <command> [options]

# Search & Discovery
kb search <query> [--limit N] [--mode hybrid|semantic|keyword]
kb find [query] [--glob pattern] [--pattern regex] [--limit N]
kb symbol <name>
kb scope-map <task> [--max-files N]
kb trace <symbol> [--direction forward|backward|both] [--depth N]
kb examples <query> [--limit N]
kb summarize <file>
kb dead-symbols [--limit N]

# Analysis
kb analyze <type> <path>
kb lookup <path>

# Context
kb compact <query> [--path <file>] [--max-chars N]
kb workset <action> [name] [--files f1,f2]
kb stash <action> [key] [value]
kb checkpoint <action> [label]
kb parse-output [--tool tsc|vitest|biome|git-status]

# Code Manipulation
kb rename <old> <new> <path> [--dry-run]
kb codemod <path> --rules <file.json> [--dry-run]
kb diff
kb transform <expression>

# Execution
kb eval <code> [--lang js|ts] [--timeout N]
kb test [files...] [--grep pattern]
kb check [--skip-types] [--skip-lint] [--detail summary|errors|full]
kb batch

# Knowledge
kb remember <title> --category <cat> [--tags t1,t2]
kb forget <path> --reason <reason>
kb read <path>
kb list [--category cat] [--tag tag]
kb update <path> --reason <reason>

# Git & Environment
kb git [--commits N] [--diff]
kb proc <action> [id] [--command cmd]
kb watch <action> [--path dir]

# Verified Lanes
kb lane create <name> --files file1.ts,file2.ts
kb lane list
kb lane status <name>
kb lane diff <name>
kb lane merge <name>
kb lane discard <name>

# Graph
kb graph stats
kb graph find-nodes [--type module|function|class] [--query pattern]
kb graph find-edges [--type imports|defines] [--source path]
kb graph neighbors <nodeId> [--direction in|out|both]
kb graph traverse <startId> [--direction forward|backward] [--depth N]

# System
kb status
kb reindex [--full]
kb onboard <path> [--generate] [--out-dir <dir>]
kb serve [--transport stdio|http] [--port N]
kb init [--force] [--guide]

Configuration

kb.config.json:

{
  "sources": [{ "path": ".", "name": "project" }],
  "embedding": {
    "model": "Xenova/mxbai-embed-large-v1",
    "dimensions": 1024
  },
  "store": {
    "backend": "lancedb",
    "path": ".kb-data/store"
  },
  "curated": {
    "path": "curated"
  }
}

Architecture

@vpxa/kb
├── packages/
│   ├── core/         — types, config, logger, constants
│   ├── store/        — LanceDB vector store
│   ├── embeddings/   — ONNX local embeddings
│   ├── chunker/      — tree-sitter + regex code chunking
│   ├── indexer/      — incremental file indexer
│   ├── analyzers/    — blast-radius, deps, symbols, patterns, diagrams
│   ├── tools/        — 52 tool modules (64 MCP tools)
│   ├── server/       — MCP protocol server
│   └── cli/          — command-line interface
├── bin/kb.mjs        — CLI entry point
└── skills/           — LLM skill files

License

MIT

Development

pnpm install       # Install dependencies
pnpm build         # Build all packages
pnpm test          # Run tests (Vitest)
pnpm lint          # Lint (Biome)

Detailed MCP Tools Reference

Search & Retrieval

kb_search — Hybrid search across the knowledge base

Find relevant code, docs, patterns, and curated knowledge using hybrid (vector + keyword), semantic, or keyword-only search.

| Parameter | Type | Required | Default | Description | |-----------|------|----------|---------|-------------| | query | string | yes | — | Natural language search query | | limit | number (1–20) | no | 5 | Maximum results to return | | search_mode | enum | no | hybrid | Search strategy: hybrid (vector + FTS + RRF fusion), semantic (vector only), keyword (FTS only) | | content_type | enum | no | — | Filter: markdown, code-typescript, code-javascript, code-python, config-json, config-yaml, config-toml, config-dotenv, infrastructure, documentation, test, script, curated-knowledge, produced-knowledge, other | | origin | enum | no | — | Filter: indexed (from files), curated (agent memory), produced (auto-generated) | | category | string | no | — | Filter by curated category (e.g., decisions, patterns) | | tags | string[] | no | — | Filter by tags (OR matching) | | min_score | number (0–1) | no | 0.25 | Minimum similarity score threshold |

Returns: Ranked results with score, source path, content type, line range, heading path, origin, tags, and full content text. Each response includes a _Next: hint suggesting logical follow-up tools.

Search modes:

  • hybrid (default) — Runs vector similarity and full-text keyword search in parallel, then merges rankings using Reciprocal Rank Fusion (k=60). Best for most queries.
  • semantic — Pure vector cosine similarity. Best when searching by meaning/concept rather than exact terms.
  • keyword — Full-text search using LanceDB's built-in FTS index. Best when searching for exact identifiers, function names, or specific strings.

Best practices for query:

  • Use natural language describing what you're looking for: "how does the notification dispatcher route messages"
  • Include domain terms that would appear in the code: "DynamoDB single-table GSI pattern for notifications"
  • Use search_mode: "keyword" for exact function/class names: "reciprocalRankFusion"
  • Use search_mode: "semantic" for conceptual queries: "retry strategy with exponential backoff"
  • For curated knowledge, combine with origin: "curated": query: "architecture decision", origin: "curated"

kb_lookup — Get all chunks for a specific file

| Parameter | Type | Required | Description | |-----------|------|----------|-------------| | path | string | yes | Relative file path (e.g., src/index.ts) |

Returns: All chunks for that file, sorted by position, with line ranges and content.

kb_status — View index statistics

No parameters. Returns total records, total files, content type breakdown, last indexed timestamp, and list of indexed files.

Curated Knowledge (Persistent Memory)

These tools give the agent persistent, version-tracked memory that survives across sessions. Knowledge is stored as markdown files with YAML frontmatter in the curated/ directory and simultaneously indexed into the vector store for semantic search.

kb_remember — Store new knowledge

| Parameter | Type | Required | Description | |-----------|------|----------|-------------| | title | string (3–120 chars) | yes | Short descriptive title | | content | string (≥10 chars) | yes | Markdown content to store | | category | string (kebab-case) | yes | Category slug: decisions, patterns, conventions, troubleshooting, or any custom kebab-case name | | tags | string[] | no | Tags for filtering |

What to remember: Architecture decisions, coding conventions, recurring patterns, API contracts, deployment procedures, debugging solutions, team agreements, review findings.

kb_update — Update existing knowledge

| Parameter | Type | Required | Description | |-----------|------|----------|-------------| | path | string | yes | Path from kb_list (e.g., decisions/use-lancedb.md) | | content | string (≥10 chars) | yes | New markdown content (replaces existing) | | reason | string (≥3 chars) | yes | Why this update is being made (recorded in changelog) |

Increments version number and appends to the entry's changelog.

kb_read — Read a curated entry

| Parameter | Type | Required | Description | |-----------|------|----------|-------------| | path | string | yes | Path from kb_list |

Returns: Full metadata (title, version, tags, created, updated) and content.

kb_list — List curated entries

| Parameter | Type | Required | Description | |-----------|------|----------|-------------| | category | string | no | Filter by category | | tag | string | no | Filter by tag |

Returns: All entries with title, version, tags, path, and 80-char content preview.

kb_forget — Remove a curated entry

| Parameter | Type | Required | Description | |-----------|------|----------|-------------| | path | string | yes | Path from kb_list | | reason | string (≥3 chars) | yes | Why this entry is being removed |

Deletes from disk and vector store.

Indexing

kb_reindex — Trigger re-indexing

| Parameter | Type | Required | Default | Description | |-----------|------|----------|---------|-------------| | full | boolean | no | false | If true, force full re-index ignoring file hashes |

Incremental mode (default) only re-indexes files whose content hash has changed. Full mode drops the table and rebuilds from scratch.

Codebase Analysis

kb_produce_knowledge — Automated analysis + synthesis instructions

| Parameter | Type | Required | Default | Description | |-----------|------|----------|---------|-------------| | scope | string | no | . (root) | Root path to analyze | | aspects | enum[] | no | ["all"] | all, structure, dependencies, symbols, patterns, entry-points, diagrams |

Runs deterministic analyzers, then returns structured instructions for the agent to synthesize and store findings using kb_remember.

kb_analyze_structure — File/directory tree with language stats

| Parameter | Type | Required | Default | Description | |-----------|------|----------|---------|-------------| | path | string | yes | — | Root path to analyze | | max_depth | number (1–10) | no | 6 | Maximum directory depth | | format | enum | no | markdown | json or markdown |

kb_analyze_dependencies — Import/require dependency graph (with confidence)

| Parameter | Type | Required | Default | Description | |-----------|------|----------|---------|-------------| | path | string | yes | — | Root path | | format | enum | no | markdown | json, markdown, or mermaid |

Dependency results include a confidence level per import: high (ES static imports), medium (dynamic imports, require()), low (inferred). The markdown format shows a Confidence column in dependency tables.

kb_analyze_symbols — Exported & local symbols

| Parameter | Type | Required | Default | Description | |-----------|------|----------|---------|-------------| | path | string | yes | — | Root path | | filter | string | no | — | Filter symbols by name substring | | format | enum | no | markdown | json or markdown |

kb_analyze_patterns — Detect architectural patterns & frameworks

| Parameter | Type | Required | Description | |-----------|------|----------|-------------| | path | string | yes | Root path |

kb_analyze_entry_points — Find Lambda handlers, CDK constructs, test suites, package exports

Walks monorepo workspace packages (pnpm-workspace.yaml / package.json#workspaces), parses exports fields, detects CDK constructs and test suites.

| Parameter | Type | Required | Description | |-----------|------|----------|-------------| | path | string | yes | Root path |

kb_analyze_diagram — Generate Mermaid architecture/dependency diagrams

| Parameter | Type | Required | Default | Description | |-----------|------|----------|---------|-------------| | path | string | yes | — | Root path | | diagram_type | enum | no | architecture | architecture or dependencies |

Context Management

kb_compact — Compress text to query-relevant sections

| Parameter | Type | Required | Default | Description | |-----------|------|----------|---------|-------------| | text | string | no* | — | The text to compress (provide text or path) | | path | string | no* | — | File path to read server-side (avoids read_file round-trip) | | query | string | yes | — | Focus query — what are you trying to understand? | | max_chars | number (100–50000) | no | 3000 | Target output size in characters | | segmentation | enum | no | paragraph | How to split: paragraph, sentence, line |

* At least one of text or path must be provided. Using path is preferred — it eliminates the read_filecompact two-call chain and prevents token doubling.

kb_workset — Manage named file sets

| Parameter | Type | Required | Default | Description | |-----------|------|----------|---------|-------------| | action | enum | yes | — | save, get, list, delete, add, remove | | name | string | varies | — | Workset name (required for all except list) | | files | string[] | varies | — | File paths (required for save, add, remove) | | description | string | no | — | Description (for save) |

Worksets persist across sessions in .kb-state/worksets.json.

kb_stash — Persist named key-value pairs

| Parameter | Type | Required | Default | Description | |-----------|------|----------|---------|-------------| | action | enum | yes | — | set, get, list, delete, clear | | key | string | varies | — | Entry key (for set, get, delete) | | value | string | varies | — | String or JSON value (for set) |

Stores intermediate results between tool calls in .kb-state/stash.json.

kb_checkpoint — Save/restore session checkpoints

| Parameter | Type | Required | Default | Description | |-----------|------|----------|---------|-------------| | action | enum | yes | — | save, load, list, latest | | label | string | varies | — | Checkpoint label (for save), or checkpoint ID (for load) | | data | string | varies | — | JSON object string (for save) | | notes | string | no | — | Optional notes (for save) |

Lightweight checkpoints stored in .kb-state/checkpoints/ for cross-session continuity.

kb_parse_output — Parse build tool output

| Parameter | Type | Required | Default | Description | |-----------|------|----------|---------|-------------| | output | string | yes | — | Raw output text from a build tool | | tool | enum | no | auto-detect | tsc, vitest, biome, git-status |

Returns JSON structured parse result with errors, warnings, and file references. Auto-detects the tool format when not specified.

FORGE & Context Compression

kb_forge_ground — Complete FORGE Ground phase

| Parameter | Type | Required | Default | Description | |-----------|------|----------|---------|-------------| | task | string | yes | — | Task description | | files | string[] | yes | — | Target files being modified (absolute paths) | | root_path | string | yes | — | Root path of the codebase | | max_constraints | number (0–10) | no | 3 | Max constraint entries to load from KB | | force_tier | enum | no | auto | Force tier: floor, standard, critical (skips auto-classification) | | task_id | string | no | auto-generated | Custom task ID for evidence map |

Chains: tier classification → scope map → file summaries → constraint loading → typed unknown seeds → evidence map creation. Replaces 5-15 manual tool calls. Floor tasks get a minimal shortcut (no scope map / constraints / evidence map).

kb_forge_classify — Classify FORGE tier

| Parameter | Type | Required | Description | |-----------|------|----------|-------------| | files | string[] | yes | Files being modified (paths) | | task | string | yes | Task description | | root_path | string | yes | Root path of the codebase |

Checks blast radius, cross-package boundaries, schema/contract patterns, and security signals. Returns tier, triggers, typed unknown seeds, packages crossed, and ceremony guidance.

kb_evidence_map — FORGE Evidence Map CRUD + Gate

| Parameter | Type | Required | Default | Description | |-----------|------|----------|---------|-------------| | action | enum | yes | — | create, add, update, get, gate, list, delete | | task_id | string | varies | — | Task identifier (all except list) | | tier | enum | varies | — | FORGE tier (for create): floor, standard, critical | | claim | string | varies | — | Critical-path claim text (for add) | | status | enum | varies | — | V (Verified), A (Assumed), U (Unresolved) | | receipt | string | no | — | Evidence: tool→ref for V, reasoning for A, attempts for U | | id | number | varies | — | Entry ID (for update) | | critical_path | boolean | no | true | Whether claim is on the critical path | | unknown_type | enum | no | — | contract, convention, freshness, runtime, data-flow, impact | | retry_count | number | no | 0 | Retry count for gate evaluation |

Gate decision logic: HARD_BLOCK (contract U on critical path) → HOLD (non-contract U, retry 0) → FORCED_DELIVERY (non-contract U, retry ≥ 1) → YIELD. Persists in .kb-state/evidence-maps.json.

kb_digest — Token-budgeted multi-source compression

| Parameter | Type | Required | Default | Description | |-----------|------|----------|---------|-------------| | sources | object[] | yes | — | Sources: { id, text, weight } (weight = priority, higher = more budget) | | query | string | yes | — | Focus query — what matters for the next step? | | max_chars | number (100–50000) | no | 4000 | Target budget in characters | | pin_fields | string[] | no | status, files, decisions, blockers, next | Key fields to always extract | | segmentation | enum | no | paragraph | paragraph, sentence, line |

Jointly ranks across all sources using embedding similarity. Pins structured fields (key:value patterns) and allocates remaining budget proportionally by weight. Returns compressed text + extracted fields + per-source stats.

kb_stratum_card — STRATUM context cards

| Parameter | Type | Required | Default | Description | |-----------|------|----------|---------|-------------| | files | string[] | yes | — | Absolute file paths to generate cards for | | query | string | yes | — | Current task query — guides relevance scoring | | tier | enum | no | T1 | T1 = structural only (~100 tok/file), T2 = T1 + compressed content (~300 tok/file) | | max_content_chars | number (100–5000) | no | 800 | For T2: max chars for compressed content section |

T1 cards contain: ROLE, DEPS, EXPORTS, UNKNOWNS, RISK. T2 adds a CONTEXT section with query-compressed content. Uses kb_file_summary + embedder similarity. No generative LLM needed.

Search & Discovery

kb_find — Federated multi-strategy search

| Parameter | Type | Required | Default | Description | |-----------|------|----------|---------|-------------| | query | string | no | — | Semantic/keyword search query | | glob | string | no | — | File glob pattern | | pattern | string | no | — | Regex pattern to match in content | | limit | number (1–50) | no | 10 | Max results | | content_type | string | no | — | Filter by content type |

Combines vector similarity, keyword (FTS), file glob, and regex strategies. Deduplicates and returns unified results with source, path, line range, score %, and preview.

kb_symbol — Resolve a symbol across the codebase

| Parameter | Type | Required | Default | Description | |-----------|------|----------|---------|-------------| | name | string | yes | — | Symbol name (function, class, type, etc.) | | limit | number (1–50) | no | 20 | Max results per category |

Finds where a symbol is defined, who imports it, and where it is referenced. Works on TypeScript and JavaScript.

kb_scope_map — Generate a task-scoped reading plan

| Parameter | Type | Required | Default | Description | |-----------|------|----------|---------|-------------| | task | string | yes | — | Description of the task to scope | | max_files | number (1–50) | no | 15 | Maximum files to include | | content_type | string | no | — | Filter by content type |

Returns a ranked file list with estimated token counts, relevance %, focus line ranges, and a suggested reading order.

kb_trace — Trace data flow through imports/references

| Parameter | Type | Required | Default | Description | |-----------|------|----------|---------|-------------| | start | string | yes | — | Starting point — symbol name or file:line reference | | direction | enum | yes | — | forward, backward, both | | max_depth | number (1–10) | no | 3 | Maximum trace depth |

Follows imports, call sites, and references to build a relationship graph from a starting symbol or file location.

kb_dead_symbols — Find unused exported symbols

| Parameter | Type | Required | Default | Description | |-----------|------|----------|---------|-------------| | limit | number (1–500) | no | 100 | Maximum exported symbols to scan |

Finds exported symbols that are never imported or re-exported anywhere in the project.

kb_file_summary — Structural summary of a source file

| Parameter | Type | Required | Description | |-----------|------|----------|-------------| | path | string | yes | Absolute path to the file |

Returns imports, exports, functions, classes, interfaces, and types found in the file.

Code Manipulation

kb_rename — Rename a symbol across files

| Parameter | Type | Required | Default | Description | |-----------|------|----------|---------|-------------| | old_name | string | yes | — | Existing symbol name | | new_name | string | yes | — | New symbol name | | root_path | string | yes | — | Root directory to search within | | extensions | string[] | no | — | File extensions to include (e.g., .ts, .tsx) | | dry_run | boolean | no | true | Preview changes without writing files |

Uses whole-word regex matching for exports, imports, and general usage references. Defaults to dry run.

kb_codemod — Apply regex-based codemod rules

| Parameter | Type | Required | Default | Description | |-----------|------|----------|---------|-------------| | root_path | string | yes | — | Root directory to transform within | | rules | object[] (min 1) | yes | — | Codemod rules: description, pattern (regex), replacement (with capture groups) | | dry_run | boolean | no | true | Preview changes without writing files |

Returns structured before/after changes for each affected line. Defaults to dry run.

kb_diff_parse — Parse unified diff text

| Parameter | Type | Required | Description | |-----------|------|----------|-------------| | diff | string | yes | Raw unified diff text |

Parses into file-level and hunk-level structural changes.

kb_data_transform — jq-like JSON transforms

| Parameter | Type | Required | Description | |-----------|------|----------|-------------| | input | string | yes | Input JSON string | | expression | string | yes | Transform expression (filtering, projection, grouping, path extraction) |

Execution & Validation

kb_eval — Execute code in a sandboxed VM

| Parameter | Type | Required | Default | Description | |-----------|------|----------|---------|-------------| | code | string | yes | — | Code snippet to execute | | lang | enum | no | js | js (direct) or ts (strips type syntax first) | | timeout | number (1–60000) | no | 5000 | Execution timeout in milliseconds |

Captures console output and return values. Constrained VM with timeout.

kb_check — Run typecheck and lint

| Parameter | Type | Required | Default | Description | |-----------|------|----------|---------|-------------| | files | string[] | no | — | Specific files (omit to check all) | | cwd | string | no | — | Working directory | | skip_types | boolean | no | false | Skip TypeScript typecheck | | skip_lint | boolean | no | false | Skip Biome lint |

Runs incremental tsc and biome and returns structured error/warning lists.

kb_test_run — Run Vitest tests

| Parameter | Type | Required | Default | Description | |-----------|------|----------|---------|-------------| | files | string[] | no | — | Specific test files or patterns | | grep | string | no | — | Only run tests matching this pattern | | cwd | string | no | — | Working directory |

Returns structured pass/fail summary. isError set if any tests failed.

kb_batch — Execute multiple operations in parallel

| Parameter | Type | Required | Default | Description | |-----------|------|----------|---------|-------------| | operations | object[] (min 1) | yes | — | Operations: id (string), type (search/find/check), args (record) | | concurrency | number (1–20) | no | 4 | Max concurrent operations |

Returns per-operation outcomes (success/failure with results or errors).

kb_audit — Unified project audit

Runs multiple analysis checks in a single call and returns a synthesized report with a composite score (0–100), per-check summaries, and prioritized recommendations.

| Parameter | Type | Required | Default | Description | |-----------|------|----------|---------|-------------| | path | string | no | . | Root path to audit | | checks | string[] | no | all | Subset of checks: structure, dependencies, patterns, health, dead_symbols, check, entry_points | | detail | enum | no | summary | summary (markdown overview) or full (structured JSON data) |

Returns KBResponse<AuditData> with next[] hints for follow-up actions.

Git & Environment

kb_git_context — Summarize Git repository state

| Parameter | Type | Required | Default | Description | |-----------|------|----------|---------|-------------| | cwd | string | no | — | Repository root | | commit_count | number (1–50) | no | 5 | Recent commits to include | | include_diff | boolean | no | false | Include diff stat for working tree |

Returns branch, working tree status, recent commits, and optionally diff stats.

kb_process — Manage child processes

| Parameter | Type | Required | Default | Description | |-----------|------|----------|---------|-------------| | action | enum | yes | — | start, stop, status, list, logs | | id | string | varies | — | Managed process ID | | command | string | varies | — | Executable (for start) | | args | string[] | no | — | Arguments (for start) | | tail | number (1–500) | no | — | Log lines (for logs) |

kb_watch — Filesystem watchers

| Parameter | Type | Required | Default | Description | |-----------|------|----------|---------|-------------| | action | enum | yes | — | start, stop, list | | path | string | varies | — | Directory path (for start) | | id | string | varies | — | Watcher ID (for stop) |

kb_delegate — Delegate subtask to local Ollama model

| Parameter | Type | Required | Default | Description | |-----------|------|----------|---------|-------------| | prompt | string | yes | — | Task or question to send | | model | string | no | first available | Ollama model name | | system | string | no | — | System prompt | | context | string | no | — | Context text (e.g., file contents) | | temperature | number (0–2) | no | 0.3 | Sampling temperature | | timeout | number (1000–600000) | no | 120000 | Timeout in ms | | action | enum | no | generate | generate or list_models |

Fails fast if Ollama is not running. Returns model name, response text, duration, and token count.

Web & Network

kb_web_fetch — Fetch web page for LLM consumption

| Parameter | Type | Required | Default | Description | |-----------|------|----------|---------|-------------| | url | string | yes | — | URL to fetch (http/https only) | | mode | enum | no | markdown | markdown, raw, links, outline | | selector | string | no | — | CSS selector to extract a specific element | | max_length | number (500–100000) | no | 15000 | Max output characters | | include_metadata | boolean | no | true | Include page title/description header | | include_links | boolean | no | false | Append extracted links | | include_images | boolean | no | false | Include image alt texts | | timeout | number (1000–60000) | no | 15000 | Request timeout in ms |

Strips scripts, styles, and boilerplate. Smart truncation at paragraph boundaries.

kb_web_search — Search the web

| Parameter | Type | Required | Default | Description | |-----------|------|----------|---------|-------------| | query | string | yes | — | Search query | | limit | number (1–20) | no | 5 | Max results | | site | string | no | — | Restrict to domain (e.g., docs.aws.amazon.com) |

Uses DuckDuckGo HTML search — no API key required. Returns title, URL, and snippet for each result.

kb_http — Make HTTP requests

| Parameter | Type | Required | Default | Description | |-----------|------|----------|---------|-------------| | url | string | yes | — | Request URL (http/https only) | | method | enum | no | GET | GET, POST, PUT, PATCH, DELETE, HEAD | | headers | record | no | — | Request headers | | body | string | no | — | Request body | | timeout | number (1000–60000) | no | 15000 | Timeout in ms |

Returns status, headers, formatted body (auto-pretty-prints JSON), timing, and size. Truncates responses over 50K chars.

Developer Utilities

kb_regex_test — Test regex patterns

| Parameter | Type | Required | Default | Description | |-----------|------|----------|---------|-------------| | pattern | string | yes | — | Regex pattern (without delimiters) | | flags | string | no | "" | Regex flags (g, i, m, s, etc.) | | test_strings | string[] | yes | — | Strings to test against | | mode | enum | no | match | match, replace, split | | replacement | string | no | — | Replacement string (for replace mode) |

Returns match details with groups and indices for each test string.

kb_encode — Encoding, decoding, and hashing

| Parameter | Type | Required | Description | |-----------|------|----------|-------------| | operation | enum | yes | base64_encode, base64_decode, url_encode, url_decode, sha256, md5, jwt_decode, hex_encode, hex_decode | | input | string | yes | Input text |

JWT decode shows header and payload without signature verification.

kb_measure — Code complexity metrics

| Parameter | Type | Required | Default | Description | |-----------|------|----------|---------|-------------| | path | string | yes | — | File or directory to measure | | extensions | string[] | no | .ts,.tsx,.js,.jsx | File extensions to include |

Returns per-file metrics (cyclomatic complexity, line counts, function counts, imports/exports) sorted by complexity, plus aggregate summary.

kb_changelog — Generate changelog from git history

| Parameter | Type | Required | Default | Description | |-----------|------|----------|---------|-------------| | from | string | yes | — | Start ref (tag, SHA, HEAD~N) | | to | string | no | HEAD | End ref | | format | enum | no | grouped | grouped, chronological, per-scope | | include_breaking | boolean | no | true | Highlight breaking changes |

Parses conventional commit format (type(scope): subject). Groups by feat/fix/refactor/etc.

kb_schema_validate — JSON Schema validation

| Parameter | Type | Required | Description | |-----------|------|----------|-------------| | data | string | yes | JSON data to validate (as string) | | schema | string | yes | JSON Schema to validate against (as string) |

Supports: type, required, properties, additionalProperties, items, enum, const, pattern, minimum/maximum, minLength/maxLength, minItems/maxItems.

kb_snippet — Persistent code snippet storage

| Parameter | Type | Required | Description | |-----------|------|----------|-------------| | action | enum | yes | save, get, list, search, delete | | name | string | varies | Snippet name | | language | string | no | Language tag | | code | string | varies | Code content (for save) | | tags | string[] | no | Categorization tags | | query | string | varies | Search query (for search) |

Stored as JSON in .kb-state/snippets/. Searchable by name, tags, language, and content.

kb_env — System environment info

| Parameter | Type | Required | Default | Description | |-----------|------|----------|---------|-------------| | include_env | boolean | no | false | Include environment variables | | filter_env | string | no | — | Filter env vars by name substring | | show_sensitive | boolean | no | false | Show sensitive values (redacted by default) |

Returns platform, arch, OS, CPU count, memory, Node version, CWD. Sensitive env var values (keys matching key/secret/token/password) are redacted unless explicitly requested.

kb_time — Date/time utilities

| Parameter | Type | Required | Description | |-----------|------|----------|-------------| | operation | enum | yes | now, parse, convert, diff, add | | input | string | varies | Date input (ISO, unix timestamp, parseable string). For diff: two comma-separated dates | | timezone | string | no | Target timezone (e.g., America/New_York) | | duration | string | no | Duration to add (e.g., 2h30m, 1d) — for add operation |

Auto-detects unix seconds vs milliseconds. Supports human-readable duration format (2h30m, 1d12h).

Verified Lanes

kb_lane — Manage verified lanes

| Parameter | Type | Required | Default | Description | |-----------|------|----------|---------|-------------| | action | enum | yes | — | create, list, status, diff, merge, discard | | name | string | varies | — | Lane name | | files | string[] | varies | — | File paths to copy into lane (for create) |

Isolated file copies for parallel exploration. Create a lane, make changes, diff against originals, merge back, or discard.

System

kb_guide — Tool discovery

| Parameter | Type | Required | Default | Description | |-----------|------|----------|---------|-------------| | goal | string | yes | — | What you want to accomplish | | max_recommendations | number | no | 5 | Max tools to recommend (1-10) |

Given a goal description, recommends which KB tools to use and in what order. Matches against 10 predefined workflows: onboard, audit, bugfix, implement, refactor, search, context, memory, validate, analyze.

kb_health — Run project health checks

| Parameter | Type | Required | Default | Description | |-----------|------|----------|---------|-------------| | path | string | no | cwd | Root directory to check |

Verifies package.json, tsconfig, scripts, lockfile, README, LICENSE, .gitignore, circular dependencies.

kb_queue — Manage task queues

| Parameter | Type | Required | Default | Description | |-----------|------|----------|---------|-------------| | action | enum | yes | — | create, push, next, done, fail, get, list, clear, delete | | name | string | varies | — | Queue name | | title | string | varies | — | Item title (for push) | | id | string | varies | — | Item ID (for done/fail) | | data | any | no | — | Arbitrary data to attach | | error | string | varies | — | Error message (for fail) |

Sequential task queues for agent operations.

kb_replay — View or clear audit trail

| Parameter | Type | Required | Default | Description | |-----------|------|----------|---------|-------------| | action | enum | no | list | list to view entries, clear to wipe the log | | last | number | no | 20 | Number of entries to return (list only) | | tool | string | no | — | Filter by tool name (list only) | | source | enum | no | — | Filter by source: mcp or cli (list only) | | since | string | no | — | ISO timestamp — only show entries after this time (list only) |

Shows the audit trail of recent tool invocations. Each entry includes tool name, duration, input/output summaries, and status. Useful for debugging agent behavior.


MCP Resources

| URI | Name | Description | |-----|------|-------------| | kb://status | kb-status | Quick status: record count, file count, last indexed time | | kb://file-tree | kb-file-tree | Sorted list of all indexed source file paths |


Curated Knowledge System

The curated system is the agent's persistent memory layer. Files are stored as markdown with YAML frontmatter in curated/:

curated/
├── conventions/   # 214 entries — coding conventions, style rules, naming patterns
├── decisions/     # 295 entries — architecture decisions, ADRs, design rationale
├── patterns/      # 165 entries — recurring patterns, templates, code idioms
└── troubleshooting/  # 62 entries — known issues, fixes, debugging guides

Frontmatter Format

---
title: "Use LanceDB for local vector storage"
category: decisions
tags: ["vector-store", "architecture", "lancedb"]
created: 2026-01-15T10:30:00.000Z
updated: 2026-02-20T14:22:00.000Z
version: 3
origin: curated
changelog:
  - version: 1
    date: 2026-01-15T10:30:00.000Z
    reason: "Initial creation"
  - version: 2
    date: 2026-02-01T09:00:00.000Z
    reason: "Added performance benchmarks"
  - version: 3
    date: 2026-02-20T14:22:00.000Z
    reason: "Updated with hybrid search findings"
---

Markdown content here...

Category Guidelines

| Category | Use For | Examples | |----------|---------|---------| | conventions | Rules the team follows | Naming patterns, import ordering, error handling style, test structure | | decisions | Why something was chosen | ADRs, technology choices, pattern selections with trade-offs | | patterns | How to do recurring things | CDK stack templates, API endpoint patterns, state machine shapes | | troubleshooting | Known problems and fixes | Build failures, deployment gotchas, dependency conflicts | | Custom (api-contracts, runbooks, etc.) | Any kebab-case slug | Domain-specific categories as needed |


How to Write Agent Instructions for Using This KB

When writing instructions for an AI agent (e.g., in .copilot-instructions.md, AGENTS.md, CLAUDE.md, or system prompts), include the following guidance:

Recommended Agent Instruction Template

## Knowledge Base Usage

You have access to a persistent knowledge base via MCP tools. Use it proactively.

### Before Starting Any Task
1. **Search first**: Use `kb_search` with a natural language query describing your task to find relevant context, prior decisions, conventions, and patterns before writing code.
2. **Check conventions**: `kb_search` with `origin: "curated"` and `category: "conventions"` for coding standards that apply to your work.
3. **Check decisions**: `kb_search` with `category: "decisions"` to understand why things were built a certain way.

### Search Strategies
- **Broad hybrid** (default, best for most queries): `kb_search({ query: "notification routing architecture" })`
- **Exact identifier lookup**: `kb_search({ query: "handleNotification", search_mode: "keyword" })`
- **Conceptual similarity**: `kb_search({ query: "event-driven message fanout design", search_mode: "semantic" })`
- **Specific convention**: `kb_search({ query: "error handling", origin: "curated", category: "conventions" })`
- **Code patterns**: `kb_search({ query: "DynamoDB batch write pattern", content_type: "code-typescript" })`
- **Troubleshooting**: `kb_search({ query: "deployment failure CDK", category: "troubleshooting" })`
- **View a full file**: `kb_lookup({ path: "src/services/dispatcher.ts" })`

### Follow the Hints
Every tool response includes a `_Next:` suggestion at the bottom. Follow these hints for efficient workflows — they guide you to logical next actions (e.g., after `kb_search`, the hint suggests `kb_lookup` or `kb_analyze_structure`).

### After Completing a Task
1. **Remember decisions**: If you made an architecture or design decision, store it:

kb_remember({ title: "Use event-driven pattern for notification fanout", content: "## Decision\n\n...\n\n## Rationale\n\n...\n\n## Alternatives Considered\n\n...", category: "decisions", tags: ["notifications", "architecture", "event-driven"] })

2. **Remember patterns**: If you established a reusable code pattern:

kb_remember({ title: "CDK construct pattern for SQS-Lambda integration", content: "## Pattern\n\ntypescript\n...\n\n\n## When to Use\n\n...", category: "patterns", tags: ["cdk", "sqs", "lambda"] })

3. **Remember troubleshooting**: If you solved a tricky bug:

kb_remember({ title: "LanceDB dimension mismatch after model change", content: "## Problem\n\n...\n\n## Solution\n\nRun kb_reindex({ full: true })...", category: "troubleshooting", tags: ["lancedb", "embeddings"] })

4. **Update existing knowledge**: If you're revising a prior decision:

kb_update({ path: "decisions/use-event-driven-fanout.md", content: "updated markdown...", reason: "Added retry strategy after production incident" })


### Knowledge Quality Standards
- **Decisions** should include: Context, Decision, Rationale, Alternatives Considered, Consequences
- **Patterns** should include: Pattern description, Code example, When to Use, When Not to Use
- **Conventions** should include: The Rule, Why, Examples (good vs bad)
- **Troubleshooting** should include: Problem, Symptoms, Root Cause, Solution, Prevention

### Codebase Analysis (for onboarding or deep understanding)
- Run `kb_produce_knowledge({ aspects: ["all"] })` to get analysis baselines and follow the synthesis instructions to populate the KB
- Use `kb_analyze_structure({ path: "." })` for project layout overview
- Use `kb_analyze_dependencies({ path: ".", format: "mermaid" })` for dependency visualization
- Use `kb_analyze_patterns({ path: "." })` to detect frameworks and conventions

### Index Maintenance
- Run `kb_reindex()` after significant code changes (incremental, fast)
- Run `kb_reindex({ full: true })` after model changes or corruption
- Check `kb_status()` to verify index health

Environment Variables

| Variable | Default | Description | |----------|---------|-------------| | KB_TRANSPORT | stdio | Transport mode: stdio or http | | KB_PORT | 3210 | HTTP server port (when --transport http) | | KB_AUTO_INDEX | true | Set to false to skip initial auto-indexing | | KB_CORS_ORIGIN | * | CORS allowed origin for HTTP mode | | KB_WORKSPACE_DIR | — | Override workspace root path | | KB_DATA_DIR | — | Override LanceDB data directory | | KB_MODEL_DIR | — | Override ONNX model cache directory |


Tech Stack

  • Embeddings: @huggingface/transformers with mixedbread-ai/mxbai-embed-large-v1 (1024 dimensions, ONNX, query-prefixed with "Represent this sentence for searching relevant passages: ")
  • Vector Store: LanceDB (local disk, L2 distance → similarity score, built-in FTS index for keyword search)
  • Search: Hybrid (vector + FTS + RRF fusion), semantic-only, or keyword-only modes
  • Chunking: Markdown-aware (heading hierarchy), tree-sitter AST-based (TS/JS/Python/Go/Rust/Java — preserves function/class boundaries), with overlap. Falls back to regex-based generic chunking when tree-sitter grammars are unavailable.
  • Dependency Analysis: Confidence-scored imports (high/medium/low per import pattern)
  • Auto-Persist: Analysis tool results are automatically indexed as origin: 'produced' entries for future search
  • Workflow Hints: Every tool response includes _Next: suggestions for logical follow-up actions
  • MCP SDK: @modelcontextprotocol/sdk (stdio + StreamableHTTP transports)
  • Runtime: Node.js ≥ 24, TypeScript, ESM, pnpm workspaces
  • Build: tsdown with integrated dts generation
  • Lint: Biome
  • Test: Vitest