@swarmvaultai/cli
v0.7.26
Published
Global CLI for SwarmVault.
Downloads
6,572
Maintainers
Readme
@swarmvaultai/cli
@swarmvaultai/cli is the global command-line entry point for SwarmVault.
It gives you the swarmvault command for building a local-first knowledge vault from files, audio transcripts, YouTube URLs, reStructuredText and DOCX documents, browser clips, saved query outputs, and guided exploration runs.
Install
SwarmVault requires Node >=24.
npm install -g @swarmvaultai/cliInstalled commands:
swarmvaultvaultas a compatibility alias
First Run
mkdir my-vault
cd my-vault
swarmvault init --obsidian --profile personal-research
swarmvault init --obsidian --profile reader,timeline
swarmvault demo
swarmvault source add https://github.com/karpathy/micrograd
swarmvault source add https://example.com/docs/getting-started
swarmvault source add ./exports/customer-call.srt --guide
swarmvault source session file-customer-call-srt-12345678
swarmvault source list
swarmvault source reload --all
sed -n '1,120p' swarmvault.schema.md
swarmvault ingest ./notes.md
swarmvault ingest ./customer-call.mp3
swarmvault ingest https://www.youtube.com/watch?v=dQw4w9WgXcQ
swarmvault ingest ./repo
swarmvault add https://arxiv.org/abs/2401.12345
swarmvault compile --max-tokens 120000
swarmvault diff
swarmvault benchmark
swarmvault query "What keeps recurring?" --commit
swarmvault query "Turn this into slides" --format slides
swarmvault explore "What should I research next?" --steps 3
swarmvault lint --deep
swarmvault graph blast ./src/index.ts
swarmvault graph query "Which nodes bridge the biggest clusters?"
swarmvault graph explain "concept:drift"
swarmvault watch status
swarmvault watch --repo --once
swarmvault hook install
swarmvault graph serve
swarmvault graph export --report ./exports/report.html
swarmvault graph export --html ./exports/graph.html
swarmvault graph export --cypher ./exports/graph.cypher
swarmvault graph push neo4j --dry-runCommands
swarmvault init [--obsidian] [--profile <alias-or-presets>]
Create a workspace with:
inbox/raw/wiki/wiki/insights/state/state/sessions/agent/swarmvault.config.jsonswarmvault.schema.md- optional
.obsidian/workspace files when--obsidianis passed
The schema file is the vault-specific instruction layer. Edit it to define naming rules, categories, grounding expectations, and exclusions before a serious compile.
--profile accepts default, personal-research, or a comma-separated preset list such as reader,timeline. For fully custom vault behavior, edit the profile block in swarmvault.config.json; that deterministic profile layer works alongside the human-written swarmvault.schema.md. The personal-research preset also sets profile.guidedIngestDefault: true and profile.deepLintDefault: true, so guided ingest/source and lint flows are on by default until you override them with --no-guide or --no-deep.
swarmvault scan <directory> [--port <port>] [--no-serve]
Quick-start a scratch vault from a local directory in one command.
- initializes the current directory as a SwarmVault workspace
- ingests the supplied directory as local sources
- compiles the vault immediately
- starts
graph serveunless you pass--no-serve - respects
--portwhen you want a specific viewer port
Use this when you want the fastest repo or docs-tree walkthrough without first deciding on managed-source registration.
swarmvault demo [--port <port>] [--no-serve]
Create a temporary sample vault with bundled sources, compile it immediately, and launch the graph viewer unless you pass --no-serve.
- writes the demo vault under the system temp directory
- requires no API keys or extra setup
- is the fastest way to inspect the full init + ingest + compile + graph workflow on a clean machine
- respects
--portwhen you want a specific viewer port
swarmvault diff
Compare the current state/graph.json against the last committed graph in git.
- when a prior committed graph exists, prints added and removed nodes, pages, and edges
- when no git baseline exists, falls back to a summary of the current graph state
- supports
--jsonfor structured automation output
swarmvault source add|list|reload|review|guide|session|delete
Manage recurring source roots through a registry-backed workflow.
source add <input>supports local files, local directories, public GitHub repo root URLs such ashttps://github.com/karpathy/micrograd, and docs/wiki/help/reference/tutorial hubs- by default
source addregisters the source, syncs it into the vault, runs one compile, and writes a source brief towiki/outputs/source-briefs/<source-id>.md - add
--guidewhen you want a resumable source session, source brief, source review, source guide, and approval-bundled canonical page edits whenprofile.guidedSessionModeiscanonical_review, withwiki/insights/fallback forinsights_only - set
profile.guidedIngestDefault: truewhen guided mode should be the default forsource addandsource reload, and use--no-guidefor individual light-path runs source listshows every managed source with its kind, status, and current brief pathsource reload [id]re-syncs one source, or use--allto refresh everything in the registry and compile oncesource review <id>stages a lighter source-scoped review artifactsource guide <id>remains a compatibility alias for the guided session flowsource session <id>resumes the latest guided session for a managed source id, raw source id, source scope id, or session idsource delete <id>unregisters the source and removes transient sync state understate/sources/<id>/, but leaves canonicalraw/,wiki/, and saved output artifacts intact
Useful flags:
--all--guide--no-guide--answers-file <path>--no-compile--no-brief--max-pages <n>--max-depth <n>
Managed sources write registry state to state/sources.json. Guided sessions write durable anchors to wiki/outputs/source-sessions/ and session state to state/source-sessions/. In an interactive TTY, --guide can ask the session questions immediately; otherwise use source session <id> or --answers-file <path> to resume and stage the approval bundle later. Local directory entries remain compatible with watch --repo; remote GitHub and docs-crawl sources are manual source reload sources in this release.
Source-scoped artifacts are intentionally split by role:
| Artifact | Created by | Purpose |
|----------|-----------|---------|
| Source brief | source add, ingest (always) | Auto summary written to wiki/outputs/source-briefs/ |
| Source review | source review, source add --guide, ingest --review, ingest --guide | Lighter staged assessment in wiki/outputs/source-reviews/ |
| Source guide | source guide, source add --guide, ingest --guide | Guided walkthrough with approval-bundled updates in wiki/outputs/source-guides/ |
| Source session | source session, source add --guide, ingest --guide | Resumable workflow state in wiki/outputs/source-sessions/ and state/source-sessions/ |
swarmvault ingest <path-or-url> [--commit]
Ingest a local file path, directory path, or URL into immutable source storage and write manifests to state/manifests/.
- local directories recurse by default
- directory ingest respects
.gitignoreunless you pass--no-gitignore - repo-aware directory ingest records
repoRelativePathand later compile writesstate/code-index.json - use
source addinstead when the same local directory, public GitHub repo root, or docs hub should stay registered and reloadable - URL ingest still localizes remote image references by default
- YouTube URLs short-circuit to direct transcript capture instead of generic HTML fetch
- local file and archive ingest supports markdown, text, reStructuredText, HTML, PDF, Word, RTF, OpenDocument, EPUB, CSV/TSV, Excel, PowerPoint, Jupyter notebooks, BibTeX, Org-mode, AsciiDoc, transcripts, Slack exports, email, calendar, audio, structured config/data, developer manifests, images, and code
- add
--guidewhen you want a resumable source session, source brief, source review, source guide, and approval-bundled canonical page edits whenprofile.guidedSessionModeiscanonical_review, withwiki/insights/fallback forinsights_only - set
profile.guidedIngestDefault: truewhen guided mode should be the default foringest, and use--no-guideto force a plain ingest for one run - code-aware directory ingest currently covers JavaScript, JSX, TypeScript, TSX, Bash/shell scripts, Python, Go, Rust, Java, Kotlin, Scala, Dart, Lua, Zig, C#, C, C++, PHP, Ruby, PowerShell, Elixir, OCaml, Objective-C, ReScript, Solidity, HTML, CSS, and Vue single-file components
Useful flags:
--repo-root <path>--answers-file <path>--no-guide--include <glob...>--exclude <glob...>--max-files <n>--include-third-party--include-resources--include-generated--no-gitignore--no-include-assets--max-asset-size <bytes>--commit
Repo ingest defaults to first_party material. The extra --include-* flags opt dependency trees, resource bundles, and generated output back in when you actually want them in the vault.
Large repo ingest now emits low-noise progress on materially large batches, and parser compatibility failures stay local to the affected source instead of aborting unrelated analysis.
Audio files use tasks.audioProvider when you configure a provider with audio capability. When no such provider is configured, SwarmVault still ingests the source and records an explicit extraction warning instead of failing. YouTube transcript ingest does not require a model provider.
When --commit is set, SwarmVault stages wiki/ and state/ changes and creates a git commit when the vault root is inside a git worktree. Outside git, it becomes a no-op instead of failing.
swarmvault add <url>
Capture supported URLs through a normalized markdown layer before ingesting them into the vault.
- arXiv abstract URLs and bare arXiv ids become durable markdown captures
- DOI URLs and bare DOI strings normalize into article-style research captures
- generic article URLs use a readability-style capture path with normalized research frontmatter
- X/Twitter URLs use a graceful public capture path
- unsupported URLs fall back to generic URL ingest instead of failing
- optional metadata:
--author <name>and--contributor <name> - normalized captures record fields such as
source_type,source_url,canonical_url,title,authors,published_at,updated_at,doi, andtagswhen available - use
source addinstead when the URL is a public GitHub repo root or a docs hub that should stay synced over time
swarmvault inbox import [dir]
Import supported files from the configured inbox directory. This is meant for browser-clipper style markdown bundles, HTML clip bundles, and other capture workflows. Local image and asset references are preserved and copied into canonical storage under raw/assets/.
swarmvault compile [--approve] [--commit] [--max-tokens <n>]
Compile the current manifests into:
- generated markdown in
wiki/ - structured graph data in
state/graph.json - local search data in
state/search.sqlite
The compiler also reads swarmvault.schema.md and records a schema_hash plus lifecycle metadata such as status, created_at, updated_at, compiled_from, and managed_by in generated pages so schema edits can mark pages stale without losing lifecycle state.
For ingested code trees, compile also writes state/code-index.json so local imports and module aliases can resolve across the repo-aware code graph.
New concept and entity pages are staged into wiki/candidates/ first. A later matching compile promotes them into wiki/concepts/ or wiki/entities/.
With --approve, compile writes a staged review bundle into state/approvals/ without applying active wiki changes.
Useful flags:
--approve--commit--max-tokens <n>
--max-tokens <n> keeps the generated wiki inside a bounded token budget by dropping lower-priority pages from final wiki/ output and reporting token-budget stats in the compile result. --commit immediately commits wiki/ and state/ changes when the vault lives in a git repo.
swarmvault benchmark [--question "<text>" ...]
Measure graph-guided context reduction against a naive full-corpus read.
- writes the latest result to
state/benchmark.json - updates
wiki/graph/report.mdandwiki/graph/report.jsonwith the current benchmark summary - accepts repeatable
--questioninputs for vault-specific benchmarks - compile and repo-aware refresh runs also keep the benchmark/report artifacts up to date by default
swarmvault review list|show|accept|reject
Inspect and resolve staged approval bundles created by swarmvault compile --approve.
review listshows pending, accepted, and rejected entry counts per bundlereview show <approvalId>shows each staged entry plus its current and staged content, including a section-level change summary when availablereview show <approvalId> --diffadds a unified diff between current and staged contentreview accept <approvalId> [targets...]applies pending entries to the live wikireview reject <approvalId> [targets...]marks pending entries as rejected without mutating active wiki paths
Targets can be page ids such as concept:approval-concept or relative wiki paths such as concepts/approval-concept.md.
swarmvault candidate list|promote|archive
Inspect and resolve staged concept and entity candidates.
candidate listshows every current candidate plus its active destination pathcandidate promote <target>promotes a candidate immediately intowiki/concepts/orwiki/entities/candidate archive <target>removes a candidate from the staged set
Targets can be page ids or relative paths under wiki/candidates/.
swarmvault query "<question>" [--no-save] [--commit] [--format markdown|report|slides|chart|image]
Query the compiled vault. The query layer also reads swarmvault.schema.md, so answers follow the vault’s own structure and grounding rules.
By default, the answer is written into wiki/outputs/ and immediately registered in:
wiki/index.mdwiki/outputs/index.mdstate/graph.jsonstate/search.sqlite
Saved outputs also carry related page, node, and source metadata so SwarmVault can refresh related source, concept, and entity pages immediately.
Human-authored pages in wiki/insights/ are also indexed into search and query context, but SwarmVault does not rewrite them after initialization.
By default, query uses the local SQLite search index. When an embedding-capable provider is available and search.hybrid is not disabled, semantic page matches are fused into the same candidate set before answer generation. tasks.embeddingProvider is the explicit way to choose that backend, but SwarmVault can also fall back to a queryProvider with embeddings support. Set search.rerank: true when you want the configured queryProvider to rerank the merged top hits. --commit immediately commits saved wiki/ and state/ changes when the vault root is inside a git repo.
swarmvault explore "<question>" [--steps <n>] [--format markdown|report|slides|chart|image]
Run a save-first multi-step research loop.
Each step:
- queries the vault
- saves the answer into
wiki/outputs/ - generates follow-up questions
- chooses the next follow-up deterministically
The command also writes a hub page linking the root question, saved step pages, and generated follow-up questions.
swarmvault lint [--deep] [--no-deep] [--web] [--conflicts]
Run anti-drift and vault health checks such as stale pages, missing graph artifacts, contradiction findings, and other structural issues.
--deep adds an LLM-powered advisory pass that can report:
coverage_gapcontradictioncontradiction_candidatemissing_citationcandidate_pagefollow_up_question
Set profile.deepLintDefault: true when deep lint should be the default for swarmvault lint, and use --no-deep when one run should stay structural only.
--web can only be used when deep lint is enabled, either explicitly with --deep or through profile.deepLintDefault. It enriches deep-lint findings with external evidence snippets and URLs from a configured web-search provider. Web search is currently scoped to deep lint; other commands (compile, query, explore) use only local vault state.
--conflicts filters the results down to contradiction-focused findings so you can audit conflicting claims without the rest of the lint output.
swarmvault watch [--lint] [--repo] [--once] [--code-only] [--debounce <ms>]
Watch the inbox directory and trigger import and compile cycles when files change. With --repo, each cycle also refreshes tracked repo roots that were previously ingested through directory ingest. With --once, SwarmVault runs one refresh cycle immediately instead of starting a long-running watcher. With --code-only, SwarmVault forces the narrower AST-only refresh path and skips non-code semantic re-analysis until you run a normal compile. With --lint, each cycle also runs linting. Each cycle writes a canonical session artifact to state/sessions/, and compatibility run metadata is still appended to state/jobs.ndjson.
When --repo sees non-code changes under tracked repo roots, SwarmVault records those files under state/watch/pending-semantic-refresh.json, marks affected compiled pages stale, and exposes the pending set through watch status and the local graph workspace instead of silently re-ingesting them.
When --repo sees only code-file changes under tracked repo roots, SwarmVault takes the faster code-only path: it refreshes code pages and graph structure without re-running non-code semantic analysis for unchanged sources.
swarmvault watch status
Show watched repo roots, the latest watch run, and any pending semantic refresh entries for tracked non-code repo changes.
swarmvault hook install|uninstall|status
Manage SwarmVault's local git hook blocks for the nearest git repository.
hook installwrites marker-basedpost-commitandpost-checkouthookshook uninstallremoves only the SwarmVault-managed hook blockhook statusreports whether those managed hook blocks are installed
The installed hooks run swarmvault watch --repo --once --code-only from the vault root so commit and checkout refreshes update code pages and graph structure quickly. Run a normal swarmvault compile when you also want non-code semantic re-analysis.
swarmvault mcp
Run SwarmVault as a local MCP server over stdio. This exposes the vault to compatible clients and agents through tools and resources such as:
workspace_infosearch_pagesread_pagelist_sourcesquery_vaultingest_inputcompile_vaultlint_vaultblast_radius
compile_vault also accepts maxTokens for bounded wiki output, and blast_radius traces reverse import impact for a file or module target.
The MCP surface also exposes swarmvault://schema, swarmvault://sessions, swarmvault://sessions/{path}, and includes schemaPath in workspace_info.
swarmvault graph serve
Start the local graph workspace backed by state/graph.json, /api/search, /api/page, and local graph query/path/explain endpoints.
It also exposes /api/bookmarklet and /api/clip, so a running local viewer can ingest the current browser page through a bookmarklet without leaving the browser.
swarmvault graph query "<question>" [--dfs] [--budget <n>]
Run a deterministic local graph traversal seeded from local search, graph labels, and matching group patterns.
swarmvault graph path <from> <to>
Return the shortest high-confidence path between two graph targets.
swarmvault graph explain <target>
Inspect graph metadata, community membership, neighbors, provenance, and group-pattern membership for a node or page.
swarmvault graph god-nodes [--limit <n>]
List the most connected bridge-heavy nodes in the current graph.
swarmvault graph blast <target> [--depth <n>]
Trace the reverse-import blast radius of changing a file or module.
- accepts a file path, module label, or module id
- follows reverse
importsedges through the compiled graph - reports affected modules by depth so you can estimate downstream impact before editing
swarmvault graph export --html|--html-standalone|--report|--svg|--graphml|--cypher|--json|--obsidian|--canvas <output>
Export the current graph as one or more shareable formats:
--htmlfor the full self-contained read-only graph workspace--html-standalonefor a lighter vis.js export with node search, legend, and sidebar inspection--reportfor a self-contained HTML graph report with stats, key nodes, communities, and warnings--svgfor a static shareable diagram--graphmlfor graph-tool interoperability--cypherfor Neo4j-style import scripts--jsonfor a deterministic machine-readable graph package--obsidianfor an Obsidian-friendly markdown vault that preserves wiki folders, appends graph connections, emits orphan-node stubs and community notes, copies assets, and writes a minimal.obsidian/config--canvasfor an Obsidian canvas grouped by community
You can combine multiple flags in one run to write several exports at once.
Set graph.communityResolution in swarmvault.config.json when you want to pin the Louvain clustering resolution used by graph reports and Obsidian community output instead of relying on the adaptive default.
swarmvault graph push neo4j
Push the compiled graph directly into Neo4j over Bolt/Aura instead of writing an intermediate file.
Useful flags:
--uri <bolt-uri>--username <user>--password-env <env-var>--database <name>--vault-id <id>--include-third-party--include-resources--include-generated--dry-run
Defaults:
- reads
graphSinks.neo4jfromswarmvault.config.jsonwhen present - includes only
first_partygraph material unless you opt into more source classes - namespaces every remote record by
vaultIdso multiple vaults can safely share one Neo4j database - upserts current graph records and does not prune stale remote data yet
swarmvault install --agent <codex|claude|cursor|goose|pi|gemini|opencode|aider|copilot|trae|claw|droid>
Install agent-specific rules into the current project so an agent understands the SwarmVault workspace contract and workflow.
Hook-capable installs:
swarmvault install --agent claude --hook
swarmvault install --agent gemini --hook
swarmvault install --agent opencode --hook
swarmvault install --agent copilot --hookAgent target mapping:
codex,goose,pi, andopencodeshareAGENTS.mdclaudewritesCLAUDE.mdgeminiwritesGEMINI.mdaiderwritesCONVENTIONS.mdand merges.aider.conf.ymlcopilotwrites.github/copilot-instructions.mdplusAGENTS.mdcursorwrites.cursor/rules/swarmvault.mdctraewrites.trae/rules/swarmvault.mdclawwrites.claw/skills/swarmvault/SKILL.mddroidwrites.factory/rules/swarmvault.md
Hook semantics:
claude --hookwrites.claude/settings.jsonplus.claude/hooks/swarmvault-graph-first.jsand adds model-visible advisory context through structured hook JSONgemini --hookwrites.gemini/settings.jsonplus.gemini/hooks/swarmvault-graph-first.jsand stays advisory/model-visibleopencode --hookwrites.opencode/plugins/swarmvault-graph-first.jsand stays advisory/log-onlycopilot --hookwrites.github/hooks/swarmvault-graph-first.jsonplus.github/hooks/swarmvault-graph-first.jsand remains decision-based rather than advisory
aider is intentionally file/config-based in this release rather than hook-based.
OpenClaw / ClawHub Skill
If you use OpenClaw through ClawHub, install the packaged skill:
clawhub install swarmvaultThat published bundle includes SKILL.md, a ClawHub README, examples, references, troubleshooting notes, and release-validation prompts. The CLI binary still comes from npm:
npm install -g @swarmvaultai/cliProvider Configuration
SwarmVault defaults to a local heuristic provider so the CLI works without API keys, but real vaults will usually point at an actual model provider.
Example:
{
"providers": {
"ollama-local": {
"type": "ollama",
"model": "qwen3:latest",
"baseUrl": "http://127.0.0.1:11434/v1",
"capabilities": ["chat", "structured", "vision", "local"]
}
},
"tasks": {
"compileProvider": "ollama-local",
"queryProvider": "ollama-local",
"lintProvider": "ollama-local",
"visionProvider": "ollama-local"
}
}Generic OpenAI-compatible APIs are supported through config when the provider follows the OpenAI request shape closely enough.
Deep lint web augmentation uses a separate webSearch config block. Example:
{
"webSearch": {
"providers": {
"evidence": {
"type": "http-json",
"endpoint": "https://search.example/api/search",
"method": "GET",
"apiKeyEnv": "SEARCH_API_KEY",
"apiKeyHeader": "Authorization",
"apiKeyPrefix": "Bearer ",
"queryParam": "q",
"limitParam": "limit",
"resultsPath": "results",
"titleField": "title",
"urlField": "url",
"snippetField": "snippet"
}
},
"tasks": {
"deepLintProvider": "evidence"
}
}
}Search behavior is configurable separately from provider routing:
{
"search": {
"hybrid": true,
"rerank": false
}
}search.hybriddefaults to enabled and merges full-text hits with semantic page matches when an embedding-capable provider is availablesearch.rerankoptionally asks the currentqueryProviderto rerank the merged top hits before query answers are generated
Troubleshooting
- If you are running from a source checkout and
graph servesays the viewer build is missing, runpnpm buildin the repository first - If a provider claims OpenAI compatibility but fails structured generation, declare only the capabilities it actually supports
- If
lint --deep --webfails immediately, make sure awebSearchprovider is configured and mapped totasks.deepLintProvider - If you still see a
node:sqliteexperimental warning on Node 24, upgrade to the latest CLI; current releases suppress that upstream warning during normal runs
Links
- Website: https://www.swarmvault.ai
- Docs: https://www.swarmvault.ai/docs
- GitHub: https://github.com/swarmclawai/swarmvault
