pi-llm-wiki
v0.1.0
Published
A Pi package for persistent markdown wikis with source capture, generated metadata, linting, and an LLM wiki-maintainer skill.
Downloads
117
Maintainers
Readme
pi-llm-wiki
Inspired by Andrej Karpathy’s “LLM Wiki” gist: https://gist.github.com/karpathy/442a6bf555914893e9891c11519de94f
This package is a Pi-native implementation of that idea.
Build a persistent, LLM-maintained markdown wiki inside pi with immutable source capture, interlinked knowledge pages, generated navigation metadata, and a bundled wiki-maintainer skill.
pi-llm-wiki implements the “LLM wiki” pattern as a Pi-native package:
- a Pi extension for deterministic operations, guardrails, and generated metadata
- a bundled
llm-wikiskill that teaches the model how to maintain the wiki - a markdown vault structure that accumulates knowledge over time instead of re-deriving it from raw files on every query
Why this package exists
Most file-based LLM workflows behave like one-shot RAG: the model searches raw documents every time you ask a question. That works, but the synthesis is ephemeral.
pi-llm-wiki creates a middle layer:
- raw source packets preserve source-of-truth inputs
- source pages summarize what each source says
- canonical wiki pages track what the wiki currently believes
- generated metadata keeps the whole vault searchable and navigable
The result is a wiki that compounds as you capture sources, ask questions, and file durable analyses.
Features
- Wiki bootstrap — initialize a new vault with config, templates, schema, and metadata files
- Immutable source capture — capture URLs, local files, PDFs, or pasted text into
raw/packets - Source-page boundary — every source becomes a source page before it influences canonical knowledge
- Canonical page management — safely resolve or create concept, entity, synthesis, and analysis pages
- Generated metadata — rebuilds
meta/registry.json,meta/backlinks.json,meta/index.md, andmeta/log.md - Mechanical linting — broken links, orphan pages, duplicate aliases/titles, frontmatter issues, coverage gaps, stale captures
- Operational guardrails — blocks direct edits to raw sources and generated metadata files
- Bundled skill — teaches the model how to capture, integrate, query, and audit the wiki
- Obsidian-friendly links — folder-qualified wikilinks plus stable source-ID citations
Architecture
The vault has four logical layers:
- Raw capture — immutable source packets in
raw/ - Wiki pages — source pages and canonical pages in
wiki/ - Meta — generated registry, backlinks, index, event log, and lint report in
meta/ - Schema — human/model operating rules in
WIKI_SCHEMA.mdand.wiki/config.json
Ownership model
| Path | Owner | Rule |
|------|-------|------|
| raw/** | extension tools | immutable after capture |
| wiki/** | model + user | editable knowledge pages |
| meta/registry.json | extension | generated |
| meta/backlinks.json | extension | generated |
| meta/index.md | extension | generated |
| meta/events.jsonl | extension/tool | append-only |
| meta/log.md | extension | generated from events |
| meta/lint-report.md | extension | generated |
| WIKI_SCHEMA.md | human + explicit request | operating manual |
Install
From npm:
pi install npm:pi-llm-wikiFrom GitHub:
pi install https://github.com/Kausik-A/pi-llm-wikiTry it without installing:
pi -e https://github.com/Kausik-A/pi-llm-wikiQuick start
1) Create a new wiki repo or folder
mkdir my-wiki
cd my-wiki
pi2) Bootstrap the vault
Ask pi:
Initialize an llm wiki here for AI research.That should call wiki_bootstrap and create:
raw/
wiki/
meta/
.wiki/
WIKI_SCHEMA.md3) Capture a source
Examples:
Capture this article into the wiki: https://example.com/some-articleCapture this PDF into the wiki: ./papers/context-windows.pdfCapture these notes into the wiki: ...pasted text...4) Integrate the source
A good integration flow is:
- capture the source
- read
wiki/sources/SRC-*.md - update that source page
- search for impacted canonical pages with
wiki_search - create missing pages with
wiki_ensure_page - update concept/entity/synthesis pages with citations
- mark the integration with
wiki_log_event kind=integrate
5) Query the wiki
Based on the wiki, what are the main tradeoffs between long-context models and RAG?By default, the bundled skill treats query mode as read-only.
If you want a durable answer filed back into the vault:
Answer the question and file the result as an analysis page.Vault layout
my-wiki/
├─ raw/
│ └─ sources/
│ └─ SRC-2026-04-04-001/
│ ├─ manifest.json
│ ├─ original/
│ ├─ extracted.md
│ └─ attachments/
├─ wiki/
│ ├─ sources/
│ ├─ concepts/
│ ├─ entities/
│ ├─ syntheses/
│ └─ analyses/
├─ meta/
│ ├─ registry.json
│ ├─ backlinks.json
│ ├─ index.md
│ ├─ events.jsonl
│ ├─ log.md
│ └─ lint-report.md
├─ .wiki/
│ ├─ config.json
│ └─ templates/
└─ WIKI_SCHEMA.mdLinking and citation style
Internal navigation
Use folder-qualified wikilinks:
[[concepts/retrieval-augmented-generation]]
[[entities/openai|OpenAI]]
[[syntheses/long-context-vs-rag]]Factual citations
Use stable source page ID links:
[[sources/SRC-2026-04-04-001|SRC-2026-04-04-001]]This keeps provenance stable even if titles or page summaries change.
Tools
| Tool | Description |
|------|-------------|
| wiki_bootstrap | Initialize the vault structure, config, templates, schema, and metadata files |
| wiki_capture_source | Capture a URL, file, or pasted text into an immutable source packet and create a source page |
| wiki_search | Search the generated wiki registry |
| wiki_ensure_page | Resolve or safely create canonical concept/entity/synthesis/analysis pages |
| wiki_lint | Run deterministic health checks over the wiki |
| wiki_status | Show counts, source states, and recent activity |
| wiki_log_event | Append structured events and regenerate meta/log.md |
| wiki_rebuild_meta | Force a full metadata rebuild |
Commands
| Command | Description |
|---------|-------------|
| /wiki-status | Show a concise operational summary |
| /wiki-lint [mode] | Run mechanical lint (all, links, orphans, frontmatter, duplicates, coverage, staleness) |
| /wiki-rebuild | Force a full metadata rebuild |
Guardrails
The extension blocks direct edits to:
raw/**meta/registry.jsonmeta/backlinks.jsonmeta/events.jsonlmeta/index.mdmeta/log.mdmeta/lint-report.md
If the model directly edits wiki/** using Pi’s built-in write or edit tools, pi-llm-wiki automatically rebuilds generated metadata at the end of the agent turn.
Source packet format
Each captured source is stored as a packet:
raw/sources/SRC-YYYY-MM-DD-NNN/
├─ manifest.json
├─ original/
├─ extracted.md
└─ attachments/This lets you preserve:
- the original artifact
- normalized extracted text for reading
- capture metadata
- future attachment downloads
Page model
Source pages
wiki/sources/SRC-*.md
These answer: what does this specific source say?
Canonical pages
wiki/concepts/— concepts and recurring ideaswiki/entities/— people, orgs, products, papers, systemswiki/syntheses/— cross-source theses and tensionswiki/analyses/— durable filed answers from queries
These answer: what does the wiki currently believe?
Skill behavior
The bundled llm-wiki skill teaches Pi to:
- never edit raw sources directly
- treat generated metadata as machine-owned
- capture first, integrate second
- search before creating new canonical pages
- cite facts using source-page IDs
- keep query mode read-only by default
- use
Tensions / caveatsandOpen questionswhen evidence is mixed
Versioning and releases
This package uses Semantic Versioning and includes a release/tag flow built for repeatable publishes.
Release flow
- Add notes under
## [Unreleased]inCHANGELOG.md - Run one of:
npm run release:patch
npm run release:minor
npm run release:majorThis will:
- verify the git working tree is clean
- verify you are on
main - run
npm run check - bump the package version
- move
Unreleasednotes into a dated version section inCHANGELOG.md - create a release commit
- create a matching git tag like
v0.1.1
- Push the release commit and tag:
npm run release:push- GitHub Actions publishes the tagged version to npm and creates a GitHub Release.
GitHub Actions
This repo includes:
- CI on push and pull request: runs
npm ci,npm run check, andnpm pack --dry-run - Release on
v*tags: runs checks, publishes to npm, and creates a GitHub release with generated notes
For a fuller walkthrough, see RELEASING.md.
First-time npm publishing checklist
Before the first automated release, do this once:
- Ensure the npm package name is available:
npm view pi-llm-wiki version- Log in locally if you want to do a manual first publish:
npm loginCreate an npm automation token in npm:
- npmjs.com → Account Settings → Access Tokens
- Create a token with publish permission for
pi-llm-wiki
Add the token as a GitHub repository secret:
gh secret set NPM_TOKEN --repo Kausik-A/pi-llm-wiki- Optionally do the first release manually:
npm run check
npm publish --access public- After that, use the release/tag flow for future versions.
Required repository secret
To enable npm publishing from GitHub Actions, add this repository secret:
NPM_TOKEN— an npm access token with publish permissions forpi-llm-wiki
Manual fallback publish
If you ever need to publish manually:
npm run check
npm publish --access publicThen users can update with:
pi updateLocal development
Install locally for testing:
pi install ./pi-llm-wikiLoad only the extension for one-off testing:
pi -e ./pi-llm-wiki/extensions/llm-wiki/index.tsSanity-check the package:
cd pi-llm-wiki
npm run checkNotes
- For PDFs and some binary formats, the extension tries
uvx --from 'markitdown[pdf]' markitdown ...when available. - If
markitdownis unavailable, capture falls back to simpler text or placeholder extraction. - v1 intentionally avoids embeddings and vector databases; the wiki itself is the main retrieval layer.
License
MIT
