ragit
v1.1.0
Published
RAG for git-based AI agent workflows
Readme
RAGit
RAGit is a zvec + git bound RAG CLI that runs inside your project repository.
It collects, analyzes, and retrieves documents produced during AI agent workflows, then version-controls snapshots bound to commit SHAs.
Product Purpose
RAGit is a local-first RAG CLI that turns AI agent project documents and context into commit-bound, reusable knowledge inside the repository.
RAGit is not a giant transcript archive. It is an agent-first collaboration memory system that preserves the smallest reusable state needed to resume work at a given commit: goal, constraints, stable decisions, open loops, and next actions. By separating active working memory from durable searchable memory, it helps the next agent recover momentum without replaying the entire past.
Runtime Structure
The runtime structure below shows how ragit connects the CLI, command layer, core services, git-bound snapshots, and local storage.
┌────────────┐
│User / Agent│
├────────────┤
└────────────┘
|
|
┌─────────┐
│ragit CLI│
├─────────┤
└─────────┘
|
┌────────────────────────────┐
│Command Layer │
├────────────────────────────┤
│init │
│ingest │
│query │
│context pack │
│memory │
│session / artifact / harness│
└────────────────────────────┘
|
┌───────────────────┐
│Core Services │
┌─────────────────┐ ├───────────────────┤
│Git commit / HEAD│ │doc authority │
├─────────────────┤ │manifest │
│snapshot binding │ │retrieval │
└─────────────────┘ │memory │
| │artifacts / harness│
| └───────────────────┘
|
┌────────────────────┐ ┌─────────────┐
│.ragit control plane│ ┌────────────┐ │Outputs │
├────────────────────┤ ┌───────┐ │.ragit/store│ ├─────────────┤
│config │ │docs/**│ ├────────────┤ │query hits │
│manifest │ ├───────┤ │documents │ │context pack │
│memory │ └───────┘ │chunks │ │recall packet│
│artifacts │ └────────────┘ └─────────────┘
└────────────────────┘ragit CLIis the single entrypoint. Every user or agent workflow starts by dispatching a command through the command layer.Git commit / HEADbinds manifest selection, so retrieval and recall stay reproducible at a specific repository state..ragit control planestores configuration and tracked knowledge state, while.ragit/storeholds the local vector index fordocumentsandchunks.- User-facing outputs are produced from the same runtime core:
query hits,context pack, andrecall packet.
Git vs RAGit
Git version-controls source code states. RAGit version-controls AI-working knowledge states bound to the same commit history.
sequenceDiagram
participant Developer
participant Git
participant Repository
participant RAGit
participant Store as ".ragit Store"
participant Agent
Developer->>Git: stage and commit code/docs
Git->>Repository: write commit snapshot
Note over Git,Repository: Git manages code and file history
Git-->>RAGit: trigger post-commit / post-merge hook
RAGit->>Repository: detect changed documents since SHA
RAGit->>Store: chunk, index, and write manifest bound to commit SHA
Note over RAGit,Store: RAGit manages document knowledge and agent context history
Agent->>RAGit: query or context pack at HEAD / specific SHA
RAGit->>Store: load snapshot + retrieval data
RAGit-->>Agent: return commit-bound knowledge/context- Git answers: "What did the repository look like at this commit?"
- RAGit answers: "What knowledge and context should an agent use at this commit?"
- Together they make code state and AI context state reproducible.
Core Value
- Preserve project context across AI agent work
- Reproduce knowledge at a specific commit state
- Turn structured docs into agent-ready inputs
- Automate indexing without adding workflow friction
Security Model
RAGit protects knowledge state, not just files.
- Write paths sanitize before persistence, so transcripts, memory state, artifacts, harness runs, and durable docs do not keep raw-looking secrets by default.
- Admission control runs before persistence on knowledge-writing paths. In
security.admission_mode=enforce, high-risk payloads are blocked or replaced with a sentinel before they can become persisted knowledge state; legacy repos without this key fall back toreport-only. - Retrieval-facing commands re-mask again before printing or JSON projection, so
query,context pack,memory recall,log,timeline, andharness packdo not echo raw secret material back to the user. - Remote embedding egress is policy-controlled.
security.remote_embedding_policy=allow-sanitizedallows only sanitized query text and durable-doc ingest text to leave the repository;local-onlyblocks remote egress entirely. ragit security auditinspects control-plane/store/docs/provider posture and admission findings, whileragit security purgesanitizes or clears local state without rewriting repo-tracked documents.
MVP Document Types (v0.1)
Architecture Decision (ADR): durable decision record with rationale and consequencesProduct Requirement (PRD): product problem, users, goals, and success criteriaSoftware Requirements (SRS): system-level functional and non-functional requirementsImplementation Specification (SPEC): implementation-level functional requirements and interface contractsPlan: execution sequencing, milestones, and work breakdownDomain-Driven Design (DDD): bounded contexts, aggregates, and domain structureGlossary: shared vocabulary for stable project termsPhase and Binding Documents (PBD): phase and binding topology for understanding implementation structure and coupling
SAD/HLD/LLD Compatibility Layer
RAGit does not add SAD, HLD, or LLD as new canonical document types.
Instead, it treats them as external architecture views layered on top of the existing document system.
SAD: repository or system-wide architecture explanation, usually read across architecture overviews plus relatedADRdocumentsHLD: higher-level module boundaries, data flow, and topology, usually expressed withSRS,DDD, andPBDLLD: implementation-unit contracts, interfaces, and state details, usually expressed withSPEC
When authors want to make that view explicit, they can add an optional frontmatter hint:
---
type: spec
architecture_view: lld
---architecture_view is advisory only.
RAGit still classifies, validates, ingests, and retrieves documents by canonical type.
Installation
Requirements:
- Node.js
20.19.0or newer - pnpm
10.13.1or newer
For repository-local development:
pnpm install
pnpm ragit --helpInside this repository checkout, run CLI commands with pnpm ragit <command>.
For the published CLI:
npm install -g ragit
pnpm add -g ragit
bun add -g ragit
npx ragit --helpWhen the package is installed globally, use ragit <command>.
pnpm build is optional for repository-local usage.
Run it only when you need to generate dist/ artifacts or verify the packaged CLI entrypoint.
pnpm buildDocumentation (Fumadocs + GitHub Pages)
- Primary URL (English):
https://rhiokim.github.io/ragit/en/ - Korean URL:
https://rhiokim.github.io/ragit/ko/ - English is the source of truth, and Korean is provided in the same structure.
Run locally:
pnpm docs:devBuild static output and preview:
pnpm docs:check:i18n
pnpm docs:build
pnpm docs:serveDeployment:
- GitHub Actions deploys automatically to
gh-pageswhenmainis pushed. - For manual redeploy, run
docs-gh-pagesviaworkflow_dispatch. - In Repository Settings > Pages, set Source to
GitHub Actions.
Package Publishing
publish.ymlvalidates tags againstpackage.json.versionand publishes only onvX.Y.Ztag pushes.workflow_dispatchruns the same release checks without publishing, so you can rehearse the pipeline before the first release.- Before enabling automatic publish, configure npm Trusted Publishing for
rhiokim/ragitand the GitHub Actions workflow.
Release validation flow:
pnpm release:check
VERSION=$(node -p 'require("./package.json").version')
git tag "v${VERSION}"
git push origin --tagsCore Commands
pnpm ragit describe query --format json
pnpm ragit init
pnpm ragit init --yes --output json
pnpm ragit init --yes --git-init
pnpm ragit log --max-count 5 --view default --format both
pnpm ragit narrative --format both
pnpm ragit narrative --emit-model .ragit/reports/narrative/current.model.json
pnpm ragit drift --scope all --view default --format both
pnpm ragit repair --scope all --format json
pnpm ragit security audit --format json
pnpm ragit security purge --target control-plane --dry-run --format json
pnpm ragit config set retrieval.top_k 8
pnpm ragit hooks install --dry-run --format json
pnpm ragit ingest --all --dry-run --format json
pnpm ragit query "DDD bounded context principles" --view minimal --format both
pnpm ragit context pack "Implementation plan for this sprint" --budget 1200 --view minimal --format both
pnpm ragit memory wrap --input session-wrap.json --dry-run --format json
pnpm ragit memory recall "resume auth flow" --view minimal --format both
pnpm ragit memory promote --input promotion-batch.json --dry-run --format json
pnpm ragit migrate from-json-store --dry-run
pnpm ragit migrate from-sqlitevss --dry-run
pnpm ragit status --format json
pnpm ragit doctor --format jsonragit narrative usage boundary:
- Canonical artifact: self-contained HTML report
- Optional viewer input: viewer-safe model export at
.ragit/reports/narrative/current.model.json - Experimental local explorer: isolated OpenTUI viewer under
tools/narrative-tui/ - Freshness axis: the narrative model and report also carry
fresh|suspect|stalefreshness state derived fromragit drift; this is a separate axis fromtrust,sensitivity, andlineage - Validation axis: the narrative model and report also carry
verified|attention|unverifiedvalidation posture derived from harness/drift assets; this is a separate axis fromfreshness,trust,sensitivity, andlineage - User-facing IA: the HTML report includes a dedicated Validation Panel, and the OpenTUI right rail is
Intent | Validation | Timeline
pnpm ragit narrative --emit-model .ragit/reports/narrative/current.model.json
cd tools/narrative-tui
bun run start -- --model ../../.ragit/reports/narrative/current.model.jsonHow Ingest Works
The flow below shows how ragit ingest turns repository documents and bound artifacts into a searchable snapshot.
┌─┐
║"│
└┬┘
┌┼┐ ┌─────────┐
│ ┌─────┐ ┌──────┐ ┌────┐ │Session /│ ┌───────┐
┌┴┐ │ragit│ │run │ │Repo│ │Harness │ │.ragit/│ ┌────────┐
User / │CLI │ │Ingest│ │docs│ │artifacts│ │store │ │Manifest│
Agent └──┬──┘ └──┬───┘ └─┬──┘ └────┬────┘ └───┬───┘ └───┬────┘
│ ragit ingest ... │ │ │ │ │ │
│ ───────────────────────────> │ │ │ │ │
│ │ │ │ │ │ │
│ │ parse mode │ │ │ │ │
│ │ + source options │ │ │ │ │
│ │ ──────────────────────> │ │ │ │
│ │ │ │ │ │ │
│ │ │────┐ │ │ │ │
│ │ │ │ ensure .ragit │ │ │ │
│ │ │<───┘ load config │ │ │ │
│ │ │ check HEAD │ │ │ │
│ │ │ │ │ │ │
│ │ │ │ │ │ │
│ │ │ resolve candidates │ │ │ │
│ │ │ ──────────────────────────> │ │ │
│ │ │ │ │ │ │
│ │ │ │ │ │ │
│ │ ╔═══════╤════╪═══════════════════════════╪════════════╗ │ │ │
│ │ ║ LOOP │ each supported doc │ ║ │ │ │
│ │ ╟───────┘ │ │ ║ │ │ │
│ │ ║ │ hash -> mask -> detect │ ║ │ │ │
│ │ ║ │ validate -> chunk -> embed│ ║ │ │ │
│ │ ║ │ ──────────────────────────> ║ │ │ │
│ │ ╚════════════╪═══════════════════════════╪════════════╝ │ │ │
│ │ │ │ │ │ │
│ │ │ │ │ │ │
│ ╔══════╤═════╪═══════════════════════╪═══════════════════════════╪══════════════════╪═══════════════════╪══════════════════╪═════════════╗
│ ║ ALT │ --dry-run │ │ │ │ │ ║
│ ╟──────┘ │ │ │ │ │ │ ║
│ ║ │ return planned summary│ │ │ │ │ ║
│ ║ │ <─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ │ │ │ │ ║
│ ╠════════════╪═══════════════════════╪═══════════════════════════╪══════════════════╪═══════════════════╪══════════════════╪═════════════╣
│ ║ [apply] │ │ │ │ │ │ ║
│ ║ │ │ bind pending artifacts │ │ │ ║
│ ║ │ │ + build artifact chunks │ │ │ ║
│ ║ │ │ ────────────────────────────────────────────>│ │ │ ║
│ ║ │ │ │ │ │ │ ║
│ ║ │ │ write docs + chunks │ │ │ ║
│ ║ │ │ ────────────────────────────────────────────────────────────────>│ │ ║
│ ║ │ │ │ │ │ │ ║
│ ║ │ │ │ build + write snapshot │ │ ║
│ ║ │ │ ────────────────────────────────────────────────────────────────────────────────────> ║
│ ║ │ │ │ │ │ │ ║
│ ║ │ │ ingest summary │ │ │ ║
│ ║ │ <─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ║
│ ╚════════════╪═══════════════════════╪═══════════════════════════╪══════════════════╪═══════════════════╪══════════════════╪═════════════╝
│ │ │ │ │ │ │
│ searchable snapshot summary│ │ │ │ │ │
│ <─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ │ │ │ │ │
│ │ │ │ │ │ │
│ │ │ │ │ │ │ - Candidate resolution changes by mode: explicit
--path, glob-style--files, incremental--since, or the default full-snapshot scan. --dry-runstops before writing.ragit/storeor a new manifest and only returns the planned ingest summary.- The apply path is where pending artifact binding, artifact chunk construction, and store/manifest writes actually happen.
- The final searchable truth comes from the manifest snapshot, not from raw files or chunks alone.
Storage Layout
.ragit/
config.toml
guide/guide-index.json
guide/templates/
manifest/<commit-sha>.json
memory/sessions/
memory/working/
store/meta.json
store/documents/
store/chunks/
cache/
hooks/
docs/
memory/
decisions/
glossary/
plans/- Recommended for Git tracking:
.ragit/config.toml,.ragit/manifest/**,.ragit/memory/**,docs/memory/** - Local-only (default
.gitignore):.ragit/store/**,.ragit/cache/**
Memory OS MVP
memory wrap: save a session summary into.ragit/memory/sessions/and refresh working state in.ragit/memory/working/memory recall: combine working state and snapshot-scoped retrieval into an agent-ready recall packetmemory promote: crystallize promotion candidates into searchable long-term docs underdocs/memory/**and ingest them immediately whenHEADexists
This split is intentional:
.ragit/memory/**is the tracked control plane for working state and session historydocs/memory/**is the searchable long-term memory corpus that participates in normal ingest/query flows
Agent CLI Contract
- Prefer
--format jsonfor machine consumers. - Use
ragit describe <command> --format jsonbefore integrating a command for the first time. - Prefer
--view minimalforquery,context pack, andmemory recall. - Prefer
--input <path|->for structured agent payloads. - Run mutating commands with
--dry-runfirst:ingest,hooks install,hooks uninstall,memory wrap,memory promote.
Canonical Agent Skill
- Repository-managed source:
skills/use-ragit - Codex install target:
${CODEX_HOME:-$HOME/.codex}/skills/use-ragitvia copy or symlink - Shared agent-neutral references for Claude and Gemini:
skills/use-ragit/references/
Discover-First init
pnpm ragit init is now a discover-first bootstrap command.
It still prepares .ragit/**, AGENTS.md, guide assets, and the local zvec store, but it does that only after it inspects the repository and decides what knowledge already exists.
Default flow:
- Check Git environment (and optionally run
git init) - Scan repository code/docs/build files
- Select
empty,existing,docs-heavy, ormonorepo - Compute documentation coverage, maturity, and knowledge-slot mapping
- Reuse existing repository docs first and plan missing foundational docs
- Write stage-1 draft docs plus
.ragit/** - Bootstrap the zvec canonical store
- Print the final summary and next actions
What init prepares:
- Git-aware repository normalization
- Existing-doc discovery and coverage evaluation
- Stage-1 foundational drafts when missing:
RAGIT.mddocs/workspace-map.mddocs/ragit/ingestion-policy.mddocs/known-gaps.mddocs/adr/README.md
.ragit/config.toml,.ragit/guide/templates/*, and.ragit/guide/guide-index.json- Empty zvec collections under
.ragit/store/ - Next-action guidance for
hooks installandingest
What init does not prepare:
- No searchable corpus, chunk records, or manifests
- No zvec document/chunk upsert
- No query-ready knowledge state during
init
In other words, init makes the repository diagnosed, foundation-ready, and zvec-store-ready, not search-ready.
storage.backend = "zvec" still means the canonical backend, and searchable knowledge still begins only after pnpm ragit ingest ... runs.
Supported options:
pnpm ragit init --mode auto --strategy balanced --merge-existing
pnpm ragit init --yes # non-interactive with defaults
pnpm ragit init --non-interactive # alias of --yes
pnpm ragit init --git-init # allow git init in non-interactive mode
pnpm ragit init --dry-run --output json
pnpm ragit init --output json # JSON summary output--cwdmay point to the repository root or any nested path inside the worktree;initnormalizes to the Git root before writing.ragitorAGENTS.md.--modeoverrides repository-mode detection.--strategycontrols how aggressively stage-1 draft docs are generated.--dry-runcomputes the full analysis report without writing files or bootstrapping storage.- zvec bootstrap currently supports
darwin/arm64,linux/arm64, andlinux/x64.
Recommended flow after init:
pnpm ragit migrate from-json-store # only if summary says migrationRequired=true
pnpm ragit hooks install
pnpm ragit ingest --allHook Strategy
post-commit: automatically indexes changes fromHEAD~1..HEADpost-merge: automatically indexes changes from${ORIG_HEAD:-HEAD~1}..HEAD- Failures are warning-only and do not block commit/merge flows.
Retrieval Strategy
- 1st pass: zvec vector search scoped to the snapshot manifest
- 2nd pass: keyword score
- Final score:
alpha * vector + (1-alpha) * keyword(defaultalpha=0.7)
Security Defaults
- Secret masking is enabled by default during ingestion (
security.secret_masking=true) - OpenAI/GitHub/AWS keys and
api_key/token/secretpatterns are masked.
License
RAGit is licensed under Apache-2.0. The root LICENSE file is the single source of truth for license terms across this repository.
Test
pnpm test