@jchaffin/gh-rag
v0.3.7
Published
Hybrid RAG over GitHub repos: BM25 + Pinecone + cited answers.
Downloads
783
Readme
import { createGhRag } from "@jchaffin/gh-rag";
const rag = await createGhRag({
openaiApiKey: process.env.OPENAI_API_KEY!,
pinecone: { apiKey: process.env.PINECONE_API_KEY!, index: "repo-chunks" },
});
await rag.ingest({ gitUrl: "https://github.com/owner/repo.git" });
const { text } = await rag.answer({
repo: "repo",
question: "Tell me about the payments project"
});
console.log(text);Realtime ask (low-latency retrieval)
For voice or streaming clients, pull context snippets fast without generating a full answer:
const rag = createGhRag({
openaiApiKey: process.env.OPENAI_API_KEY!,
pine: { index: /* Pinecone index handle */ } as any,
});
const snippets = await rag.ask({ repo: "repo", query: "auth flow", limit: 6 });
// Each snippet: { path, start, end, text }Server endpoint (Fastify):
- Start:
npm run build && npm run start - POST http://localhost:3000/ask with JSON
{ "repo": "repo", "query": "auth flow", "limit": 6 }
Notes:
- Set
OPENAI_EMBED_MODELto match your ingested index (e.g.,text-embedding-3-smallfor speed). Ingestion also respects this. - In-memory caching smooths identical queries for ~10s; embeddings cache for ~60s.
- Local BM25 index is optional. By default, ingest does not write any local files. To enable BM25 text ranking (used by
askwhen available), either passwriteBm25: truetoingestRepo/rag.ingest, or setGH_RAG_WRITE_BM25=1and provide aworkdirif you don't want..
CLI
Ask questions from the command line after ingesting a repo into Pinecone.
- Env: set
OPENAI_API_KEY,PINECONE_API_KEY, optionalPINECONE_INDEX(defaultrepo-chunks), optionalGITHUB_TOKEN.
Examples:
# Build once
npm run build
# Ask (uses env REPO and QUESTION if set)
npm run ask -- --repo my-repo --question "What does the auth flow look like?"
# With JSON output
npm run ask -- -r my-repo -q "Key modules?" --json
# If installed globally (after publish or npm link)
gh-rag-ask -r my-repo "How do I run this?"
# Ingest a repo (GitHub URL or local path)
npm run ask -- --repo-url https://github.com/owner/repo.git --repo repo
# Ingest then immediately ask in one command
npm run ask -- --repo-url https://github.com/owner/repo.git -q "What are the core services?"
# Ingest ALL your GitHub repos (requires GITHUB_TOKEN)
# Default filters: excludes forks and archived repos
npm run ingest:all -- --affiliation owner --visibility all --concurrency 2
# Or if installed globally
gh-rag-ingest-all --affiliation owner --visibility all
# Flags:
# --include-forks Include forked repos
# --include-archived Include archived repos
# --dry-run List what would be ingested
# --index <name> Override Pinecone index