@jchaffin/gh-rag

v0.3.9

Published

14 days ago

Hybrid RAG over GitHub repos: BM25 + Pinecone + cited answers.

0High
0Medium
0Low

import { createGhRag } from "@jchaffin/gh-rag";

const rag = await createGhRag({
  openaiApiKey: process.env.OPENAI_API_KEY!,
  pinecone: { apiKey: process.env.PINECONE_API_KEY!, index: "repo-chunks" },
});

await rag.ingest({ gitUrl: "https://github.com/owner/repo.git" });

const { text } = await rag.answer({
  repo: "repo",
  question: "Tell me about the payments project"
});
console.log(text);

Realtime ask (low-latency retrieval)

For voice or streaming clients, pull context snippets fast without generating a full answer:

const rag = createGhRag({
  openaiApiKey: process.env.OPENAI_API_KEY!,
  pine: { index: /* Pinecone index handle */ } as any,
});

const snippets = await rag.ask({ repo: "repo", query: "auth flow", limit: 6 });
// Each snippet: { path, start, end, text }

Server endpoint (Fastify):

Start: npm run build && npm run start
POST http://localhost:3000/ask with JSON { "repo": "repo", "query": "auth flow", "limit": 6 }

Notes:

Set OPENAI_EMBED_MODEL to match your ingested index (e.g., text-embedding-3-small for speed). Ingestion also respects this.
In-memory caching smooths identical queries for ~10s; embeddings cache for ~60s.
Local BM25 index is optional. By default, ingest does not write any local files. To enable BM25 text ranking (used by ask when available), either pass writeBm25: true to ingestRepo/rag.ingest, or set GH_RAG_WRITE_BM25=1 and provide a workdir if you don't want ..

CLI

Ask questions from the command line after ingesting a repo into Pinecone.

Env: set OPENAI_API_KEY, PINECONE_API_KEY, optional PINECONE_INDEX (default repo-chunks), optional GITHUB_TOKEN.

Examples:

# Build once
npm run build

# Ask (uses env REPO and QUESTION if set)
npm run ask -- --repo my-repo --question "What does the auth flow look like?"

# With JSON output
npm run ask -- -r my-repo -q "Key modules?" --json

# If installed globally (after publish or npm link)
gh-rag-ask -r my-repo "How do I run this?"

# Ingest a repo (GitHub URL or local path)
npm run ask -- --repo-url https://github.com/owner/repo.git --repo repo

# Ingest then immediately ask in one command
npm run ask -- --repo-url https://github.com/owner/repo.git -q "What are the core services?"

# Ingest ALL your GitHub repos (requires GITHUB_TOKEN)
# Default filters: excludes forks and archived repos
npm run ingest:all -- --affiliation owner --visibility all --concurrency 2

# Or if installed globally
gh-rag-ingest-all --affiliation owner --visibility all

# Flags:
#   --include-forks        Include forked repos
#   --include-archived     Include archived repos
#   --dry-run              List what would be ingested
#   --index <name>         Override Pinecone index

Pkg
Stats

Discover Tips

General search

Package details

User packages

Sponsor

About

Twitter

GitHub

Twitter

GitHub

Site

Open Software & Tools

Framework

Server

Data Store

Caching

CSS / Styling

Typeface

Avatars

Data Viz

Date formatting

Infinite scrolling

Markdown rendering

Repository url parsing

User data

Compiling

Types

Odds & Ends

@jchaffin/gh-rag

v0.3.9

Published

Vulnerabilities

Links

Maintainers

Keywords

Readme

Realtime ask (low-latency retrieval)

CLI