congraph-rag

v0.1.1

Published

11 days ago

**CongraphRAG** is a unified, modular Graph-based Retrieval-Augmented Generation (RAG) framework designed to decouple retrieval logic into atomic operators. Built for state-of-the-art performance and diversity, it supports multiple graph structures and sp

0High
0Medium
0Low

thewilsonglobal

CongraphRAG

CongraphRAG is a unified, modular Graph-based Retrieval-Augmented Generation (RAG) framework designed to decouple retrieval logic into atomic operators. Built for state-of-the-art performance and diversity, it supports multiple graph structures and specialized retrieval engines, standardized around ConGraphDB v0.1.10.

🚀 Key Features

Operator-Based Retrieval: Over 16+ reusable atomic units (e.g., PPR, VDB, One-hop expansion) for complex graph traversals.
Unified Schema: Standardized storage backend using ConGraphDB for Entity, Chunk, Fact, and Community hierarchies.
Engine Diversity: Native support for multiple RAG methodologies (PathRAG, LightRAG, HippoRAG, MS-GraphRAG).
Benchmark Driven: Integrated evaluation suite for Fact Retrieval, Complex Reasoning, and Creative Generation.
Multi-LLM Support: Built-in support for OpenAI, Anthropic, Azure, and local Ollama deployments.

📁 Repository Structure

congraph-rag/
├── src/
│   ├── core/           # Standardized interfaces and types
│   ├── llm/            # Multi-provider LLM & Embedding integration
│   ├── storage/        # ConGraphDB schema & storage logic
│   ├── operators/      # Atomic retrieval building blocks
│   ├── benchmark/      # Evaluation & benchmarking tools
│   ├── engines/        # Engine implementations (PathRAG, HippoRAG, etc.)
│   ├── orchestrator/   # Retrieval pipeline composition
│   ├── server/         # Fastify-based API server
│   ├── dashboard/      # React-based visual debugger
│   └── cli/            # Command-line interface
├── docs/               # Detailed documentation
└── scripts/            # Build and utility scripts

📖 Documentation

Architecture Overview - Deep dive into layers and operator model.
Operators Reference - Detailed guide to atomic retrieval units.
Engines Guide - Breaking down PathRAG, HippoRAG, MS-GraphRAG, etc.
Evaluation & Benchmarking - Measuring Faithfulness and Logic.
Visual Dashboard - Using the Cytoscape-based debugger.

🛠 Prerequisites

Node.js (v20+)
pnpm (v9+) or npm (v10+)
ConGraphDB (v0.1.10)

📦 Installation

# Clone the repository
git clone https://github.com/your-repo/congraph-rag.git
cd congraph-rag

# Install dependencies
pnpm install

# Build the project (compiles TypeScript to the dist folder)
pnpm run build

[!NOTE] The build process uses tsup and is configured to automatically handle self-referencing package imports (e.g. congraph-rag/core) as external dependencies. This ensures that the codebase can be built cleanly from scratch even when the dist directory is completely empty.

🚦 Quick Start

Configuration

Create a .env or configuration object following the ConfigSchema:

const config = {
  storage: {
    type: 'congraphdb',
    connectionString: './data/congraph.db',
  },
  llm: {
    provider: 'openai',
    model: 'gpt-4-turbo',
    apiKey: process.env.OPENAI_API_KEY,
  },
};

Running a Query

Using the CLI:

# Example query using PathRAG engine
npx tsx src/cli/bin.ts query "How are entity X and entity Y related?" --engine path

Roadmap

[x] Operator-Based Model - Atomic retrieval units (VDB, PPR, Louvain, Link, WeightedPath)
[x] Engine Diversity - Support for PathRAG, HippoRAG, LightRAG, MS-GraphRAG, RAPTOR, ToG
[x] ConGraphDB Native Storage - Direct integration with v0.1.10 schemas (Entity, Chunk, Fact, Community)
[x] Logic Score Validation - Graph-based hallucination detection via GraphValidator
[x] Benchmark Rotation - Multi-dataset rotation (Da Vinci, Newton, Tesla) for accuracy testing
[x] Interactive Dashboard - Cytoscape-based visualization of retrieved subgraphs
[x] Multi-LLM Provider Support - Native integrations for OpenAI, Anthropic, Azure, and local Ollama
[ ] Streaming Operator Pipeline - Parallelized execution of retrieval operators for lower latency
[ ] Dynamic Strategy Routing - Agentic engine selection based on query intent analysis
[ ] Incremental Indexing - Delta-indexing for HippoRAG and RAPTOR without full graph rebuilds
[ ] Cross-Graph Federated Search - Retrieving from multiple distributed graph databases simultaneously
[ ] Operator Auto-Tuning - Bayesian optimization for hyperparameter pruning and expansion limits
[ ] Graph-Enriched Fine-Tuning - Pipelines for training small models on operator execution traces
[ ] WebAssembly Operator Core - Running heavy graph operators in the browser/edge environments

🤝 Contributing

We welcome contributions! Please see CONTRIBUTING.md for guidelines.

⚖️ License & Disclaimer

This project is licensed under the ISC License. Please read the DISCLAIMER.md regarding AI-generated content and professional advice.