@nehloo/graphnosis

v0.2.2

Published

2 days ago

AI-native dual-graph knowledge representation — build, query, and persist typed knowledge graphs in-process.

0High
0Medium
0Low

nehloo

graph knowledge-graph rag llm ai memory retrieval

Graphnosis — Dual-Graph Private AI Memory & Knowledge Framework

Instead of raw files that humans can read (md, txt, pdf, doc etc.), can humans feed AI with context binary files that LLMs understand better than humans can read them?

Graphnosis transforms raw files into AI-optimized directed and undirected graph representations. Instead of feeding AI models flat text chunks (the standard RAG approach), Graphnosis builds a structured knowledge graph with typed relationships — then serializes relevant subgraphs into a format designed for machine comprehension, not human readability.

The name is a compound of graph and gnosis (Greek for knowledge) — literally "graph knowledge". The .gai file extension stands for Graphnosis AI, the AI-native knowledge format at the heart of the system.

The result: faster retrieval, richer reasoning, and answers that trace back through explicit relationship chains.

The Question That Started This

"Are AI models based on non-oriented graphs?"

That question — asked casually in a conversation with an AI — unlocked something that had been sitting dormant for decades.

I first wrote code in 1990. Basic language, on a Sintez ZX Spectrum clone connected to a TV set and a tape cassette, in Romania. A few years later, at an Informatics high school, I learned about directed and undirected graphs — oriented and non-oriented, as we called them. They were elegant. They made sense in a way that linear data structures didn't. But at the time, there wasn't much you could do with them beyond textbook exercises.

In the late 1990s, I wrote a C++ class built around machine learning concepts — a neural network training loop that would take a sketchy, hand-drawn letter as input (drawn using a pixel editor I'd built — the EditIcon project preserves that early tool), run it through repeated training cycles, and process the output until the system recognized what the letter was supposed to be. It worked. It felt like the future. But the future wasn't ready yet.

In the early 2000s, I explored treating electrical harnesses as undirected graphs — modeling the physical wiring of circuits as graph structures to enable faster comprehension and 3D routing of complex harness designs. The concept showed promise, but it was left unexplored. Other things took priority.

Over the years, I explored many startup ideas and concepts across different domains — software, music, events, nonprofits, research. Each one taught something. None of them brought all the threads together.

Then came that question about AI and graphs. And suddenly, everything connected.

The graphs from high school. The neural network from C++. The harness routing from engineering. The startup instinct from years of building things. The realization that AI models might process knowledge more effectively through the same structures I'd been thinking about since I was a teenager — not as human-readable text, but as typed, weighted, traversable graphs.

The insight: human-readable formats are lossy for AI consumption. Prose contains redundant phrasing, implicit relationships, linear structure that hides non-linear connections, and ambiguity that humans resolve with world knowledge but AI must guess at. A purpose-built AI-native format could be dramatically more efficient.

Graphnosis is what happens when three decades of scattered ideas finally find their moment.

How Graphnosis Works for AI

Graphnosis intakes humans' raw files meant for AI context (md, txt, pdf, doc etc) and generates dual-graph knowledge memory binary files that humans can't read - but AI best can.

The Pipeline

RAW FILES (any format)  ──>  DETERMINISTIC PIPELINE ($0)  ──>  DUAL GRAPH
                                                                   |
CONVERSATIONS  ─────────────────────────────────────>    TEMPORAL + IDENTITY LAYER
                                                                   |
                                                          LLM ENRICHMENT (optional)
                                                                   |
                                                          ENRICHED GRAPH (.gai)
                                                        /      |        \
                                                   QUERY    GIKI       AUDIT
                                                     |    (pages)    (reports)
                                                  ANSWER
                                                     ^
                                             HUMAN CORRECTIONS

Deterministic pipeline ($0): Parsing, chunking, entity extraction, TF-IDF similarity, graph construction — all pure JS, zero API calls.

LLM enrichment (optional): Adds synthesis (one-sentence insight), contextual explanation, and source quality annotation per node. Costs ~$0.50-2 per dataset.

Human corrections: Add facts, edit nodes, supersede outdated info, or bulk-import markdown. Human-corrected nodes get maximum confidence (1.0).

Forgetting policy: Forget by topic ("forget everything about my old job"), by time window ("forget everything before March"), or cascade from a source node. All forgetting is soft-delete — nodes remain in the graph for audit but score 0.3x in queries. Nothing is ever permanently destroyed.

Giki pages: Human-readable topic pages auto-generated from the graph, with citations back to specific graph nodes.

Audit reports: Entity breakdowns, contradiction detection, cross-domain discoveries, health dashboard, markdown export.

The Dual-Graph Model

Every piece of knowledge exists as a node. Nodes are connected by two types of edges:

Directed edges (arrows — A -> B) represent:

contains — a section contains a paragraph
precedes — one fact follows another in sequence
cites — one source references another
defines — a definition explains a concept used elsewhere
causes, supports, contradicts — causal and logical relationships
supersedes — new information replaces old (with provenance)
discussed-in — knowledge traced back to conversation origin
knows, works-with, reports-to — person relationships

Undirected edges (lines — A <-> B) represent:

similar-to — two facts share vocabulary (measured by TF-IDF cosine similarity)
shares-entity — two facts mention the same person, place, or concept
co-occurs — two facts appear in the same section
same-person — two mentions of the same person across sources
related-to — general association between people or concepts

Both edge types exist over the same node set. This dual structure gives AI models richer reasoning paths than either graph type alone.

Temporal Awareness

Every node tracks:

createdAt — when the knowledge was first ingested
lastAccessedAt — when it was last retrieved in a query
accessCount — how many times it's been used
validUntil — optional expiration (for superseded information)
confidence — 0-1 score that decays over time if knowledge isn't reinforced

The query engine applies temporal scoring: recently accessed nodes score higher, frequently used nodes score higher, expired nodes score 0.3x. Knowledge that isn't accessed for 7+ days begins to decay.

Conversation Memory

Graphnosis ingests conversations (Claude, ChatGPT, Slack, raw text) into the same graph as domain knowledge. Each conversation turn becomes a node with discussed-in edges linking to the knowledge it references. This means the system remembers what you discussed alongside what it knows.

Identity Layer

Person entities mentioned 2+ times across sources automatically get dedicated person nodes with:

Inferred attributes (role, organization) from surrounding content
Relationship edges between co-mentioned persons
User profile inference from conversation patterns

The .gai Format

Instead of storing knowledge as human-readable markdown, Graphnosis uses a binary format (.gai — short for Graphnosis AI) built on MessagePack:

[4-byte magic: "GAI" + version]
[4-byte header length]
[MessagePack header: node count, edge count, levels, metadata]
[MessagePack body: nodes, directed edges, undirected edges, hierarchy]
[4-byte checksum]

This isn't designed for humans to read. It's designed for AI to consume efficiently — fewer tokens, explicit structure, typed relationships.

How Queries Work

When you ask a question:

Query decomposition — Complex questions are split into sub-queries; each is expanded with synonyms derived from the graph itself
Seed finding — TF-IDF matching across all query variants identifies the most relevant nodes
Graph traversal — BFS from seed nodes with temporal scoring (recency + frequency + confidence)
Subgraph extraction — Top 20 nodes + connecting edges, including enriched synthesis when available
Serialization — Structured format with explicit edges for LLM reasoning:

=== KNOWLEDGE SUBGRAPH (20 nodes, 58 edges) ===

--- NODES ---
[n1|event|0.53] The Turing machine was invented in 1936 by Alan Turing...
[n2|fact|0.38] A universal Turing machine can simulate any other Turing machine...

--- DIRECTED ---
n1 -[defines:0.9]-> n2

--- UNDIRECTED ---
n1 ~[similar-to:0.7]~ n2

--- ENRICHED INSIGHTS ---
[event|0.53] SYNTHESIS: Turing's 1936 paper laid the theoretical foundation for all computation
  CONTEXT: This event preceded physical computers by a decade and connects to Church's lambda calculus

Prior Art & What's Different

Graph-based RAG is an active research area. Microsoft's GraphRAG (2025) pioneered community detection and hierarchical summaries on knowledge graphs. LightRAG (EMNLP 2025) introduced dual-level retrieval combining entity extraction with abstract reasoning. LazyGraphRAG achieved 700x query cost reduction vs GraphRAG.

Graphnosis's contribution is a specific combination that hasn't been published as a unified system:

Dual-graph (directed + undirected edges over the same node set) — most systems use one graph type
AI-native binary format (.gai) optimized for machine consumption, not human readability
Zero-API graph construction (TF-IDF, no embeddings required) — $0 to build the graph
Human audit trail — giki pages with node citations, contradiction detection, correction API
Temporal awareness — confidence decay, supersedes edges, access tracking per node
Identity layer — automatic person extraction, relationship edges, user profile inference
Reflection engine — automated contradiction detection, cross-domain discovery, transitive edge inference

No single technique here is new. The novelty is the combination into a unified, open-source system.

Landscape Comparison

Graphnosis exists alongside other approaches to persistent AI knowledge. Each makes different tradeoffs:

| | Graphnosis | GBrain (Garry Tan) | MemPalace (Milla Jovovich) | Karpathy Wiki | |---|---|---|---|---| | Representation | Dual-graph (.gai binary) | Markdown files in git | Spatial hierarchy + ChromaDB | Markdown wiki pages | | Conversation memory | Yes (Claude/ChatGPT/Slack) | No | Yes (core feature) | No | | Identity tracking | Auto-extracted person nodes | Manual (people/ dir) | No | Partial (entity pages) | | Contradiction detection | Automated | No | No | LLM lint (manual) | | LLM cost to build | $0 + optional enrichment | ~$5-20/dataset | $0 | ~$10-50/dataset | | Human auditability | Giki pages + audit export | Native (markdown) | Partial | Native (wiki) | | Relationships | Explicit typed edges | Implicit links | Tunnels | Implicit cross-refs | | Persistence | SQLite + .gai files | Git repo | ChromaDB + SQLite | Filesystem |

Where Graphnosis wins: Relationship-aware reasoning, multi-source knowledge fusion, token efficiency, automated contradiction detection.

Where others win: GBrain has native git version control. MemPalace achieves 96.6% retrieval recall R@5 on LongMemEval (different metric than end-to-end QA — see benchmarks.md). Karpathy's pattern produces richer narrative synthesis.

They complement each other: MemPalace for conversation memory, GBrain for personal knowledge management, Graphnosis for structured domain knowledge with explicit relationships.

Why This Matters (vs. Standard RAG)

| Aspect | Standard RAG | Graphnosis | |--------|-------------|---------| | Context format | Flat text chunks | Structured subgraph with typed edges | | Relationships | Implicit (AI must infer) | Explicit (edges with types and weights) | | Retrieval | Vector similarity on chunks | Graph traversal + synonym expansion + query decomposition | | Resolution | Fixed chunk size | Hierarchical (zoom in/out via compression levels) | | Dependencies | Requires embedding API | TF-IDF (pure JS, zero API calls for graph construction) | | Memory | Stateless per session | Temporal nodes + conversation ingestion + SQLite persistence | | Corrections | Re-ingest from scratch | In-place edit, supersede, or soft-delete individual nodes | | Auditability | None | Giki pages, audit reports, contradiction detection |

Proof-of-Concept Datasets

All datasets use freely-licensed public content:

| Dataset | Source | License | Result | |---------|--------|---------|--------| | History of Computing | Wikipedia (51 articles) | CC BY-SA 3.0 | 12,199 nodes, 67,578 edges | | Transformer Architecture | arXiv (25 papers) | Open Access | Paper abstracts + metadata | | Next.js Documentation | GitHub (30 pages) | MIT | Markdown docs + code examples | | NASA Mars Missions | api.nasa.gov | Public Domain | Rover data + mission facts |

Performance

Graph Engine

Benchmarked on the Wikipedia dataset (12,199 nodes, 67,578 edges):

Avg query time: 75ms (seed finding + graph traversal + serialization)
Avg nodes retrieved: 20 per query
Avg token estimate: ~2,138 tokens per subgraph context
Graph construction: ~15 seconds for 51 Wikipedia articles

LongMemEval — Official Benchmark

76.40% end-to-end QA accuracy on the official LongMemEval benchmark (500 questions, gpt-4o answer + gpt-4o judge, hybrid retrieval).

| Category | Score | |---|---| | single-session-user | 95.31% (61/64) | | knowledge-update | 87.50% (63/72) | | single-session-assistant | 87.50% (49/56) | | temporal-reasoning | 71.65% (91/127) | | multi-session | 63.64% (77/121) | | single-session-preference | 43.33% (13/30) |

What got us here:

Hybrid retrieval: TF-IDF graph traversal + semantic embeddings (text-embedding-3-small)
Question-type router with category-specific retrieval strategies and prompt blocks
Session summary nodes (gpt-4o-mini at ingest) for multi-session / temporal / knowledge-update questions
Query-time preference extraction (gpt-4o-mini) for single-session-preference questions
Multi-session aggregation routing: strong count-signal captured before temporal/KU patterns
Aggregation prompt distinguishes additions vs. superseded totals (sum vs. supersede logic)
Temporal grounding: date normalization, wider BFS subgraph for time-sensitive questions
Session-diverse seed selection to improve cross-session recall
Sibling-turn expansion to include conversational context around relevant turns
Upgraded answer model from gpt-4o-mini to gpt-4o

Leaderboard context (end-to-end QA with official GPT-4 judge):

| System | Score | |---|---| | Agentmemory V4 | 96.20% | | PwC Chronos | 95.60% | | OMEGA | 95.40% | | Mastra | 94.87% | | Supermemory | 85.86% | | Graphnosis | 76.40% | | Zep | 71.20% |

MemPalace's 96.6%/100% figures measure retrieval recall R@5 (is the correct conversation session in the top 5 results?) — a different metric than end-to-end QA with a GPT-4 judge. Both are valid; they measure different things. This runner uses the verbatim official judge prompts from xiaowu0162/LongMemEval.

For the full benchmark progression story — every iteration from first run to this result — see benchmarks.md.

Graphnosis as AI Middleware (MCP Server)

Graphnosis ships as a portable MCP (Model Context Protocol) server — drop-in knowledge-graph middleware for any LLM. Load a .gai file, ask a question, and receive a ~2K-token plain-text subgraph snippet ready to inject into any LLM's system prompt.

Two deployment modes

Mode 1 — Local / Claude Desktop (stdio transport)

Add to claude_desktop_config.json:

{
  "mcpServers": {
    "graphnosis": {
      "command": "node",
      "args": ["/path/to/Graphnosis/node_modules/.bin/tsx", "src/mcp/server.ts"],
      "cwd": "/path/to/Graphnosis",
      "env": { "OPENAI_API_KEY": "sk-..." }
    }
  }
}

Or run directly:

npm run mcp

Mode 2 — Enterprise On-Premises (HTTP transport, Docker)

docker compose up
# MCP endpoint: http://internal-host:3001/mcp

Point any MCP-compatible client at your internal host. The .gai file stays on your mounted volume inside the enterprise perimeter. See enterprise/enterprise.md for the full security and privacy architecture.

MCP tools

| Tool | What it does | |------|-------------| | load_graph | Load a .gai file into session memory | | ingest_files | Parse raw files → build graph → store in session | | update_graph | Add new documents to an existing session graph | | query | Ask a question → returns a ~2K plain-text subgraph snippet | | export | Write the session graph back to a .gai file |

Privacy guarantee: query returns only the serialized subgraph text and a node count — never the full graph, raw node list, or binary file. Only the few hundred tokens relevant to your question ever leave the enterprise perimeter.

Why not just use Anthropic/OpenAI memory?

Cloud-provider memory stores your knowledge on their infrastructure. Graphnosis keeps the graph on your machine or your servers — always.

| | Graphnosis | Cloud-provider memory | |---|---|---| | Graph location | Your machine / enterprise servers | Provider infrastructure | | LLM compatibility | Any (Claude, GPT-4, Gemini, Ollama…) | Provider-locked | | Privacy | Full control | Data leaves perimeter | | Format | Open .gai (portable) | Proprietary | | Self-hostable | Yes | No |

Getting Started

# Install dependencies
npm install

# Set up environment (required for chat/LLM features)
cp .env.example .env.local
# Add your OPENAI_API_KEY to .env.local

# Run the development server
npm run dev

Open http://localhost:3000 and use the navigation:

| Page | Purpose | |------|---------| | Dashboard | Graph stats, node/edge type breakdowns | | Examples | Load proof-of-concept datasets (Wikipedia, arXiv, Next.js, NASA) | | Graph | Force-directed visualization with node inspector | | Chat | Query the graph with optional subgraph context panel | | Correct | Add facts, edit nodes, supersede info, bulk-import markdown | | Giki | Browse auto-generated topic pages with node citations | | Audit | Entity reports, contradictions, health dashboard, markdown export | | Benchmarks | Query performance metrics across 10 test queries |

Using Graphnosis as an NPM Dependency

Graphnosis ships a Node SDK so you can embed the graph engine inside your own service without running the Next.js app. The SDK is in-process and performs zero network I/O — the core query path never calls OpenAI or any other remote service. See enterprise/enterprise.md for the full enterprise security posture.

Install

npm install @nehloo/graphnosis

# Optional — only needed for embedding-based retrieval (semantic search):
npm install ai @ai-sdk/openai

# Optional — only needed for SQLite persistence:
npm install better-sqlite3

Next.js users: if you install better-sqlite3, add it to serverExternalPackages in your next.config.ts to prevent webpack from bundling the native module:
// next.config.ts
const nextConfig = { serverExternalPackages: ['better-sqlite3'] };
export default nextConfig;
ai and @ai-sdk/openai are peer dependencies — install them only if you call g.buildEmbeddings() / queryHybrid() / promptHybrid() for semantic retrieval. The core graph build/query pipeline (TF-IDF) works without them.

Quick example

import { readFileSync } from 'node:fs';
import { Graphnosis } from '@nehloo/graphnosis';

const g = new Graphnosis({ name: 'docs' });
g.addMarkdown(readFileSync('README.md', 'utf8'), 'README.md');
g.addText('Chunking splits documents into 3-sentence units.', 'notes.txt');
g.build();

// Retrieve a subgraph for a question (no LLM call)
const result = g.query('how does chunking work?');
console.log(result.subgraph.serialized);

// Or build a system prompt ready for any LLM
const prompt = g.prompt('how does chunking work?');
// pass `prompt` to Claude / GPT-4 / Ollama / Bedrock — your choice of client

Appending new files to an existing graph

After calling build() you can add more documents without rebuilding from scratch. New nodes are chunk-deduplicated (by content hash) and edges are wired in incrementally.

Every append method returns an AppendResult that includes any contradictions detected between the new content and existing nodes. The graph is not automatically modified — you decide how to resolve each conflict.

Supported formats: .md .txt .html .htm .csv .json .pdf

import { readFileSync } from 'node:fs';
import { Graphnosis } from '@nehloo/graphnosis';

const g = new Graphnosis({ name: 'kb' });
g.addMarkdown(initialContent, 'base.md');
g.build();

// Single file by content string
const r1 = g.appendMarkdown(moreContent, 'update.md');
const r2 = g.appendText('A quick fact to add.', 'note.txt');

// PDF — pass a Buffer, returns Promise
const r3 = await g.appendPdf(readFileSync('report.pdf'), 'report.pdf');

// Auto-detect format from file extension
const r4 = await g.appendFile('/uploads/research.pdf');

// Walk an entire folder (recursive by default)
const r5 = await g.appendFolder('/docs', { recursive: true });
console.log(`Added ${r5.newNodes} nodes, skipped ${r5.skipped?.length} files`);

// Handle contradictions — user decides what to do with each one
for (const c of r5.contradictions) {
  console.warn('Conflict detected:', c.description);
  console.warn('  Shared entities:', c.sharedEntities.join(', '));
  console.warn('  Node A:', c.nodeA, '  Node B:', c.nodeB);

  // Option 1: supersede the old node with the new content
  g.supersede(c.nodeB, resolvedContent, 'user approved update');

  // Option 2: discard the new node
  g.deleteNode(c.nodeA, 'user dismissed conflict');

  // Option 3: do nothing — both nodes coexist (resolved flag stays false)
}

// Load a saved graph and continue appending to it:
g.loadGai('knowledge.gai', { hmacKey });
await g.appendFolder('/new-docs');
g.saveGai('knowledge.gai', { hmacKey });

Full-graph consistency check — run after a batch of appends for a comprehensive audit:

const { contradictions, discoveries, decayed } = g.reflect();
// contradictions: conflicting claims across the whole graph
// discoveries: surprising cross-domain connections
// decayed: nodes whose confidence dropped (not accessed recently)

Querying multiple graphs (federation)

You can maintain separate, independent knowledge graphs — per domain, per user, per data source — and query across all of them at once. Results are merged, deduplicated by content hash, and ranked into a single LLM-ready prompt.

import { Graphnosis, queryGraphs } from '@nehloo/graphnosis';

const productGraph = new Graphnosis({ name: 'product' });
productGraph.addMarkdown(productDocs, 'product.md').build();

const supportGraph = new Graphnosis({ name: 'support' });
supportGraph.addMarkdown(supportTickets, 'tickets.md').build();

const policyGraph = new Graphnosis({ name: 'policy' });
policyGraph.addMarkdown(policies, 'policy.md').build();

// Single call — queries all three graphs, merges top-20 nodes by relevance
const prompt = queryGraphs(
  [productGraph, supportGraph, policyGraph],
  'how do I cancel my subscription?',
  {},    // QueryOptions (same as g.prompt)
  20     // maxNodes across all graphs (default 20)
);
// pass prompt to your LLM

Each graph stays isolated — different TTLs, persistence backends, and access controls. Federation happens only at query time, in-process, with no network egress.

Corrections

// Edit a node's content
g.edit(nodeId, 'corrected content', 'fixing factual error');

// Soft-delete (node stays for audit, drops from queries)
g.deleteNode(nodeId, 'outdated');

// Supersede — replaces node and links old→new via directed edge
g.supersede(nodeId, 'updated content', 'new version published');

// Bulk-import a markdown document as new nodes
g.importMarkdown(markdownPatch, 'patch-2024-01.md');

// GDPR / data retention
g.forgetBefore(Date.now() - 90 * 24 * 60 * 60 * 1000, 'retention-policy');
g.forgetTopic('John Smith', 'user-deletion-request');

Non-English / non-ASCII corpora — pluggable analyzer (v0.2)

Graphnosis ships two built-in analyzers. Pick the one that matches your corpus:

| Analyzer | id | What it does | Use for | |---|---|---|---| | asciiFoldAnalyzer (default) | ascii-fold | NFD-normalize + strip diacritics + English stopwords. café → cafe, cusătura → cusatura. | English; English with foreign proper names (Beyoncé, São Paulo); Romanian, French, Spanish, Polish — anywhere folding to ASCII is acceptable retrieval. | | unicodeAnalyzer | unicode | Unicode-aware split, preserves diacritics, no stopwords. cusătura stays cusătura. | Languages where diacritics are phonemic: Turkish (ı ≠ i), Hungarian, Finnish, German (where ü ≠ ue distinction matters). |

import { Graphnosis, unicodeAnalyzer } from '@nehloo/graphnosis';

// Default (asciiFoldAnalyzer) — English with diacritic robustness:
const g1 = new Graphnosis({ name: 'docs' });
g1.addMarkdown('The café opened in São Paulo.', 'note.md');
g1.build();
g1.query('cafe sao paulo'); // ✓ matches

// Unicode-preserving — Turkish (where folding would lose meaning):
const g2 = new Graphnosis({ name: 'docs-tr', analyzer: unicodeAnalyzer });
g2.addMarkdown('Bu kız çok şık.', 'note.md');
g2.build();
g2.query('kız'); // ✓ matches; would also match 'kiz' separately

The analyzer's id is persisted on the graph metadata. Loading a graph saved with one analyzer against a runtime configured with another throws a typed AnalyzerMismatchError — token streams are not interchangeable.

For language-specific stemming (Snowball, Zemberek, Hunspell, …) implement the TextAnalyzer interface yourself, or watch for a future @nehloo/graphnosis-langs companion package.

Hybrid retrieval (opt-in)

By default g.query() and g.prompt() are sync and fully offline — pure TF-IDF, no network calls. For semantic retrieval that catches paraphrases TF-IDF misses ("previous job" ↔ "Acme Corp"), opt into the hybrid path with an embedding adapter:

import { Graphnosis } from '@nehloo/graphnosis';
import { openaiEmbedAdapter } from '@nehloo/graphnosis/adapters/openai';

const g = new Graphnosis({
  name: 'docs',
  embed: openaiEmbedAdapter({ model: 'text-embedding-3-small' }),
});
// … addMarkdown / build … //

// One-time: embed every content node (uses the configured adapter)
await g.buildEmbeddings({
  batchSize: 256,
  onProgress: ({ done, total }) => console.log(`embed ${done}/${total}`),
  signal: abortController.signal, // optional cancellation
});

// Hybrid: merges TF-IDF + embedding seed pools (one network call per query)
const ctx    = await g.queryHybrid('previous job', { similarity: 'hybrid' });
const prompt = await g.promptHybrid('previous job');

// Pure semantic — skips the TF-IDF pool
const sem    = await g.queryHybrid('previous job', { similarity: 'embeddings' });

// Keep the index in sync as you append new content
await g.appendWithEmbeddings(parseMarkdown(newDoc, 'note.md'));

Pluggable provider — not just OpenAI. Adapters live behind sub-paths:

| Provider | Import | Notes | |----------|--------|-------| | OpenAI | @nehloo/graphnosis/adapters/openai → openaiEmbedAdapter({ model }) | Symmetric model — intent ignored | | Static (tests) | @nehloo/graphnosis/adapters/static → staticEmbedAdapter({ vectors }) | No network, no peer deps | | Voyage / Cohere / custom | implement EmbeddingAdapter directly | MUST honor intent: 'document' \| 'query' |

See src/sdk/adapters/README.md for the full adapter contract, the id naming convention, and a Voyage example.

Persistence caveat — embeddings are not saved. saveGai() / saveSqlite() / toBuffer() / toSqliteBuffer() only persist the graph and TF-IDF index. The EmbeddingIndex is in-memory only — after loadGai() / fromBuffer() / loadSqlite*() you must call await g.buildEmbeddings() again before using the hybrid methods. Persisting vectors to disk is on the roadmap; for now, treat the embedding index as a per-process cache.

Adapter mismatch is a fail-closed error. Loading a graph that was embedded with one adapter and trying to query it with a different adapter id throws EmbeddingAdapterMismatchError. Vector spaces are not interchangeable across providers / models / dimensions / intents.

The sync query() / prompt() methods continue to work without an embedding index — the hybrid path is purely additive.

Buffer-based persistence (serverless-friendly)

Vercel, Lambda, Cloudflare Workers, and Fly Machines have no persistent local volume. Use the buffer-based methods to round-trip via blob storage without /tmp gymnastics:

// .gai (binary, signed with HMAC if you pass a key)
const buf = g.toBuffer({ hmacKey: process.env.GAI_HMAC_KEY });
await blob.put('graphs/myorg/kg.gai', buf);

// later, in a cold serverless invocation
const fresh = await blob.get('graphs/myorg/kg.gai');
const g2 = new Graphnosis();
g2.fromBuffer(Buffer.from(fresh), { hmacKey: process.env.GAI_HMAC_KEY });

// SQLite (writes a transient file under os.tmpdir() — must be writable)
const sqlBuf = g.toSqliteBuffer();
await blob.put('graphs/myorg/kg.sqlite', sqlBuf);
g2.fromSqliteBuffer(sqlBuf, 'myorg-graph-name');

saveGai() / loadGai() / saveSqlite() / loadSqlite*() continue to work — they're now thin wrappers over the buffer methods.

Reason conventions for soft-delete (v0.2)

The corrections engine takes a freeform reason: string on every soft-delete / edit / supersede / forget. By documented convention, prefix the reason to indicate the lifecycle event class — the audit exporter uses the prefix to filter speculative or rolled-back UX events out of compliance exports by default.

| Prefix | Meaning | Audit visibility | |----------------|-----------------------------------------------------------------|----------------------------| | (no prefix) | Real lifecycle event — load-bearing for audit | Always shown | | user: | Explicit human action (corrections, GDPR deletions) | Always shown | | system: | Automated platform action (cascade, retention, decay) | Shown by default, filterable | | preview: | Speculative / rolled-back UX (preview-rejected, preview-expired)| Hidden by default |

// Recommended: prefix preview-only soft-deletes so they don't pollute audit
g.deleteNode(nodeId, 'preview:user-rejected');

// system: prefix is what Graphnosis itself uses internally
g.forgetBefore(cutoff, 'system:retention-policy'); // default

// Pass an empty filter to show everything in an audit export
import { generateAuditReport } from '@nehloo/graphnosis';
generateAuditReport(graph, tfidfIndex, { hideReasonPrefixes: [] });

This is convention, not enforcement. Skipping prefixes gives v0.1 behavior (every soft-delete shown). The convention is dogfooded internally — Graphnosis itself prefixes system: on cascade-delete, retention, and topic-forget.

Persistence

// Signed .gai — use whenever the file crosses a trust boundary
const hmacKey = process.env.GAI_HMAC_KEY!; // 32+ random bytes
g.saveGai('knowledge.gai', { hmacKey });
g.loadGai('knowledge.gai', { hmacKey });   // fails closed on any tampering

// SQLite (requires the optional better-sqlite3 dependency)
g.saveSqlite('./data/graphnosis.db');

Only the graph + TF-IDF index are written. Embedding vectors built via buildEmbeddings() are in-memory only — re-embed after loading.

Public API surface

import {
  // Core facade
  Graphnosis,          // class — ingest, build, query, append, correct, persist
  queryGraphs,         // federated query across multiple Graphnosis instances

  // Graphnosis class methods (reference)
  // g.addMarkdown / addHtml / addCsv / addJson / addText / addDocument
  // g.build()
  // g.append()          — append ParsedDocument[], returns AppendResult
  // g.appendMarkdown / appendText / appendHtml / appendCsv / appendJson
  // g.appendPdf(buffer) — async, returns Promise<AppendResult>
  // g.appendFile(path)  — async, auto-detects format, returns Promise<AppendResult>
  // g.appendFolder(path, opts?) — async, walks directory, returns Promise<AppendResult>
  // g.query / g.prompt                  — sync, fully offline (TF-IDF only)
  // g.buildEmbeddings({ adapter? })     — async, embeds nodes via adapter
  // g.hasEmbeddings()                   — sync, true after buildEmbeddings
  // g.queryHybrid / g.promptHybrid      — async, hybrid TF-IDF + embeddings
  // g.appendWithEmbeddings(...)         — async, append + keep index in sync
  // g.reflect()         — full-graph contradiction + decay + discovery audit
  // g.edit / deleteNode / supersede / correct / importMarkdown
  // g.forgetBefore / forgetTopic
  // g.saveGai / loadGai / saveSqlite / loadSqlite / loadSqliteByName
  // g.toBuffer / fromBuffer             — serverless-friendly .gai I/O
  // g.toSqliteBuffer / fromSqliteBuffer — serverless-friendly SQLite I/O

  // Built-in analyzers + types
  asciiFoldAnalyzer, unicodeAnalyzer,  // pass to constructor.analyzer
  // Typed errors
  AnalyzerMismatchError, EmbeddingAdapterMismatchError,

  // Lower-level primitives
  buildGraph,            // build a graph from ParsedDocument[]
  queryGraph,            // subgraph retrieval given a graph + tfidf index
  buildGraphPrompt,      // wrap serialized subgraph into an LLM system prompt
  addDocumentsToGraph,   // incremental append to a live graph
  reflect,               // full-graph reflection engine

  // Parsers
  parseMarkdown, parseHtml, parseCsv, parseJson,

  // Corrections
  applyCorrection, importCorrections,
  forgetByTimeWindow, forgetByTopic, cascadeSoftDelete,

  // Persistence
  writeGai, readGai,     // .gai binary format (with optional HMAC-SHA256)
  openSqliteStore,       // path-scoped SQLite store
  toSerializable, fromSerializable,
} from '@nehloo/graphnosis';

The facade intentionally does not re-export anything from src/core/enrichment/* or src/core/query/answer.ts — those modules call OpenAI and are reserved for the app + MCP server. This keeps the library's "no-egress" guarantee verifiable by auditing a single file (src/sdk/index.ts).

Project Structure

src/
  core/
    types.ts                        # All TypeScript interfaces (40+ types)
    constants.ts                    # Thresholds, magic bytes, stopwords
    ingestion/parsers/              # Markdown, PDF, HTML, CSV/JSON, conversation parsers
    extraction/                     # Chunker, entity extractor, identity extractor
    similarity/                     # TF-IDF, cosine, Jaccard (pure JS)
    graph/                          # Graph builder, directed/undirected edges, incremental updates
    optimization/                   # Deduplicator, pruner, hierarchical compressor, reflection engine
    format/                         # .gai binary writer/reader (MessagePack)
    query/                          # Seed finder, BFS traverser, subgraph serializer,
                                    # synonym expander, query decomposer
    enrichment/                     # LLM-powered node synthesis + context
    corrections/                    # Human correction engine (add/edit/supersede/delete)
    giki/                           # Graph-to-wiki page generator with citations
    audit/                          # Audit report generator + markdown exporter
    persistence/                    # SQLite store (better-sqlite3, WAL mode)
  examples/                         # Wikipedia, arXiv, Next.js docs, NASA Mars fetchers
  app/                              # Next.js App Router — 8 pages + 10 API routes
tests/
  longmemeval/                      # LongMemEval benchmark suite (12 tests, 4 categories)

Tech Stack

Next.js 16 (App Router, TypeScript)
Vercel AI SDK v6 (chat interface, streaming)
MessagePack (msgpackr) for .gai binary format
TF-IDF + cosine similarity (pure JS, no embedding APIs)
better-sqlite3 for persistent graph storage (WAL mode)
react-force-graph-2d for graph visualization
Tailwind CSS for UI

All dependencies are MIT or Apache-2.0 licensed. No GPL/LGPL/AGPL.

Live Demo

Explore the working prototype: graphnosis.vercel.app

Screenshots

References & Attribution

For the full list see REFERENCES.md.

Benchmark used: Wu et al. (2025). LongMemEval. ICLR 2025. — 500-question evaluation dataset; GPT-4 judge prompts used verbatim.

Related work (independent development — listed for comparison context): Edge et al. GraphRAG · Guo et al. LightRAG · Lewis et al. RAG

License

MIT

Contributing

This is an active research project exploring AI-native knowledge representation. Contributions welcome — especially around:

New parser types (DOCX, PPTX, audio transcripts)
Improved relation extraction (NLP-based causes, contradicts detection)
Embedding-based similarity as optional upgrade to TF-IDF
Benchmark comparisons against standard RAG pipelines (GraphRAG, LightRAG)
Multi-graph merge (combine multiple .gai files)
Giki page quality improvements (LLM-assisted narrative generation)