@londer/cortex

v0.2.1

Published

10 days ago

High-performance AI memory system with MCP integration

0High
0Medium
0Low

alon_londer

msch_londer

ai memory mcp model-context-protocol knowledge-graph semantic-search claude

Cortex

A high-performance, locally-hosted AI memory system that stores, indexes, and retrieves contextual knowledge via MCP (Model Context Protocol). Designed for Claude Code, but works with any MCP client.

Installation

npm install -g @londer/cortex

Or from source:

git clone https://github.com/londer/cortex.git
cd cortex
npm install
npm run build

Quick Start

# 1. Set up storage backends (Qdrant + Neo4j via Docker)
cortex setup

# 2. Start the MCP server
cortex serve

Claude Code Integration

# Add Cortex to Claude Code
claude mcp add cortex -- cortex serve

# Generate Claude Code instructions for your project
cortex init

# Or set up global instructions (applies to all projects)
cortex init --global

Or manually add to your MCP config:

{
  "servers": {
    "cortex": {
      "command": "cortex",
      "args": ["serve"]
    }
  }
}

LLM Setup (Optional)

Cortex works fully without an API key. Adding one unlocks higher-quality extraction and smart consolidation.

# Via CLI
cortex config set anthropic_api_key sk-ant-your-key-here

# Or via environment variable
export ANTHROPIC_API_KEY=sk-ant-your-key-here

Ollama (Local LLM)

For privacy-first LLM extraction without an API key, install Ollama:

# Check Ollama status
cortex ollama status

# Pull the default model
cortex ollama pull

# Or configure a different model
cortex config set ollama_model mistral

Extraction Tiers

Cortex uses a tiered entity extraction system:

| Tier | Method | Quality | Latency | Cost | Requirements | |------|--------|---------|---------|------|-------------| | 1 | Regex + Heuristics | Basic | < 1ms | Free | Always available | | 2 | NLP (compromise.js) | Good | < 50ms | Free | Always available | | 2.5 | Ollama (local LLM) | Good+ | 1-5s | Free | Ollama running locally | | 3 | LLM (Claude API) | Best | 500ms-2s | ~$0.002/call | API key required |

The system automatically selects the best available tier (3 → 2.5 → 2 → 1) and falls back gracefully.

Capability Matrix

| Feature | No API Key | With Ollama | With API Key | |---------|-----------|-------------|-------------| | Entity extraction | Tier 1-2 (regex + NLP) | Tier 2.5 (local LLM) | Tier 3 (Claude) | | Relationship detection | Basic verb patterns | LLM-powered | Full semantic understanding | | Auto-extraction | Local heuristics | Ollama-powered | LLM-powered analysis | | Consolidation: near-identical dedup | Yes (> 0.95 similarity) | Yes | Yes | | Consolidation: smart merge | Flagged for review | Yes | Yes | | Consolidation: contradiction resolution | No | Yes | Yes | | memory_ingest | Tier 1-2 extraction | Tier 2.5 extraction | Tier 3 extraction | | Cross-project sharing | Full access | Full access | Full access | | Access tracking & stats | Full access | Full access | Full access |

Available Tools

| Tool | Description | |------|-------------| | memory_store | Store a memory. Auto-embeds content, stores metadata, links entities. | | memory_search | Semantic search with scope boosting. | | memory_relate | Create entity relationships in the knowledge graph. | | memory_graph_query | Traverse the knowledge graph from an entity. | | memory_context | Smart context retrieval combining semantic + graph + recency. | | memory_forget | Delete a memory from all stores (requires confirmation). | | memory_ingest | Ingest raw text and extract memories/entities/relationships. | | memory_consolidate | Merge redundant memories. Supports dry_run preview. | | memory_config | View/modify runtime configuration (including API key). | | memory_stats | Aggregate statistics: totals, breakdowns, access patterns, staleness. | | memory_share | Cross-project sharing: promote visibility, link/unlink projects. |

CLI Reference

cortex serve                      Start the MCP server (stdio transport)
cortex setup                      Start production Qdrant + Neo4j (ports 16333/17687)
cortex setup --dev                Start dev Qdrant + Neo4j (ports 26333/27687)
cortex setup --stop               Stop containers
cortex init                       Generate Cortex instructions for project CLAUDE.md
cortex init --global              Generate global instructions (~/.claude/CLAUDE.md)
cortex config get                 Show current configuration
cortex config set <key> <value>   Set a runtime config value
cortex consolidate                Run manual consolidation
cortex stats                      Show memory statistics and staleness info
cortex ollama status              Check Ollama availability and installed models
cortex ollama pull [model]        Pull an Ollama model
cortex version                    Show version
cortex help                       Show help

Configuration

Configuration is loaded from environment variables with sensible defaults. See .env.example for all options.

Key settings:

| Variable | Default | Description | |----------|---------|-------------| | QDRANT_URL | http://localhost:16333 | Qdrant server URL | | NEO4J_URI | bolt://localhost:17687 | Neo4j bolt URI | | SQLITE_PATH | ~/.cortex/cortex.db | SQLite database path | | ANTHROPIC_API_KEY | (empty) | Anthropic API key (optional) | | CORTEX_EXTRACTION_TIER | auto | auto, local-only, or llm-preferred | | CORTEX_AUTO_EXTRACT | true | Enable auto-extraction from conversations | | CORTEX_CONSOLIDATION_ENABLED | true | Enable periodic memory consolidation | | CORTEX_STALENESS_THRESHOLD_DAYS | 90 | Days before a memory is considered stale | | CORTEX_OLLAMA_ENABLED | true | Enable Ollama local LLM | | CORTEX_OLLAMA_URL | http://localhost:11434 | Ollama server URL | | CORTEX_OLLAMA_MODEL | llama3.2 | Ollama model name |

Runtime overrides persist to ~/.cortex/runtime-config.json.

Development

# Run with hot reload
npm run dev

# Run tests (requires Docker backends running)
npm test

# Build
npm run build

# Release (bump version, update changelog, create tag)
npm run release

Architecture

Claude Code
    │ MCP (stdio)
    ▼
┌──────────────────────────────────────┐
│          Cortex MCP Server           │
│            (TypeScript)              │
├──────────────────────────────────────┤
│  Extraction    │ Consolidation       │
│  Tier 1-3+     │ LLM / Ollama / Local│
│  Auto-extract  │ Cluster + Merge     │
├──────────────────────────────────────┤
│         Orchestration Layer          │
│  Scope inference, ranking, dedup     │
├─────────┬──────────┬─────────────────┤
│ Qdrant  │  Neo4j   │     SQLite      │
│ vectors │  graph   │    metadata     │
└─────────┴──────────┴─────────────────┘
         ▲                    ▲
  HuggingFace            Anthropic API
  Transformers.js         (optional)
                          Ollama
                          (optional)

License

MIT - Arthur Lonfils / Londer