context-engine-ai
v0.3.0
Published
A lightweight context engine for AI agents. Ingest events, build semantic context, query with natural language. Zero config default with SQLite + local TF-IDF embeddings.
Downloads
585
Maintainers
Readme
context-engine-ai
Give your AI agent a memory of what just happened.
Ingest events from any source. Query with natural language. Get back ranked, time-decayed results — no vector database, no API keys, no config.
Live Demo · Try the CLI · Install · Quick Start · Use Cases · API Reference · Examples
Try It in 10 Seconds
npx context-engine-ai demoNo API keys. No database. No config. Runs a simulated developer workflow and shows how context-engine answers natural language questions about what's happening.
context-engine demo
Simulating developer workflow...
[editor] app: VS Code, file: src/auth.ts, project: backend
[test] command: npm test, result: 47 passed, 2 failed
[message] from: Alice, via: Slack, text: auth token bug is back
[browser] url: oauth.net/2, title: OAuth 2.0 docs
[meeting] title: Sprint Review, starts_in: 25 minutes
[editor] app: VS Code, file: src/auth.ts, change: fix token refresh
[test] command: npm test, result: 49 passed, 0 failed
[commit] message: fix: token refresh race condition, files: 3
8 events ingested. Querying...
Q: "messages from slack?"
A: [message] from: Alice, via: Slack, text: auth token bug is back
Q: "next meeting?"
A: [meeting] title: Sprint Review, starts_in: 25 minutes
Q: "test results?"
A: [test] command: npm test, result: 47 passed, 2 failed | [test] 49 passed, 0 failed
Q: "latest commit?"
A: [commit] message: fix: token refresh race condition, files: 3
Zero config. Zero API keys. Just context.The Problem
You're building an AI agent. It needs to know what's going on — the user just switched to VS Code, a Slack message came in, there's a meeting in 15 minutes, and three tests are failing.
Your options today:
- Vector database + embedding API — Set up Pinecone/Weaviate, get an OpenAI key, write the retrieval pipeline, handle rate limits. Works, but it's infrastructure for what should be a function call.
- Stuff everything into the prompt — Append raw events to the system prompt. Hits token limits fast. No relevance ranking. Old events drown out new ones.
- Build it yourself — Roll your own event store, embedding logic, similarity search, temporal decay, deduplication. Easily a week of work before you write any agent logic.
context-engine-ai is option 4: a single import that handles all of this.
import { ContextEngine } from 'context-engine-ai'
const ctx = new ContextEngine() // SQLite + local embeddings, zero config
await ctx.ingest({ type: 'app_switch', data: { app: 'VS Code', file: 'main.ts' } })
await ctx.ingest({ type: 'calendar', data: { event: 'Standup', in: '15min' } })
await ctx.ingest({ type: 'message', data: { from: 'Alice', text: 'PR ready for review' } })
const result = await ctx.query('what is the user doing right now?')Returns:
{
summary: '[app_switch] app: VS Code, file: main.ts | [calendar] event: Standup, in: 15min | [message] from: Alice, text: PR ready for review',
events: [
{ type: 'app_switch', data: { app: 'VS Code', file: 'main.ts' }, relevance: 0.94, ... },
{ type: 'calendar', data: { event: 'Standup', in: '15min' }, relevance: 0.87, ... },
{ type: 'message', data: { from: 'Alice', text: 'PR ready for review' }, relevance: 0.82, ... },
],
query: 'what is the user doing right now?',
timestamp: 1709312400000
}Events are ranked by similarity to your query and weighted by recency — a 5-minute-old event scores higher than an identical one from yesterday. Local TF-IDF handles keyword matching out of the box; upgrade to OpenAI embeddings for true semantic search. The summary string is formatted for direct injection into LLM system prompts — drop it into your agent's context and it just works.
Features
| Feature | Description |
|---------|-------------|
| Zero config | SQLite + local TF-IDF embeddings. No API keys, no cloud, instant startup. |
| Natural language querying | Ask "test results?" instead of writing SQL or filtering by type. Local TF-IDF for keyword matching; upgrade to OpenAI embeddings for true semantic search. |
| Temporal decay | Recent events automatically rank higher. Configurable half-life (default: 24h). |
| Auto-deduplication | Switching between two apps 50 times doesn't create 50 events — duplicates merge within a configurable time window. |
| Auto-pruning | When event count exceeds your limit, the lowest-relevance oldest events are removed. No cron jobs. |
| SQLite or PostgreSQL | In-memory for dev, SQLite file for persistence, pgvector for production scale. |
| Local or OpenAI embeddings | Local TF-IDF (128-dim, free, no network) or OpenAI text-embedding-3-small (1536-dim) for higher semantic quality. |
| HTTP server + CLI | npx context-engine-ai serve — REST API in one command. |
| Full TypeScript | Types for every interface. Works great with @ts-check in JS files too. |
| ~64KB unpacked | Tiny footprint. Ships only what's needed. |
| Sub-millisecond | ~0.1ms per ingest, ~0.1ms per query with local embeddings (SQLite, 1000 events). |
When to Use This
Good fit:
- Your AI agent needs real-time awareness of what's happening (user activity, system events, messages)
- You want semantic search over a stream of structured events
- You need something working in minutes, not days
- You're building a prototype and don't want to set up infrastructure
- You want temporal decay and deduplication handled for you
Not the right tool:
- You need to search over large documents or PDFs (use a RAG framework like LangChain or LlamaIndex)
- You need persistent long-term memory across months of history (use a proper vector database)
- You're indexing millions of documents (use pgvector or a dedicated vector DB directly)
How It Compares
| | context-engine-ai | RAG frameworks | Custom implementation |
|---|---|---|---|
| Setup | npm install, done | Vector DB + embedding API + retrieval chain | Days of plumbing |
| API keys | No (local TF-IDF default) | Yes (OpenAI/Cohere/etc) | Depends |
| Temporal decay | Built-in, configurable | Manual implementation | Build it yourself |
| Deduplication | Built-in (cosine threshold + time window) | Manual | Build it yourself |
| Data model | Event-oriented {type, data} | Document chunks | Your schema |
| Query interface | Natural language | Natural language | SQL / custom |
| Storage | SQLite (zero-config) → PostgreSQL | Pinecone/Weaviate/Chroma | Your choice |
| HTTP server | ctx.serve(3334) — one line | Build it | Build it |
| Size | ~64KB | 10-100MB+ with dependencies | Varies |
Install
npm install context-engine-aiQuick Start
As a Library
import { ContextEngine } from 'context-engine-ai'
const ctx = new ContextEngine()
await ctx.ingest({ type: 'task', data: { title: 'Review PR #42', priority: 'high' } })
await ctx.ingest({ type: 'message', data: { from: 'Alice', text: 'deploy is broken' } })
const result = await ctx.query('any issues right now?')
console.log(result.summary)
// => "[message] from: Alice, text: deploy is broken | [task] title: Review PR #42, priority: high"
console.log(result.events)
// => StoredEvent[] sorted by relevance × recency
await ctx.close()With Persistence
const ctx = new ContextEngine({ dbPath: './my-context.db' })
// Events survive restarts. Standard SQLite file — inspect with any SQLite tool.Production (PostgreSQL + pgvector)
const ctx = new ContextEngine({
storage: 'postgres',
pgConnectionString: 'postgresql://user:pass@localhost:5432/mydb',
embeddingProvider: 'openai',
openaiApiKey: process.env.OPENAI_API_KEY,
})As an HTTP Server
const ctx = new ContextEngine({ dbPath: './context.db' })
ctx.serve(3334)Or via CLI:
npx context-engine-ai serve --port 3334As an MCP Server (Claude Desktop, Cursor, Windsurf)
Use context-engine as a Model Context Protocol tool server. Any MCP-compatible client (Claude Desktop, Cursor, Windsurf, VS Code) can then call ingest_event, query_context, get_recent, and clear_context as native tools.
npm install @modelcontextprotocol/sdk zod
node examples/mcp-server.jsAdd to your Claude Desktop config (~/Library/Application Support/Claude/claude_desktop_config.json):
{
"mcpServers": {
"context-engine": {
"command": "node",
"args": ["/absolute/path/to/examples/mcp-server.js"]
}
}
}The agent can then call:
ingest_event— store any event (type+data)query_context— semantic search:"any errors in the last hour?"get_recent— latest N events by timestampclear_context— wipe the store
See examples/mcp-server.js for the full implementation.
REST Endpoints
| Method | Path | Description |
|--------|------|-------------|
| POST | /ingest | Ingest an event { type, data } |
| GET | /context?q=...&limit=10 | Semantic query |
| GET | /recent?limit=20 | Recent events by timestamp |
| GET | /count | Number of stored events |
| DELETE | /events | Clear all stored events |
| GET | /health | Health check |
# Ingest an event
curl -X POST http://localhost:3334/ingest \
-H 'Content-Type: application/json' \
-d '{"type": "deploy", "data": {"service": "api", "version": "2.1.0", "env": "production"}}'
# Query with natural language
curl "http://localhost:3334/context?q=recent%20deployments&limit=5"
# Get recent events
curl "http://localhost:3334/recent?limit=10"Use Cases
1. Give your AI agent situational awareness
The core use case. Ingest events as they happen — user actions, system alerts, messages, calendar entries — and query for relevant context when the agent responds.
import Anthropic from '@anthropic-ai/sdk'
import { ContextEngine } from 'context-engine-ai'
const ctx = new ContextEngine({ dbPath: './agent-context.db' })
const claude = new Anthropic()
// Events stream in throughout the day
await ctx.ingest({ type: 'terminal', data: { command: 'npm test', output: '3 failed, 12 passed' } })
await ctx.ingest({ type: 'slack', data: { from: 'Sarah', text: 'Auth service throwing 401s in staging' } })
await ctx.ingest({ type: 'error', data: { service: 'auth', error: 'TokenExpiredError', count: 47 } })
await ctx.ingest({ type: 'pr', data: { repo: 'backend', title: 'Fix OAuth token refresh', status: 'review_requested' } })
// Agent gets relevant context for its response
const context = await ctx.query('what needs attention?', 5)
const response = await claude.messages.create({
model: 'claude-sonnet-4-6',
max_tokens: 1024,
system: `You are a developer assistant. Current context:\n${context.summary}`,
messages: [{ role: 'user', content: 'What should I focus on?' }]
})
// Claude sees the auth errors, failing tests, and related PR — responds with specific advice2. Desktop activity tracker
Track what you're doing across apps. Deduplication means switching between two windows 100 times creates 2 events, not 100.
import { ContextEngine } from 'context-engine-ai'
import { execSync } from 'child_process'
const ctx = new ContextEngine({ dbPath: './desktop.db', decayHours: 8 })
// Poll active window every 5 seconds
setInterval(async () => {
const app = execSync(
`osascript -e 'tell app "System Events" to get name of first process whose frontmost is true'`,
{ encoding: 'utf-8' }
).trim()
await ctx.ingest({ type: 'window_focus', data: { app } })
}, 5000)
// Later: "what was I doing this afternoon?"
const result = await ctx.query('what was I working on?')
console.log(result.summary)
// => "[window_focus] app: VS Code | [window_focus] app: Firefox | [window_focus] app: Slack"3. Webhook aggregation
Receive events from GitHub, Slack, PagerDuty, or any webhook source. Query the combined stream in natural language instead of checking each service individually.
import express from 'express'
import { ContextEngine } from 'context-engine-ai'
const ctx = new ContextEngine({ dbPath: './ops.db', maxEvents: 5000, decayHours: 48 })
const app = express()
app.use(express.json())
app.post('/webhook/github', async (req, res) => {
const { action, pull_request, repository } = req.body
await ctx.ingest({
type: 'github_pr',
data: { action, title: pull_request?.title, repo: repository?.full_name }
})
res.sendStatus(200)
})
app.post('/webhook/pagerduty', async (req, res) => {
const { event } = req.body
await ctx.ingest({
type: 'alert',
data: { severity: event?.severity, summary: event?.summary?.slice(0, 200) }
})
res.sendStatus(200)
})
// One query across all sources
app.get('/context', async (req, res) => {
const result = await ctx.query(req.query.q, parseInt(req.query.limit) || 10)
res.json(result)
})
app.listen(4000)4. Other ideas
- Smart notifications — Check what the user is doing before interrupting them
- Meeting prep — Combine calendar + recent work + messages for automated briefings
- Log analysis — Ingest structured logs, query them with plain English
- IoT / sensor fusion — Unify events from multiple devices into one queryable stream
- Chat context — Feed conversation history + user activity into LLM system prompts
How It Works
Events In Embed Store Query
─────────────┐ ┌──────┐ ┌────────────┐ ┌──────────────┐
app_switch │───>│TF-IDF│───>│ SQLite / │<───│ "what is the │
calendar │ │ or │ │ pgvector │ │ user doing?"│
message │ │OpenAI │ │ │ └──────┬───────┘
terminal │ └──────┘ └────────────┘ │
git_commit │ dedup + prune cosine similarity
─────────────┘ + temporal decay
│
┌────▼────┐
│ Ranked │
│ Context │
└─────────┘Step 1: Ingest
Events arrive as {type, data}. The engine serializes them to searchable text:
{ type: 'message', data: { from: 'Alice', text: 'PR ready' } }
→ "event:message from:Alice text:PR ready"This text is embedded into a 128-dimensional vector (local TF-IDF) or 1536-dimensional (OpenAI). The engine then checks for near-duplicates — if a >95% similar event was ingested in the last 60 seconds, it merges instead of storing a duplicate.
Step 2: Query
Your natural language question is embedded and compared against every stored event using cosine similarity. Each result is then weighted by temporal decay:
finalScore = cosineSimilarity(query, event) × relevance × 0.5^(age / halfLife)Concrete example — you query "any errors?" with decayHours: 24:
| Event | Cosine sim | Age | Decay | Final score |
|-------|-----------|-----|-------|-------------|
| [error] service: auth, count: 47 | 0.92 | 5 min | 0.9998 | 0.92 |
| [error] service: api, count: 3 | 0.89 | 6 hours | 0.84 | 0.75 |
| [test] result: 2 failed | 0.41 | 2 min | 0.9999 | 0.41 |
| [error] service: auth, count: 12 | 0.91 | 3 days | 0.125 | 0.11 |
The 5-minute-old auth error wins. The 3-day-old error — identical content — scores 8x lower. The summary string is pre-formatted for direct injection into LLM system prompts.
Step 3: Prune
When event count exceeds maxEvents, the lowest-scoring oldest events are automatically removed. No cron jobs, no maintenance.
Local Embeddings
The default embedding provider uses TF-IDF with locality-sensitive hashing projected into 128 dimensions. No network calls, deterministic, sub-millisecond. It works well for structured event data where the vocabulary is predictable (event types, field names, common terms). Swap to OpenAI embeddings with one config change when you need true semantic search over ambiguous natural language.
Performance
Benchmarked on Apple Silicon with local TF-IDF + SQLite (in-memory):
| Operation | Latency | Notes | |-----------|---------|-------| | Ingest | ~0.1ms/event | Including embed + dedup check + store | | Query | ~0.1ms/query | Across 1000 stored events | | Memory | ~20MB heap | With 1000 events loaded |
Configuration
const ctx = new ContextEngine({
// Storage
storage: 'sqlite', // 'sqlite' (default) or 'postgres'
dbPath: './context.db', // SQLite file path (default: in-memory)
pgConnectionString: '...', // PostgreSQL connection string
// Embeddings
embeddingProvider: 'local', // 'local' (TF-IDF, default) or 'openai'
openaiApiKey: '...', // Required for OpenAI (or set OPENAI_API_KEY env var)
// Tuning
maxEvents: 1000, // Max stored events before pruning (default: 1000)
decayHours: 24, // Relevance half-life in hours (default: 24)
deduplicationWindow: 60000, // Dedup time window in ms (default: 60s)
deduplicationThreshold: 0.95, // Cosine similarity threshold for dedup (default: 0.95)
})Storage: SQLite (default)
Zero config. Uses better-sqlite3. Stores embeddings as JSON arrays. Good for single-process use, prototyping, and edge deployments.
Pass dbPath to persist across restarts. Without it, uses in-memory storage (events lost on restart).
Storage: PostgreSQL + pgvector
Uses pgvector for native vector similarity search. Multi-process safe, production-ready, handles millions of events. Requires the vector extension to be installed.
const ctx = new ContextEngine({
storage: 'postgres',
pgConnectionString: 'postgresql://user:pass@localhost:5432/mydb',
})Embeddings: Local (default)
TF-IDF with locality-sensitive hashing. 128-dimensional vectors. No external calls, instant, deterministic. Performs well for matching structured event data to natural language queries.
Embeddings: OpenAI
Uses text-embedding-3-small (1536-dimensional). Higher semantic quality for complex or ambiguous queries. Requires an API key.
const ctx = new ContextEngine({
embeddingProvider: 'openai',
openaiApiKey: 'sk-...', // or set OPENAI_API_KEY env var
})API Reference
new ContextEngine(options?)
Create a new engine instance. See Configuration for all options.
ctx.ingest(event): Promise<StoredEvent>
Ingest an event. Embeds the event text, checks for duplicates, stores it, and prunes if over the limit. If a near-duplicate exists within the deduplication window, the existing event is updated instead of creating a new one.
interface EventInput {
type: string // Event category (e.g. 'app_switch', 'message', 'error')
data: Record<string, unknown> // Event payload — any key/value pairs
}
interface StoredEvent {
id: string
type: string
data: Record<string, unknown>
timestamp: number
embedding: number[]
relevance: number // 0.0 - 1.0
}ctx.query(question, limit?): Promise<ContextResult>
Semantic search across stored events. Returns events ranked by cosine similarity to the query, weighted by temporal decay. Includes a pre-formatted summary string suitable for injecting into LLM prompts.
interface ContextResult {
summary: string // Human-readable: "[type] key: val | [type] key: val"
events: StoredEvent[] // Ranked by relevance × decay
query: string // The original query
timestamp: number // When the query was executed
}ctx.recent(limit?): Promise<StoredEvent[]>
Get the most recent events ordered by timestamp. Default limit: 20.
ctx.count(): Promise<number>
Returns the number of events currently stored.
const n = await ctx.count()
console.log(`${n} events in context`)ctx.clear(): Promise<void>
Remove all stored events.
ctx.serve(port?): Server
Start an Express HTTP server. Default port: 3334. Returns a Node.js http.Server.
ctx.close(): Promise<void>
Clean shutdown. Closes database connections and HTTP server.
Utility Exports
For advanced use — build custom storage backends or embedding providers:
import {
SQLiteStorage, // StorageAdapter implementation for SQLite
PostgresStorage, // StorageAdapter implementation for PostgreSQL + pgvector
LocalEmbeddingProvider, // TF-IDF embeddings (128-dim, no network)
OpenAIEmbeddingProvider, // OpenAI text-embedding-3-small (1536-dim)
createServer, // Express app factory
cosineSimilarity, // (a: number[], b: number[]) => number
computeDecay, // (timestamp, now, halfLifeHours) => number
eventToText, // (type, data) => string
} from 'context-engine-ai'TypeScript
Full type definitions included:
import type {
StoredEvent,
EventInput,
ContextResult,
StorageAdapter, // Implement this to add custom storage backends
EmbeddingProvider, // Implement this to add custom embedding providers
EngineOptions,
} from 'context-engine-ai'Examples
See the examples/ directory for runnable code:
| Example | Description |
|---------|-------------|
| basic.js | Event ingestion and semantic querying |
| server.js | Running as an HTTP service |
| ai-agent.js | Feeding context into Claude |
| agent-context.js | Building a structured context block for agent system prompts |
| webhook-server.js | Multi-source webhook aggregation |
| mcp-server.js | MCP tool server for Claude Desktop, Cursor, Windsurf |
| custom-storage.js | Implementing a custom storage adapter |
npx context-engine-ai demo # Interactive demo — no setup needed
node examples/basic.js # Library usage
node examples/ai-agent.js # Agent integration (needs ANTHROPIC_API_KEY)Pricing
The npm package is free and open source (MIT) — every feature, no limits, no API keys required.
For managed infrastructure, we offer a cloud API:
| | Open Source | Pro | Team | Enterprise | |---|---|---|---|---| | Price | Free | $29/mo | $99/mo | Custom | | Full library + CLI + MCP | Yes | Yes | Yes | Yes | | Self-hosted (SQLite / PostgreSQL) | Yes | Yes | Yes | Yes | | Local + OpenAI embeddings | BYOK | Included | Included | Included | | Managed Cloud API | — | Yes | Yes | Yes | | Events/month | Unlimited | 50,000 | 500,000 | Unlimited | | Support | GitHub Issues | Email | Priority | Dedicated + SLA |
Early access: email [email protected] with subject "Cloud API Access" — first adopters get 3 months free.
Documentation
- Quick Start Guide — running in under 2 minutes
- Architecture Overview
- Custom Adapters
- Deployment Guide
- Pricing
Requirements
- Node.js >= 18
- No external services required (default configuration)
- Optional: PostgreSQL with pgvector extension (for production scale)
- Optional: OpenAI API key (for higher-quality embeddings)
Development
git clone https://github.com/Quinnod345/context-engine.git
cd context-engine
npm install
npm run build # Compile TypeScript
npm test # Run test suite
npm run dev # Watch modeContributing
See CONTRIBUTING.md. Some ideas:
- New storage adapters (Redis, DuckDB, Turso)
- New embedding providers (Cohere, local ONNX models)
- Browser extension for automatic context capture
- Streaming ingestion via WebSocket
Star History
If context-engine-ai saves you time, a ⭐ on GitHub helps others find it.
