@cortexa/core

v1.1.2

Published

4 months ago

The intelligence layer that turns any database into a self-analyzing, self-explaining system.

Cortexa

The intelligence layer that turns any database into a self-analyzing, self-explaining system.

Documentation · CLI Reference · API Reference · Bug Reports

What is Cortexa?

Databases hold your application's truth, but they can't tell you what's happening inside them. You find out about problems after users complain, and understanding why something went wrong means digging through logs manually.

Cortexa changes that. It connects to your existing database as a read-only observer — no migrations, no ORMs, no schema changes. It discovers your schema automatically, watches for changes in real-time, learns what "normal" looks like, and alerts you when something is off. Then it explains why in plain English.

Your Database ──(read-only)──> Cortexa ──> Intelligence

Key principles

Read-only by design — Cortexa never writes to your database. All intelligence is stored locally in .cortexa/cortexa.db (SQLite).
Zero configuration schema — Point it at your database and it introspects everything automatically. No models to define, no schemas to maintain.
Database-agnostic — Works with PostgreSQL, MySQL, SQLite, MariaDB, CockroachDB, MongoDB, and SQL Server out of the box.
LLM-powered reasoning — Uses AI to classify entities, explain anomalies, trace causal chains, and answer natural language questions about your data.

What it does

| Capability | What it does | |---|---| | Schema Discovery | Introspects tables, columns, foreign keys. LLM classifies entity types (transaction, user, config, etc.) and maps relationships. | | Change Detection | Polls or streams (CDC) for INSERTs, UPDATEs, DELETEs. Tracks per-table operation counts over time. | | Behavioral Baselines | Learns normal rates (inserts/min, updates/min) per entity using rolling statistics. Adapts as your application evolves. | | Anomaly Detection | Flags rate spikes, rate drops, and stuck records by comparing live activity against learned baselines. | | State Reasoning | Tracks state machine transitions (e.g. pending → confirmed → shipped). Detects skipped states and stuck workflows. | | Cross-Entity Analytics | Correlates activity across related entities (e.g. orders and payments). Tracks value distributions and temporal patterns. | | Knowledge Graph | Connects entities, events, anomalies, and insights into a traversable causal graph. Find root causes by following edges. | | Autonomous Actions | Rule-based recommendations with configurable governance: advisory, autonomous, or manual. | | Explain | AI-powered root cause analysis. Ask "why did this anomaly happen?" and get a structured explanation. | | Ask | Natural language interface. Query your entire intelligence stack in plain English. |

Quick Start

1. Install

npm install @cortexa/core

2. Initialize configuration

npx cortexa init

This generates cortexa.config.ts in your project root:

import { defineConfig } from '@cortexa/core';

export default defineConfig({
  connection: {
    type: 'postgres',   // 'mysql' | 'sqlite' | 'mariadb' | 'cockroachdb' | 'mongodb' | 'mssql'
    url: process.env.DATABASE_URL,
  },
  llm: {
    provider: 'openai', // 'anthropic' | 'deepseek'
    apiKey: process.env.OPENAI_API_KEY,
  },
});

3. Test connection

npx cortexa status

Connected to mydb (PostgreSQL)
Tables: 24
Storage: .cortexa/cortexa.db (initialized)

4. Discover your schema

npx cortexa discover

Cortexa introspects every table, sends the schema to your LLM for entity classification, and maps relationships from foreign keys. Results are stored locally.

5. Watch for changes

npx cortexa watch

Starts the intelligence pipeline: polls for changes, builds baselines, detects anomalies, tracks state transitions, and generates insights — all in real-time.

6. Ask questions

npx cortexa ask "What is the overall health of the database?"
npx cortexa ask "Are orders and payments correlated?"
npx cortexa explain anomaly 1

Programmatic API

Everything available through the CLI is also available as a TypeScript API:

import { Cortexa } from '@cortexa/core';

const cortexa = new Cortexa({
  connection: {
    type: 'postgres',
    host: 'localhost',
    port: 5432,
    database: 'myapp',
    user: 'readonly_user',
    password: process.env.DB_PASSWORD,
  },
  llm: {
    provider: 'openai',
    apiKey: process.env.OPENAI_API_KEY,
  },
  knowledge: { enabled: true },
});

await cortexa.connect();

Schema Discovery

const { entities, relationships } = await cortexa.discover();
// entities: [{ name: 'orders', type: 'transaction', columns: [...] }, ...]
// relationships: [{ from: 'orders', to: 'users', type: 'many-to-one' }, ...]

Real-time Monitoring

cortexa.on('event', (event) => console.log('Change:', event));
cortexa.on('anomaly', (anomaly) => console.log('Anomaly:', anomaly));
cortexa.on('insight', (insight) => console.log('Insight:', insight));

await cortexa.watch();

Querying Intelligence

const events     = cortexa.getEvents({ entity: 'orders', last: 100 });
const anomalies  = cortexa.getAnomalies({ severity: 'high' });
const baselines  = cortexa.getBaselines();
const transitions = cortexa.getTransitions('orders');

Knowledge Graph

const graph   = cortexa.graph();
const summary = graph.getSummary();           // node/edge counts, top entities
const causes  = graph.causesOf(nodeId);       // BFS traversal of causal chain
const impact  = graph.impactOf(nodeId);       // downstream effects
const intel   = graph.entity('orders').intelligence(); // per-entity aggregation

AI-Powered Analysis

const explanation = await cortexa.explain({ type: 'anomaly', id: 1 });
const answer = await cortexa.ask('Why did order activity spike today?');

await cortexa.disconnect();

Architecture

Cortexa builds intelligence through a layered pipeline. Each layer feeds into the next, producing progressively higher-level understanding of your database.

┌─────────────────────────────────────────────────────────┐
│                     Your Database                       │
│              (read-only connection)                     │
└───────────────────────┬─────────────────────────────────┘
                        │
                        v
┌───────────────────────────────────────────────────────┐
│              Schema Discovery                         │
│  Tables, columns, FKs, indexes                        │
│  LLM classifies entity types & maps relationships     │
└───────────────────────┬───────────────────────────────┘
                        │
                        v
┌───────────────────────────────────────────────────────┐
│              Change Detection                         │
│  Polling (hash-based diffing) or CDC streaming        │
│  INSERT / UPDATE / DELETE per table                   │
└───────────┬───────────┬───────────┬───────────────────┘
            │           │           │
            v           v           v
     ┌──────────┐ ┌──────────┐ ┌──────────┐
     │ Baselines│ │  State   │ │ Analytics│
     │ rolling  │ │ Machines │ │ correlate│
     │ stats    │ │ workflow │ │ distribs │
     │ per-op   │ │ tracking │ │ temporal │
     └────┬─────┘ └────┬─────┘ └────┬─────┘
          │             │            │
          v             v            v
┌───────────────────────────────────────────────────────┐
│              Anomaly Detection                        │
│  Rate spikes, rate drops, stuck records               │
│  Skipped states, unexpected transitions               │
└───────────────────────┬───────────────────────────────┘
                        │
                        v
┌───────────────────────────────────────────────────────┐
│              Knowledge Graph                          │
│  Entities, events, anomalies, insights                │
│  Causal chains, traversal, per-entity intelligence    │
└───────────┬───────────────────────┬───────────────────┘
            │                       │
            v                       v
     ┌──────────────┐       ┌──────────────┐
     │   Actions    │       │ Explain / Ask│
     │  rule-based  │       │  AI-powered  │
     │  governance  │       │  natural     │
     │  pipeline    │       │  language    │
     └──────────────┘       └──────────────┘

Data flow: Your database is never modified. Cortexa reads schema metadata and change data, processes it through each layer, and stores all derived intelligence in a local SQLite file (.cortexa/cortexa.db). The LLM is called only for schema classification, explain, and ask — all other intelligence is computed locally.

Supported Databases

| Database | Type | Polling | Streaming (CDC) | Driver | |----------|------|:-------:|:---------------:|--------| | PostgreSQL | Relational | Yes | Yes — Logical Replication | pg (included) | | MySQL | Relational | Yes | Yes — Binlog | mysql2 (included) | | SQLite | Embedded | Yes | — | better-sqlite3 (included) | | MariaDB | Relational | Yes | Yes — Binlog | mysql2 (included) | | CockroachDB | Distributed | Yes | — | pg (included) | | MongoDB | Document | Yes | Yes — Change Streams | mongodb (optional) | | SQL Server | Relational | Yes | — | mssql (optional) |

Included drivers ship with Cortexa — no extra install needed. Optional drivers require a separate install:

# MongoDB
npm install mongodb

# SQL Server
npm install mssql

# PostgreSQL CDC streaming
npm install pg-logical-replication

# MySQL / MariaDB CDC streaming
npm install @powersync/mysql-zongji

CLI Reference

| Command | Description | |---------|-------------| | cortexa init | Generate config file (--demo for full example) | | cortexa status | Test connection and show table count | | cortexa discover | Discover and classify schema | | cortexa entities | List classified entities | | cortexa relationships | List entity relationships | | cortexa watch | Start the intelligence pipeline (--once for single poll) | | cortexa events | List recent change events | | cortexa baselines | Show learned rate baselines | | cortexa anomalies | List detected anomalies | | cortexa insights | List insights from state analysis | | cortexa transitions | Show state transition stats | | cortexa correlations | Show cross-entity correlations | | cortexa distributions | Show column value distributions | | cortexa graph | Knowledge graph summary and traversal | | cortexa actions | View and manage action recommendations | | cortexa explain <type> <id> | AI explanation of anomaly, insight, or event | | cortexa ask "<question>" | Ask a natural language question | | cortexa serve | Start REST API server (--port, --host) |

REST API

Start the HTTP server to access Cortexa from any language (Python, Go, Ruby, etc.):

npx cortexa serve
# Cortexa API running at http://127.0.0.1:3210

Options: --port <port> (default: 3210), --host <host> (default: 127.0.0.1), --no-cors.

Endpoints

| Method | Endpoint | Description | |--------|----------|-------------| | GET | /api/status | Connection status | | GET | /api/entities | List classified entities | | GET | /api/relationships | List entity relationships | | GET | /api/events | List change events (?entity=, ?last=) | | GET | /api/baselines | Learned rate baselines | | GET | /api/anomalies | Detected anomalies (?severity=, ?entity=) | | GET | /api/insights | State analysis insights (?entity=, ?severity=) | | GET | /api/transitions | Transition statistics (?entity=) | | GET | /api/correlations | Cross-entity correlations | | GET | /api/distributions | Value distributions (?entity=) | | GET | /api/graph | Knowledge graph summary | | GET | /api/graph/export | Full graph as JSON | | GET | /api/graph/entity/:name | Entity intelligence | | GET | /api/actions | Recommendations (?status=, ?action=) | | POST | /api/discover | Trigger schema discovery | | POST | /api/explain | AI explanation ({ type, id }) | | POST | /api/ask | Natural language question ({ question }) | | POST | /api/watch | Start watching ({ interval, once }) | | POST | /api/unwatch | Stop watching | | POST | /api/actions/:id/approve | Approve recommendation | | POST | /api/actions/:id/reject | Reject recommendation |

All responses return { ok: boolean, data?: ..., error?: string }.

Examples

# Get anomalies
curl http://localhost:3210/api/anomalies?severity=high

# Ask a question
curl -X POST http://localhost:3210/api/ask \
  -H "Content-Type: application/json" \
  -d '{"question": "Why did order activity spike today?"}'

# Explain an anomaly
curl -X POST http://localhost:3210/api/explain \
  -H "Content-Type: application/json" \
  -d '{"type": "anomaly", "id": 1}'

# Python example
import requests

r = requests.get("http://localhost:3210/api/anomalies", params={"severity": "high"})
print(r.json()["data"])

r = requests.post("http://localhost:3210/api/ask", json={"question": "Are orders healthy?"})
print(r.json()["data"]["answer"])

Programmatic usage

import { CortexaServer } from '@cortexa/core';

const server = new CortexaServer(config, { port: 3210, cors: true });
await server.start();
// ... later
await server.stop();

Configuration

The config file supports fine-grained control over every layer of the pipeline:

import { defineConfig } from '@cortexa/core';

export default defineConfig({
  // ── Database Connection ──────────────────────────────────
  connection: {
    type: 'postgres',              // 'mysql' | 'sqlite' | 'mariadb' | 'cockroachdb' | 'mongodb' | 'mssql'
    url: process.env.DATABASE_URL, // or use host/port/database/user/password
  },

  // ── LLM Provider ────────────────────────────────────────
  llm: {
    provider: 'openai',           // 'anthropic' | 'deepseek'
    apiKey: process.env.OPENAI_API_KEY,
    model: 'gpt-4o-mini',         // model to use for classification and analysis
    retry: {
      maxRetries: 3,
      initialDelayMs: 1000,
    },
  },

  // Number of tables to classify per LLM batch
  batchSize: 5,

  // ── State Machine Tracking ──────────────────────────────
  reasoning: {
    workflows: {
      orders: {
        stateColumn: 'status',
        expectedTransitions: [
          'pending -> confirmed',
          'confirmed -> shipped',
          'shipped -> delivered',
        ],
        stuckThreshold: '24h',   // flag records stuck in a state
      },
    },
  },

  // ── Cross-Entity Analytics ──────────────────────────────
  analytics: {
    correlations: {
      'order-payment': {
        entities: ['orders', 'payments'],
        timeWindow: '5m',        // correlate events within this window
      },
    },
    distributions: {
      orders: {
        columns: ['total'],      // track value distributions
        bucketCount: 10,
      },
    },
  },

  // ── Knowledge Graph ─────────────────────────────────────
  knowledge: {
    enabled: true,               // build causal graph from events + anomalies
  },

  // ── Autonomous Actions ──────────────────────────────────
  actions: {
    governance: 'advisory',      // 'autonomous' | 'advisory' | 'manual'
    rules: [{
      trigger: 'anomaly',
      condition: { severity: ['critical', 'high'] },
      action: 'notify_team',
    }],
  },

  // ── Webhook Notifications ───────────────────────────────
  notifications: {
    enabled: true,
    rules: [{
      triggers: ['anomaly', 'insight'],
      targets: [{ url: process.env.SLACK_WEBHOOK_URL, type: 'slack' }],
      filter: { severity: ['critical', 'high'] },
    }],
  },
});

Requirements

Node.js >= 18
A supported database
An LLM API key (OpenAI, Anthropic, or DeepSeek) — required for schema classification, explain, and ask commands

Contributing

Contributions are welcome. Please open an issue first to discuss what you'd like to change.

git clone https://github.com/Mohammed3MG/cortexa.git
cd cortexa
npm install
npm test          # run unit tests
npm run build     # build the project
npm run typecheck # verify types

License

Apache 2.0