cognitive-core

v0.1.2

Published

12 days ago

TypeScript-native cognitive core for adaptive learning and abstraction

0High
0Medium
0Low

alexngai

cognitive learning agent memory strategy concept skill

cognitive-core

A TypeScript learning system for AI agents. Records how agents solve problems, extracts reusable playbooks and factual knowledge from trajectories, and injects relevant guidance into future tasks.

Motivation

cognitive-core gives agents persistent, structured memory:

Trajectories record what the agent did (ReAct-style thought/action/observation steps)
Playbooks distill trajectories into reusable guidance (strategy, tactics, verification criteria)
Knowledge Bank extracts and organizes factual knowledge — what agents learn about tools, libraries, and patterns
Routing matches incoming tasks to relevant playbooks before the agent starts working
Meta-learning tracks which playbooks helped and adjusts routing over time

The result: agents that get measurably better at recurring problem types without fine-tuning or prompt engineering.

Installation

npm install cognitive-core

Requires Node.js 18+.

Quick Start

Solve a task with memory-augmented agents

import { createAtlasWithAgents, createTask, createMockBackend } from 'cognitive-core';

const atlas = createAtlasWithAgents(
  [createMockBackend()],
  { storage: { baseDir: '.cognitive-core' } }
);

await atlas.init();

const result = await atlas.solve(createTask({
  domain: 'code',
  description: 'Fix the TypeScript compilation error in auth.ts',
}));

console.log(result.trajectory.outcome.success);  // true
console.log(result.routing?.strategy);            // 'direct' | 'adapt' | 'explore' | 'fallback'
console.log(result.injectedPlaybooks?.length);    // number of playbooks injected

await atlas.close();

Feed trajectories from external agents

cognitive-core can learn from trajectories produced by external agents. Record what happened elsewhere and feed it in.

import { createAtlas, createTrajectory, createTask, createStep, successOutcome } from 'cognitive-core';

const atlas = createAtlas({ storage: { baseDir: '.cognitive-core' } });
await atlas.init();

const trajectory = createTrajectory({
  task: createTask({
    domain: 'code',
    description: 'Fix the null pointer exception in user service',
  }),
  steps: [
    createStep({
      thought: 'Check where the null value originates',
      action: 'Read src/services/user.ts',
      observation: 'getUserById returns undefined when user not found',
    }),
    createStep({
      thought: 'Add a guard clause before accessing user properties',
      action: 'Edit src/services/user.ts to add null check',
      observation: 'Added: if (!user) return null',
    }),
  ],
  outcome: successOutcome('Fixed by adding null check in getUserById'),
  agentId: 'claude-code',
});

const result = await atlas.processTrajectory(trajectory);
console.log(result.abstractable);  // true - system can extract a playbook from this

await atlas.close();

Query memory directly

const context = await atlas.queryMemory('typescript import resolution error', {
  domains: ['code'],
  includePlaybooks: true,
});

for (const { playbook, score } of context.playbooks) {
  console.log(`${playbook.name} (${Math.round(score * 100)}% match)`);
  console.log(`  Strategy: ${playbook.guidance.strategy}`);
  console.log(`  Tactics: ${playbook.guidance.tactics.join(', ')}`);
}

How It Works

Solve flow

Every call to atlas.solve(task) runs through this pipeline:

flowchart TD
    Task[Task arrives] --> Router[TaskRouter]

    Router -->|queries| Memory[MemorySystem]
    Memory --> Exp[ExperienceMemory]
    Memory --> PB[PlaybookLibrary]
    Memory --> KB[KnowledgeBank]
    Memory --> Meta[MetaMemory]

    Router --> Decision{RoutingDecision}

    Decision --> Agent[AgentManager]
    Agent -->|injects playbooks + knowledge| Backend[Backend]
    Backend -->|returns trajectory| Session[AgentSession]

    Session --> Check{Succeeded?}
    Check -->|Yes| Learn[LearningPipeline]
    Check -->|No| Refine{Retry?}
    Refine -->|Yes, max 3x| Agent
    Refine -->|No| Learn

    Learn --> Analyze[TrajectoryAnalyzer]
    Learn --> Extract[PlaybookExtractor]
    Learn --> KnowExtract[KnowledgeExtractor]
    Learn --> Usage[UsageInference]
    Learn --> MetaLearn[MetaLearner]

    Analyze --> Updated[Memory updated]
    Extract --> Updated
    KnowExtract --> Updated
    Usage --> Updated
    MetaLearn --> Updated

    Updated -.->|next task| Router

Playbook lifecycle

A playbook starts with low confidence (0.3) after extraction from a cluster of similar trajectories. Each time it's injected into an agent and the task succeeds, confidence grows. After enough successful uses (default: 5 successes, 80%+ success rate), it gets promoted to a core skill that's always in the system prompt. If it starts failing in specific contexts, the system records refinements rather than discarding the playbook entirely.

stateDiagram-v2
    [*] --> Extracted : pattern found across trajectories

    Extracted --> Contextual : confidence above 0.3
    Contextual --> Domain : tagged to domain

    Contextual --> Contextual : success
    Domain --> Domain : success

    Domain --> Core : 0.85+ confidence, 5+ successes, 80%+ rate
    Core --> Core : success

    Core --> Domain : 3 consecutive failures
    Domain --> Contextual : confidence drops

    Contextual --> Refined : failure in specific context
    Refined --> Contextual : refinement recorded

Memory architecture

The four memory stores serve different retrieval patterns:

graph LR
    subgraph MemorySystem
        E[ExperienceMemory]
        P[PlaybookLibrary]
        K[KnowledgeBank]
        M[MetaMemory]
    end

    subgraph Search
        BM[BM25 Index]
        VS[sqlite-vec]
        TS[Text Similarity]
    end

    subgraph Providers
        OAI[OpenAI]
        VOY[Voyage]
        HF[HuggingFace local]
        MM[minimem optional]
    end

    E --> BM
    E --> VS
    P --> BM
    P --> VS
    K --> TS
    K -.-> MM
    VS -.-> OAI
    VS -.-> VOY
    VS -.-> HF

    subgraph Storage
        JSON[JSON files]
        SQL[SQLite]
        MD[Markdown + YAML]
    end

    E --> JSON
    P --> JSON
    M --> JSON
    K --> MD
    VS --> SQL

| Store | Purpose | Format | |-------|---------|--------| | ExperienceMemory | Episodic — what happened | JSON | | PlaybookLibrary | Procedural — how to do things | JSON | | KnowledgeBank | Semantic — facts, concepts, relationships | Markdown + YAML frontmatter | | MetaMemory | Meta-learning — what worked | JSON |

Search and matching

Memory search uses BM25 text matching by default. For higher-quality retrieval, plug in an embedding provider (OpenAI, Voyage, or local via @huggingface/transformers) and cognitive-core switches to hybrid BM25 + vector search backed by sqlite-vec.

Knowledge bank search uses text similarity by default, with optional delegation to minimem for hybrid vector + BM25 search when available.

Knowledge Bank

The knowledge bank is a semantic memory store for facts, concepts, and relationships that agents learn from experience. While playbooks capture how to do things, the knowledge bank captures what agents know — facts about tools, libraries, version-specific behavior, and causal relationships.

Knowledge note types

| Type | Purpose | Example | |------|---------|---------| | Observation | Atomic fact learned from experience | "Prisma requires db push before migrate dev when schema has drifted" | | Entity | Living document about a tool/library/pattern | "Everything I know about Prisma" | | Domain summary | High-level overview of a knowledge domain | "Database knowledge overview" |

How knowledge is stored

Knowledge notes are plain Markdown files with structured YAML frontmatter, organized on the filesystem:

knowledge/
├── observations/    # Atomic facts (k-*.md)
├── entities/        # Living entity docs (prisma.md, vitest.md)
└── domains/         # Domain summaries (database.md, testing.md)

Any agent with filesystem access can browse, grep, and read knowledge directly — no special API required.

Multi-layer knowledge graph

Relationships between knowledge notes are tracked in a multi-layer graph overlay (inspired by MAGMA):

| Layer | Captures | Example | |-------|----------|---------| | Semantic | Conceptual relationships | "Prisma depends-on PostgreSQL" | | Temporal | When knowledge was learned | "Observation A supersedes B" | | Causal | Cause-effect chains | "Upgrading TS 5.4 broke enum const exports" | | Entity | Tool/component interactions | "Next.js uses React" |

Knowledge lifecycle

Trajectory → KnowledgeExtractor → Observations (confidence: 0.3)
                                        │
                              reinforced by more trajectories
                                        │
                                  Observations (confidence: 0.5-0.8)
                                        │
                              consolidated into entity notes
                                        │
                                  Entity Notes (living docs)
                                        │
                              domain summaries regenerated
                                        │
                                  Domain Summaries

Knowledge evolves through:

Reinforcement — same fact observed again increases confidence
Contradiction detection — conflicting facts are flagged and resolved
Consolidation — observations about the same entity merge into entity notes
Decay — unvalidated knowledge gradually loses confidence

Knowledge surfacing

During atlas.solve(), relevant knowledge is surfaced alongside playbooks as an independent context section:

const result = await atlas.solve(task);
console.log(result.surfacedKnowledge?.length);  // knowledge notes injected

Three-tier retrieval:

Domain match — if the task domain matches a knowledge domain, include the domain summary
Entity match — if the task mentions known entities, include their notes
Semantic match — text similarity search for contextually relevant observations

minimem integration

When minimem is available, the knowledge bank can delegate search to minimem's hybrid search (vector + BM25) for higher-quality retrieval. Set minimemAware: true in config to enable. The two systems communicate via file conventions only — no cross-package imports.

See Design Doc for the full architecture.

CLI

cognitive-core ships a CLI for querying playbooks and storing trajectories without writing code.

# Initialize storage
cognitive-core init --dir .cognitive-core

# Store a trajectory from a JSON file
cognitive-core store ./trajectory.json

# Search for relevant playbooks
cognitive-core search "fix typescript import errors" --domain code
# Found 2 playbook(s) matching "fix typescript import errors":
#
# Match: 92% (trigger)
# ## typescript-import-resolution
# Confidence: 85%
# Strategy: Check tsconfig.json paths configuration
# Tactics:
#   - Verify moduleResolution setting
#   - Check baseUrl and paths mapping
#   - Inspect file extensions (.js vs .ts)

# Get full playbook details
cognitive-core get playbook-abc123

# List domains with playbooks
cognitive-core domains

# View memory statistics
cognitive-core stats
# Experiences: 47
# Playbooks: 12
# Meta-observations: 31

All commands support --json for structured output:

cognitive-core search "debug async errors" --json | jq '.results[0].strategy'

Agent Backends

cognitive-core delegates execution to backends that handle spawning, message passing, and trajectory extraction.

Subprocess backend

Spawns agents as child processes. Works with any CLI agent.

import { createAtlasWithAgents, createSubprocessBackend, claudeCodeConfig } from 'cognitive-core';

const backend = createSubprocessBackend({
  'claude-code': claudeCodeConfig,  // Pre-configured for Claude Code CLI
});

const atlas = createAtlasWithAgents([backend], {
  execution: {
    defaultAgentType: 'claude-code',
    maxExecutionTime: 300,
    captureToolCalls: true,
  },
});

ACP backend

Uses the Agent Communication Protocol for richer interaction with ACP-compatible agents.

import { createACPBackend, claudeCodeACPConfig } from 'cognitive-core';

const backend = createACPBackend({
  'claude-code': claudeCodeACPConfig,  // Uses npx claude-code-acp
});

Custom backends

Implement the AgentBackend interface:

import type { AgentBackend, AgentSpawnConfig, AgentSession } from 'cognitive-core';

class MyBackend implements AgentBackend {
  readonly name = 'my-agent';
  readonly supportedTypes = ['my-agent'];

  async isAvailable(): Promise<boolean> { return true; }

  async spawn(config: AgentSpawnConfig): Promise<AgentSession> {
    // config.task - the task to solve
    // config.systemPromptAdditions - playbook context to inject
    // config.timeout - max execution time
    const session = await launchMyAgent(config);
    return session;
  }

  async getSession(id: string): Promise<AgentSession | undefined> { /* ... */ }
  async terminate(id: string): Promise<void> { /* ... */ }
}

Observing agent execution

const manager = atlas.getAgentManager();

manager.addObserver({
  onSessionStart: (session) => console.log('Started:', session.id),
  onToolCall: (session, toolCall) => console.log('Tool:', toolCall.name),
  onSessionEnd: (session, trajectory) => {
    console.log('Done:', trajectory.outcome.success);
    console.log('Steps:', trajectory.steps.length);
  },
});

Skill Library

The skill library manages how playbooks are surfaced to agents across four tiers:

| Tier | When loaded | Criteria | |------|-------------|----------| | Core | Always in system prompt | 85%+ confidence, 5+ successes, 80%+ success rate | | Domain | When task domain matches | Domain-tagged playbooks | | Contextual | When task query matches | Semantic/trigger match | | On-demand | Agent explicitly requests | Available via CLI |

Playbooks are automatically promoted and demoted based on usage outcomes. Three consecutive failures trigger demotion review.

const skills = await atlas.getSkillLibrary()?.getSkillsForAgent(task);
// skills.core       - always-on playbooks
// skills.domain     - relevant to task domain
// skills.contextual - matched to this specific task

Configuration

const atlas = createAtlasWithAgents([backend], {
  storage: {
    baseDir: '.cognitive-core',
    persistenceEnabled: true,
  },
  learning: {
    creditStrategy: 'simple',        // 'simple' | 'contribution'
    minTrajectories: 10,             // batch learning threshold
    deduplicationThreshold: 0.9,     // prevent duplicate playbooks
  },
  router: {
    similarityThreshold: 0.85,       // match confidence threshold
    useDomainRouting: true,
  },
  memory: {
    maxExperiences: 4,               // k for experience retrieval
    maxContextTokens: 4000,
    capacity: {
      maxExperiences: 1000,          // total stored experiences
      maxPlaybooks: 200,
      autoPrune: true,
      preserveDomainCoverage: true,
    },
  },
  knowledgeBank: {
    enabled: true,                   // enable semantic memory (default: false)
    memoryDir: 'memory',
    extraction: {
      enabled: true,                 // auto-extract from trajectories
      useLlmExtraction: false,       // use heuristic extraction
    },
    graph: {
      enabledLayers: ['semantic'],   // semantic | temporal | causal | entity
    },
    surfacing: {
      maxNotesPerTask: 5,
      maxTokensForKnowledge: 2000,
    },
    minimemAware: false,             // set true if using minimem
  },
  execution: {
    defaultAgentType: 'claude-code',
    maxExecutionTime: 300,
    captureToolCalls: true,
  },
  refinement: {
    useAgentEvaluation: true,
    maxIterations: 3,
    acceptableScore: 0.7,
  },
});

Core Types

Playbook

The central learning unit. Combines when to apply, what to do, and how to verify results.

interface Playbook {
  id: string;
  name: string;
  applicability: {
    situations: string[];    // "debugging async code"
    triggers: string[];      // "Promise rejection", "TS2307"
    antiPatterns: string[];  // when NOT to use
    domains: string[];       // "typescript", "react", "testing"
  };
  guidance: {
    strategy: string;        // high-level approach
    tactics: string[];       // mid-level steps
    steps?: string[];        // concrete commands
    codeExample?: string;
  };
  verification: {
    successIndicators: string[];  // "Tests pass", "No errors"
    failureIndicators: string[];  // "Same error persists"
    rollbackStrategy?: string;
  };
  evolution: {
    version: string;
    successCount: number;
    failureCount: number;
    refinements: Refinement[];    // context-specific adaptations
    failures: FailureRecord[];
  };
  confidence: number;             // 0-1, grows with successful use
  complexity: 'simple' | 'moderate' | 'complex';
}

Trajectory

A recorded problem-solving session in ReAct format.

interface Trajectory {
  id: string;
  task: Task;
  steps: Step[];              // thought -> action -> observation
  outcome: Outcome;           // success/failure + solution
  agentId: string;
  llmCalls: number;
  totalTokens: number;
  wallTimeSeconds: number;
}

RoutingDecision

How the system decides to approach a task.

interface RoutingDecision {
  strategy: 'direct' | 'adapt' | 'explore' | 'fallback';
  confidence: number;
  memoryContext: MemoryQueryResultV2;
  estimatedBudget: number;
  reasoning: string;
}

Research References

cognitive-core draws from several lines of research on agent memory and learning:

ArcMemo - Concept-level memory outperforms instance-level at all compute scales
ReMem - Experience retrieval with k=4, iterative refinement loops
Stitch - Library learning through compression
LILO - AutoDoc for concept naming
Voyager - Ever-growing skill libraries with verification
Claudeception - Lightweight skill persistence for code agents
A-MEM - Zettelkasten-inspired structured notes with dynamic linking (knowledge note format)
MAGMA - Multi-graph decomposition into semantic, temporal, causal, entity layers (graph overlay design)
Zep - Entity extraction, community detection, temporal tracking
Memory Survey - Comprehensive survey and taxonomy of factual/experiential/procedural memory

Limitations

No cross-domain transfer. A playbook learned in the code domain won't surface for testing tasks even if the underlying pattern is the same. Domain tags are string-matched, not semantically compared.
Cold start. The system needs ~10 trajectories before batch learning kicks in and playbooks start appearing. Until then, agents run without memory augmentation. Knowledge extraction works per-trajectory, so knowledge is available sooner.
Text-based matching by default. BM25 works but misses semantic similarity. Vector search requires configuring an embedding provider and adds latency. Knowledge bank search uses text similarity unless minimem is available.
No trajectory quality filtering. The system stores all trajectories, including ones from poorly-performing agents. Low-quality trajectories can produce low-quality playbooks and knowledge. The deduplication threshold and confidence model help, but don't solve the garbage-in problem.
Single-machine storage. Persistence is JSON file-based (experiences, playbooks), Markdown file-based (knowledge), and SQLite (vector store). There's no built-in replication, multi-agent concurrency, or cloud storage.
Extraction is heuristic by default. Without an LLM extractor configured, both playbook and knowledge extraction use text pattern matching, which produces lower-quality results. LLM-assisted extraction is available via workspace templates.
Knowledge extraction is not auto-wired into the learning pipeline. Knowledge must be extracted explicitly via KnowledgeBank.extractFromTrajectory(). This is by design — it allows callers to control when extraction runs.

Contributing

Contributions welcome. The test suite uses Vitest:

npm install
npm run test:run    # run tests once
npm run test        # watch mode
npm run typecheck   # type checking without emit

License

MIT

Published

Vulnerabilities

Links

Maintainers

Keywords

Readme

cognitive-core

Table of Contents

Motivation

Installation

Quick Start

Solve a task with memory-augmented agents

Feed trajectories from external agents

Query memory directly

How It Works

Solve flow

Playbook lifecycle

Memory architecture

Search and matching

Knowledge Bank

Knowledge note types

How knowledge is stored

Multi-layer knowledge graph

Knowledge lifecycle

Knowledge surfacing

minimem integration

CLI

Agent Backends

Subprocess backend

ACP backend

Custom backends

Observing agent execution

Skill Library

Configuration

Core Types

Playbook

Trajectory

RoutingDecision

Research References

Limitations

Contributing

License