npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2026 – Pkg Stats / Ryan Hefner

cognitive-core

v0.2.5

Published

TypeScript-native cognitive core for adaptive learning and abstraction

Readme

cognitive-core

A TypeScript learning system for AI agents. Records how agents solve problems, extracts reusable playbooks and factual knowledge from trajectories, and injects relevant guidance into future tasks.

Table of Contents

Motivation

cognitive-core gives agents persistent, structured memory:

  1. Trajectories record what the agent did (ReAct-style thought/action/observation steps)
  2. Playbooks distill trajectories into reusable guidance (strategy, tactics, verification criteria)
  3. Knowledge Bank extracts and organizes factual knowledge — what agents learn about tools, libraries, and patterns
  4. Routing matches incoming tasks to relevant playbooks before the agent starts working, using a learned MoE gating function
  5. Meta-learning tracks which playbooks helped and adjusts routing over time
  6. Three-speed learning — immediate per-trajectory updates (<200ms), energy-triggered batch extraction, and circadian-gated maintenance — all managed by a single UnifiedLearningPipeline
  7. Temporal compression — experiences flow through Hot/Warm/Cold/Evicted tiers based on access frequency, keeping memory bounded
  8. Unified persistence — all system state lives in a single SQLite database with WAL mode, including learned MoE routing weights, experience clusters, and playbook version history that survive restarts

The result: agents that get measurably better at recurring problem types without fine-tuning or prompt engineering.

Installation

npm install cognitive-core

Requires Node.js 22+.

Quick Start

Solve a task with memory-augmented agents

import { createAtlasWithAgents, createTask, createMockBackend } from 'cognitive-core';

const atlas = createAtlasWithAgents(
  [createMockBackend()],
  { storage: { baseDir: '.cognitive-core' } }
);

await atlas.init();

const result = await atlas.solve(createTask({
  domain: 'code',
  description: 'Fix the TypeScript compilation error in auth.ts',
}));

console.log(result.trajectory.outcome.success);  // true
console.log(result.routing?.strategy);            // 'direct' | 'adapt' | 'explore' | 'fallback'
console.log(result.injectedPlaybooks?.length);    // number of playbooks injected

await atlas.close();

Feed trajectories from external agents

cognitive-core can learn from trajectories produced by external agents. Record what happened elsewhere and feed it in.

import { createAtlas, createTrajectory, createTask, createStep, successOutcome } from 'cognitive-core';

const atlas = createAtlas({ storage: { baseDir: '.cognitive-core' } });
await atlas.init();

const trajectory = createTrajectory({
  task: createTask({
    domain: 'code',
    description: 'Fix the null pointer exception in user service',
  }),
  steps: [
    createStep({
      thought: 'Check where the null value originates',
      action: 'Read src/services/user.ts',
      observation: 'getUserById returns undefined when user not found',
    }),
    createStep({
      thought: 'Add a guard clause before accessing user properties',
      action: 'Edit src/services/user.ts to add null check',
      observation: 'Added: if (!user) return null',
    }),
  ],
  outcome: successOutcome('Fixed by adding null check in getUserById'),
  agentId: 'claude-code',
});

const result = await atlas.processTrajectory(trajectory);
// result.instantLoop.experienceId  — stored experience
// result.batchTriggered             — whether batch learning was auto-triggered
// result.maintenanceTriggered       — whether maintenance cycle ran

await atlas.close();

Query memory directly

const context = await atlas.queryMemory('typescript import resolution error', {
  domains: ['code'],
  includePlaybooks: true,
});

for (const { playbook, score } of context.playbooks) {
  console.log(`${playbook.name} (${Math.round(score * 100)}% match)`);
  console.log(`  Strategy: ${playbook.guidance.strategy}`);
  console.log(`  Tactics: ${playbook.guidance.tactics.join(', ')}`);
}

How It Works

Solve flow

Every call to atlas.solve(task) runs through this pipeline:

flowchart TD
    Task[Task arrives] --> Router[TaskRouter + MoEGate]

    Router -->|queries| Memory[MemorySystem]
    Memory --> Exp[ExperienceMemory]
    Memory --> PB[PlaybookLibrary]
    Memory --> KB[KnowledgeBank]
    Memory --> Meta[MetaMemory]
    Memory --> RB[ReasoningBank]
    Memory --> CS[CausalStore]

    Router --> Decision{RoutingDecision}

    Decision --> Skills[SkillLibrary + KnowledgeBank surfacing]
    Skills -->|injects playbooks + knowledge| Agent[AgentManager]
    Agent --> Backend[Backend]
    Backend -->|returns trajectory| Session[AgentSession]

    Session --> Check{Succeeded?}
    Check -->|Yes| PostExec[Post-execution]
    Check -->|No| Refine{Retry?}
    Refine -->|Yes, max 3x| Agent
    Refine -->|No| PostExec

    PostExec --> Usage[PlaybookUsageInference]
    PostExec --> MetaReflect[MetaLearner reflection]
    PostExec --> Effective[EffectivenessTracker annotate]

    PostExec --> Pipeline[UnifiedLearningPipeline]

    Pipeline --> Immediate[Speed 1: Immediate]
    Immediate --> Store[Store experience]
    Immediate --> Bump[Bump playbook confidence]
    Immediate --> QuickK[Extract knowledge notes]
    Immediate --> Causal[Extract causal edges]
    Immediate --> Reflexion[Generate reflexion episode]

    Pipeline --> Energy{EnergyEvaluator}
    Energy -->|threshold reached| Batch[Speed 2: Batch]
    Energy -->|below threshold| Done[Done]
    Batch --> Extract[PlaybookExtractor]
    Batch --> DeepK[Batch knowledge extraction]
    Batch --> Compress[TemporalCompressor]
    Batch --> Cluster[ReasoningBank re-cluster]
    Batch --> Prune[ExperienceMemory prune]

    Batch --> MaintCheck{Maintenance due?}
    MaintCheck -->|Yes| Maint[Speed 3: Maintenance]
    MaintCheck -->|No| Done
    Maint --> Heal[HealingOrchestrator]
    Maint --> Defrag[Knowledge defrag]
    Maint --> MetaStrat[Meta-strategy generation]
    Maint --> Done

    Done -.->|next task| Router

Post-execution subsystems

After every solve() call, three subsystems run synchronously on the trajectory before it enters the learning pipeline. These require the RoutingDecision and injected playbook context, so they only run via solve() — not via processTrajectory().

| Subsystem | What it does | |-----------|--------------| | PlaybookUsageInference | Infers which injected playbooks were actually used by the agent. Records outcomes to SkillLibrary for tier management. | | MetaLearner | Generates a meta-reflection on routing/retrieval effectiveness. Stores observations and periodically generates meta-strategies. | | EffectivenessTracker | Annotates the trajectory with which playbooks were surfaced vs. applied and which experiences were retrieved. |

All three use workspace templates (LLM-assisted) when an AgenticTaskRunner is set, with heuristic fallback otherwise.

Three-speed learning pipeline

All learning is managed by the UnifiedLearningPipeline, which consolidates InstantLoop, batch extraction, energy-based triggering, maintenance scheduling, and healing into a single orchestrator.

Speed 1: Immediate (<200ms, no LLM)

Fires on every trajectory. The InstantLoop performs lightweight, synchronous updates:

| Operation | Target | |-----------|--------| | Store experience | ExperienceMemory | | Bump matched playbook confidence | PlaybookLibrary | | Extract lightweight knowledge notes | KnowledgeBank | | Extract causal edges | CausalStore | | Generate reflexion episode | ReflexionMemory |

After the instant loop, the trajectory is analyzed (heuristic or workspace template) and accumulated for batch.

Speed 2: Batch (energy-triggered)

Triggered by the EnergyEvaluator — not just a count threshold. The evaluator computes an energy score from signals:

| Signal | Weight | Trigger condition | |--------|--------|-------------------| | Count threshold | — | pendingCount >= countThreshold (default: minTrajectories) | | Contradiction detected | 0.4 | Coherence checker flagged a conflict | | Novel domain | 0.3 | First trajectory in an unseen domain | | High error rate | 0.2 | Recent failure rate exceeds 60% | | Pattern shift | 0.1 | External signal (e.g., from caller) |

Batch triggers when pendingCount >= countThreshold or energy >= energyThreshold (0.7), subject to a debounce interval (minIntervalMs, default: 30s).

When batch runs:

  1. Playbook extraction — extract new playbooks from trajectory clusters, refine existing ones
  2. Batch knowledge extraction — richer than instant-loop extraction, covers all accumulated trajectories
  3. Temporal compression — promote/demote/evict experiences across Hot/Warm/Cold/Evicted tiers
  4. Re-clustering — ReasoningBank rebuilds K-means++ clusters from current experiences
  5. Pruning — ExperienceMemory prunes beyond maxExperiences
  6. Meta-learning — MetaLearner generates routing strategies

After batch completes, accumulated trajectories are cleared and the energy evaluator resets.

Speed 3: Maintenance (circadian-gated)

Runs after batch when the MaintenanceScheduler signals readiness. Three scheduling modes:

| Mode | When maintenance runs | |------|----------------------| | manual | Only via explicit runMaintenance({ force: true }) | | afterNBatches | After every N batch cycles (e.g., batchInterval: 2) | | periodic | After a time interval since last maintenance |

Built-in maintenance tasks:

| Task | What it does | |------|--------------| | healing-cycle | HealingOrchestrator runs anomaly detectors (PlaybookDriftDetector, MemoryBloatDetector) and applies repair strategies | | knowledge-defrag | KnowledgeBank merges duplicate/overlapping notes | | meta-strategy-generation | MetaLearner generates new routing strategies from accumulated observations |

Custom maintenance tasks and anomaly detectors can be registered via registerMaintenanceTask() and addAnomalyDetector().

solve() vs processTrajectory()

| | atlas.solve(task) | atlas.processTrajectory(trajectory) | |---|---|---| | Routing | Yes — TaskRouter + MoEGate | No | | Agent execution | Yes — AgentManager + refinement | No (trajectory provided externally) | | Post-execution | Yes — UsageInference, MetaLearner, EffectivenessTracker | No (requires routing context) | | Three-speed pipeline | Yes | Yes | | Returns | SolveResult (trajectory, routing, playbook usage) | ImmediateResult (instant loop, energy eval, batch/maintenance results) |

Use solve() when Atlas owns execution. Use processTrajectory() to feed in external trajectories — the full three-speed pipeline still runs, but post-execution subsystems that require routing context are skipped.

Playbook lifecycle

A playbook starts with low confidence (0.3) after extraction from a cluster of similar trajectories. Each time it's injected into an agent and the task succeeds, confidence grows. After enough successful uses (default: 5 successes, 80%+ success rate), it gets promoted to a core skill that's always in the system prompt. If it starts failing in specific contexts, the system records refinements rather than discarding the playbook entirely.

stateDiagram-v2
    [*] --> Extracted : pattern found across trajectories

    Extracted --> Contextual : confidence above 0.3
    Contextual --> Domain : tagged to domain

    Contextual --> Contextual : success
    Domain --> Domain : success

    Domain --> Core : 0.85+ confidence, 5+ successes, 80%+ rate
    Core --> Core : success

    Core --> Domain : 3 consecutive failures
    Domain --> Contextual : confidence drops

    Contextual --> Refined : failure in specific context
    Refined --> Contextual : refinement recorded

Memory architecture

Seven memory stores serve different retrieval patterns:

graph LR
    subgraph MemorySystem
        E[ExperienceMemory]
        P[PlaybookLibrary]
        K[KnowledgeBank]
        M[MetaMemory]
        RB[ReasoningBank]
        CS[CausalStore]
        RF[ReflexionMemory]
    end

    subgraph Search
        II[InvertedIndex]
        BM[BM25 Index]
        VS[sqlite-vec]
        TS[Text Similarity]
    end

    subgraph Providers
        OAI[OpenAI]
        VOY[Voyage]
        HF[HuggingFace local]
        MM[minimem optional]
    end

    E --> II
    E --> BM
    E --> VS
    P --> II
    P --> BM
    P --> VS
    K --> TS
    K -.-> MM
    CS --> II
    CS --> TS
    RB --> TS
    VS -.-> OAI
    VS -.-> VOY
    VS -.-> HF

    subgraph Storage
        SQL[SQLite — cognitive-core.db]
        MD[Markdown + YAML]
        SKILL[SQLite — skills.db]
    end

    E --> SQL
    P --> SQL
    M --> SQL
    CS --> SQL
    RF --> SQL
    RB -.->|system_state| SQL
    VS --> SQL
    K --> MD

| Store | Purpose | Format | |-------|---------|--------| | ExperienceMemory | Episodic — what happened | SQLite | | PlaybookLibrary | Procedural — how to do things | SQLite (with version history) | | KnowledgeBank | Semantic — facts, concepts, relationships | Markdown + YAML frontmatter | | MetaMemory | Meta-learning — what worked | SQLite | | ReasoningBank | Clustered experience retrieval (K-means++) | SQLite (system_state) | | CausalStore | Cause-effect relationships | SQLite | | ReflexionMemory | Structured self-reflections | SQLite |

Candidate retrieval

All memory stores use a shared two-phase retrieval pattern:

  1. InvertedIndex narrowing — tokenize the query, find candidate IDs via posting lists (fast, O(query tokens))
  2. Fallback to full scan — if the index returns fewer candidates than needed, fall back to the full store or domain partition

This is implemented once in memory/candidate-retrieval.ts and shared by ExperienceMemory, PlaybookLibrary, and CausalStore.

Search and matching

Memory search uses BM25 text matching by default. For higher-quality retrieval, plug in an embedding provider (OpenAI, Voyage, or local via @huggingface/transformers) and cognitive-core switches to hybrid BM25 + vector search backed by sqlite-vec.

Knowledge bank search uses text similarity by default, with optional delegation to minimem for hybrid vector + BM25 search when available.

MoE routing gate

The TaskRouter uses a Mixture-of-Experts inspired gating function to select routing strategies. Instead of fixed threshold rules, a learned linear gate computes softmax(W · features) over the four strategies (direct, adapt, explore, fallback). Weights are updated online from outcome feedback, with per-domain specialization. Cold start falls back to uniform weights (equivalent to the original threshold logic).

Temporal compression

Experiences flow through four tiers based on their access frequency:

| Tier | Fidelity | Threshold | |------|----------|-----------| | Hot | Full data preserved | accessScore > 0.5 | | Warm | Key steps only (solutionOutput stripped) | accessScore > 0.2 | | Cold | Single-paragraph summary | accessScore > 0.05 | | Evicted | Removed (pattern captured in ReasoningBank) | accessScore < 0.05 |

Hysteresis prevents oscillation: an experience must stay below a threshold for 2 consecutive compression passes before demotion.

Coherence checking

Before inserting new knowledge notes, the CoherenceChecker detects contradictions using a simplified sheaf-inspired approach:

  1. Filter existing notes to the same entity/domain
  2. Compute text similarity between new and candidate notes
  3. Detect sentiment conflicts via negation heuristics (e.g., "always" vs "never", "use" vs "avoid")
  4. Flag contradictions where residualEnergy = similarity × sentimentConflict exceeds threshold

Knowledge Bank

The knowledge bank is a semantic memory store for facts, concepts, and relationships that agents learn from experience. While playbooks capture how to do things, the knowledge bank captures what agents know — facts about tools, libraries, version-specific behavior, and causal relationships.

Knowledge note types

| Type | Purpose | Example | |------|---------|---------| | Observation | Atomic fact learned from experience | "Prisma requires db push before migrate dev when schema has drifted" | | Entity | Living document about a tool/library/pattern | "Everything I know about Prisma" | | Domain summary | High-level overview of a knowledge domain | "Database knowledge overview" |

How knowledge is stored

Knowledge notes are plain Markdown files with structured YAML frontmatter, organized on the filesystem:

knowledge/
├── observations/    # Atomic facts (k-*.md)
├── entities/        # Living entity docs (prisma.md, vitest.md)
└── domains/         # Domain summaries (database.md, testing.md)

Any agent with filesystem access can browse, grep, and read knowledge directly — no special API required.

Multi-layer knowledge graph

Relationships between knowledge notes are tracked in a multi-layer graph overlay (inspired by MAGMA):

| Layer | Captures | Example | |-------|----------|---------| | Semantic | Conceptual relationships | "Prisma depends-on PostgreSQL" | | Temporal | When knowledge was learned | "Observation A supersedes B" | | Causal | Cause-effect chains | "Upgrading TS 5.4 broke enum const exports" | | Entity | Tool/component interactions | "Next.js uses React" |

Knowledge lifecycle

Trajectory → KnowledgeExtractor → Observations (confidence: 0.3)
                                        │
                              reinforced by more trajectories
                                        │
                                  Observations (confidence: 0.5-0.8)
                                        │
                              consolidated into entity notes
                                        │
                                  Entity Notes (living docs)
                                        │
                              domain summaries regenerated
                                        │
                                  Domain Summaries

Knowledge evolves through:

  • Reinforcement — same fact observed again increases confidence
  • Contradiction detection — conflicting facts are flagged by the CoherenceChecker and resolved
  • Consolidation — observations about the same entity merge into entity notes
  • Decay — unvalidated knowledge gradually loses confidence
  • Defragmentation — the KnowledgeDefragmenter merges duplicate/overlapping notes

Knowledge surfacing

During atlas.solve(), relevant knowledge is surfaced alongside playbooks as an independent context section:

const result = await atlas.solve(task);
console.log(result.surfacedKnowledge?.length);  // knowledge notes injected

Three-tier retrieval:

  1. Domain match — if the task domain matches a knowledge domain, include the domain summary
  2. Entity match — if the task mentions known entities, include their notes
  3. Semantic match — text similarity search for contextually relevant observations

minimem integration

When minimem is available, the knowledge bank can delegate search to minimem's hybrid search (vector + BM25) for higher-quality retrieval. Set minimemAware: true in config to enable. The two systems communicate via file conventions only — no cross-package imports.

See Design Doc for the full architecture.

Learning Pipeline

Knowledge extraction

The KnowledgeExtractor runs four heuristic strategies per trajectory:

  1. Error patterns — extracts observations from error signatures and suggested fixes
  2. Config facts — detects reads/writes of config files (package.json, tsconfig, etc.) and captures versions
  3. Causal chains — identifies failure→recovery step sequences and creates causal links
  4. Entity identification — matches known entities in observations and produces entity updates

Reflexion generation

The ReflexionGenerator creates structured self-reflection episodes from completed trajectories. Each episode includes:

  • Outcome classification (success/partial/failure)
  • Self-critique (wasted effort, thrashing detection)
  • Key insights (tools used, file paths, error patterns)
  • Strategy assessment against matched playbooks
  • Suggested playbook updates

Shared utilities

Common analysis patterns are consolidated into shared utilities to avoid duplication:

| Utility | Used by | |---------|---------| | classifyError() / classifyErrorType() | TrajectoryAnalyzer, ReflexionGenerator | | detectRepeatedActions() | TrajectoryAnalyzer, ReflexionGenerator, MetaLearner | | extractToolNames() | ReflexionGenerator, MetaLearner | | getCandidates() | ExperienceMemory, PlaybookLibrary, CausalStore |

CLI

cognitive-core ships a CLI for operating the learning system — ingesting trajectories, querying memory, and running maintenance. Available as both cogcore (short) and cognitive-core.

npm install -g cognitive-core
cogcore help

Daemon mode

The primary operational command. Polls SessionLog for new coding sessions and feeds them through the learning pipeline automatically.

# Continuous mode — watch for new sessions and process them
cogcore run --repo /path/to/repo --interval 30

# One-shot mode — process all unprocessed sessions and exit
cogcore run --once --repo /path/to/repo

The flow: SessionBank.discover() → filter unprocessed → SessionTrajectorySource.synthesize()atlas.processTrajectory()bank.markProcessed(). Supports graceful shutdown via SIGINT/SIGTERM.

Playbook commands

# Initialize storage
cogcore init

# Store a trajectory from a JSON file
cogcore store ./trajectory.json

# Search for relevant playbooks
cogcore search "fix typescript import errors" --domain code
# Found 2 playbook(s) matching "fix typescript import errors":
#
# Match: 92% (trigger)
# ## typescript-import-resolution
# Confidence: 85%
# Strategy: Check tsconfig.json paths configuration

# Get full playbook details
cogcore get playbook-abc123

# List domains with playbook counts
cogcore domains

# View system statistics
cogcore stats

Knowledge bank

Query and manage the semantic memory store (observations, entities, domain summaries).

cogcore kb search "prisma migration"
cogcore kb list                        # all notes
cogcore kb list --type entities        # entity notes only
cogcore kb list --type domains         # domain summaries only
cogcore kb get prisma                  # by ID, entity name, or domain name
cogcore kb stats                       # observation/entity/domain counts, graph stats
cogcore kb defrag                      # deduplicate, process inbox, run decay

Sessions

Inspect SessionLog checkpoint data from git orphan branches. SessionBank is standalone — it reads directly from the git repo without Atlas.

cogcore sessions list --repo .
cogcore sessions get 2026-02-15-aabbccdd-1122-3344-5566-778899aabbcc
cogcore sessions query --agent "Claude Code" --since 2026-01-01 --limit 10

Learning pipeline

Operator controls for the three-speed learning pipeline.

cogcore learn stats                    # processed count, pending, batch/maintenance cycles
cogcore learn batch                    # force batch learning on accumulated trajectories
cogcore learn maintenance              # force maintenance cycle (defrag, healing, meta-strategies)

Skill library

Inspect and manage the tiered skill system.

cogcore skills list                    # all skills with tier, confidence, success rate
cogcore skills list --tier core        # core skills only (always in system prompt)
cogcore skills list --tier domain      # domain skills only
cogcore skills stats                   # core/domain counts, last refresh
cogcore skills refresh                 # trigger tier re-evaluation (promotions/demotions)

Global options

All commands support these flags:

| Flag | Description | |------|-------------| | --dir <path> | Storage directory (default: .cognitive-core) | | --json | Machine-readable JSON output | | --limit <n> | Maximum results | | --domain <d> | Filter by domain | | --repo <path> | Git repository path (for sessions/run) |

# Pipe JSON output to jq
cogcore search "debug async errors" --json | jq '.[0].strategy'
cogcore kb stats --json | jq '.observationCount'

Agent Backends

cognitive-core delegates execution to backends that handle spawning, message passing, and trajectory extraction.

Subprocess backend

Spawns agents as child processes. Works with any CLI agent.

import { createAtlasWithAgents, createSubprocessBackend, claudeCodeConfig } from 'cognitive-core';

const backend = createSubprocessBackend({
  'claude-code': claudeCodeConfig,  // Pre-configured for Claude Code CLI
});

const atlas = createAtlasWithAgents([backend], {
  execution: {
    defaultAgentType: 'claude-code',
    maxExecutionTime: 300,
    captureToolCalls: true,
  },
});

ACP backend

Uses the Agent Communication Protocol for richer interaction with ACP-compatible agents.

import { createACPBackend, claudeCodeACPConfig } from 'cognitive-core';

const backend = createACPBackend({
  'claude-code': claudeCodeACPConfig,  // Uses npx claude-code-acp
});

Custom backends

Implement the AgentBackend interface:

import type { AgentBackend, AgentSpawnConfig, AgentSession } from 'cognitive-core';

class MyBackend implements AgentBackend {
  readonly name = 'my-agent';
  readonly supportedTypes = ['my-agent'];

  async isAvailable(): Promise<boolean> { return true; }

  async spawn(config: AgentSpawnConfig): Promise<AgentSession> {
    // config.task - the task to solve
    // config.systemPromptAdditions - playbook context to inject
    // config.timeout - max execution time
    const session = await launchMyAgent(config);
    return session;
  }

  async getSession(id: string): Promise<AgentSession | undefined> { /* ... */ }
  async terminate(id: string): Promise<void> { /* ... */ }
}

Observing agent execution

const manager = atlas.getAgentManager();

manager.addObserver({
  onSessionStart: (session) => console.log('Started:', session.id),
  onToolCall: (session, toolCall) => console.log('Tool:', toolCall.name),
  onSessionEnd: (session, trajectory) => {
    console.log('Done:', trajectory.outcome.success);
    console.log('Steps:', trajectory.steps.length);
  },
});

Skill Library

The skill library manages how playbooks are surfaced to agents across four tiers:

| Tier | When loaded | Criteria | |------|-------------|----------| | Core | Always in system prompt | 85%+ confidence, 5+ successes, 80%+ success rate | | Domain | When task domain matches | Domain-tagged playbooks | | Contextual | When task query matches | Semantic/trigger match | | On-demand | Agent explicitly requests | Available via CLI |

Playbooks are automatically promoted and demoted based on usage outcomes. Three consecutive failures trigger demotion review.

const skills = await atlas.getSkillLibrary()?.getSkillsForAgent(task);
// skills.core       - always-on playbooks
// skills.domain     - relevant to task domain
// skills.contextual - matched to this specific task

skill-tree integration

When skillTree.enabled is true (default), playbooks are published to a skill-tree SQLite database via SqliteStorageAdapter, making them available to external consumers.

Team skill library

For multi-agent systems, the TeamSkillLibrary surfaces team-level and role-specific playbooks with ELO-based ranking from team outcome feedback.

Session Bank

The session bank provides integration with the Claude Code CLI, enabling cognitive-core to learn from real coding sessions.

import { SessionBank, EntireGitReader, EntireTranscriptParser } from 'cognitive-core';

Components:

  • EntireGitReader — reads session transcripts from Claude Code's git-based storage
  • EntireTranscriptParser — parses raw transcripts into structured SessionRecord objects
  • SessionBank — manages session records with querying by date range, domain, and outcome

The EntireTrajectorySource bridges session bank records into trajectories for the learning pipeline.

Workspace Templates

The workspace system provides LLM-assisted analysis via structured templates. Each template defines inputs, outputs, and a processing prompt. The AgenticTaskRunner executes templates against a workspace.

Available templates:

| Template | Purpose | |----------|---------| | trajectory-analysis | Deep trajectory analysis with LLM | | playbook-extraction | Extract playbooks from trajectory clusters | | knowledge-extraction | LLM-assisted knowledge extraction | | usage-inference | Infer which playbooks were actually used | | meta-reflection | Generate meta-learning observations | | solution-evaluation | Evaluate solution quality | | refinement-analysis | Analyze refinement opportunities | | knowledge-defrag | Merge overlapping knowledge notes | | team-playbook-extraction | Extract team-level playbooks | | team-trajectory-analysis | Analyze multi-agent trajectories |

Templates can run in heuristic mode (no LLM, pattern matching only) or agentic mode (LLM-assisted, higher quality).

Configuration

const atlas = createAtlasWithAgents([backend], {
  storage: {
    baseDir: '.cognitive-core',
    dbName: 'cognitive-core.db',   // SQLite database filename
    persistenceEnabled: true,
  },
  learning: {
    creditStrategy: 'simple',        // 'simple' | 'contribution'
    minTrajectories: 10,             // batch learning threshold
    deduplicationThreshold: 0.9,     // prevent duplicate playbooks
  },
  router: {
    similarityThreshold: 0.85,       // match confidence threshold
    useDomainRouting: true,
  },
  memory: {
    maxExperiences: 4,               // k for experience retrieval
    maxContextTokens: 4000,
    capacity: {
      maxExperiences: 1000,          // total stored experiences
      maxPlaybooks: 200,
      autoPrune: true,
      preserveDomainCoverage: true,
    },
  },
  knowledgeBank: {
    enabled: true,                   // enable semantic memory (default: false)
    memoryDir: 'memory',
    extraction: {
      enabled: true,                 // auto-extract from trajectories
      useLlmExtraction: false,       // use heuristic extraction
    },
    graph: {
      enabledLayers: ['semantic'],   // semantic | temporal | causal | entity
    },
    surfacing: {
      maxNotesPerTask: 5,
      maxTokensForKnowledge: 2000,
    },
    minimemAware: false,             // set true if using minimem
  },
  execution: {
    defaultAgentType: 'claude-code',
    maxExecutionTime: 300,
    captureToolCalls: true,
  },
  refinement: {
    useAgentEvaluation: true,
    maxIterations: 3,
    acceptableScore: 0.7,
  },
});

Core Types

Playbook

The central learning unit. Combines when to apply, what to do, and how to verify results.

interface Playbook {
  id: string;
  name: string;
  applicability: {
    situations: string[];    // "debugging async code"
    triggers: string[];      // "Promise rejection", "TS2307"
    antiPatterns: string[];  // when NOT to use
    domains: string[];       // "typescript", "react", "testing"
  };
  guidance: {
    strategy: string;        // high-level approach
    tactics: string[];       // mid-level steps
    steps?: string[];        // concrete commands
    codeExample?: string;
  };
  verification: {
    successIndicators: string[];  // "Tests pass", "No errors"
    failureIndicators: string[];  // "Same error persists"
    rollbackStrategy?: string;
  };
  evolution: {
    version: string;
    successCount: number;
    failureCount: number;
    refinements: Refinement[];    // context-specific adaptations
    failures: FailureRecord[];
  };
  confidence: number;             // 0-1, grows with successful use
  complexity: 'simple' | 'moderate' | 'complex';
}

Trajectory

A recorded problem-solving session in ReAct format.

interface Trajectory {
  id: string;
  task: Task;
  steps: Step[];              // thought -> action -> observation
  outcome: Outcome;           // success/failure + solution
  agentId: string;
  llmCalls: number;
  totalTokens: number;
  wallTimeSeconds: number;
}

RoutingDecision

How the system decides to approach a task.

interface RoutingDecision {
  strategy: 'direct' | 'adapt' | 'explore' | 'fallback';
  confidence: number;
  memoryContext: MemoryQueryResultV2;
  estimatedBudget: number;
  reasoning: string;
}

Research References

cognitive-core draws from several lines of research on agent memory and learning:

  • ArcMemo - Concept-level memory outperforms instance-level at all compute scales
  • ReMem - Experience retrieval with k=4, iterative refinement loops
  • Stitch - Library learning through compression
  • LILO - AutoDoc for concept naming
  • Voyager - Ever-growing skill libraries with verification
  • Claudeception - Lightweight skill persistence for code agents
  • A-MEM - Zettelkasten-inspired structured notes with dynamic linking (knowledge note format)
  • MAGMA - Multi-graph decomposition into semantic, temporal, causal, entity layers (graph overlay design)
  • Zep - Entity extraction, community detection, temporal tracking
  • Memory Survey - Comprehensive survey and taxonomy of factual/experiential/procedural memory
  • ruvector - Inspiration for infrastructure improvements: ReasoningBank (K-means++ clustering), MoE routing gate, CoherenceChecker (sheaf-inspired), TemporalCompressor (tiered memory), InstantLoop (hot-path learning)

Limitations

  • No cross-domain transfer. A playbook learned in the code domain won't surface for testing tasks even if the underlying pattern is the same. Domain tags are string-matched, not semantically compared.
  • Cold start. The system needs ~10 trajectories before batch learning kicks in and playbooks start appearing. The instant loop provides per-trajectory updates immediately, and knowledge extraction works per-trajectory, so some learning is available sooner.
  • Text-based matching by default. BM25 and inverted index candidate narrowing work but miss semantic similarity. Vector search requires configuring an embedding provider and adds latency. Knowledge bank search uses text similarity unless minimem is available.
  • No trajectory quality filtering. The system stores all trajectories, including ones from poorly-performing agents. Low-quality trajectories can produce low-quality playbooks and knowledge. The deduplication threshold, confidence model, and temporal compression help, but don't solve the garbage-in problem.
  • Single-machine storage. All system-internal data lives in a single SQLite database (cognitive-core.db with WAL mode), knowledge notes are Markdown files, and skills use a separate skills.db. There's no built-in replication, multi-agent concurrency, or cloud storage.
  • Extraction is heuristic by default. Without an LLM extractor configured, both playbook and knowledge extraction use text pattern matching, which produces lower-quality results. LLM-assisted extraction is available via workspace templates.

Contributing

Contributions welcome. The test suite uses Vitest:

npm install
npm run test:run    # run tests once
npm run test        # watch mode
npm run typecheck   # type checking without emit

License

MIT