cognitive-core
v0.2.5
Published
TypeScript-native cognitive core for adaptive learning and abstraction
Maintainers
Readme
cognitive-core
A TypeScript learning system for AI agents. Records how agents solve problems, extracts reusable playbooks and factual knowledge from trajectories, and injects relevant guidance into future tasks.
Table of Contents
- Motivation
- Installation
- Quick Start
- How It Works
- Knowledge Bank
- Learning Pipeline
- Three-Speed Pipeline Detail
- CLI
- Agent Backends
- Skill Library
- Session Bank
- Workspace Templates
- Configuration
- Core Types
- Research References
- Limitations
- Contributing
- License
Motivation
cognitive-core gives agents persistent, structured memory:
- Trajectories record what the agent did (ReAct-style thought/action/observation steps)
- Playbooks distill trajectories into reusable guidance (strategy, tactics, verification criteria)
- Knowledge Bank extracts and organizes factual knowledge — what agents learn about tools, libraries, and patterns
- Routing matches incoming tasks to relevant playbooks before the agent starts working, using a learned MoE gating function
- Meta-learning tracks which playbooks helped and adjusts routing over time
- Three-speed learning — immediate per-trajectory updates (<200ms), energy-triggered batch extraction, and circadian-gated maintenance — all managed by a single
UnifiedLearningPipeline - Temporal compression — experiences flow through Hot/Warm/Cold/Evicted tiers based on access frequency, keeping memory bounded
- Unified persistence — all system state lives in a single SQLite database with WAL mode, including learned MoE routing weights, experience clusters, and playbook version history that survive restarts
The result: agents that get measurably better at recurring problem types without fine-tuning or prompt engineering.
Installation
npm install cognitive-coreRequires Node.js 22+.
Quick Start
Solve a task with memory-augmented agents
import { createAtlasWithAgents, createTask, createMockBackend } from 'cognitive-core';
const atlas = createAtlasWithAgents(
[createMockBackend()],
{ storage: { baseDir: '.cognitive-core' } }
);
await atlas.init();
const result = await atlas.solve(createTask({
domain: 'code',
description: 'Fix the TypeScript compilation error in auth.ts',
}));
console.log(result.trajectory.outcome.success); // true
console.log(result.routing?.strategy); // 'direct' | 'adapt' | 'explore' | 'fallback'
console.log(result.injectedPlaybooks?.length); // number of playbooks injected
await atlas.close();Feed trajectories from external agents
cognitive-core can learn from trajectories produced by external agents. Record what happened elsewhere and feed it in.
import { createAtlas, createTrajectory, createTask, createStep, successOutcome } from 'cognitive-core';
const atlas = createAtlas({ storage: { baseDir: '.cognitive-core' } });
await atlas.init();
const trajectory = createTrajectory({
task: createTask({
domain: 'code',
description: 'Fix the null pointer exception in user service',
}),
steps: [
createStep({
thought: 'Check where the null value originates',
action: 'Read src/services/user.ts',
observation: 'getUserById returns undefined when user not found',
}),
createStep({
thought: 'Add a guard clause before accessing user properties',
action: 'Edit src/services/user.ts to add null check',
observation: 'Added: if (!user) return null',
}),
],
outcome: successOutcome('Fixed by adding null check in getUserById'),
agentId: 'claude-code',
});
const result = await atlas.processTrajectory(trajectory);
// result.instantLoop.experienceId — stored experience
// result.batchTriggered — whether batch learning was auto-triggered
// result.maintenanceTriggered — whether maintenance cycle ran
await atlas.close();Query memory directly
const context = await atlas.queryMemory('typescript import resolution error', {
domains: ['code'],
includePlaybooks: true,
});
for (const { playbook, score } of context.playbooks) {
console.log(`${playbook.name} (${Math.round(score * 100)}% match)`);
console.log(` Strategy: ${playbook.guidance.strategy}`);
console.log(` Tactics: ${playbook.guidance.tactics.join(', ')}`);
}How It Works
Solve flow
Every call to atlas.solve(task) runs through this pipeline:
flowchart TD
Task[Task arrives] --> Router[TaskRouter + MoEGate]
Router -->|queries| Memory[MemorySystem]
Memory --> Exp[ExperienceMemory]
Memory --> PB[PlaybookLibrary]
Memory --> KB[KnowledgeBank]
Memory --> Meta[MetaMemory]
Memory --> RB[ReasoningBank]
Memory --> CS[CausalStore]
Router --> Decision{RoutingDecision}
Decision --> Skills[SkillLibrary + KnowledgeBank surfacing]
Skills -->|injects playbooks + knowledge| Agent[AgentManager]
Agent --> Backend[Backend]
Backend -->|returns trajectory| Session[AgentSession]
Session --> Check{Succeeded?}
Check -->|Yes| PostExec[Post-execution]
Check -->|No| Refine{Retry?}
Refine -->|Yes, max 3x| Agent
Refine -->|No| PostExec
PostExec --> Usage[PlaybookUsageInference]
PostExec --> MetaReflect[MetaLearner reflection]
PostExec --> Effective[EffectivenessTracker annotate]
PostExec --> Pipeline[UnifiedLearningPipeline]
Pipeline --> Immediate[Speed 1: Immediate]
Immediate --> Store[Store experience]
Immediate --> Bump[Bump playbook confidence]
Immediate --> QuickK[Extract knowledge notes]
Immediate --> Causal[Extract causal edges]
Immediate --> Reflexion[Generate reflexion episode]
Pipeline --> Energy{EnergyEvaluator}
Energy -->|threshold reached| Batch[Speed 2: Batch]
Energy -->|below threshold| Done[Done]
Batch --> Extract[PlaybookExtractor]
Batch --> DeepK[Batch knowledge extraction]
Batch --> Compress[TemporalCompressor]
Batch --> Cluster[ReasoningBank re-cluster]
Batch --> Prune[ExperienceMemory prune]
Batch --> MaintCheck{Maintenance due?}
MaintCheck -->|Yes| Maint[Speed 3: Maintenance]
MaintCheck -->|No| Done
Maint --> Heal[HealingOrchestrator]
Maint --> Defrag[Knowledge defrag]
Maint --> MetaStrat[Meta-strategy generation]
Maint --> Done
Done -.->|next task| RouterPost-execution subsystems
After every solve() call, three subsystems run synchronously on the trajectory before it enters the learning pipeline. These require the RoutingDecision and injected playbook context, so they only run via solve() — not via processTrajectory().
| Subsystem | What it does | |-----------|--------------| | PlaybookUsageInference | Infers which injected playbooks were actually used by the agent. Records outcomes to SkillLibrary for tier management. | | MetaLearner | Generates a meta-reflection on routing/retrieval effectiveness. Stores observations and periodically generates meta-strategies. | | EffectivenessTracker | Annotates the trajectory with which playbooks were surfaced vs. applied and which experiences were retrieved. |
All three use workspace templates (LLM-assisted) when an AgenticTaskRunner is set, with heuristic fallback otherwise.
Three-speed learning pipeline
All learning is managed by the UnifiedLearningPipeline, which consolidates InstantLoop, batch extraction, energy-based triggering, maintenance scheduling, and healing into a single orchestrator.
Speed 1: Immediate (<200ms, no LLM)
Fires on every trajectory. The InstantLoop performs lightweight, synchronous updates:
| Operation | Target | |-----------|--------| | Store experience | ExperienceMemory | | Bump matched playbook confidence | PlaybookLibrary | | Extract lightweight knowledge notes | KnowledgeBank | | Extract causal edges | CausalStore | | Generate reflexion episode | ReflexionMemory |
After the instant loop, the trajectory is analyzed (heuristic or workspace template) and accumulated for batch.
Speed 2: Batch (energy-triggered)
Triggered by the EnergyEvaluator — not just a count threshold. The evaluator computes an energy score from signals:
| Signal | Weight | Trigger condition |
|--------|--------|-------------------|
| Count threshold | — | pendingCount >= countThreshold (default: minTrajectories) |
| Contradiction detected | 0.4 | Coherence checker flagged a conflict |
| Novel domain | 0.3 | First trajectory in an unseen domain |
| High error rate | 0.2 | Recent failure rate exceeds 60% |
| Pattern shift | 0.1 | External signal (e.g., from caller) |
Batch triggers when pendingCount >= countThreshold or energy >= energyThreshold (0.7), subject to a debounce interval (minIntervalMs, default: 30s).
When batch runs:
- Playbook extraction — extract new playbooks from trajectory clusters, refine existing ones
- Batch knowledge extraction — richer than instant-loop extraction, covers all accumulated trajectories
- Temporal compression — promote/demote/evict experiences across Hot/Warm/Cold/Evicted tiers
- Re-clustering — ReasoningBank rebuilds K-means++ clusters from current experiences
- Pruning — ExperienceMemory prunes beyond
maxExperiences - Meta-learning — MetaLearner generates routing strategies
After batch completes, accumulated trajectories are cleared and the energy evaluator resets.
Speed 3: Maintenance (circadian-gated)
Runs after batch when the MaintenanceScheduler signals readiness. Three scheduling modes:
| Mode | When maintenance runs |
|------|----------------------|
| manual | Only via explicit runMaintenance({ force: true }) |
| afterNBatches | After every N batch cycles (e.g., batchInterval: 2) |
| periodic | After a time interval since last maintenance |
Built-in maintenance tasks:
| Task | What it does |
|------|--------------|
| healing-cycle | HealingOrchestrator runs anomaly detectors (PlaybookDriftDetector, MemoryBloatDetector) and applies repair strategies |
| knowledge-defrag | KnowledgeBank merges duplicate/overlapping notes |
| meta-strategy-generation | MetaLearner generates new routing strategies from accumulated observations |
Custom maintenance tasks and anomaly detectors can be registered via registerMaintenanceTask() and addAnomalyDetector().
solve() vs processTrajectory()
| | atlas.solve(task) | atlas.processTrajectory(trajectory) |
|---|---|---|
| Routing | Yes — TaskRouter + MoEGate | No |
| Agent execution | Yes — AgentManager + refinement | No (trajectory provided externally) |
| Post-execution | Yes — UsageInference, MetaLearner, EffectivenessTracker | No (requires routing context) |
| Three-speed pipeline | Yes | Yes |
| Returns | SolveResult (trajectory, routing, playbook usage) | ImmediateResult (instant loop, energy eval, batch/maintenance results) |
Use solve() when Atlas owns execution. Use processTrajectory() to feed in external trajectories — the full three-speed pipeline still runs, but post-execution subsystems that require routing context are skipped.
Playbook lifecycle
A playbook starts with low confidence (0.3) after extraction from a cluster of similar trajectories. Each time it's injected into an agent and the task succeeds, confidence grows. After enough successful uses (default: 5 successes, 80%+ success rate), it gets promoted to a core skill that's always in the system prompt. If it starts failing in specific contexts, the system records refinements rather than discarding the playbook entirely.
stateDiagram-v2
[*] --> Extracted : pattern found across trajectories
Extracted --> Contextual : confidence above 0.3
Contextual --> Domain : tagged to domain
Contextual --> Contextual : success
Domain --> Domain : success
Domain --> Core : 0.85+ confidence, 5+ successes, 80%+ rate
Core --> Core : success
Core --> Domain : 3 consecutive failures
Domain --> Contextual : confidence drops
Contextual --> Refined : failure in specific context
Refined --> Contextual : refinement recordedMemory architecture
Seven memory stores serve different retrieval patterns:
graph LR
subgraph MemorySystem
E[ExperienceMemory]
P[PlaybookLibrary]
K[KnowledgeBank]
M[MetaMemory]
RB[ReasoningBank]
CS[CausalStore]
RF[ReflexionMemory]
end
subgraph Search
II[InvertedIndex]
BM[BM25 Index]
VS[sqlite-vec]
TS[Text Similarity]
end
subgraph Providers
OAI[OpenAI]
VOY[Voyage]
HF[HuggingFace local]
MM[minimem optional]
end
E --> II
E --> BM
E --> VS
P --> II
P --> BM
P --> VS
K --> TS
K -.-> MM
CS --> II
CS --> TS
RB --> TS
VS -.-> OAI
VS -.-> VOY
VS -.-> HF
subgraph Storage
SQL[SQLite — cognitive-core.db]
MD[Markdown + YAML]
SKILL[SQLite — skills.db]
end
E --> SQL
P --> SQL
M --> SQL
CS --> SQL
RF --> SQL
RB -.->|system_state| SQL
VS --> SQL
K --> MD| Store | Purpose | Format | |-------|---------|--------| | ExperienceMemory | Episodic — what happened | SQLite | | PlaybookLibrary | Procedural — how to do things | SQLite (with version history) | | KnowledgeBank | Semantic — facts, concepts, relationships | Markdown + YAML frontmatter | | MetaMemory | Meta-learning — what worked | SQLite | | ReasoningBank | Clustered experience retrieval (K-means++) | SQLite (system_state) | | CausalStore | Cause-effect relationships | SQLite | | ReflexionMemory | Structured self-reflections | SQLite |
Candidate retrieval
All memory stores use a shared two-phase retrieval pattern:
- InvertedIndex narrowing — tokenize the query, find candidate IDs via posting lists (fast, O(query tokens))
- Fallback to full scan — if the index returns fewer candidates than needed, fall back to the full store or domain partition
This is implemented once in memory/candidate-retrieval.ts and shared by ExperienceMemory, PlaybookLibrary, and CausalStore.
Search and matching
Memory search uses BM25 text matching by default. For higher-quality retrieval, plug in an embedding provider (OpenAI, Voyage, or local via @huggingface/transformers) and cognitive-core switches to hybrid BM25 + vector search backed by sqlite-vec.
Knowledge bank search uses text similarity by default, with optional delegation to minimem for hybrid vector + BM25 search when available.
MoE routing gate
The TaskRouter uses a Mixture-of-Experts inspired gating function to select routing strategies. Instead of fixed threshold rules, a learned linear gate computes softmax(W · features) over the four strategies (direct, adapt, explore, fallback). Weights are updated online from outcome feedback, with per-domain specialization. Cold start falls back to uniform weights (equivalent to the original threshold logic).
Temporal compression
Experiences flow through four tiers based on their access frequency:
| Tier | Fidelity | Threshold | |------|----------|-----------| | Hot | Full data preserved | accessScore > 0.5 | | Warm | Key steps only (solutionOutput stripped) | accessScore > 0.2 | | Cold | Single-paragraph summary | accessScore > 0.05 | | Evicted | Removed (pattern captured in ReasoningBank) | accessScore < 0.05 |
Hysteresis prevents oscillation: an experience must stay below a threshold for 2 consecutive compression passes before demotion.
Coherence checking
Before inserting new knowledge notes, the CoherenceChecker detects contradictions using a simplified sheaf-inspired approach:
- Filter existing notes to the same entity/domain
- Compute text similarity between new and candidate notes
- Detect sentiment conflicts via negation heuristics (e.g., "always" vs "never", "use" vs "avoid")
- Flag contradictions where
residualEnergy = similarity × sentimentConflictexceeds threshold
Knowledge Bank
The knowledge bank is a semantic memory store for facts, concepts, and relationships that agents learn from experience. While playbooks capture how to do things, the knowledge bank captures what agents know — facts about tools, libraries, version-specific behavior, and causal relationships.
Knowledge note types
| Type | Purpose | Example |
|------|---------|---------|
| Observation | Atomic fact learned from experience | "Prisma requires db push before migrate dev when schema has drifted" |
| Entity | Living document about a tool/library/pattern | "Everything I know about Prisma" |
| Domain summary | High-level overview of a knowledge domain | "Database knowledge overview" |
How knowledge is stored
Knowledge notes are plain Markdown files with structured YAML frontmatter, organized on the filesystem:
knowledge/
├── observations/ # Atomic facts (k-*.md)
├── entities/ # Living entity docs (prisma.md, vitest.md)
└── domains/ # Domain summaries (database.md, testing.md)Any agent with filesystem access can browse, grep, and read knowledge directly — no special API required.
Multi-layer knowledge graph
Relationships between knowledge notes are tracked in a multi-layer graph overlay (inspired by MAGMA):
| Layer | Captures | Example | |-------|----------|---------| | Semantic | Conceptual relationships | "Prisma depends-on PostgreSQL" | | Temporal | When knowledge was learned | "Observation A supersedes B" | | Causal | Cause-effect chains | "Upgrading TS 5.4 broke enum const exports" | | Entity | Tool/component interactions | "Next.js uses React" |
Knowledge lifecycle
Trajectory → KnowledgeExtractor → Observations (confidence: 0.3)
│
reinforced by more trajectories
│
Observations (confidence: 0.5-0.8)
│
consolidated into entity notes
│
Entity Notes (living docs)
│
domain summaries regenerated
│
Domain SummariesKnowledge evolves through:
- Reinforcement — same fact observed again increases confidence
- Contradiction detection — conflicting facts are flagged by the CoherenceChecker and resolved
- Consolidation — observations about the same entity merge into entity notes
- Decay — unvalidated knowledge gradually loses confidence
- Defragmentation — the KnowledgeDefragmenter merges duplicate/overlapping notes
Knowledge surfacing
During atlas.solve(), relevant knowledge is surfaced alongside playbooks as an independent context section:
const result = await atlas.solve(task);
console.log(result.surfacedKnowledge?.length); // knowledge notes injectedThree-tier retrieval:
- Domain match — if the task domain matches a knowledge domain, include the domain summary
- Entity match — if the task mentions known entities, include their notes
- Semantic match — text similarity search for contextually relevant observations
minimem integration
When minimem is available, the knowledge bank can delegate search to minimem's hybrid search (vector + BM25) for higher-quality retrieval. Set minimemAware: true in config to enable. The two systems communicate via file conventions only — no cross-package imports.
See Design Doc for the full architecture.
Learning Pipeline
Knowledge extraction
The KnowledgeExtractor runs four heuristic strategies per trajectory:
- Error patterns — extracts observations from error signatures and suggested fixes
- Config facts — detects reads/writes of config files (package.json, tsconfig, etc.) and captures versions
- Causal chains — identifies failure→recovery step sequences and creates causal links
- Entity identification — matches known entities in observations and produces entity updates
Reflexion generation
The ReflexionGenerator creates structured self-reflection episodes from completed trajectories. Each episode includes:
- Outcome classification (success/partial/failure)
- Self-critique (wasted effort, thrashing detection)
- Key insights (tools used, file paths, error patterns)
- Strategy assessment against matched playbooks
- Suggested playbook updates
Shared utilities
Common analysis patterns are consolidated into shared utilities to avoid duplication:
| Utility | Used by |
|---------|---------|
| classifyError() / classifyErrorType() | TrajectoryAnalyzer, ReflexionGenerator |
| detectRepeatedActions() | TrajectoryAnalyzer, ReflexionGenerator, MetaLearner |
| extractToolNames() | ReflexionGenerator, MetaLearner |
| getCandidates() | ExperienceMemory, PlaybookLibrary, CausalStore |
CLI
cognitive-core ships a CLI for operating the learning system — ingesting trajectories, querying memory, and running maintenance. Available as both cogcore (short) and cognitive-core.
npm install -g cognitive-core
cogcore helpDaemon mode
The primary operational command. Polls SessionLog for new coding sessions and feeds them through the learning pipeline automatically.
# Continuous mode — watch for new sessions and process them
cogcore run --repo /path/to/repo --interval 30
# One-shot mode — process all unprocessed sessions and exit
cogcore run --once --repo /path/to/repoThe flow: SessionBank.discover() → filter unprocessed → SessionTrajectorySource.synthesize() → atlas.processTrajectory() → bank.markProcessed(). Supports graceful shutdown via SIGINT/SIGTERM.
Playbook commands
# Initialize storage
cogcore init
# Store a trajectory from a JSON file
cogcore store ./trajectory.json
# Search for relevant playbooks
cogcore search "fix typescript import errors" --domain code
# Found 2 playbook(s) matching "fix typescript import errors":
#
# Match: 92% (trigger)
# ## typescript-import-resolution
# Confidence: 85%
# Strategy: Check tsconfig.json paths configuration
# Get full playbook details
cogcore get playbook-abc123
# List domains with playbook counts
cogcore domains
# View system statistics
cogcore statsKnowledge bank
Query and manage the semantic memory store (observations, entities, domain summaries).
cogcore kb search "prisma migration"
cogcore kb list # all notes
cogcore kb list --type entities # entity notes only
cogcore kb list --type domains # domain summaries only
cogcore kb get prisma # by ID, entity name, or domain name
cogcore kb stats # observation/entity/domain counts, graph stats
cogcore kb defrag # deduplicate, process inbox, run decaySessions
Inspect SessionLog checkpoint data from git orphan branches. SessionBank is standalone — it reads directly from the git repo without Atlas.
cogcore sessions list --repo .
cogcore sessions get 2026-02-15-aabbccdd-1122-3344-5566-778899aabbcc
cogcore sessions query --agent "Claude Code" --since 2026-01-01 --limit 10Learning pipeline
Operator controls for the three-speed learning pipeline.
cogcore learn stats # processed count, pending, batch/maintenance cycles
cogcore learn batch # force batch learning on accumulated trajectories
cogcore learn maintenance # force maintenance cycle (defrag, healing, meta-strategies)Skill library
Inspect and manage the tiered skill system.
cogcore skills list # all skills with tier, confidence, success rate
cogcore skills list --tier core # core skills only (always in system prompt)
cogcore skills list --tier domain # domain skills only
cogcore skills stats # core/domain counts, last refresh
cogcore skills refresh # trigger tier re-evaluation (promotions/demotions)Global options
All commands support these flags:
| Flag | Description |
|------|-------------|
| --dir <path> | Storage directory (default: .cognitive-core) |
| --json | Machine-readable JSON output |
| --limit <n> | Maximum results |
| --domain <d> | Filter by domain |
| --repo <path> | Git repository path (for sessions/run) |
# Pipe JSON output to jq
cogcore search "debug async errors" --json | jq '.[0].strategy'
cogcore kb stats --json | jq '.observationCount'Agent Backends
cognitive-core delegates execution to backends that handle spawning, message passing, and trajectory extraction.
Subprocess backend
Spawns agents as child processes. Works with any CLI agent.
import { createAtlasWithAgents, createSubprocessBackend, claudeCodeConfig } from 'cognitive-core';
const backend = createSubprocessBackend({
'claude-code': claudeCodeConfig, // Pre-configured for Claude Code CLI
});
const atlas = createAtlasWithAgents([backend], {
execution: {
defaultAgentType: 'claude-code',
maxExecutionTime: 300,
captureToolCalls: true,
},
});ACP backend
Uses the Agent Communication Protocol for richer interaction with ACP-compatible agents.
import { createACPBackend, claudeCodeACPConfig } from 'cognitive-core';
const backend = createACPBackend({
'claude-code': claudeCodeACPConfig, // Uses npx claude-code-acp
});Custom backends
Implement the AgentBackend interface:
import type { AgentBackend, AgentSpawnConfig, AgentSession } from 'cognitive-core';
class MyBackend implements AgentBackend {
readonly name = 'my-agent';
readonly supportedTypes = ['my-agent'];
async isAvailable(): Promise<boolean> { return true; }
async spawn(config: AgentSpawnConfig): Promise<AgentSession> {
// config.task - the task to solve
// config.systemPromptAdditions - playbook context to inject
// config.timeout - max execution time
const session = await launchMyAgent(config);
return session;
}
async getSession(id: string): Promise<AgentSession | undefined> { /* ... */ }
async terminate(id: string): Promise<void> { /* ... */ }
}Observing agent execution
const manager = atlas.getAgentManager();
manager.addObserver({
onSessionStart: (session) => console.log('Started:', session.id),
onToolCall: (session, toolCall) => console.log('Tool:', toolCall.name),
onSessionEnd: (session, trajectory) => {
console.log('Done:', trajectory.outcome.success);
console.log('Steps:', trajectory.steps.length);
},
});Skill Library
The skill library manages how playbooks are surfaced to agents across four tiers:
| Tier | When loaded | Criteria | |------|-------------|----------| | Core | Always in system prompt | 85%+ confidence, 5+ successes, 80%+ success rate | | Domain | When task domain matches | Domain-tagged playbooks | | Contextual | When task query matches | Semantic/trigger match | | On-demand | Agent explicitly requests | Available via CLI |
Playbooks are automatically promoted and demoted based on usage outcomes. Three consecutive failures trigger demotion review.
const skills = await atlas.getSkillLibrary()?.getSkillsForAgent(task);
// skills.core - always-on playbooks
// skills.domain - relevant to task domain
// skills.contextual - matched to this specific taskskill-tree integration
When skillTree.enabled is true (default), playbooks are published to a skill-tree SQLite database via SqliteStorageAdapter, making them available to external consumers.
Team skill library
For multi-agent systems, the TeamSkillLibrary surfaces team-level and role-specific playbooks with ELO-based ranking from team outcome feedback.
Session Bank
The session bank provides integration with the Claude Code CLI, enabling cognitive-core to learn from real coding sessions.
import { SessionBank, EntireGitReader, EntireTranscriptParser } from 'cognitive-core';Components:
- EntireGitReader — reads session transcripts from Claude Code's git-based storage
- EntireTranscriptParser — parses raw transcripts into structured
SessionRecordobjects - SessionBank — manages session records with querying by date range, domain, and outcome
The EntireTrajectorySource bridges session bank records into trajectories for the learning pipeline.
Workspace Templates
The workspace system provides LLM-assisted analysis via structured templates. Each template defines inputs, outputs, and a processing prompt. The AgenticTaskRunner executes templates against a workspace.
Available templates:
| Template | Purpose |
|----------|---------|
| trajectory-analysis | Deep trajectory analysis with LLM |
| playbook-extraction | Extract playbooks from trajectory clusters |
| knowledge-extraction | LLM-assisted knowledge extraction |
| usage-inference | Infer which playbooks were actually used |
| meta-reflection | Generate meta-learning observations |
| solution-evaluation | Evaluate solution quality |
| refinement-analysis | Analyze refinement opportunities |
| knowledge-defrag | Merge overlapping knowledge notes |
| team-playbook-extraction | Extract team-level playbooks |
| team-trajectory-analysis | Analyze multi-agent trajectories |
Templates can run in heuristic mode (no LLM, pattern matching only) or agentic mode (LLM-assisted, higher quality).
Configuration
const atlas = createAtlasWithAgents([backend], {
storage: {
baseDir: '.cognitive-core',
dbName: 'cognitive-core.db', // SQLite database filename
persistenceEnabled: true,
},
learning: {
creditStrategy: 'simple', // 'simple' | 'contribution'
minTrajectories: 10, // batch learning threshold
deduplicationThreshold: 0.9, // prevent duplicate playbooks
},
router: {
similarityThreshold: 0.85, // match confidence threshold
useDomainRouting: true,
},
memory: {
maxExperiences: 4, // k for experience retrieval
maxContextTokens: 4000,
capacity: {
maxExperiences: 1000, // total stored experiences
maxPlaybooks: 200,
autoPrune: true,
preserveDomainCoverage: true,
},
},
knowledgeBank: {
enabled: true, // enable semantic memory (default: false)
memoryDir: 'memory',
extraction: {
enabled: true, // auto-extract from trajectories
useLlmExtraction: false, // use heuristic extraction
},
graph: {
enabledLayers: ['semantic'], // semantic | temporal | causal | entity
},
surfacing: {
maxNotesPerTask: 5,
maxTokensForKnowledge: 2000,
},
minimemAware: false, // set true if using minimem
},
execution: {
defaultAgentType: 'claude-code',
maxExecutionTime: 300,
captureToolCalls: true,
},
refinement: {
useAgentEvaluation: true,
maxIterations: 3,
acceptableScore: 0.7,
},
});Core Types
Playbook
The central learning unit. Combines when to apply, what to do, and how to verify results.
interface Playbook {
id: string;
name: string;
applicability: {
situations: string[]; // "debugging async code"
triggers: string[]; // "Promise rejection", "TS2307"
antiPatterns: string[]; // when NOT to use
domains: string[]; // "typescript", "react", "testing"
};
guidance: {
strategy: string; // high-level approach
tactics: string[]; // mid-level steps
steps?: string[]; // concrete commands
codeExample?: string;
};
verification: {
successIndicators: string[]; // "Tests pass", "No errors"
failureIndicators: string[]; // "Same error persists"
rollbackStrategy?: string;
};
evolution: {
version: string;
successCount: number;
failureCount: number;
refinements: Refinement[]; // context-specific adaptations
failures: FailureRecord[];
};
confidence: number; // 0-1, grows with successful use
complexity: 'simple' | 'moderate' | 'complex';
}Trajectory
A recorded problem-solving session in ReAct format.
interface Trajectory {
id: string;
task: Task;
steps: Step[]; // thought -> action -> observation
outcome: Outcome; // success/failure + solution
agentId: string;
llmCalls: number;
totalTokens: number;
wallTimeSeconds: number;
}RoutingDecision
How the system decides to approach a task.
interface RoutingDecision {
strategy: 'direct' | 'adapt' | 'explore' | 'fallback';
confidence: number;
memoryContext: MemoryQueryResultV2;
estimatedBudget: number;
reasoning: string;
}Research References
cognitive-core draws from several lines of research on agent memory and learning:
- ArcMemo - Concept-level memory outperforms instance-level at all compute scales
- ReMem - Experience retrieval with k=4, iterative refinement loops
- Stitch - Library learning through compression
- LILO - AutoDoc for concept naming
- Voyager - Ever-growing skill libraries with verification
- Claudeception - Lightweight skill persistence for code agents
- A-MEM - Zettelkasten-inspired structured notes with dynamic linking (knowledge note format)
- MAGMA - Multi-graph decomposition into semantic, temporal, causal, entity layers (graph overlay design)
- Zep - Entity extraction, community detection, temporal tracking
- Memory Survey - Comprehensive survey and taxonomy of factual/experiential/procedural memory
- ruvector - Inspiration for infrastructure improvements: ReasoningBank (K-means++ clustering), MoE routing gate, CoherenceChecker (sheaf-inspired), TemporalCompressor (tiered memory), InstantLoop (hot-path learning)
Limitations
- No cross-domain transfer. A playbook learned in the
codedomain won't surface fortestingtasks even if the underlying pattern is the same. Domain tags are string-matched, not semantically compared. - Cold start. The system needs ~10 trajectories before batch learning kicks in and playbooks start appearing. The instant loop provides per-trajectory updates immediately, and knowledge extraction works per-trajectory, so some learning is available sooner.
- Text-based matching by default. BM25 and inverted index candidate narrowing work but miss semantic similarity. Vector search requires configuring an embedding provider and adds latency. Knowledge bank search uses text similarity unless minimem is available.
- No trajectory quality filtering. The system stores all trajectories, including ones from poorly-performing agents. Low-quality trajectories can produce low-quality playbooks and knowledge. The deduplication threshold, confidence model, and temporal compression help, but don't solve the garbage-in problem.
- Single-machine storage. All system-internal data lives in a single SQLite database (
cognitive-core.dbwith WAL mode), knowledge notes are Markdown files, and skills use a separateskills.db. There's no built-in replication, multi-agent concurrency, or cloud storage. - Extraction is heuristic by default. Without an LLM extractor configured, both playbook and knowledge extraction use text pattern matching, which produces lower-quality results. LLM-assisted extraction is available via workspace templates.
Contributing
Contributions welcome. The test suite uses Vitest:
npm install
npm run test:run # run tests once
npm run test # watch mode
npm run typecheck # type checking without emitLicense
MIT
