cognitive-core
v0.1.2
Published
TypeScript-native cognitive core for adaptive learning and abstraction
Maintainers
Readme
cognitive-core
A TypeScript learning system for AI agents. Records how agents solve problems, extracts reusable playbooks and factual knowledge from trajectories, and injects relevant guidance into future tasks.
Table of Contents
- Motivation
- Installation
- Quick Start
- How It Works
- Knowledge Bank
- CLI
- Agent Backends
- Skill Library
- Configuration
- Core Types
- Research References
- Limitations
- Contributing
- License
Motivation
cognitive-core gives agents persistent, structured memory:
- Trajectories record what the agent did (ReAct-style thought/action/observation steps)
- Playbooks distill trajectories into reusable guidance (strategy, tactics, verification criteria)
- Knowledge Bank extracts and organizes factual knowledge — what agents learn about tools, libraries, and patterns
- Routing matches incoming tasks to relevant playbooks before the agent starts working
- Meta-learning tracks which playbooks helped and adjusts routing over time
The result: agents that get measurably better at recurring problem types without fine-tuning or prompt engineering.
Installation
npm install cognitive-coreRequires Node.js 18+.
Quick Start
Solve a task with memory-augmented agents
import { createAtlasWithAgents, createTask, createMockBackend } from 'cognitive-core';
const atlas = createAtlasWithAgents(
[createMockBackend()],
{ storage: { baseDir: '.cognitive-core' } }
);
await atlas.init();
const result = await atlas.solve(createTask({
domain: 'code',
description: 'Fix the TypeScript compilation error in auth.ts',
}));
console.log(result.trajectory.outcome.success); // true
console.log(result.routing?.strategy); // 'direct' | 'adapt' | 'explore' | 'fallback'
console.log(result.injectedPlaybooks?.length); // number of playbooks injected
await atlas.close();Feed trajectories from external agents
cognitive-core can learn from trajectories produced by external agents. Record what happened elsewhere and feed it in.
import { createAtlas, createTrajectory, createTask, createStep, successOutcome } from 'cognitive-core';
const atlas = createAtlas({ storage: { baseDir: '.cognitive-core' } });
await atlas.init();
const trajectory = createTrajectory({
task: createTask({
domain: 'code',
description: 'Fix the null pointer exception in user service',
}),
steps: [
createStep({
thought: 'Check where the null value originates',
action: 'Read src/services/user.ts',
observation: 'getUserById returns undefined when user not found',
}),
createStep({
thought: 'Add a guard clause before accessing user properties',
action: 'Edit src/services/user.ts to add null check',
observation: 'Added: if (!user) return null',
}),
],
outcome: successOutcome('Fixed by adding null check in getUserById'),
agentId: 'claude-code',
});
const result = await atlas.processTrajectory(trajectory);
console.log(result.abstractable); // true - system can extract a playbook from this
await atlas.close();Query memory directly
const context = await atlas.queryMemory('typescript import resolution error', {
domains: ['code'],
includePlaybooks: true,
});
for (const { playbook, score } of context.playbooks) {
console.log(`${playbook.name} (${Math.round(score * 100)}% match)`);
console.log(` Strategy: ${playbook.guidance.strategy}`);
console.log(` Tactics: ${playbook.guidance.tactics.join(', ')}`);
}How It Works
Solve flow
Every call to atlas.solve(task) runs through this pipeline:
flowchart TD
Task[Task arrives] --> Router[TaskRouter]
Router -->|queries| Memory[MemorySystem]
Memory --> Exp[ExperienceMemory]
Memory --> PB[PlaybookLibrary]
Memory --> KB[KnowledgeBank]
Memory --> Meta[MetaMemory]
Router --> Decision{RoutingDecision}
Decision --> Agent[AgentManager]
Agent -->|injects playbooks + knowledge| Backend[Backend]
Backend -->|returns trajectory| Session[AgentSession]
Session --> Check{Succeeded?}
Check -->|Yes| Learn[LearningPipeline]
Check -->|No| Refine{Retry?}
Refine -->|Yes, max 3x| Agent
Refine -->|No| Learn
Learn --> Analyze[TrajectoryAnalyzer]
Learn --> Extract[PlaybookExtractor]
Learn --> KnowExtract[KnowledgeExtractor]
Learn --> Usage[UsageInference]
Learn --> MetaLearn[MetaLearner]
Analyze --> Updated[Memory updated]
Extract --> Updated
KnowExtract --> Updated
Usage --> Updated
MetaLearn --> Updated
Updated -.->|next task| RouterPlaybook lifecycle
A playbook starts with low confidence (0.3) after extraction from a cluster of similar trajectories. Each time it's injected into an agent and the task succeeds, confidence grows. After enough successful uses (default: 5 successes, 80%+ success rate), it gets promoted to a core skill that's always in the system prompt. If it starts failing in specific contexts, the system records refinements rather than discarding the playbook entirely.
stateDiagram-v2
[*] --> Extracted : pattern found across trajectories
Extracted --> Contextual : confidence above 0.3
Contextual --> Domain : tagged to domain
Contextual --> Contextual : success
Domain --> Domain : success
Domain --> Core : 0.85+ confidence, 5+ successes, 80%+ rate
Core --> Core : success
Core --> Domain : 3 consecutive failures
Domain --> Contextual : confidence drops
Contextual --> Refined : failure in specific context
Refined --> Contextual : refinement recordedMemory architecture
The four memory stores serve different retrieval patterns:
graph LR
subgraph MemorySystem
E[ExperienceMemory]
P[PlaybookLibrary]
K[KnowledgeBank]
M[MetaMemory]
end
subgraph Search
BM[BM25 Index]
VS[sqlite-vec]
TS[Text Similarity]
end
subgraph Providers
OAI[OpenAI]
VOY[Voyage]
HF[HuggingFace local]
MM[minimem optional]
end
E --> BM
E --> VS
P --> BM
P --> VS
K --> TS
K -.-> MM
VS -.-> OAI
VS -.-> VOY
VS -.-> HF
subgraph Storage
JSON[JSON files]
SQL[SQLite]
MD[Markdown + YAML]
end
E --> JSON
P --> JSON
M --> JSON
K --> MD
VS --> SQL| Store | Purpose | Format | |-------|---------|--------| | ExperienceMemory | Episodic — what happened | JSON | | PlaybookLibrary | Procedural — how to do things | JSON | | KnowledgeBank | Semantic — facts, concepts, relationships | Markdown + YAML frontmatter | | MetaMemory | Meta-learning — what worked | JSON |
Search and matching
Memory search uses BM25 text matching by default. For higher-quality retrieval, plug in an embedding provider (OpenAI, Voyage, or local via @huggingface/transformers) and cognitive-core switches to hybrid BM25 + vector search backed by sqlite-vec.
Knowledge bank search uses text similarity by default, with optional delegation to minimem for hybrid vector + BM25 search when available.
Knowledge Bank
The knowledge bank is a semantic memory store for facts, concepts, and relationships that agents learn from experience. While playbooks capture how to do things, the knowledge bank captures what agents know — facts about tools, libraries, version-specific behavior, and causal relationships.
Knowledge note types
| Type | Purpose | Example |
|------|---------|---------|
| Observation | Atomic fact learned from experience | "Prisma requires db push before migrate dev when schema has drifted" |
| Entity | Living document about a tool/library/pattern | "Everything I know about Prisma" |
| Domain summary | High-level overview of a knowledge domain | "Database knowledge overview" |
How knowledge is stored
Knowledge notes are plain Markdown files with structured YAML frontmatter, organized on the filesystem:
knowledge/
├── observations/ # Atomic facts (k-*.md)
├── entities/ # Living entity docs (prisma.md, vitest.md)
└── domains/ # Domain summaries (database.md, testing.md)Any agent with filesystem access can browse, grep, and read knowledge directly — no special API required.
Multi-layer knowledge graph
Relationships between knowledge notes are tracked in a multi-layer graph overlay (inspired by MAGMA):
| Layer | Captures | Example | |-------|----------|---------| | Semantic | Conceptual relationships | "Prisma depends-on PostgreSQL" | | Temporal | When knowledge was learned | "Observation A supersedes B" | | Causal | Cause-effect chains | "Upgrading TS 5.4 broke enum const exports" | | Entity | Tool/component interactions | "Next.js uses React" |
Knowledge lifecycle
Trajectory → KnowledgeExtractor → Observations (confidence: 0.3)
│
reinforced by more trajectories
│
Observations (confidence: 0.5-0.8)
│
consolidated into entity notes
│
Entity Notes (living docs)
│
domain summaries regenerated
│
Domain SummariesKnowledge evolves through:
- Reinforcement — same fact observed again increases confidence
- Contradiction detection — conflicting facts are flagged and resolved
- Consolidation — observations about the same entity merge into entity notes
- Decay — unvalidated knowledge gradually loses confidence
Knowledge surfacing
During atlas.solve(), relevant knowledge is surfaced alongside playbooks as an independent context section:
const result = await atlas.solve(task);
console.log(result.surfacedKnowledge?.length); // knowledge notes injectedThree-tier retrieval:
- Domain match — if the task domain matches a knowledge domain, include the domain summary
- Entity match — if the task mentions known entities, include their notes
- Semantic match — text similarity search for contextually relevant observations
minimem integration
When minimem is available, the knowledge bank can delegate search to minimem's hybrid search (vector + BM25) for higher-quality retrieval. Set minimemAware: true in config to enable. The two systems communicate via file conventions only — no cross-package imports.
See Design Doc for the full architecture.
CLI
cognitive-core ships a CLI for querying playbooks and storing trajectories without writing code.
# Initialize storage
cognitive-core init --dir .cognitive-core
# Store a trajectory from a JSON file
cognitive-core store ./trajectory.json
# Search for relevant playbooks
cognitive-core search "fix typescript import errors" --domain code
# Found 2 playbook(s) matching "fix typescript import errors":
#
# Match: 92% (trigger)
# ## typescript-import-resolution
# Confidence: 85%
# Strategy: Check tsconfig.json paths configuration
# Tactics:
# - Verify moduleResolution setting
# - Check baseUrl and paths mapping
# - Inspect file extensions (.js vs .ts)
# Get full playbook details
cognitive-core get playbook-abc123
# List domains with playbooks
cognitive-core domains
# View memory statistics
cognitive-core stats
# Experiences: 47
# Playbooks: 12
# Meta-observations: 31All commands support --json for structured output:
cognitive-core search "debug async errors" --json | jq '.results[0].strategy'Agent Backends
cognitive-core delegates execution to backends that handle spawning, message passing, and trajectory extraction.
Subprocess backend
Spawns agents as child processes. Works with any CLI agent.
import { createAtlasWithAgents, createSubprocessBackend, claudeCodeConfig } from 'cognitive-core';
const backend = createSubprocessBackend({
'claude-code': claudeCodeConfig, // Pre-configured for Claude Code CLI
});
const atlas = createAtlasWithAgents([backend], {
execution: {
defaultAgentType: 'claude-code',
maxExecutionTime: 300,
captureToolCalls: true,
},
});ACP backend
Uses the Agent Communication Protocol for richer interaction with ACP-compatible agents.
import { createACPBackend, claudeCodeACPConfig } from 'cognitive-core';
const backend = createACPBackend({
'claude-code': claudeCodeACPConfig, // Uses npx claude-code-acp
});Custom backends
Implement the AgentBackend interface:
import type { AgentBackend, AgentSpawnConfig, AgentSession } from 'cognitive-core';
class MyBackend implements AgentBackend {
readonly name = 'my-agent';
readonly supportedTypes = ['my-agent'];
async isAvailable(): Promise<boolean> { return true; }
async spawn(config: AgentSpawnConfig): Promise<AgentSession> {
// config.task - the task to solve
// config.systemPromptAdditions - playbook context to inject
// config.timeout - max execution time
const session = await launchMyAgent(config);
return session;
}
async getSession(id: string): Promise<AgentSession | undefined> { /* ... */ }
async terminate(id: string): Promise<void> { /* ... */ }
}Observing agent execution
const manager = atlas.getAgentManager();
manager.addObserver({
onSessionStart: (session) => console.log('Started:', session.id),
onToolCall: (session, toolCall) => console.log('Tool:', toolCall.name),
onSessionEnd: (session, trajectory) => {
console.log('Done:', trajectory.outcome.success);
console.log('Steps:', trajectory.steps.length);
},
});Skill Library
The skill library manages how playbooks are surfaced to agents across four tiers:
| Tier | When loaded | Criteria | |------|-------------|----------| | Core | Always in system prompt | 85%+ confidence, 5+ successes, 80%+ success rate | | Domain | When task domain matches | Domain-tagged playbooks | | Contextual | When task query matches | Semantic/trigger match | | On-demand | Agent explicitly requests | Available via CLI |
Playbooks are automatically promoted and demoted based on usage outcomes. Three consecutive failures trigger demotion review.
const skills = await atlas.getSkillLibrary()?.getSkillsForAgent(task);
// skills.core - always-on playbooks
// skills.domain - relevant to task domain
// skills.contextual - matched to this specific taskConfiguration
const atlas = createAtlasWithAgents([backend], {
storage: {
baseDir: '.cognitive-core',
persistenceEnabled: true,
},
learning: {
creditStrategy: 'simple', // 'simple' | 'contribution'
minTrajectories: 10, // batch learning threshold
deduplicationThreshold: 0.9, // prevent duplicate playbooks
},
router: {
similarityThreshold: 0.85, // match confidence threshold
useDomainRouting: true,
},
memory: {
maxExperiences: 4, // k for experience retrieval
maxContextTokens: 4000,
capacity: {
maxExperiences: 1000, // total stored experiences
maxPlaybooks: 200,
autoPrune: true,
preserveDomainCoverage: true,
},
},
knowledgeBank: {
enabled: true, // enable semantic memory (default: false)
memoryDir: 'memory',
extraction: {
enabled: true, // auto-extract from trajectories
useLlmExtraction: false, // use heuristic extraction
},
graph: {
enabledLayers: ['semantic'], // semantic | temporal | causal | entity
},
surfacing: {
maxNotesPerTask: 5,
maxTokensForKnowledge: 2000,
},
minimemAware: false, // set true if using minimem
},
execution: {
defaultAgentType: 'claude-code',
maxExecutionTime: 300,
captureToolCalls: true,
},
refinement: {
useAgentEvaluation: true,
maxIterations: 3,
acceptableScore: 0.7,
},
});Core Types
Playbook
The central learning unit. Combines when to apply, what to do, and how to verify results.
interface Playbook {
id: string;
name: string;
applicability: {
situations: string[]; // "debugging async code"
triggers: string[]; // "Promise rejection", "TS2307"
antiPatterns: string[]; // when NOT to use
domains: string[]; // "typescript", "react", "testing"
};
guidance: {
strategy: string; // high-level approach
tactics: string[]; // mid-level steps
steps?: string[]; // concrete commands
codeExample?: string;
};
verification: {
successIndicators: string[]; // "Tests pass", "No errors"
failureIndicators: string[]; // "Same error persists"
rollbackStrategy?: string;
};
evolution: {
version: string;
successCount: number;
failureCount: number;
refinements: Refinement[]; // context-specific adaptations
failures: FailureRecord[];
};
confidence: number; // 0-1, grows with successful use
complexity: 'simple' | 'moderate' | 'complex';
}Trajectory
A recorded problem-solving session in ReAct format.
interface Trajectory {
id: string;
task: Task;
steps: Step[]; // thought -> action -> observation
outcome: Outcome; // success/failure + solution
agentId: string;
llmCalls: number;
totalTokens: number;
wallTimeSeconds: number;
}RoutingDecision
How the system decides to approach a task.
interface RoutingDecision {
strategy: 'direct' | 'adapt' | 'explore' | 'fallback';
confidence: number;
memoryContext: MemoryQueryResultV2;
estimatedBudget: number;
reasoning: string;
}Research References
cognitive-core draws from several lines of research on agent memory and learning:
- ArcMemo - Concept-level memory outperforms instance-level at all compute scales
- ReMem - Experience retrieval with k=4, iterative refinement loops
- Stitch - Library learning through compression
- LILO - AutoDoc for concept naming
- Voyager - Ever-growing skill libraries with verification
- Claudeception - Lightweight skill persistence for code agents
- A-MEM - Zettelkasten-inspired structured notes with dynamic linking (knowledge note format)
- MAGMA - Multi-graph decomposition into semantic, temporal, causal, entity layers (graph overlay design)
- Zep - Entity extraction, community detection, temporal tracking
- Memory Survey - Comprehensive survey and taxonomy of factual/experiential/procedural memory
Limitations
- No cross-domain transfer. A playbook learned in the
codedomain won't surface fortestingtasks even if the underlying pattern is the same. Domain tags are string-matched, not semantically compared. - Cold start. The system needs ~10 trajectories before batch learning kicks in and playbooks start appearing. Until then, agents run without memory augmentation. Knowledge extraction works per-trajectory, so knowledge is available sooner.
- Text-based matching by default. BM25 works but misses semantic similarity. Vector search requires configuring an embedding provider and adds latency. Knowledge bank search uses text similarity unless minimem is available.
- No trajectory quality filtering. The system stores all trajectories, including ones from poorly-performing agents. Low-quality trajectories can produce low-quality playbooks and knowledge. The deduplication threshold and confidence model help, but don't solve the garbage-in problem.
- Single-machine storage. Persistence is JSON file-based (experiences, playbooks), Markdown file-based (knowledge), and SQLite (vector store). There's no built-in replication, multi-agent concurrency, or cloud storage.
- Extraction is heuristic by default. Without an LLM extractor configured, both playbook and knowledge extraction use text pattern matching, which produces lower-quality results. LLM-assisted extraction is available via workspace templates.
- Knowledge extraction is not auto-wired into the learning pipeline. Knowledge must be extracted explicitly via
KnowledgeBank.extractFromTrajectory(). This is by design — it allows callers to control when extraction runs.
Contributing
Contributions welcome. The test suite uses Vitest:
npm install
npm run test:run # run tests once
npm run test # watch mode
npm run typecheck # type checking without emitLicense
MIT
