@foreveragent/core

v0.2.0

Published

4 months ago

Persistent memory framework for AI agents — fact extraction, hybrid retrieval, context management, and persona identity

Forever Agent

Persistent memory framework for AI agents — fact extraction, hybrid retrieval, context management, and persona identity.

Forever Agent gives your AI agents memory that persists across sessions, context pruning, and model switches. It runs on Node.js built-in modules with zero npm dependencies.

Features

🧠 Persistent Memory — Atomic facts survive session restarts, model switches, and context window limits
🔍 Hybrid Retrieval — Four-strategy retrieval (keyword FTS5, semantic similarity, knowledge graph, temporal) merged with Reciprocal Rank Fusion
📊 Fact Lifecycle — HOT/WARM/COLD tiering, access-pattern promotion/demotion, supersession
🎭 Persona Management — Agent identity as a first-class concern, hot-reload, change tracking
🗂 Knowledge Graph — Entity-relationship store with traversal
⚡ Context Management — Budget-aware context window with adaptive pruning
🔄 Context Compaction — Summarizes old conversation history instead of dropping it, preserving information while freeing token budget
🔒 Quality Gate — Filters low-signal extractions before storage, improving precision
🏗 Pluggable Backends — Swap storage, vector store, LLM, and embeddings independently
📦 Zero npm Dependencies — Uses Node.js 22.6+ built-ins (SQLite, crypto, fs) only. Requires an external LLM and embedding service.

Quick Start

npm install @foreveragent/core

Basic Usage

import { ForeverAgent } from '@foreveragent/core';

// Create an agent with in-memory storage (great for testing)
const agent = await ForeverAgent.create({
  dataDir: './data',
  llm: {
    type: 'openai-compatible',
    endpoint: 'http://localhost:1234/v1',
    model: 'your-model',
    maxTokens: 2048,
  },
  embedding: {
    type: 'openai-compatible',
    endpoint: 'http://localhost:1234/v1',
    model: 'your-embedding-model',
    dimensions: 1536,
  },
});

// Start a session
const session = await agent.startSession();

// Your conversation loop:
while (true) {
  const userMessage = await getUserInput();

  // Recall relevant memories before each LLM call
  const memories = await agent.recall(userMessage);

  // Prepare context with persona + memories injected
  const { messages } = await agent.prepareContext(conversationHistory, memories);

  // Call your LLM
  const response = await llm.chat(messages);

  // Extract facts from this exchange (runs in background)
  await agent.extract(userMessage, response);

  conversationHistory.push({ role: 'user', content: userMessage });
  conversationHistory.push({ role: 'assistant', content: response });
}

// End session when done
await agent.endSession();

With OpenAI

import { ForeverAgent } from '@foreveragent/core';

const agent = await ForeverAgent.create({
  dataDir: './data',
  llm: {
    type: 'openai-compatible',
    endpoint: 'https://api.openai.com/v1',
    model: 'gpt-4o',
    maxTokens: 4096,
  },
  embedding: {
    type: 'openai-compatible',
    endpoint: 'https://api.openai.com/v1',
    model: 'text-embedding-3-small',
    dimensions: 1536,
  },
  // Optional: wire up a persona file
  persona: {
    filePath: './persona.md',
    hotReload: true,
    priority: 'critical',
  },
});

With Local Models (LMStudio / Ollama)

import { ForeverAgent } from '@foreveragent/core';

const agent = await ForeverAgent.create({
  dataDir: './.agent-data',
  llm: {
    type: 'openai-compatible',
    endpoint: 'http://localhost:11434/v1',  // Ollama
    model: 'qwen3:14b',
    maxTokens: 2048,
    temperature: 0.3,
  },
  embedding: {
    type: 'openai-compatible',
    endpoint: 'http://localhost:11434/v1',
    model: 'nomic-embed-text',
    dimensions: 768,
  },
});

API Reference

`ForeverAgent.create(config)`

Creates and initializes a new ForeverAgent instance.

const agent = await ForeverAgent.create({
  // Required
  dataDir: string,           // Where to store databases
  llm: LLMBackendConfig,     // LLM configuration
  embedding: EmbeddingBackendConfig,  // Embedding configuration

  // Optional
  persona?: {
    filePath?: string,       // Path to persona.md file
    content?: string,        // Inline persona content
    hotReload?: boolean,     // Watch for file changes
    priority?: 'critical' | 'high' | 'normal',
  },
  memory?: {
    maxFacts?: number,       // Max facts to store (0 = unlimited)
    autoTier?: boolean,      // Enable automatic tiering
    hotThreshold?: number,   // Sessions before HOT→WARM (default: 5)
    coldThreshold?: number,  // Sessions before WARM→COLD (default: 50)
  },
  context?: {
    maxTokens?: number,      // Context window size (default: 32768)
    pruneThreshold?: number, // % utilization before pruning (default: 80)
  },
  retrieval?: {
    keywordWeight?: number,  // FTS5 weight (default: 0.3)
    semanticWeight?: number, // Semantic similarity weight (default: 0.4)
    graphWeight?: number,    // Knowledge graph weight (default: 0.2)
    temporalWeight?: number, // Temporal recency weight (default: 0.1)
    limit?: number,          // Max results (default: 15)
  },
  curation?: {
    enabled?: boolean,       // Enable automatic curation (default: true)
    intervalTurns?: number,  // Turns between curation runs (default: 10)
    dedup?: boolean,         // Enable deduplication (default: true)
    synthesis?: boolean,     // Enable LLM synthesis (default: true)
  },
  logLevel?: 'debug' | 'info' | 'warn' | 'error' | 'silent',
});

Session Management

// Start a session (creates new session ID)
const session = await agent.startSession();
const session = await agent.startSession({ id: 'my-session-id' }); // specific ID

// End the current session
await agent.endSession();

// Get current session info
const session = agent.getCurrentSession();

Memory Operations

// Recall facts matching a query (hybrid retrieval)
const results = await agent.recall('deployment process', {
  limit: 10,
  method: 'hybrid',          // 'keyword' | 'semantic' | 'graph' | 'temporal' | 'hybrid'
  timeRange: { period: 'last_week' },
});
// results: Array<{ fact: Fact, method: string, score: number }>

// Extract facts from a conversation exchange
const facts = await agent.extract(userMessage, assistantResponse, {
  toolResults: [{ name: 'bash', content: '...' }],
  turnNumber: 42,
});

// Store a fact directly (bypasses LLM extraction)
const fact = await agent.storeFact('The server IP is 192.168.1.100', {
  importance: 0.8,
  tier: 'HOT',
});

// Search facts
const facts = await agent.searchFacts({
  text: 'deployment',
  tier: 'HOT',
  minImportance: 0.5,
  limit: 20,
});

// Get all facts in a session
const sessionFacts = await agent.getSessionFacts(sessionId);

Context Preparation

// Prepare context with persona + memories injected
const { messages, snapshot } = await agent.prepareContext(
  conversationHistory,  // Message[]
  memories,             // RetrievalResult[]
);

// Check context budget
const snapshot = await agent.getContextSnapshot(conversationHistory);
console.log(`Context: ${snapshot.utilization}% full, ${snapshot.tokensUsed} tokens`);

Metrics & Health

// Get performance metrics
const metrics = await agent.getMetrics();
console.log(metrics);
// {
//   totalFacts: 1547,
//   factsByTier: { HOT: 234, WARM: 891, COLD: 422 },
//   sessionCount: 12,
//   extractionLatencyMs: 245,
//   retrievalLatencyMs: 18,
//   ...
// }

// Health check all backends
const healthy = await agent.healthCheck();

Architecture

┌─────────────────────────────────────────────────────────────────┐
│                    Your Agent / Application                      │
└───────────────────────────┬─────────────────────────────────────┘
                            │  ForeverAgent API
                            ▼
┌─────────────────────────────────────────────────────────────────┐
│                    Forever Agent Engine                          │
│                                                                 │
│  ┌─────────────┐  ┌──────────────┐  ┌────────────────────────┐ │
│  │    Fact     │  │   Hybrid     │  │     Context Manager    │ │
│  │  Extractor  │  │  Retriever   │  │                        │ │
│  │             │  │              │  │  • Token budget        │ │
│  │ • LLM-based │  │ • FTS5       │  │  • Pruning             │ │
│  │ • Heuristic │  │ • Semantic   │  │  • Persona injection   │ │
│  │ • Quality   │  │ • Graph      │  │  • Memory injection    │ │
│  │   gate      │  │ • Temporal   │  │                        │ │
│  │             │  │ • RRF fusion │  │                        │ │
│  └─────────────┘  └──────────────┘  └────────────────────────┘ │
│                                                                 │
│  ┌─────────────┐  ┌──────────────┐  ┌────────────────────────┐ │
│  │  Persona    │  │  Curation    │  │   Knowledge Graph      │ │
│  │  Manager   │  │   Engine     │  │                        │ │
│  │             │  │              │  │  • Entity extraction   │ │
│  │ • Hot-reload│  │ • Dedup      │  │  • Relationship edges  │ │
│  │ • Priority  │  │ • Tiering    │  │  • Graph traversal     │ │
│  │ • Change ∆  │  │ • Synthesis  │  │                        │ │
│  └─────────────┘  └──────────────┘  └────────────────────────┘ │
│                                                                 │
│  ┌──────────────────────────────────────────────────────────┐  │
│  │                     Pluggable Backends                    │  │
│  │                                                          │  │
│  │  Storage (SQLite)   Vector (sqlite-vec)   Graph (SQLite)  │  │
│  │  LLM (OpenAI-compat) Embedding (OpenAI-compat)           │  │
│  └──────────────────────────────────────────────────────────┘  │
└─────────────────────────────────────────────────────────────────┘

Retrieval Strategy

Forever Agent uses four retrieval strategies merged with Reciprocal Rank Fusion (RRF):

| Strategy | Weight | How It Works | |---|---|---| | Keyword (FTS5) | 0.3 | BM25 full-text search, porter stemming | | Semantic | 0.4 | Cosine similarity via embedding vectors | | Graph | 0.2 | Knowledge graph entity traversal | | Temporal | 0.1 | Recency bias, time-range filtering |

RRF merges ranked lists without score normalization — a fact ranked #1 by two strategies scores higher than one ranked #1 by one strategy.

Memory Tiers

| Tier | Access Pattern | Typical Age | |---|---|---| | HOT | Accessed in last 5 sessions | Recent | | WARM | Accessed in last 50 sessions | Active | | COLD | Not accessed in 50+ sessions | Historical |

Custom Backends

You can replace any backend by implementing its interface:

import type { IStorageBackend } from '@foreveragent/core';

class PostgresStorage implements IStorageBackend {
  readonly name = 'postgres';
  // ... implement all methods
}

const agent = await ForeverAgent.create({
  // ...
  storage: new PostgresStorage({ connectionString: process.env.DATABASE_URL }),
});

Available backend interfaces:

IStorageBackend — Fact storage (CRUD, FTS, tier management)
IVectorStoreBackend — Embedding storage and similarity search
IEmbeddingBackend — Text-to-vector generation
ILLMBackend — Text generation (for extraction and synthesis)
IGraphBackend — Entity-relationship storage and traversal

Supported Frameworks

Forever Agent works with any AI agent that supports MCP or can call a library API. Detailed setup instructions for each framework are in docs/INTEGRATION-GUIDE.md.

MCP-Native (plug and play)

| Framework | Config Location | Difficulty | |-----------|----------------|------------| | Claude Code | ~/.claude/settings.json | Easy | | Cursor | ~/.cursor/mcp.json | Easy | | Windsurf | ~/.codeium/windsurf/mcp_config.json | Easy | | Cline | VS Code → Cline Settings → MCP | Easy | | Continue | ~/.continue/config.yaml | Easy | | Zed | ~/.config/zed/settings.json | Easy | | OpenAI Codex CLI | ~/.codex/config.toml | Easy | | OpenCode | ~/.opencode/config.json | Easy |

All MCP integrations use the same pattern:

{
  "mcpServers": {
    "forever-agent": {
      "command": "node",
      "args": ["--experimental-strip-types", "/path/to/Forever-Agent-Clean/mcp-server/mcp-server.ts"],
      "env": {
        "FA_DATA_DIR": "./.agent-memory",
        "FA_LLM_ENDPOINT": "http://localhost:11434/v1",
        "FA_LLM_MODEL": "qwen3:14b",
        "FA_EMBED_MODEL": "nomic-embed-text",
        "FA_EMBED_DIMS": "768"
      }
    }
  }
}

Deep Integrations

| Framework | Method | Difficulty | |-----------|--------|------------| | OpenClaw | Native memory plugin (kind: "memory") | Medium | | Pi Coding Agent | MCP server or custom extension | Medium | | aider | Wrapper script | Medium | | LangChain / LangGraph | Direct library API | Easy |

OpenClaw

OpenClaw (320K+ ⭐) has a native memory plugin slot system. Forever Agent can replace OpenClaw's built-in memory (memory-core or memory-lancedb) with full hybrid retrieval, knowledge graph, fact lifecycle, and persona management. See the OpenClaw section for the complete plugin implementation.

Claude Code (quick start)

Add to ~/.claude/settings.json:

{
  "mcpServers": {
    "forever-agent": {
      "command": "node",
      "args": ["--experimental-strip-types", "/path/to/Forever-Agent-Clean/mcp-server/mcp-server.ts"],
      "env": {
        "FA_DATA_DIR": "/path/to/.agent-memory",
        "FA_LLM_ENDPOINT": "http://localhost:11434/v1",
        "FA_LLM_MODEL": "qwen3:14b",
        "FA_EMBED_MODEL": "nomic-embed-text"
      }
    }
  }
}

Then add to your CLAUDE.md:

## Memory

You have persistent memory via the forever-agent MCP server.

- Before complex questions, use `memory_recall` to check for context
- After significant work, use `memory_extract` to capture key facts
- Use `memory_store` for important decisions and preferences

LangChain

import { ForeverAgent } from '@foreveragent/core';
import { ChatOpenAI } from '@langchain/openai';

const memory = await ForeverAgent.create({ /* ... */ });
const llm = new ChatOpenAI({ model: 'gpt-4o' });

// In your chain:
const memories = await memory.recall(userInput);
const systemPrompt = buildSystemPrompt(memories);

📖 Full integration guide with all 12 frameworks: docs/INTEGRATION-GUIDE.md

Development

# Run unit tests (no network, in-memory backends)
npm test

# Run integration tests (requires local SQLite)
npm run test:integration

# Run all tests
npm run test:all

# Type check without building
npm run typecheck

# Build for distribution
npm run build

Requirements

Node.js >= 22.6.0 (for node:sqlite built-in)
No npm dependencies in production — only devDependencies for TypeScript types

License

Free to use, modify, and distribute. See LICENSE for full terms.

Made with ❤️ by Feature Collective Investments, LLC