@revealui/ai
v0.6.2
Published
AI runtime for agent-driven products — agents, memory, LLM providers (Inference Snaps, Ollama, OpenAI-compatible), tools, and orchestration. Anthropic-SDK-free.
Maintainers
Readme
title: "@revealui/ai" description: "AI system for RevealUI - memory, LLM, orchestration, and tools." visibility: public status: verified audience: user
@revealui/ai
Commercial package - requires a RevealUI Pro license. Free to install and evaluate; a license key is required for production use.
AI system for RevealUI - memory, LLM, orchestration, and tools.
Features
- Memory System: CRDT-based persistent memory (Working, Episodic, Semantic)
- LLM Integration: Provider abstractions for Anthropic, GROQ, Ollama, Canonical Inference Snaps, and more
- Agent Orchestration: Runtime and execution engine for AI agents
- Tool Calling: Tool registry + standard-MCP-client integration (Stage 5.1a)
- Vector Search: Semantic search with pgvector
- Type-safe: Full TypeScript support
- Performant: Optimized for low-latency operations
Reference stack (Ubuntu)
The recommended on-prem / on-device stack pairs @revealui/ai with a
Canonical Inference Snap for local LLM inference:
@revealui/ai (this package)
│
├── agent runtime + tool-calling loop
│
├── MCP client → @revealui/mcp/client → tools/resources/prompts from MCP servers
│
└── LLM provider → OpenAI-compatible HTTP → localhost:<port>/v1
▲
│
Canonical Inference Snap
(DeepSeek R1, Qwen 2.5 VL,
Gemma 3, Nemotron Nano —
silicon-optimized on Intel/
Ampere/NVIDIA/NPU)Quick setup:
sudo snap install gemma3 # or deepseek-r1, qwen-vl, etc.
gemma3 set http.port=9090
gemma3 status # confirms base URLimport { InferenceSnapsProvider, LLMClient } from '@revealui/ai'
const provider = new InferenceSnapsProvider({
baseURL: 'http://localhost:9090/v1',
model: 'gemma3',
})
const client = new LLMClient({ provider })Cloud providers (Anthropic, OpenAI, GROQ) remain supported; the local inference path is the documented default for self-hosted deployments.
MCP tool integration
Standard MCP servers plug into the agent runtime as tool sources. Construct
an McpClient (from @revealui/mcp/client), connect it, and pass it to
AgentRuntime:
import { McpClient } from '@revealui/mcp/client'
import { AgentRuntime } from '@revealui/ai'
const contentClient = new McpClient({
clientInfo: { name: 'my-agent', version: '1.0.0' },
transport: { kind: 'streamable-http', url: 'https://admin.example.com/api/mcp/content' },
})
await contentClient.connect()
const runtime = new AgentRuntime({
mcpClients: [{ name: 'content', client: contentClient }],
})Tools from each client are namespaced as mcp_<name>__<toolName> so multiple
clients coexist without collisions.
Resources + prompts (Stage 5.1b)
When an MCP client advertises resources and/or prompts, the adapter additionally emits per-server meta-tools so the agent can read them on demand without any bespoke wiring:
| Meta-tool | Calls | Returns |
|---|---|---|
| mcp_<ns>__list_resources | client.listResources() | [{ uri, name?, description?, mimeType? }, …] |
| mcp_<ns>__read_resource({ uri }) | client.readResource(uri) | resource contents; text parts flattened into the content field for token-efficient LLM context |
| mcp_<ns>__list_prompts | client.listPrompts() | [{ name, description?, arguments? }, …] |
| mcp_<ns>__get_prompt({ name, args? }) | client.getPrompt(name, args) | { description?, messages }; text messages flattened to <role>: <text> |
All four are opt-out via include on createToolsFromMcpClient(). Meta-tools
are silently skipped when the client doesn't implement the underlying
method (e.g. servers that don't advertise the resources capability).
// Include every primitive (default)
const tools = await createToolsFromMcpClient(client, { namespace: 'content' })
// Tools only, skip the resources/prompts meta-tools
const toolsOnly = await createToolsFromMcpClient(client, {
namespace: 'content',
include: { tools: true, resources: false, prompts: false },
})Elicitation, progress, cancellation (Stage 5.3)
MCP servers can ask the user for structured input mid-flow
(elicitation/create), report progress on long-running tool calls, and
respect cancellation. @revealui/ai wires all three through consumer
callbacks — the agent package stays UI-agnostic:
import { McpClient } from '@revealui/mcp/client'
import { createElicitationHandler, createToolsFromMcpClient } from '@revealui/ai'
const controller = new AbortController() // wire to a UI cancel button
const client = new McpClient({
clientInfo: { name: 'my-agent', version: '1.0.0' },
transport: { kind: 'streamable-http', url: '…' },
elicitationHandler: createElicitationHandler({
onElicit: async ({ message, requestedSchema }) => {
const form = await showFormDialog({ title: message, schema: requestedSchema })
if (!form) return { action: 'cancel' }
return { action: 'accept', content: form.values }
},
timeoutMs: 60_000, // auto-cancel if no response
// allowUrlMode: false // default — URL-mode auto-declines
}),
})
await client.connect()
const tools = await createToolsFromMcpClient(client, {
namespace: 'content',
signal: controller.signal, // cancel all MCP RPC when aborted
onProgress: (event) => { // per-tool-call progress events
progressBar.update(event.toolName, event.progress.progress, event.progress.total)
},
})Elicitation safety:
mode: 'url'(out-of-band consent) is auto-declined unlessallowUrlMode: true. URL-mode is a phishing vector when users can't easily verify the target domain.timeoutMsauto-cancels (not declines) when no response arrives. Useful in headless automation.- Errors thrown inside
onElicitmap to{ action: 'cancel' }— the server sees a clean non-response, not a crashed protocol.
Progress events are forwarded from the server's
notifications/progress stream with the namespace + tool name stamped
for multi-client attribution. Resource + prompt meta-tools also emit
(via their own notifications/progress flows).
Cancellation uses the MCP spec's notifications/cancelled — aborting
the AbortSignal propagates to in-flight RPC calls on the wire, so
long-running operations stop instead of just being ignored.
Recursive sampling (Stage 5.2)
Some MCP servers need LLM capabilities without bundling a provider. The
spec lets servers issue sampling/createMessage requests — the client
runs the inference, keeps cost + context control, and returns the result.
On the Ubuntu reference stack, that means servers get LLM access via the
developer's local Canonical Inference Snap — no cloud round-trip required.
import { McpClient } from '@revealui/mcp/client'
import { InferenceSnapsProvider, createSamplingHandler } from '@revealui/ai'
const llm = new InferenceSnapsProvider({
baseURL: 'http://localhost:9090/v1',
model: 'gemma3',
})
const client = new McpClient({
clientInfo: { name: 'my-agent', version: '1.0.0' },
transport: { kind: 'streamable-http', url: 'https://example.com/mcp' },
samplingHandler: createSamplingHandler({
llm,
defaultModel: 'gemma3',
allowedModels: ['gemma3', 'deepseek-r1'], // strongly recommended
onSamplingRequest: (info) => {
// metering / audit trail
metrics.incr('mcp.sampling', { model: info.model })
},
}),
})
await client.connect()allowedModels filters modelPreferences.hints from the server — hints
outside the list are ignored so a malicious server can't escalate costs.
The handler reports the resolved model back in result.model.
Scope in 5.2: text-only messages (non-text content throws with a clear
error). Multimodal sampling lands with the provider interface's content
parts extension. stopSequences aren't forwarded to the LLM yet (the
current LLMChatOptions doesn't expose that field); this is advisory per
spec, so omission is compliant.
The legacy mcpToolSource path (hypervisor-backed) still works and is
kept for backwards compatibility, but new integrations should prefer the
mcpClients path.
Installation
pnpm add @revealui/aiQuick Start
import { EpisodicMemory } from '@revealui/ai/memory/stores'
import { NodeIdService } from '@revealui/ai/memory/services'
import { createClient } from '@revealui/db/client'
const db = createClient({ connectionString: process.env.POSTGRES_URL! })
const nodeIdService = new NodeIdService(db)
const nodeId = await nodeIdService.getNodeId('user', 'user-123')
const memory = new EpisodicMemory('user-123', nodeId, db)Testing
⚠️ Important: This package has known testing limitations. See TESTING.md for details.
Quick Commands
# Unit tests (always work)
pnpm --filter @revealui/ai test
# Integration tests (require Neon instance)
POSTGRES_URL="postgresql://..." pnpm --filter @revealui/ai test __tests__/integration
# Production validation
POSTGRES_URL="postgresql://..." ./scripts/validate-production.shTesting Limitations
- ❌ Local PostgreSQL testing not possible (Neon HTTP driver limitation)
- ⚠️ Mock database tests may fail (known limitation, not a bug)
Full documentation: See TESTING.md
Documentation
- TESTING.md: Complete testing guide, limitations, and validation plan
- OBSERVABILITY.md: Observability and monitoring guide
- Source Code:
packages/ai/src/memory/ - Helper Functions:
packages/ai/src/memory/utils/sql-helpers.ts
API Reference
Memory System
EpisodicMemory
Long-term memory for conversation history and agent memories.
import { EpisodicMemory } from '@revealui/ai/memory/stores'
const memory = new EpisodicMemory(userId, nodeId, db)
await memory.add(agentMemory)
await memory.save()
const memories = await memory.getAll()NodeIdService
Deterministic node IDs for CRDT operations.
import { NodeIdService } from '@revealui/ai/memory/services'
const nodeIdService = new NodeIdService(db)
const nodeId = await nodeIdService.getNodeId('user', 'user-123')CRDTPersistence
Generic adapter for saving/loading CRDT state.
import { CRDTPersistence } from '@revealui/ai/memory/persistence'
const persistence = new CRDTPersistence(db)
await persistence.saveCRDTState(crdtId, 'lww_register', data)
const state = await persistence.loadCRDTState(crdtId, 'lww_register')LLM Integration
Provider abstractions and unified client for Anthropic, GROQ, and Ollama.
import { LLMClient, createLLMClientFromEnv } from '@revealui/ai/llm/client'
const client = createLLMClientFromEnv()
const response = await client.chat([
{ role: 'user', content: 'Hello!' }
])Agent Orchestration
Agent runtime and execution engine for autonomous agents.
import { AgentRuntime } from '@revealui/ai/orchestration/runtime'
const runtime = new AgentRuntime()Tools
Tool registry and execution system with MCP integration.
import { ToolRegistry } from '@revealui/ai/tools/registry'
const registry = new ToolRegistry()Performance Considerations
The memory system uses deep cloning to ensure immutability and prevent data corruption. Understanding the cloning strategy is important for performance optimization.
Cloning Layers
LWWRegister Level (Core CRDT)
get(): Clones object/array values on every callset(): Clones values when storingmerge(): Clones winning values during mergetoData(): Clones values when serializing
WorkingMemory Level
getContext(): Returns cloned context from LWWRegistergetContextValue(): Returns value from cloned context (no additional cloning)setContext(): Clones entire context objectupdateContext(): Clones context once for multiple updates
AgentContextManager Level
getContext(): Returns value from cloned context (no additional cloning)getAllContext(): Returns cloned context from WorkingMemorysetContext(): Validates then sets (cloning happens in WorkingMemory)updateContext(): Validates then updates (cloning happens in WorkingMemory)
Performance Implications
✅ Efficient Operations
- Single key access:
getContext(key)- No double cloning - Multiple updates:
updateContext({ k1: v1, k2: v2 })- Single clone - Primitive values: No cloning overhead
⚠️ Performance Considerations
- Large contexts: Every
getContext()clones the entire context - Frequent updates: Each update clones the context
- Deep nesting: Deep cloning is recursive and can be slow for very deep objects
Best Practices
1. Batch Updates: Use updateContext() for multiple changes instead of multiple setContext() calls
// ❌ Bad: Clones context 3 times
manager.setContext('key1', 'value1')
manager.setContext('key2', 'value2')
manager.setContext('key3', 'value3')
// ✅ Good: Clones context once
manager.updateContext({
key1: 'value1',
key2: 'value2',
key3: 'value3',
})2. Cache Context: If you need to access multiple values, get the full context once
// ❌ Bad: Clones context multiple times
const value1 = manager.getContext('key1')
const value2 = manager.getContext('key2')
const value3 = manager.getContext('key3')
// ✅ Good: Clone once, access multiple times
const context = manager.getAllContext()
const value1 = context.key1
const value2 = context.key2
const value3 = context.key33. Avoid Deep Nesting: Keep context structure relatively flat
// ❌ Bad: Very deep nesting
context: {
user: {
profile: {
settings: {
theme: {
color: 'dark'
}
}
}
}
}
// ✅ Good: Flatter structure
context: {
'user.profile.settings.theme.color': 'dark'
}4. Use Primitives When Possible: Primitives don't require cloning
// ✅ Good: Primitives are fast
manager.setContext('count', 42)
manager.setContext('name', 'John')
// ⚠️ Consider: Objects require cloning
manager.setContext('user', { name: 'John', age: 30 })Size Limits
The system enforces limits to prevent performance issues:
- Max Context Keys: 10,000 keys
- Max Context Size: ~10MB (approximate)
- Max Object Depth: 100 levels
These limits prevent:
- Memory exhaustion
- Stack overflow from deep recursion
- Performance degradation from huge objects
Monitoring Performance
If you experience performance issues:
- Profile Context Size: Check how large your contexts are
- Monitor Clone Operations: Count how many times contexts are cloned
- Check Depth: Ensure objects aren't too deeply nested
- Review Update Patterns: Look for opportunities to batch updates
Future Optimizations
Potential optimizations (not yet implemented):
- Lazy Cloning: Only clone when values are actually accessed
- Structural Sharing: Share unchanged parts of objects
- Caching: Cache cloned values for frequently accessed keys
- Incremental Updates: Only clone changed parts of context
Requirements
- Node.js 24.13.0 or higher
- PostgreSQL with pgvector extension
- Neon Postgres (for production) or compatible database
License
FSL-1.1-MIT (Fair Source — converts to MIT after 2 years). See LICENSE.
Last Updated: 2026-03-04 Consolidated: 2026-01-31 (Merged PERFORMANCE.md into this README)
