@t2a/core
v0.6.2
Published
Talk-to-Action SDK core: single-session three-role (user/assistant/system_event) conversation kernel with zero runtime deps.
Maintainers
Readme
t2a-core
@t2a/core· Talk-to-Action Conversation Kernel
What is t2a-core?
A TypeScript SDK that models LLM conversations as a group chat between Human, AI, and Systems.
Traditional chat: User asks → AI answers (maybe calls a tool). t2a-core: One user, one AI, N systems — all first-class participants in a shared session timeline.
User (operator)
↕
LLM (coordinator)
↕ ↕ ↕
System A System B System C
(ERP) (CRM) (WMS)The operator speaks naturally. The AI translates intent into system operations. Each system pushes results back as events. The AI synthesizes and responds. No tab-switching, no form-filling, no polling.
Core design:
- Three-role model —
user/assistant/system_eventas first-class message roles - Async-by-event — systems push events into the session anytime; AI reacts without waiting for user input
- Stream interruption — user interrupts AI mid-sentence; partial output is persisted, context preserved
- Self-managing context — built-in
/compact, overflow detection, and sanity interludes - Zero runtime dependencies — pure interfaces for Storage, LLM, and Transport
Target scenario:
Single user + Multiple systems + Single LLM — the next interaction paradigm for complex systems.
Users shouldn't need to know how many systems exist behind an app, or learn each system's UI. They express intent in one conversation; the AI coordinates everything.
- C-end user: "Book tomorrow's train to Shanghai, then call a car to the hotel" → train API + ride-hailing + hotel system
- Operations staff: "Refund order #12345, adjust inventory, notify the customer" → ERP + WMS + CRM
- Enterprise user: "Summarize this quarter's sales, compare with last year, draft a report" → BI + docs + email
Same model: one human, one AI, N systems — one session.
Why a "group chat" kernel?
Existing frameworks already handle tool use well — user speaks, AI calls tools, tools return results. That's a solved problem.
What they can't do: systems pushing events to the AI without user initiation.
━━ Tool Use (every framework does this) ━━━━━━━━━━━━━━━━━━━━━━━━━
User: "Check my delivery status" ← user initiates
AI → logistics API → result → AI replies ← tool is passive
━━ System Event (only t2a-core) ━━━━━━━━━━━━━━━━━━━━━━━━━━━
Logistics system: "Package delivered" ← system initiates
AI → User: "Your package just arrived! ← AI reacts without user input
Want me to schedule a pickup?"This is the group chat paradigm: systems are active participants, not passive tools. They can speak whenever they have something to say.
Real-world examples of system-initiated events:
- 📦 Delivery arrived → AI proactively notifies user
- 💰 Payment confirmed → AI starts order processing
- ⚠️ Inventory alert → AI warns operations staff
- 🖼️ Image generation done → AI delivers the result
- 📅 Calendar reminder → AI nudges user about upcoming event
None of these require the user to say anything first.
| | t2a-core | Typical frameworks |
|---|---|---|
| Mental model | Group chat (user + AI + N systems) | 1:1 chat (user ↔ AI, tools are sub-calls) |
| External events | pushSystemEvent() — any system injects events anytime, AI reacts | Must poll, or user sends another message |
| Three-role model | user / assistant / system_event as first-class roles | Events crammed into user or system |
| Async task writeback | Tool fires async task → result flows back as system_event → AI responds | Synchronous tool calls block the conversation |
| Stream interruption | Partial output persisted with interrupted=true, LLM sees where it stopped | Abort = lost context |
| Built-in compression | /compact + session.compact(), LLM summarizes history automatically | DIY |
| Sanity events | long_wait / overflow_warning / overflow_hit with friendly interludes | No UX for slow tools or token limits |
| Runtime deps | 0 | Transitive dependency trees |
| Vendor lock-in | All interfaces — swap Storage/LLM/Transport freely | Often tied to specific providers |
How is this different from LangGraph?
LangGraph is a workflow orchestration framework (DAG + checkpoints). It's the closest project in terms of ambition:
| | t2a-core | LangGraph |
|---|---|---|
| Mental model | Group chat timeline | Workflow graph (nodes + edges) |
| Interrupt | session.interrupt() — cuts LLM stream, partial output persisted, next turn continues naturally | interrupt(value) — throws exception, pauses graph; resume replays the node function from the start |
| Resume semantics | True conversation continuation — LLM sees its partial output | Function replay + value injection — same code re-runs, interrupt() returns the resume value |
| External events | pushSystemEvent() — first-class, any system pushes anytime | update_state() + manual resume — essentially "mutate state then kick" |
| Token management | Built-in /compact + overflow detection | None built-in |
| Language | TypeScript, 0 deps | Python-first (JS version exists but weaker ecosystem) |
| Best for | Conversational agents coordinating multiple systems | Multi-step approval workflows, DAG orchestration |
In one line: LangGraph orchestrates agents as workflow graphs. t2a-core drives agents as group chat participants.
How is this different from pi-agent-core?
pi-agent-core is a lightweight single-session agent driver (prompt → tool calls → response). It's excellent at what it does, but:
| | t2a-core | pi-agent-core |
|---|---|---|
| Persistence | Storage interface built-in — messages survive restarts | Purely in-memory, session dies when process ends |
| Async events | system_event role + pushSystemEvent() — external systems inject events that trigger agent responses | No mechanism for external event injection |
| Interruption | session.interrupt() — partial content saved, context preserved | steer() / followUp() but no partial persistence |
| Token management | /compact + overflow detection + automatic interludes | transformContext (manual pruning only) |
| Message model | Three roles with degradation strategy (system_event → user prefix at LLM boundary) | Standard two-role (user/assistant) |
| Multi-session coordination | Designed for it — structured artifacts, event-driven handoffs | Single session only, no inter-session protocol |
In short: pi-agent-core is a conversation driver. t2a-core is a conversation kernel with state, events, and lifecycle management.
Core Capabilities
1. Three-Role Message Model
user → what the human says
assistant → what the AI says
system_event → what the world tells the agentSystem events (task completion, webhook triggers, sensor readings) are stored with their own role and source metadata. They only get "downgraded" to user-role messages at the LLM API boundary.
2. Async-by-Event Pattern
Tools don't have to block until completion. A tool handler can:
- Start an async operation (image generation, API call, long computation)
- Return immediately with an acknowledgment
- When done, call
session.pushSystemEvent()— the agent picks up and responds
The user doesn't have to send another message for the agent to react to completed work.
3. Stream Interruption & Resume
When session.interrupt() is called:
- Current LLM stream aborts
- Partial content is persisted with
interrupted: true - Next turn, the LLM sees its own partial output and naturally continues or pivots
No context is lost. No awkward "sorry, where was I?" behavior.
4. AgentLoop
Full tool-calling loop with:
- Streaming token delivery
- Parallel and serial tool execution
- Configurable max iterations
- Interruptible at any point
5. History Compression
await session.compact({ keepLastN: 10 });
// or user types: /compactSDK calls the LLM to summarize older messages, replaces them with a compact_summary system event. Context window stays healthy.
How compact actually works
- Load full history for the session from storage.
- Split: messages older than
keepLastN→toCompact; the latestkeepLastN→kept.- If
history.length <= keepLastN, emitssystem_notice { code: 'compact_nothing_to_do' }and returns.
- If
- Summarize: build a single text blob from
toCompact(user / assistant / tool / system_event lines), send it to the LLM withcompact.summarizerSystemPrompt(overridable via config). - Soft-delete the old range via
Storage.replaceRange():- Sets
deleted_at = nowon every message intoCompact(no rows are physically deleted). - Inserts one new message of role
system_eventwithsource = 'compact_summary'and the LLM-produced summary as content.
- Sets
- Persist a notice: a
notice { type: 'compact_done', payload: { compactedCount } }is also written to storage so admins/UI can see when each compaction happened. - Emit events on the bus:
compact_start { messageCount }before the LLM callcompact_done { ... }after successsystem_notice { code: 'compact_failed' }on summarizer failure (original history is left intact)
maybeInterlude('on_compact_start' | 'on_compact_done')runs so users see something while the summarizer thinks.
Implications for downstream consumers
- Auditability: Old messages are retained with
deleted_atset — history is reversible/inspectable, never lost. - Active-history queries (the ones the LLM sees, and any "current" UI views) should always filter
WHERE deleted_at IS NULL. The bundledSQLiteStorage.loadMessages()does this for you. - Message counts shrink after compact. Example: 34 messages,
compact({ keepLastN: 20 })→- 14 messages get
deleted_at = now - 1 new
compact_summarymessage is inserted - Active count becomes
20 + 1 = 21 - Total row count in the table is still
35(deleted rows are kept).
- 14 messages get
- Admin / analytics dashboards that want to show active conversation length must use
WHERE deleted_at IS NULL. To show lifetime message volume, drop that filter. - Storage requirement: any custom Storage implementation MUST provide
replaceRange()semantics that soft-delete a range and atomically append the summary message.session.compact()throws ifStorage.replaceRangeis missing.
When compact runs automatically
- User types the
compactCommand(default/compact) — intercepted before LLM dispatch. onOverflow: 'summarize'is configured and the context window is hit — Session callscompact()automatically before resuming the turn (seeonOverflowmodes:truncate/summarize/reject).
6. Sanity Events & Interludes
long_wait— tool running too long? SDK emits event, default interludes give the user friendly feedbackoverflow_warning— approaching token limitoverflow_hit— hard limit reached, triggers compact or rejection
Built-in "interludes" (human-friendly messages) with 7 tone buckets. Override with your own.
7. Multi-LLM Fallback (v0.4)
Provide multiple LLM clients — if the primary times out or errors, the SDK automatically switches to the next one:
import { OpenAILLMClient } from '@t2a/core';
const session = new Session({
sessionId: 'demo-001',
storage: myStorage,
tools,
systemPrompt: 'You are a helpful assistant.',
// Multiple providers — SDK tries in order
llm: [
new OpenAILLMClient({ baseUrl: 'https://mimo-api.com/v1', apiKey: 'key1' }),
new OpenAILLMClient({ baseUrl: 'https://api.deepseek.com/v1', apiKey: 'key2' }),
new OpenAILLMClient({ baseUrl: 'https://api.openai.com/v1', apiKey: 'key3' }),
],
model: ['mimo-v2.5-pro', 'deepseek-v4', 'gpt-4o'],
// Fallback behavior
llmFallback: {
timeoutMs: 15000, // 15s per client before switching
maxRetries: 1, // no retry, switch immediately
},
});
// Know when fallback happens
session.on('llm_fallback', ({ fromIndex, toIndex, model, error }) => {
console.log(`Provider ${fromIndex} failed: ${error.message}, switching to ${model}`);
});
// Know when all providers are down
session.on('llm_exhausted', ({ errors }) => {
console.error('All LLM providers failed:', errors.map(e => e.message));
});Timeout behavior: The timer starts when the request is sent. Once the first chunk arrives (stream started), the timeout is cancelled — a slow but streaming response won't be killed.
Single client still works: llm: singleClient (no array) behaves exactly as before. Fallback config is simply ignored.
Quick Start
import { Session, ToolRegistry } from '@t2a/core';
const tools = new ToolRegistry();
tools.register({
schema: {
name: 'get_weather',
description: 'Get weather for a city',
parameters: {
type: 'object',
properties: { city: { type: 'string' } },
required: ['city'],
},
},
handler: async (args) => ({ ok: true, data: { city: args.city, temp: 22 } }),
});
const session = new Session({
sessionId: 'demo-001',
storage: myStorage, // implement Storage interface
llm: myLLMClient, // implement LLMClient interface
tools,
systemPrompt: 'You are a helpful assistant.',
});
// Listen to events
session.on('text', ({ delta }) => process.stdout.write(delta));
session.on('tool_start', ({ name }) => console.log(`[tool] ${name}`));
session.on('done', () => console.log('\n[done]'));
await session.sendUserMessage('What is the weather in Tokyo?');Async Event Injection
// An external system completed a task — push it into the conversation
session.pushSystemEvent({
source: 'image_generator',
payload: { taskId: 42, url: 'https://example.com/result.png' },
triggerAgent: true, // agent will respond to this event
});Interruption
// User sends a new message while AI is still responding
session.interrupt();
// partial output is saved, next sendUserMessage() continues naturally
await session.sendUserMessage('Actually, never mind. Tell me about...');Reference Implementations (v0.3.0)
The SDK ships with optional reference implementations:
OpenAILLMClient— works with any OpenAI-compatible API (GPT, DeepSeek, Kimi, GLM, MiMo, etc.), with optionalparseReasoningfor o1/o3 reasoning tokensClaudeLLMClient— Anthropic Messages API native, with extended thinking and multi-modal normalizerGeminiLLMClient— Google Gemini REST API native, with thinking support and cumulative SSE delta extractionSQLiteStorage—better-sqlite3based, configurable table names
These are provided as starting points. For production, implement the interfaces to match your stack.
Multi-Vendor LLM Clients (v0.5.0)
import { ClaudeLLMClient, GeminiLLMClient, OpenAILLMClient } from '@t2a/core';
// OpenAI / 兼容厂商
const openai = new OpenAILLMClient({
baseUrl: 'https://api.openai.com/v1',
apiKey: 'sk-...',
model: 'gpt-4o',
parseReasoning: true, // 解析 o1/o3 reasoning tokens
});
// Claude 原生
const claude = new ClaudeLLMClient({
baseUrl: 'https://api.anthropic.com/v1',
apiKey: 'sk-ant-...',
model: 'claude-sonnet-4-20250514',
thinking: { type: 'enabled', budgetTokens: 10000 },
});
// Gemini 原生
const gemini = new GeminiLLMClient({
baseUrl: 'https://generativelanguage.googleapis.com/v1beta',
apiKey: 'AIza...',
model: 'gemini-2.5-pro',
thinking: { includeThoughts: true, thinkingBudget: 8000 },
});Architecture
┌─────────────────────────────────────────────────┐
│ Session │
│ ┌─────────────┐ ┌───────────┐ ┌──────────┐ │
│ │ AgentLoop │ │ EventBus │ │ Interlude│ │
│ └──────┬──────┘ └───────────┘ └──────────┘ │
│ │ │
│ ┌──────▼──────┐ ┌───────────┐ ┌──────────┐ │
│ │ToolRegistry │ │ Storage* │ │LLMClient*│ │
│ └─────────────┘ └───────────┘ └──────────┘ │
└─────────────────────────────────────────────────┘
* = you provide theseUse Cases
Primary: Conversational Interface to Complex Systems
Replace "learn N different UIs" with one natural-language session:
User: "帮我订明天去上海的高铁,到了之后叫个车去酒店"
AI → Train API: search & book → [system_event: ticket_booked]
AI → Ride-hailing: schedule pickup → [system_event: ride_scheduled]
AI → Hotel: confirm reservation → [system_event: hotel_confirmed]
AI: "已订明早 8:30 G1234 次高铁,12:05 到上海虹桥站。
已预约接站专车,送往和平饶店(已确认入住)。"Each system reports back asynchronously. The AI synthesizes all results. The user never leaves the chat.
Works for everyone:
- C-end users — book travel, manage subscriptions, cross-app workflows through one conversation
- Operations staff — refunds + inventory + notifications across ERP/CRM/WMS in one command
- Enterprise users — pull data from BI, draft reports, send emails — all in one session
- AI generation tools — image/video generation takes 60-120s; results push back when ready
- IoT / device control — sensor events interrupt or inform ongoing conversations
Documentation
DESIGN.md— Full design document (architecture, decisions, trade-offs)SCHEMA.md— Database schema (MySQL / SQLite DDL)CHANGELOG.md— Version historyROADMAP.md— Version roadmap
Status
- 154 tests across 16 test files
- Line coverage ≥ 92%, branch coverage ≥ 78%
tsc --noEmit+tsup build(ESM + CJS + .d.ts) passing- Zero breaking changes across v0.1 → v0.5
- 🧠 Multi-vendor native LLM clients — OpenAI, Claude, Gemini with thinking/reasoning support
Roadmap
| Version | Content | Status |
|---|---|---|
| v0.1.0 | Core SDK — Session, AgentLoop, ToolRegistry, EventBus, Interlude | ✅ |
| v0.2.0 | Stream interruption, /compact, long_wait, overflow sanity | ✅ |
| v0.3.0 | OpenAILLMClient + SQLiteStorage reference impls, buildLLMMessages enhancements | ✅ |
| v0.4.0 | Overflow strategies (truncate/summarize), Transport interface, Multi-LLM fallback | ✅ |
| v0.5.0 | Claude/Gemini native LLMClient, multi-modal normalizer, thinking support | ✅ |
Install
npm install @t2a/coreLicense
MIT © 2026 kyo
