@archships/dim-agent-sdk

v0.0.75

Published

2 months ago

An agent-first TypeScript SDK with provider adapters, sessions, hooks, plugins, and runtime gateways.

0High
0Medium
0Low

eric8810

zerob13

cluademini

agent ai llm plugins sdk tools

@archships/dim-agent-sdk

An agent-first TypeScript SDK with canonical multi-provider contracts, session/tool loop, hook-based plugins, runtime gateways, and builtin local coding tools.

Install

Recommended runtime: Node.js >=18.

npm install @archships/dim-agent-sdk

Quick start

import { createAgent, createModel, createOpenAIAdapter } from '@archships/dim-agent-sdk'

const model = createModel(
  createOpenAIAdapter({
    apiKey: process.env.OPENAI_API_KEY,
    baseUrl: 'https://api.openai.com/v1',
    defaultModel: 'gpt-4o-mini',
  }),
)

const agent = createAgent({
  model,
  cwd: process.cwd(),
})

const session = await agent.createSession({
  systemPrompt: 'You are a coding agent. Use tools when needed.',
})

const itemId = session.send(
  'Create hello.txt in the current directory and write hello world into it.',
)

for await (const event of session.receive()) {
  if (event.itemId !== itemId) continue
  if (event.type === 'text_delta') process.stdout.write(event.delta)
  if (event.type === 'done') console.log(event.message.content)
}

When a session has tools available, the runtime also injects an internal tool-call JSON contract into the effective system prompt. Models are told to emit one complete JSON object per tool call. If a streamed tool call still arrives with malformed pre-execution JSON, the SDK now surfaces it as a failed recoverable tool_result, stores that failed result in session history when the tool name is known, and keeps the run alive instead of failing the whole item immediately. The SDK also owns transcript tool-call consistency. Snapshot restore normalizes legacy/OpenAI-shaped assistant.tool_calls and tool.tool_call_id / call_id / callId fields, preserves assistant messages whose only meaningful content is toolCalls, repairs contiguous orphan tool results by backfilling synthetic assistant tool calls, and compensates missing tool results with recovered failed tool messages. These recovered results mark unknown historical failures; the real tool execution state remains unknown. Before provider dispatch, unrepairable tool history raises InvalidToolMessageHistoryError locally with message index, ids, and request/session/turn metadata, so OpenAI-compatible providers do not receive malformed messages.

Usage ledger

Session.usage remains the compatibility mirror of total session usage. The canonical source now lives in usageLedger, with request-, turn-, and session-level views.

const itemId = session.send('Summarize the latest tool results.')

for await (const event of session.receive()) {
  if (event.itemId !== itemId) continue
  if (event.type === 'turn_usage') {
    console.log(event.summary.totalUsage.totalTokens)
  }
}

console.log(session.getUsageSummary().totalUsage.totalTokens)
console.log(session.getTurnUsage(itemId)?.breakdown)
console.log(session.listRequestUsages({ kind: 'auto_compaction' }))

Direct gateway calls can also opt into the same ledger:

for await (const event of agent.services.model.stream(request, {
  usage: {
    sessionId: session.id,
    kind: 'manual_compaction',
  },
})) {
  console.log(event.type)
}

Persistence

SessionSnapshot stays as the full export/restore shape, including usageLedger. Built-in persistence now stores main session state and usage ledger separately:

FileStateStore writes <encodeURIComponent(sessionId)>.json for the main snapshot
FileStateStore writes <encodeURIComponent(sessionId)>.usage.json for the session usage ledger
transformSnapshot only rewrites the main snapshot path
load() hydrates the sidecar ledger back into the returned SessionSnapshot

Custom stores now implement the split persistence contract:

import type { StateStoreSaveInput } from '@archships/dim-agent-sdk'

async function save(input: StateStoreSaveInput) {
  console.log(input.snapshot.sessionId)
  console.log(input.usageLedger.summary.totalUsage.totalTokens)
}

Observability JSONL logging

Agent-level observability records SDK public APIs, state transitions, hooks, model requests, provider and plugin network requests, tool calls, usage finalization, persistence, approvals, notifications, services, and process subagent IPC as JSONL. Each queued user item carries traceId and turnId, and operation rows use paired start plus success or error phases.

import path from 'node:path'
import { createAgent, createModel, createOpenAIAdapter } from '@archships/dim-agent-sdk'

const model = createModel(
  createOpenAIAdapter({
    apiKey: process.env.OPENAI_API_KEY,
    baseUrl: 'https://api.openai.com/v1',
    defaultModel: 'gpt-4o-mini',
  }),
)

const agent = createAgent({
  model,
  cwd: process.cwd(),
  observability: {
    logFilePath: path.join(process.cwd(), 'logs/dim-sdk.jsonl'),
    sanitizer: {
      maxStringLength: 4000,
      maxArrayItems: 128,
    },
    streamBodies: {
      enabled: true,
      directory: path.join(process.cwd(), 'logs/streams'),
    },
  },
})

agent.setObservabilityEnabled(false, { reason: 'maintenance window' })
agent.setObservabilityEnabled(true, { reason: 'resume diagnostics' })

The first row is sdk.meta, including SDK/package, Node/process, OS, and allowlisted environment metadata. Model streams emit two rows by default: model.request start with stream metadata, then model.request success / error with stream metadata plus a summary. api.session.receive.event keeps streaming delta event boundaries while replacing delta text with deltaLength by default. Set streamEvents.detail: true to add sanitized model.event rows and keep receive-event delta payloads. Adapter-level debug.logFilePath remains available for provider-only compatibility logs; new host integrations should prefer createAgent({ observability }).

Runtime inspection

The SDK exposes a read-only inspection service at agent.services.inspect for host diagnostics.

Manual

Developer manual: docs/manual/README.md
Chinese guide: docs/manual/zh/README.md
English guide: docs/manual/en/README.md
SDK API reference (zh): docs/manual/zh/sdk-api-reference.md
Plugin API reference (zh): docs/manual/zh/plugin-api-reference.md
SDK API reference (en): docs/manual/en/sdk-api-reference.md
Plugin API reference (en): docs/manual/en/plugin-api-reference.md
Host integration cookbook: start in docs/manual/en/sdk.md or docs/manual/zh/sdk.md

Included core capabilities

Canonical content / message / tool / model / state contracts
Canonical message content aligned to AI SDK V3 prompt parts: text and file
createAgent() -> Agent -> Session
Queue-first session flow: send(), sendBatch(), steer(), receive(), getQueueStatus()
Long-lived host hygiene: Session.dispose() releases session-scoped runtime registrations and controller-owned resources when a session is no longer needed
Session events: text_delta, optional thinking_delta, tool_call_start, tool_call_args_delta, tool_call_end, tool_call, plugin_event, recoverable and regular tool_result, usage_recorded, turn_usage, done, error
Provider adapters: openai-compatible, openai-responses, anthropic, gemini, zenmux, aihubmix, aihubmix-responses, moonshotai, deepseek, xai, xai-responses
Builtin tools: read, write, edit, exec
Builtin read returns UTF-8 text as line-numbered windows with full-file revision tracking, locally paginates oversized text windows with continuation notices instead of using the SDK-global <persisted-output> path, and can also project common raster images into synthetic user file context when the active model advertises image input support
Hook-first plugin integration
Host-layer ordered subagents with SubagentOrchestrator, InProcessSessionSubagentExecutor, ProcessSessionSubagentExecutor, and SubagentProcessRegistry
Runtime gateways: file system, git, exec, network, model
Host inspection: agent.services.inspect for agent/session/persistence snapshots
Namespaced plugin session state with snapshot restore support
In-memory and file-based persistence

Hook support

Supported public hooks in the current runtime:

run.start
tool.beforeExecute
tool.afterExecute
context.compact.before
notify.message
run.stop
run.end
session.error

Reserved / experimental hook names that are typed but not wired into the runtime yet:

subagent.stop

Current failure policy:

Sync middleware is blocking and fail-fast
Observers are best-effort
mode: 'async' is observer-only
timeoutMs applies per hook handler

Official plugin packages

| Package | Support level | Notes | | ------------------------------------- | ------------- | ----------------------------------------------------------------------------------------------------------------- | | @archships/dim-plugin-auto-compact | supported | Official auto compaction plugin; requires compaction.compactorPluginId: 'auto-compact' | | @archships/dim-plugin-grep-glob | supported | Registers grep and glob filesystem tools with relative and absolute path overrides | | @archships/dim-plugin-mcp-client | supported | Controller-driven MCP client for session-scoped server connections and tool injection; use @archships/dim-agent-sdk >= 0.0.23 and @archships/dim-plugin-api >= 0.0.9; host-only prompt/context/tool injection remains compatible | | @archships/dim-plugin-skills | supported | Metadata-first file-backed skills plugin with <skills_instructions> catalog metadata, skill: name trigger guidance, a model-callable skill loader tool, and a catalog session controller | | @archships/dim-plugin-plan-mode | supported | Session-scoped planning guardrail with host-data-backed drafts, restricted exec, and plan_read / plan_write; use @archships/dim-agent-sdk >= 0.0.23 and @archships/dim-plugin-api >= 0.0.9 | | @archships/dim-plugin-memory | experimental | Placeholder package; not part of the current supported surface yet | | @archships/dim-plugin-web | experimental | Placeholder package; not part of the current supported surface yet | | @archships/dim-plugin-scheduler | experimental | Placeholder package; not part of the current supported surface yet | | @archships/dim-plugin-research-mode | experimental | Evidence-first research guardrail with a session controller, staged prompt injection, research state, and delegate_tasks guidance |

Compaction and state model

session.messages always keeps the full original history for UI and restore
systemPrompt is text-only; user messages can carry mixed text + file content
Canonical request projection is controlled by SDK compaction state: cursor, systemSegments, checkpoints
Once compaction is active, the request projection realigns cursor to a real user boundary, injects the retained anchor user first, and replays systemSegments as a synthetic summary user message instead of a tail system message
SubagentProcessRuntimeProfile.compaction configures compaction for child-owned process sessions; profiles that set compactorPluginId should also install the matching compactor plugin
process subagents inherit parent runtime model capabilities only when the child process factory returns the same provider / model id and leaves that capability field unset; different-model profiles should expose image input and context-window support from their own model factory options
Session.getStatus() returns the canonical read-only session status snapshot used by hook runtime context
Session.getUsageSummary(), getUsageLedger(), getTurnUsage(), listTurnUsages(), and listRequestUsages() expose semantic usage queries without reconstructing totals from done.usage
Session.getPlugin(pluginId) returns a session-scoped plugin controller when the plugin exposes one
agent.services.inspect exposes read-only agent config, live session state, and persisted snapshot summaries without leaking mutable runtime internals
Session.dispose() unregisters the session from compaction and plugin-state services, and also disposes session-scoped plugin controllers such as mcp-client
Plugins can persist their own namespaced session state through pluginState
If compaction.compactorPluginId is configured, only that plugin can write canonical compaction through plugin services
Session.compact() remains available as an app-level override
run.end keeps final message rewrite support; usage rewrite is retired and hooks now observe usageSummary
buildCompactionBudget(options, estimatedInputTokens, { contextWindow, plannedOutput }) and calculateThresholdTokens(options, targets) are exported for hosts that want to compute the same threshold = contextWindow − max(effectiveO, contextWindow × safetyRatio) the SDK uses (safetyRatio defaults to DEFAULT_COMPACTION_SAFETY_RATIO = 0.2; effectiveO caps plannedOutput and falls back to the safety floor when it claims more than 60% of the window)
Session.getCurrentContextSize(), Session.ensureCurrentContextSize(), Session.countTokens(request), and Session.getCompactionBudget({ plannedOutput? }) let hosts observe the SDK's own post-hook budgeting without rebuilding the math
isContextWindowExceededError(error) is a provider-agnostic classifier for context-window-exceeded errors surfaced by ModelGateway.stream
If threshold compaction still cannot fit the next request, the runtime can do one bounded recovery pass that rewrites only the newest oversized result payloads (tool outputs and subagent parent-commit packages) into short overflow summaries before finally raising context_compaction_required
Hook handlers receive the same canonical status through context.status, without exposing full message history or other plugins' state
Session.getStatus().capabilities mirrors runtime model input capabilities, and builtin read uses that signal to decide whether a local image should stay metadata-only or be injected as synthetic user image context; same-model process subagents inherit missing capabilities for this path

Provider notes

createOpenAIAdapter(): OpenAI-compatible Chat Completions style; maps reasoning_content / reasoning into thinking_delta and replays assistant thinking back upstream
createOpenAIResponsesAdapter(): official OpenAI Responses API; maps reasoning summaries into thinking_delta, sends full canonical history by default, only reuses previousResponseId when usePreviousResponseId: true, and accepts request-level providerOptions.openai transport fields for prompt_cache_key and custom headers
createAnthropicAdapter(): maps Claude thinking blocks into thinking_delta; when overriding baseUrl, the adapter appends /v1 if it is missing; Anthropic-compatible routes now auto-inject explicit prompt caching unless cache.mode is set to 'off'
createGeminiAdapter(): maps Gemini thought parts into thinking_delta
createZenMuxAdapter(): ZenMux adapter; routes anthropic/* models through https://zenmux.ai/api/anthropic/v1/messages and sends every other model through the official ZenMux OpenAI-compatible endpoint at https://zenmux.ai/api/v1/chat/completions. baseUrl is kept for compatibility but ignored at runtime.
createAihubmixAdapter() / createAihubmixResponsesAdapter(): AIHubMix chat + responses adapters; chat forwards reasoning effort into AIHubMix-compatible request bodies, and the responses variant sends full canonical history by default, only reuses previousResponseId when usePreviousResponseId: true, and honors custom fetch overrides for transport/auth wiring. If you omit apiKey, you must inject the real upstream authentication inside custom fetch.
createMoonshotAIAdapter(): MoonshotAI language model adapter with thinking budget mapping
createDeepSeekAdapter(): DeepSeek chat adapter with reasoning -> thinking_delta
createXaiAdapter() / createXaiResponsesAdapter(): xAI chat + responses adapters; the responses variant now sends full canonical history by default and only reuses previousResponseId when usePreviousResponseId: true
Builtin exec now publishes a flat top-level object schema and rejects action-incompatible fields as request errors. Anthropic-compatible adapters remove top-level oneOf / anyOf / allOf during transport projection; OpenAI-compatible and Responses families for OpenAI-compatible, OpenAI Responses, ZenMux non-Anthropic, AIHubMix chat / responses, xAI chat / responses, DeepSeek, and MoonshotAI also remove top-level enum / not; Gemini keeps the original schema unchanged.
Builtin exec foreground commands wait up to 5 seconds before moving still-running commands to background managed-task messaging and completion notifications.
Official adapters use realtime upstream streaming by default. Set adapter streamMode: 'buffered' or request streamMode: 'buffered' to replay a completed response instead.
Session runtime fills missing maxOutputTokens with 4000 after model.request middleware runs. Direct model.stream() calls still pass through whatever the caller provides.
The SDK no longer injects implicit workspace context into model requests; if the model needs file or Git state, let it call tools explicitly.
Builtin read returns line-numbered, revision-tracked text windows for UTF-8 files, while png / jpg / jpeg / gif / webp use fileSystem.readBytes() and, when the active model advertises capabilities.modalities.input: ['image'], append a synthetic user file message after the tool receipt so providers receive a normal image input instead of tool-result media. In process subagents, identical child provider / model ids inherit missing parent capabilities before this decision runs.
Builtin providers are also available from public subpaths such as @archships/dim-agent-sdk/providers/openai and @archships/dim-agent-sdk/providers/xai-responses
Published packages ship these public entrypoints directly; consumers should not import dist/src/* or patch node_modules after install
Other deep internal imports are not part of the public API, even though package-local tests use path aliases against src/*
Custom providers can follow the same factory pattern via @archships/dim-agent-sdk/providers/core; implement stream() for realtime deltas, generate() for buffered results, or both

OpenAI Responses request transport:

providerOptions.openai.promptCacheKey maps to the /responses body field prompt_cache_key
providerOptions.openai.requestHeaders merge into the outgoing HTTP headers for that request, so gateways can receive values such as session_id

import {
  createModel,
  createOpenAIResponsesAdapter,
  type OpenAIResponsesRequestProviderOptions,
} from '@archships/dim-agent-sdk'

const model = createModel(
  createOpenAIResponsesAdapter({
    apiKey: process.env.OPENAI_API_KEY,
    defaultModel: 'o4-mini',
  }),
)

const requestProviderOptions = {
  promptCacheKey: runtimePromptCacheKey,
  requestHeaders: {
    session_id: runtimeSessionId,
  },
} satisfies OpenAIResponsesRequestProviderOptions

for await (const event of model.stream({
  messages,
  providerOptions: {
    openai: requestProviderOptions,
  },
})) {
  console.log(event.type)
}

Anthropic-compatible prompt caching defaults:

cache.mode defaults to 'auto'
cache.ttl defaults to '5m'
cache: { mode: 'off' } disables SDK-managed cache_control injection
explicit providerOptions.anthropic.cacheControl on a message, content block, or tool definition is forwarded unchanged and disables auto injection for that request

import { createProviderFactory } from '@archships/dim-agent-sdk/providers/core'

Demo

Provider-free demos:

pnpm run demo:host: approval + notification host control-plane walkthrough
pnpm run demo:persistence: FileStateStore + session.save() + restoreSession() walkthrough
pnpm run demo:plan-mode: official session-scoped plugin controller + hostDataDir walkthrough
pnpm run demo:compaction: scripted compaction runtime walkthrough
pnpm run demo:auto-compact: scripted official auto compact plugin demo
pnpm run demo:subagents: scripted ordered subagent walkthrough for in-process and child-owned process modes

Provider-backed demos:

pnpm run demo:openai: builtin tools smoke demo
pnpm run demo:hooks: Hook v2 scenario runner

Repo demo files:

packages/dim-agent-sdk/demo/host-control-plane-scripted.ts
packages/dim-agent-sdk/demo/persistence-scripted.ts
packages/dim-agent-sdk/demo/plan-mode-plugin.ts
packages/dim-agent-sdk/demo/compaction-scripted.ts
packages/dim-agent-sdk/demo/auto-compact-plugin.ts
packages/dim-agent-sdk/demo/subagents-scripted.ts
packages/dim-agent-sdk/demo/openai-tools.ts
packages/dim-agent-sdk/demo/openai-hooks.ts

demo:hooks currently runs these provider-backed scenarios:

lifecycle
approval-deny
synthetic-result
notification-control
stop-finalize

Provider-backed demos use createOpenAIAdapter() against the OpenAI-compatible Chat API and require DIM_TEST_API_KEY, DIM_TEST_BASE_URL, and optional DIM_TEST_MODEL_ID.

Plan mode v2 notes:

plan drafts live under <hostDataDir>/plans/<sessionId>/plan.md
host applications drive plan mode through session.getPlugin('plan-mode')
plan_write is the only writable artifact exposed to the model while plan mode is active
enable() / disable() changes affect the next run, not the current in-flight run
agent.deleteSession(sessionId) removes both the persisted snapshot and the matching plan draft directory

Testing

Local repository verification is split into three layers:

Local workspace development expects Node.js 20.19+ or 22.12+ because repository verification uses official Vite 7 + Vitest + Oxlint.
pnpm run test: full local regression, including deterministic test/e2e/*.e2e.test.ts
pnpm run test:e2e: deterministic end-to-end workflows for plan-mode, code-agent tool loops, auto-compact restore, and approval / permission boundaries
pnpm run test:plugins: focused plugin contract and integration tests
pnpm run test:smoke: env-gated provider smoke for provider-tools, provider-hooks, provider-plan-mode, and provider-subagent-process; excludes long-task
pnpm run test:smoke:providers: focused multi-provider builtin-tool smoke in test/smoke/provider-tools.smoke.ts
pnpm run test:smoke:long-task: default manual OpenAI-compatible long-task smoke; it currently aliases the guides case without subagents
pnpm run test:smoke:long-task:subagents: the same default guides case with host-owned delegation plus child-owned process subagents enabled through DIM_TEST_LONG_TASK_SUBAGENTS=1
pnpm run test:smoke:long-task:guides / pnpm run test:smoke:long-task:guides:subagents: explicit beginner-guide generation case for modelinfo-cli
pnpm run test:smoke:long-task:rust-port / pnpm run test:smoke:long-task:rust-port:subagents: review-oriented Rust port case for the same reference repo

The provider smoke layer reuses:

DIM_TEST_API_KEY
DIM_TEST_BASE_URL
DIM_TEST_BASE_URL_ANTHROPIC (optional; enables the Anthropic smoke lane)
DIM_TEST_MODEL_ID (optional)
DIM_TEST_LONG_TASK_CASE (optional; guides by default, also supports rust-port)
DIM_TEST_LONG_TASK_SUBAGENTS (optional; only 1 enables the subagent mode for the long-task smoke)

The focused multi-provider tools smoke loads repo-root .env.smoke automatically. Start from .env.smoke.example, fill only the providers you want to run, and use per-provider keys such as DIM_TEST_OPENAI_API_KEY, DIM_TEST_OPENAI_MODEL_ID, and optional DIM_TEST_OPENAI_BASE_URL. ZenMux is the exception: DIM_TEST_ZENMUX_BASE_URL is ignored because createZenMuxAdapter() now pins the official ZenMux endpoints, and you can optionally add DIM_TEST_ZENMUX_ANTHROPIC_MODEL_ID to create a second ZenMux smoke lane that exercises the Anthropic-compatible route. That smoke keeps its temporary workdirs under os.tmpdir()/dim-sdk-provider-smoke/<provider-id>/run-* and removes each run directory after the test finishes.

Smoke tests assert stable invariants such as tool calls, plugin_event, notifications, and on-disk side effects. They intentionally avoid exact natural-language output matching unless the smoke is explicitly about artifact structure. The long-task smoke is tuned for diagnosis as much as pass/fail: it runs only through the OpenAI-compatible adapter, reuses a generic harness plus named cases, and preserves both the final workspace and debug logs for inspection. The default guides case generates beginner-friendly Markdown guides for every tracked file; the rust-port case asks the model to translate the same reference repo into a reviewable Rust cargo project. Every run now prints and writes a unified summary with host rounds, model turns, tool-call count, total tokens, cache/token detail breakdowns when the provider exposes them, delegation/process-batch counts, and case-specific stats. It also writes a full-fidelity session timeline array to logs/long-task-session.json, alongside the staged debug logs in logs/long-task-smoke.jsonl, so parent and delegated child context can be replayed during debugging. Those staged diagnostics now include explicit compaction_notification and compaction_state_after_hook entries, which makes it much easier to see whether auto compact triggered and why it still failed to bring the request back under budget. Ordered subagents remain opt-in through pnpm run test:smoke:long-task:subagents or DIM_TEST_LONG_TASK_SUBAGENTS=1.

Published

Vulnerabilities

Links

Maintainers

Keywords

Readme

@archships/dim-agent-sdk

Install

Quick start

Usage ledger

Persistence

Observability JSONL logging

Runtime inspection

Manual

Included core capabilities

Hook support

Official plugin packages

Compaction and state model

Provider notes

Demo

Testing