@xeno-corporation/xeno-agent-sdk

v0.6.4

Published

5 days ago

XENO Agent SDK built from the same engine as XENO CODE

0High
0Medium
0Low

XENO Agent SDK

Core runtime for XENO agent-enabled products. It provides the agent loop, tool execution, permissions, sessions, memory, audit logging, and provider integration used across the XENO platform.

Vision

Every XENO creative app (Pixel, Motion, Sound) has an AI agent embedded directly into the interface. Users open a sidebar, type what they want ("remove the background from this layer", "cut the silence from this podcast", "match-cut these two clips"), and the agent translates that into tool calls against the app's engine. One request can span multiple apps: "Create a product video from these photos with background music" triggers coordinated work across Pixel, Motion, and Sound simultaneously.

This is what makes XENO fundamentally different from Adobe. No other creative suite has embedded agent orchestration.

Without agent SDK:  User manually opens Pixel, edits, opens Motion, edits, opens Sound, edits
With agent SDK:     User says "Create a product video" -> agents in all 3 apps coordinate automatically

Current State

7,394 lines of TypeScript, v0.1.0
10 subsystems: core loop, tools, security, session, identity, memory, audit, config, utils, types
6 built-in tools: Read, Write, Edit, Glob, Grep, Bash
Provider-agnostic LLM integration (Xeno API, Ollama, any OpenAI-compatible endpoint)
4 permission modes: default, acceptEdits, bypassPermissions, plan
4-level identity hierarchy: global, project, role, session
4-level memory system with auto-capture (conversation, session, project, global)
Session persistence with checkpoints, transcript writing, and crash recovery
Audit logging in JSON-lines format (who, what, when, result)
Delegation system: planner/executor/reviewer sub-agent workflows
ESM-only, TypeScript 5.7.3, tsup build
Dependencies: chalk, glob, gray-matter

Automation

The SDK repo now carries the same production automation discipline as the CLI:

.github/workflows/ci.yml -- cross-platform build verification plus full Linux runtime checks
.github/workflows/certification.yml -- scheduled production/stress verification and large-repo coding benchmarks
.github/workflows/release.yml -- npm Trusted Publishing with provenance, checksums, and GitHub release artifacts

Architecture

xeno-agent-sdk/
src/
  core/             Agent loop engine (provider-agnostic LLM interface, delegation, reducer)
  tools/            Tool registry with permission checks and built-in tools (6)
  security/         Permission engine (4 modes), path safety, approval flows
  session/          Session persistence, checkpoints, transcript, lock management
  identity/         4-level identity hierarchy (global/project/role/session)
  memory/           Memory manager (conversation -> session -> project -> global)
  audit/            JSON-lines audit logger (every tool call, every decision)
  config/           Model configuration, API endpoints, defaults
  utils/            Shared utilities
  types.ts          All shared TypeScript types (exported for consumers)
  create-agent.ts   Main entry point: createXenoAgent()
  permission-engine.ts   Audit-backed permission engine factory
  prompt-context.ts      System prompt and context assembly
  session-runtime.ts     Session lifecycle management
  delegated-agent.ts     Sub-agent creation for delegation workflows
  delegated-turn.ts      Single-turn delegated execution

10 Subsystems

| Subsystem | Purpose | Key exports | |-----------|---------|-------------| | Core Loop | Agentic turn loop with tool dispatch, streaming, reducer | AgentLoop, AgentLoopConfig | | Tools | Registry pattern for tool registration + 6 built-in tools | ToolRegistry, registry | | Security | Permission engine with 4 modes, path sandboxing | PermissionEngine, PermissionConfig | | Session | Persistence, checkpoints, transcripts, locks, recovery | SessionManager, CheckpointManager | | Identity | Persona loading and resolution across 4 levels | IdentityLoader, IdentityResolver | | Memory | Hierarchical memory with budget management | MemoryManager, MemoryBudget | | Audit | JSON-lines logging for every action | AuditLogger | | Config | Model settings, API endpoints, defaults | XENO_API_BASE, DEFAULT_MODEL | | Delegation | Planner/executor/reviewer sub-agent workflows | DelegatedAgent, DelegatedTurn | | Types | All shared TypeScript types | ExecutionMode, ResolvedIdentity, etc. |

How Apps Integrate

The primary entry point is createXenoAgent(). Each app registers its own operations as tools, and the SDK handles everything else: the agentic loop, permission checks, audit logging, memory, sessions.

import { createXenoAgent } from '@xeno-corporation/xeno-agent-sdk'

// Each app defines its domain-specific tools
const pixelTools = {
  'layer.remove-bg': {
    description: 'Remove background from the active layer',
    execute: async (params) => {
      const result = await xenoLib.rmbg(params.layerId)   // xeno-lib AI model
      await engine.applyMask(params.layerId, result.mask)  // app engine
      return { success: true, layerId: params.layerId }
    },
    confirm: true,  // requires user approval
  },
  'brush.draw': {
    description: 'Draw a stroke on the active layer',
    execute: async (params) => engine.drawStroke(params),
    confirm: false,  // safe, no confirmation needed
  },
  'file.export': {
    description: 'Export the current document',
    execute: async (params) => exporter.save(params),
    confirm: true,
    destructive: false,
  },
}

const agent = await createXenoAgent({
  toolRegistry: pixelTools,
  model: 'claude-sonnet-4-20250514',
  permissionConfig: { mode: 'default' },
  // identity, memory, session all auto-configured
})

End-to-End Example

User types in Pixel's agent sidebar: "Remove the background from this layer"

1. Agent receives natural language input
2. LLM decides to call tool: layer.remove-bg({ layerId: 'layer-3' })
3. Permission engine checks: confirm=true -> prompts user for approval
4. User approves -> tool executes:
     a. Calls xeno-lib's RMBG model (Rust, ONNX, GPU-accelerated)
     b. Receives alpha mask
     c. Applies mask to layer in Pixel's rendering engine
5. Audit logger records: who=user, what=layer.remove-bg, when=timestamp, result=success
6. Agent responds: "Done. Background removed from Layer 3."

The Tool Registry Pattern

Apps extend the SDK by registering their operations as tools. The SDK provides 6 built-in tools for file operations (Read, Write, Edit, Glob, Grep, Bash). Apps add domain-specific tools:

| App | Example tools | |-----|---------------| | Pixel | brush.draw, layer.remove-bg, selection.expand, filter.blur, file.export | | Motion | timeline.cut, clip.speed, transition.add, keyframe.set, render.export | | Sound | track.eq, region.normalize, master.lufs, effect.reverb, bounce.export | | Hub | workspace.create, app.launch, agent.dispatch |

Each tool declares: description (for the LLM), execute (the implementation), confirm (whether to ask the user), and destructive (whether it's irreversible).

Cross-App Orchestration

One agent request can trigger coordinated work across multiple apps. The Hub acts as an orchestrator, dispatching tasks to per-app agents via a mailbox system.

User: "Create a product video from these photos with background music"

xeno-hub (orchestrator)
    |-- mailbox.send('pixel-agent', { task: 'export hero images as PNGs' })
    |-- mailbox.send('motion-agent', { task: 'create timeline from exported images' })
    |-- mailbox.send('sound-agent', { task: 'add background track, master to -14 LUFS' })

Each agent receives via:
    mailbox.onMessage((msg) => agent.execute(msg.task))

Messages are JSON-serializable for cross-process IPC. Timeout and retry are built in. This system is planned and not yet implemented in the current codebase.

Security Model

Permission Modes

| Mode | Behavior | |------|----------| | default | Ask user before writes and shell commands | | acceptEdits | Auto-approve file edits, ask for shell commands | | bypassPermissions | Auto-approve everything (development/testing only) | | plan | Read-only mode, agent can only plan and suggest |

Safety Features

Path sandboxing: agents cannot access files outside configured boundaries
Destructive action gates: delete, overwrite, and shell commands require explicit approval
Audit trail: every tool call logged in JSON-lines format with trace IDs
Risk classification: each tool call classified as low/medium/high risk
Permission rules: per-tool, per-path override rules

Session Persistence

Sessions have unique IDs with embedded timestamps
Full conversation transcripts written to disk
Checkpoint system for long-running workflows
Lock management prevents concurrent access to the same session
Turn restore for crash recovery (resume mid-conversation)
Session registry tracks all sessions per project

Memory System

4-level hierarchy with automatic capture and budget management:

| Level | Scope | Persists | Example | |-------|-------|----------|---------| | Conversation | Current turn | No | "The user just asked about Layer 3" | | Session | Current session | Until session ends | "We're working on the hero image" | | Project | Current project | Yes | "This project uses 300 DPI, CMYK" | | Global | All projects | Yes | "User prefers dark theme, metric units" |

Memory is injected into the system prompt with configurable token budgets to avoid context overflow.

LLM Providers

| Provider | How | When | |----------|-----|------| | Xeno API (cloud) | api.xenostudio.ai proxy to hosted models | Online, highest capability | | xeno-rt (local) | OpenAI-compatible API on localhost | Offline, privacy, no cost | | Ollama (local) | OpenAI-compatible API | Alternative local provider |

The SDK is provider-agnostic. Any OpenAI-compatible endpoint works. Provider selection and fallback logic are handled by the config subsystem.

Consumers

| App | How it uses the SDK | Example agent task | |-----|--------------------|--------------------| | xeno-agent-cli | Terminal agent (reference implementation) | "Refactor this codebase" | | xeno-pixel | Image editing agent sidebar | "Remove backgrounds from 50 images" | | xeno-motion | Video editing agent sidebar | "Cut this interview into a highlight reel" | | xeno-sound | Audio editing agent sidebar | "Master this podcast to -16 LUFS" | | xeno-hub | Orchestrator routing tasks between apps | "Create marketing materials" dispatches to Pixel + Motion + Sound |

Planned: React Component Library

A separate /react or /ui export will provide pre-built React components for embedding the agent interface in Electron-based XENO apps:

ChatPanel -- full agent sidebar with message history
MessageBubble -- individual messages (user, agent, system)
ToolCallIndicator -- real-time display of tool execution with status
ThinkingState -- streaming thinking/reasoning indicator
PermissionPrompt -- inline approval UI for destructive actions
SessionSwitcher -- switch between conversation sessions

These components will NOT live in the core SDK (no DOM dependencies in core). They will be a separate package entry point that apps can import independently.

Ecosystem Position

LAYER 5 -- APPS (Pixel, Motion, Sound, Hub)
    | embed xeno-agent-sdk for AI automation
    | agent sidebar in every app
LAYER 3 -- THIS REPO (xeno-agent-sdk)
    | uses LLM providers + invokes AI models
LAYER 2 -- COMPUTE (xeno-rt for LLM, xeno-lib for 17 AI models)
    | runs on
LAYER 1 -- PLATFORM (servers, auth, credits)

See Full Ecosystem Report for complete context.

Development

npm install
npm run build      # tsup -> ESM output
npm run typecheck   # TypeScript 5.7.3 strict mode