zoe-agent

v0.3.2

Published

9 days ago

Zoe — your AI person. A skill-driven AI agent with CLI, SDK, and server adapters; multi-provider LLM support and Docker-native deployment. Coding is one of the things she does.

Zoe Agent 🦞

The Engineering-First Headless Agent Framework: CLI, SDK, and Server. Stable, Scalable Automation for the Post-Vision Era.

🔗 GitHub Repository: https://github.com/hashangit/zoe

Zoe Agent is a high-stability, open-source automation framework specifically engineered for headless systems.

Unlike "screen-seeing" agents (such as OpenClaw) that rely on visual interpretation, Zoe Agent is built on a foundation of precise command-driven execution. This makes it significantly more stable, robust from an engineering perspective, and easier to scale across complex environments—whether it's a local server, a CI/CD pipeline, or thousands of containerized nodes.

Why Zoe Agent?

🐳 Docker Native: Built to run safely inside containers. Minimal footprint (Node.js/Alpine friendly).
🚀 Better Engineering: Operates via precise system APIs and shell commands rather than unstable visual recognition, ensuring deterministic outcomes.
🛡️ Superior Stability: Immune to issues like UI rendering, screen resolution, or network lag that plague vision-based agents.
📈 Massive Scalability: Low resource consumption allows orchestrating thousands of instances (e.g., in K8s) for true automation swarms.
🔌 Swarm Ready: Stateless design allows for easy orchestration via K8s, Docker Swarm, or simple shell loops.
🧩 Extensible Integrations: Built-in support for Web Search (Tavily), Email (SMTP), and Notification Webhooks (Feishu, DingTalk, WeCom).
📦 SDK & Server: TypeScript SDK for programmatic use, standalone HTTP/WebSocket server for remote access.
🛠 Skills System: Loadable skill packs with file references, custom tool registration, and extensible workflows.

Features

🤖 Multi-Provider Support: Switch between OpenAI, Anthropic Claude, GLM, or any OpenAI-compatible endpoint
🔄 Runtime Provider Switching: Change AI providers mid-conversation with /models command
📜 Headless Execution: No browsers, no GUIs. Pure terminal efficiency.
🚀 Non-Interactive Mode: Intelligent flag handling (-y, --no-interactive) for zero-touch automation.
📂 Universal Control: From simple file I/O to complex system administration.
🧠 Context Aware: Detects container environments and provides accurate system time for relative date queries.
🌐 Web Search: Integrated with Tavily for real-time information retrieval.
🕒 Time Accuracy: Built-in tool to get precise system date and time for correct temporal context.
📧 Communication: Send emails and push notifications to chat groups automatically.
📦 TypeScript SDK: Programmatic access via createAgent, streamText, generateText.
🖥 Server Mode: Standalone HTTP/WebSocket server with API key auth and session management.
🛠 Skills System: Loadable skill packs from directories with @path file references and custom tool creation.
🐚 Shell Approval: Interactive prompts or non-interactive modes via ZOE_SHELL_APPROVE env var.
🖥️ Interactive TUI: In a TTY, a full-screen Ink/React UI — bordered always-on input, streaming feed, session manager, message queue//steer, and inline write_file diffs (atomic, crash-safe writes).

Tech Stack

Runtime: Node.js
Language: TypeScript
Architecture: Modular multi-adapter (core, CLI, SDK, server)
Framework: Commander.js
UI: Inquirer (interactivity), Chalk (styling), Ora (spinners)
AI: Multi-Provider (OpenAI, Anthropic Claude, GLM, OpenAI-Compatible)

Installation

npm

npm install -g zoe-agent

pnpm

pnpm add -g zoe-agent

Homebrew (macOS & Linux)

brew tap hashangit/zoe
brew install zoe-agent

Note: Requires Node.js 20 or later.

Server Binary

The zoe-server binary is included for running the standalone HTTP/WebSocket server:

zoe-server --port 7337 --generate-api-key

SDK Usage

Import the SDK in your TypeScript/JavaScript project:

npm install zoe-agent

// Main exports
import { createAgent, streamText, generateText } from 'zoe-agent';
// Server utilities
import { createServer } from 'zoe-agent/server';

Development Installation

Clone the repository:

git clone https://github.com/hashangit/zoe.git
cd zoe

Install dependencies:
```
pnpm install
```
Build the project:
```
pnpm run build
```
Link globally (optional):
```
pnpm link
```

Quick Start

Setup: Run the interactive setup wizard to configure your API keys and integrations.
```
zoe setup
```
The wizard now supports configuring multiple providers (OpenAI, Anthropic, GLM) in a single session.
Run: Start the agent in interactive mode.
```
zoe
```

Usage

Interactive Mode

Simply run zoe to enter the chat loop.

zoe
> List all TypeScript files in the src folder.

Interactive TUI

In a terminal (TTY), zoe launches a full-screen Ink/React TUI instead of the readline loop: a bordered always-visible input, a streaming message feed, a persistent task panel (manage_todos), session list/resume/export, and inline diffs for write_file. File writes are atomic (same-dir temp + fs.rename, so a crash never corrupts the file) and render as a green/red unified diff in the tool block; Ctrl+O expands collapsed blocks, and /clear starts a fresh session. It falls back to the readline REPL when piped or run with --no-interactive.

Headless Mode (One-Shot)

Run a single command and exit.

zoe "Check disk usage and save the report to usage.txt" --no-interactive

Auto-Confirm (CI/CD)

Automatically approve all tool executions (dangerous, use with caution or in sandboxes).

zoe "Refactor src/index.ts to use ES modules" -y

Provider Selection

Use a specific provider for a single command:

zoe -p anthropic "Analyze this code for security issues"

Switch Providers Mid-Conversation

In interactive mode, type /models to switch between configured providers:

zoe
> /models  # Select Anthropic from the list
> Now analyze this with Claude...

CLI Options

-m, --model <model>: Specify the LLM model (default: gpt-4o).
-p, --provider <provider>: Specify the LLM provider (openai-compatible, openai, anthropic, glm).
-n, --no-interactive: Exit after processing the initial query (Headless mode).
-y, --yes: Auto-confirm all tool executions (e.g., shell commands).
--docker: Run in Docker-optimized non-interactive mode (auto-detected in containers).
--generate-api-key: Generate an API key for server mode (use with zoe-server).

Interactive Commands

/models: Switch between configured providers during a conversation.
/exit or /quit: End the session.

Configuration

Zoe Agent uses a hierarchical configuration system.

Priority Order (Highest to Lowest):

CLI Arguments: (e.g., -m gpt-4o)
Environment Variables: (OPENAI_API_KEY, .env file)
Project Config: (./.zoe/setting.json in current directory)
Global Config: (~/.zoe/setting.json)

Supported Configuration Keys (JSON)

Environment Variables:

ZOE_SHELL_APPROVE: Shell command approval mode (auto, deny, or unset for interactive)
ZOE_SKILLS_PATH: Colon-separated list of additional skill directories
OPENAI_API_KEY, ANTHROPIC_API_KEY, GLM_API_KEY: Provider API keys
OPENAI_COMPAT_API_KEY, OPENAI_COMPAT_BASE_URL, OPENAI_COMPAT_MODEL: OpenAI-compatible provider settings

Multi-Provider Configuration:

provider: Active provider type (openai-compatible, openai, anthropic, glm)

models: Object containing per-provider configurations:

{
  "models": {
    "openai-compatible": { "apiKey": "...", "baseUrl": "...", "model": "gpt-4o" },
    "openai": { "apiKey": "...", "model": "gpt-4o" },
    "anthropic": { "apiKey": "...", "model": "claude-sonnet-4-5-20250929" },
    "glm": { "apiKey": "...", "model": "sonnet" }
  }
}

Legacy Keys (Backward Compatible):

apiKey: Your OpenAI API Key (legacy, treated as openai-compatible).
baseUrl: Custom Base URL (e.g., for DeepSeek or LocalLLM).
model: Default model to use.
tavilyApiKey: API Key for Tavily Web Search.
smtpHost, smtpPort, smtpUser, smtpPass, smtpFrom: SMTP Email settings.
feishuWebhook, dingtalkWebhook, wecomWebhook: Notification webhooks.

Project-Level Config Example

Multi-Provider Configuration:

{
  "provider": "anthropic",
  "models": {
    "openai": { "apiKey": "sk-...", "model": "gpt-4o" },
    "anthropic": { "apiKey": "sk-ant-...", "model": "claude-sonnet-4-5-20250929" }
  }
}

Legacy Configuration (Still Supported): Create a file at .zoe/setting.json:

{
  "model": "gpt-3.5-turbo",
  "baseUrl": "https://api.deepseek.com/v1"
}

⚠️ Security Warning: If you store your apiKey or secrets in .zoe/setting.json, make sure to add .zoe/ to your .gitignore file to prevent leaking secrets!

Integrations

Gateway (MCP Client + REST Proxy + OpenAPI Adapter)

Zoe Agent v0.3.0 introduces a universal API gateway that connects to downstream MCP servers and REST APIs:

Semantic Injection: Middleware scores your message against all discovered tools and injects the top-K most relevant directly into the agent's tool context. Zero context pollution.
Proxy Pattern: Generic tools (gateway_route, gateway_call_tool, etc.) let the agent navigate targets when semantic injection finds no match.
OpenAPI Import: Import any OpenAPI spec (JSON/YAML) and auto-register all operations as a REST target.
Credential Trust Guard: Admin-registered targets can resolve stored credentials; agent-registered targets cannot — preventing credential exfiltration.
Audit Logging: Ring-buffer audit logs with per-target usage summaries for debugging and self-healing.

Configuration (~/.zoe/setting.json or env vars):

{
  "gatewayEnabled": true,
  "gatewaySemanticTopK": 3,
  "gatewayRateLimit": 60,
  "gatewayMaxAuditLogs": 1000
}

REST API: GET/POST/PATCH/DELETE /v1/gateway/* (admin scope required for mutations)

SDK:

import { gateway } from 'zoe-agent';
const gw = await gateway.createGateway({ enabled: true, semanticTopK: 3, defaultRateLimitPerMin: 60, maxAuditLogsInMemory: 1000 });

Multi-Provider LLM Support

Zoe Agent supports multiple AI providers with seamless switching:

OpenAI: GPT-4, GPT-3.5-turbo, and latest models
Anthropic: Claude Sonnet, Haiku, Opus models
GLM: Z.ai GLM-4.5, GLM-4.7, GLM-5.1 models
OpenAI-Compatible: DeepSeek, LocalLLM, Ollama, LM Studio, and any OpenAI-compatible endpoint

Configure multiple providers during setup and switch between them using /models command or -p flag.

Web Search (Tavily)

Zoe Agent can search the web if you provide a Tavily API Key during setup or in config.

Usage: "Search for the latest Node.js release notes."

Email (SMTP)

Configure SMTP settings to let the agent send emails.

Usage: "Send an email to [email protected] with the summary of the log file."

Notifications (Feishu/DingTalk/WeCom)

Configure webhooks to receive alerts or reports in your team chat apps.

Usage: "Notify the team on Feishu that the build has finished."

Date & Time

Built-in utility to provide the agent with the current system time, ensuring accurate handling of relative time requests.

Usage: "What's the date today?" or "Remind me to check the logs next Monday."

SDK & Programmatic Usage

Zoe Agent provides a TypeScript SDK for building agent-powered applications.

Basic Agent

import { createAgent } from 'zoe-agent';

const agent = await createAgent({
  provider: 'anthropic',
  model: 'claude-sonnet-4-5-20250929',
});

const result = await agent.chat('List all running Docker containers');
console.log(result.text);

Streaming

import { streamText } from 'zoe-agent';

const stream = await streamText('Analyze the logs for errors', {
  provider: 'openai',
});

for await (const chunk of stream.textStream) {
  process.stdout.write(chunk);
}

Structured Output

import { generateText } from 'zoe-agent';

const result = await generateText('Extract the top 3 issues from these logs', {
  provider: 'anthropic',
});
console.log(result.text);

Getting Started (SDK)

Before first use, configure providers from your CLI config (or env vars):

import { configureProviders, loadProviderConfig } from 'zoe-agent';

const config = loadProviderConfig(); // reads ~/.zoe/setting.json + .zoe/setting.json + env
if (config) {
  configureProviders(config);
}

// Now safe to call
import { generateText } from 'zoe-agent';
const result = await generateText('Hello, world!');

Without this step, generateText defaults to OpenAI using OPENAI_API_KEY if present, or throws a "no provider configured" error otherwise.

Custom Tools

import { createAgent, tool } from 'zoe-agent';

const agent = await createAgent({
  provider: 'openai',
  tools: [
    tool({
      name: 'check_disk',
      description: 'Check disk usage',
      parameters: {},
      execute: async () => {
        const usage = await getDiskUsage();
        return JSON.stringify(usage);
      },
    }),
  ],
});

Session Persistence

const agent = await createAgent({
  provider: 'anthropic',
  persist: 'my-session',          // Resume a previous session
});

Server Mode

Run Zoe Agent as a standalone HTTP/WebSocket server for remote agent access.

Starting the Server

# Start with default settings
zoe-server

# Generate an API key
zoe-server --generate-api-key

# Custom port
zoe-server --port 8080

REST API

# Send a prompt
curl -X POST http://localhost:7337/api/chat \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"message": "Check disk usage", "provider": "openai"}'

# List sessions
curl http://localhost:7337/api/sessions \
  -H "Authorization: Bearer YOUR_API_KEY"

WebSocket Streaming

const ws = new WebSocket('ws://localhost:7337/ws?token=YOUR_API_KEY');

ws.onopen = () => {
  ws.send(JSON.stringify({
    type: 'chat',
    message: 'Analyze the error logs',
    provider: 'anthropic',
  }));
};

ws.onmessage = (event) => {
  const chunk = JSON.parse(event.data);
  process.stdout.write(chunk.text);
};

Skills System

Skills are single SKILL.md files with YAML frontmatter that extend the agent with domain-specific prompts and tool restrictions. See docs/sdk/skills.md for full documentation.

Skill Format

---
name: docker-ops
description: Docker operations assistant
tags: [docker, devops]
allowedTools: [execute_shell_command, read_file]
args: [environment, service]
model:
  provider: anthropic
  model: claude-sonnet-4-20250514
---

Skill instructions and templates here.
Use {{environment}} and {{service}} for argument substitution.
Reference files with @k8s/deployment.yaml.

Key Features

Argument substitution: {{argName}}, $1/$2/$ALL for positional args
File references: @path/to/file.yaml injects file contents into the prompt
Tool restrictions: allowedTools limits which tools the skill can use
Model selection: model overrides provider/model per skill
Lazy loading: skill body is only read when invoked, keeping startup fast
Body limits: bodies over 8k chars warn, over 32k chars truncate (~8k tokens)

Discovery Locations

Skills are discovered in priority order (last wins):

Built-in bundled skills
~/.zoe/skills/
.zoe/skills/ (project-level)
ZOE_SKILLS_PATH directories

# Add custom skill directories
export ZOE_SKILLS_PATH=/path/to/skills:/another/path

Docker Support

Zoe Agent includes a production-ready Dockerfile (Node 20 Alpine) and docker-compose.yml for containerized deployment.

Quick Start with Docker

# Clone and build
git clone https://github.com/hashangit/zoe.git
cd zoe
docker build -t zoe-server .

# Run the server
docker run -d -p 7337:7337 \
  -e OPENAI_API_KEY=sk-... \
  zoe-server

# Or use Docker Compose
docker compose up -d

Docker-Optimized CLI Mode

Use --docker for non-interactive execution inside containers:

docker run --rm \
  -e OPENAI_API_KEY=sk-... \
  -e ZOE_SHELL_APPROVE=auto \
  zoe-server zoe "Check disk usage" --docker

Zoe Agent auto-detects Docker and non-interactive environments. When running in a container, it adjusts behavior accordingly (no interactive prompts, streamlined output).

Shell Approval in Containers

Set ZOE_SHELL_APPROVE to control how shell commands are approved without interactive prompts:

auto: Automatically approve all commands (use in trusted/sandboxed environments)
deny: Deny all shell command execution
(unset): Interactive prompt (default, requires a TTY)

# docker-compose.yml example
services:
  zoe:
    build: .
    ports:
      - "7337:7337"
    environment:
      - OPENAI_API_KEY=${OPENAI_API_KEY}
      - ANTHROPIC_API_KEY=${ANTHROPIC_API_KEY}
      - ZOE_SHELL_APPROVE=auto
      - ZOE_SKILLS_PATH=/app/skills

Non-Latin Font Issues in Screenshots

When running Zoe Agent inside a Docker container (especially Alpine or Debian Slim), screenshots of websites with non-Latin text (e.g., CJK characters) may display text as square boxes ("tofu") due to missing fonts. Emojis (e.g., 🔥) may also appear as squares.

Solution: Install CJK (Chinese/Japanese/Korean) and Emoji fonts in your container.

For Debian/Ubuntu:

apt-get update && apt-get install -y fonts-noto-cjk fonts-wqy-zenhei fonts-noto-color-emoji

For Alpine Linux: apk add font-noto-cjk font-noto-emoji

License

Zoe is distributed under the Business Source License 1.1 (BSL) — a source-available license.

Free for personal, non-commercial use (including students, educators, and research), and for short evaluation/internal trial use.
Commercial / production use (including offering Zoe as a managed service) requires a commercial license from the Licensor.
On the Change Date (2028-01-01) the Licensed Work automatically becomes available under the Apache License 2.0.

See the full terms in LICENSE. To obtain a commercial license, contact the maintainer.

Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

Fork the Project
Create your Feature Branch (git checkout -b feature/AmazingFeature)
Commit your Changes (git commit -m 'Add some AmazingFeature')
Push to the Branch (git push origin feature/AmazingFeature)
Open a Pull Request

GitHub: https://github.com/hashangit/zoe