omnius

v1.0.561

Published

3 hours ago

AI coding agent powered by open-source models (Ollama/vLLM) — interactive TUI with agentic tool-calling loop

Omnius

Omnius is a local-first agentic coding runtime: terminal UI, autonomous coding loop, REST daemon, model router, memory layer, media tools, Telegram bridge, and peer-to-peer inference mesh in one CLI.

It is designed for open-weight and user-controlled models first, while still routing cleanly through Ollama, vLLM, OpenAI-compatible endpoints, OpenRouter, Groq, Chutes, sponsor peers, COHERE peers, and other configured providers.

Install

npm install -g omnius
omnius

Requirements:

Node.js 22 or newer
npm 10 or newer for published CLI use
pnpm 9 or newer for workspace development
A local model or configured remote endpoint

Start the REST daemon:

omnius serve

The daemon defaults to http://127.0.0.1:11435. Open the interactive API docs at http://127.0.0.1:11435/docs.

What Omnius Does

Runs autonomous coding tasks, edits files, executes tools, tests changes, and iterates on failures.
Provides a dense terminal UI for model selection, endpoint routing, task control, shell output, voice, sponsors, Telegram, and system telemetry.
Exposes a REST daemon with OpenAI/Ollama-compatible inference, agentic task execution, memory, skills, tools, MCP, events, voice, projects, and governance endpoints.
Routes models through local, cloud, sponsor, and peer-to-peer endpoints without assuming local Ollama is the only source.
Supports realtime spoken conversation for ASR/TTS clients through /realtime and REST realtime: true.
Supports image, video, sound, music, TTS, ASR, voice clone references, Telegram media workflows, and sponsor-provided media generation.
Keeps project runtime state in .omnius/, which is intentionally ignored by git.

Common Workflows

omnius "inspect this repo and summarize the main entrypoints"
omnius serve

/help                 command help
/model                select or inspect the active model
/endpoint             select or configure local, cloud, sponsor, or peer endpoints
/realtime             toggle short ASR/TTS-oriented conversation mode
/broker               inspect model broker, RAM/VRAM thresholds, and loaded models
/sponsor              expose local or upstream capacity to peers
/cohere               participate in distributed COHERE inference
/telegram             configure or toggle the Telegram bridge
/skills               list explorable skills and docs memories
/pause                pause after the current turn boundary
/stop                 interrupt the active run
/resume               resume saved state

Current Feature Areas

| Area | What to read | | --- | --- | | Install and setup | Install, First run, Model providers | | Terminal workflows | TUI workflows, Slash commands | | REST daemon | REST reference, REST quickref, OpenAPI source | | Realtime voice chat | Realtime guide | | Sponsor and COHERE mesh | Sponsor and COHERE guide | | Telegram bridge | Telegram guide | | Media generation | Media guide | | Operations | Runtime hygiene, Security and remote access | | Architecture | Architecture overview | | Agent-explorable docs | Agent memory docs index |

Shared Media Dependencies

Image, video, audio, and music generation share a single, system-wide dependency store instead of duplicating heavy runtimes per project or per Telegram group.

Earlier builds wrote a private Python venv plus Hugging Face / Torch / pip caches under every scoped working directory (for example …/telegram-creative/<group-id>/.omnius/image-gen/.venv). On a busy machine the same multi-gigabyte diffusers stack and model weights were re-downloaded once per group — tens of gigabytes of pure duplication.

Everything now resolves to one source of truth under ~/.omnius (override with OMNIUS_HOME):

| Location | Holds | | --- | --- | | ~/.omnius/runtimes/<kind>/.venv-<backend> | One shared Python venv per kind+backend (image/video/audio) | | ~/.omnius/models/huggingface/{hub,transformers,diffusers} | Shared model weights — downloaded once, reused everywhere | | ~/.omnius/models/{torch,cache,pip-cache} | Shared Torch hub, XDG, and pip caches | | ~/.omnius/models/_meta.json | LRU usage index for automatic disk-pressure eviction | | ~/.omnius/media/{images,videos,audio,music} | Global generated-media gallery (project-independent) |

Project directories keep only lightweight session artifacts; no venvs or model weights are written per project.

Migrate and dedup existing machines. A one-time cleanup consolidates any legacy per-group caches into the unified store — unique weights are moved (never re-downloaded), duplicates and stale venvs are reclaimed:

# TUI — current project only
/models cleanup
# TUI — every project + nested scoped group on this machine (dry-run first)
/models cleanup --all --dry-run
/models cleanup --all

# REST — preview, then apply
curl -s -X POST localhost:11435/v1/media/migrate -H 'content-type: application/json' -d '{"dryRun":true}'
curl -s -X POST localhost:11435/v1/media/migrate -H 'content-type: application/json' -d '{}'
# Inspect store + reclaimable legacy caches
curl -s localhost:11435/v1/media/store

Generate over REST. The daemon (default 127.0.0.1:11435, a port in the IANA dynamic/private range that avoids common system-service collisions) exposes the local generators so any user on the machine can list models, generate, and browse the global gallery without the CLI:

curl -s localhost:11435/v1/media/models
curl -s -X POST localhost:11435/v1/media/image -H 'content-type: application/json' -d '{"prompt":"a compact robot painter"}'
curl -s -X POST localhost:11435/v1/media/music -H 'content-type: application/json' -d '{"prompt":"warm lo-fi piano loop"}'
curl -s localhost:11435/v1/media/gallery

The same surface drives the Generate tab in the web UI (http://127.0.0.1:11435) — pick a kind (image/video/audio/music), choose a model loaded from the system, generate, and review every previously generated file in one global gallery.

Recent Highlights

/realtime and REST realtime: true provide short, natural, SOUL.md-aware conversation for ASR/TTS clients.
Endpoint setup and sponsor setup aggregate models from all enabled endpoints, including external OpenAI-compatible routers.
/sponsor can expose text inference and media generation for image, video, sound, and music with per-modality limits.
Sponsor and COHERE status surfaces now use shared telemetry concepts: concurrency, request rate, daily tokens, peer usage, model usage, and remote system metrics.
The TUI reports token production rate as t/s, supports Shift+Enter multiline input, and renders dynamic shell output inside bounded Unicode cards.
Telegram state is scoped by user and group, supports durable reply preferences, and feeds raw platform/tool failures back into the agent loop.
Ollama pool cleanup now accounts for process groups and orphan runner processes that can keep VRAM pinned.
REST documentation is available both as human docs and as Omnius-discoverable docs skills.

REST API

Start the daemon (default http://127.0.0.1:11435; interactive docs at /docs, machine spec at /openapi.json):

omnius serve

For shared deployments, gate access with scoped bearer keys (read < run < admin):

OMNIUS_REST_API_KEYS="read-key:read:grafana,run-key:run:ci:60:100000:3,admin-key:admin:ops" omnius serve
# then: Authorization: Bearer <key>

The complete endpoint inventory follows. It is kept in lockstep with the served OpenAPI spec by pnpm docs:check; the canonical machine contract is generated from packages/cli/src/api/openapi.ts and mirrored in docs/reference/rest-api.md.

Docs and compatibility aliases

| Method | Path | Purpose | | --- | --- | --- | | GET | /docs · /api/docs · /swagger-ui | Swagger UI | | GET | /openapi.json · /openapi.yaml · /v3/api-docs · /swagger.json · /api-docs | OpenAPI spec (JSON/YAML + aliases) | | GET | /redoc | ReDoc renderer |

Health and observability

| Method | Path | Purpose | | --- | --- | --- | | GET | /health · /health/ready · /health/startup | Liveness, backend readiness, startup probes | | GET | /version | Package version and platform | | GET | /metrics | Prometheus metrics | | GET | /v1/events | Server-sent event stream | | GET | /v1/usage | Token usage and rate limits | | GET | /v1/audit | Audit log query | | GET | /v1/cost | Cost tracker | | GET | /v1/system | CPU, RAM, GPU, and system snapshot |

Inference and chat

| Method | Path | Purpose | | --- | --- | --- | | GET | /v1/models · /api/tags | Aggregated model list (OpenAI + Ollama tags) | | POST | /v1/chat/completions | OpenAI-compatible chat completion | | POST | /v1/chat | Stateful Omnius chat | | POST | /api/chat | Ollama-compatible chat alias | | POST | /v1/generate · /api/generate | One-shot generation (Ollama-compatible) | | POST | /v1/embeddings · /api/embed | Embeddings (OpenAI + Ollama aliases) | | GET | /v1/chat/sessions | Active chat sessions | | POST | /v1/chat/check-in | Steering check-in for active chat |

Agentic runs

| Method | Path | Purpose | | --- | --- | --- | | POST | /v1/run | Submit agentic task | | GET | /v1/runs · /v1/runs/{id} | List runs · get run details | | DELETE | /v1/runs/{id} | Abort run | | POST/GET | /v1/todos | Create/update · list sessions with todos | | GET/DELETE | /v1/todos/{session_id} | Get · delete session todos | | POST | /v1/evaluate | Evaluate a run | | POST | /v1/index | Trigger repository indexing |

Configuration, keys, profiles, projects

| Method | Path | Purpose | | --- | --- | --- | | GET/PATCH | /v1/config | Read · update daemon config | | GET/PUT | /v1/config/model | Current model · switch model | | POST | /v1/config/model/check | Probe model readiness | | GET/PUT | /v1/config/endpoint | Current endpoint · switch endpoint | | POST | /v1/config/endpoint/test | Probe endpoint | | GET/DELETE | /v1/config/endpoint/history | Endpoint history · remove item | | POST | /v1/share/generate | Generate remote-access share URL | | GET/POST | /v1/keys | List · mint runtime API keys | | DELETE | /v1/keys/{prefix} | Revoke runtime keys by prefix | | GET/POST | /v1/profiles | List · create tool profiles | | GET/DELETE | /v1/profiles/{name} | Get · delete profile | | GET/DELETE | /v1/projects | List · unregister projects | | GET | /v1/projects/current | Current project | | POST | /v1/projects/switch · /v1/projects/register · /v1/projects/rename | Switch · register · rename project | | GET/PUT/DELETE | /v1/projects/preferences | Read · patch · reset project preferences |

Skills, commands, tools, MCP

| Method | Path | Purpose | | --- | --- | --- | | GET | /v1/skills · /v1/skills/{name} | List · load skill content | | GET | /v1/commands | List slash commands | | POST | /v1/commands/{cmd} | Execute slash command | | GET | /v1/tools · /v1/tools/{name} | List (built-in + external) · tool metadata | | POST | /v1/tools/register | Register an application-specific external tool | | DELETE | /v1/tools/{name} | Unregister an external tool | | POST | /v1/tools/{name}/call | Call tool | | POST | /v1/tools/{name}/eval | Evaluate an external tool against test cases | | GET | /v1/mcps · /v1/mcps/{name} | List · MCP server details | | POST | /v1/mcps/{name}/call | Call MCP tool | | GET | /v1/hooks · /v1/agents | Hook registry · agent type registry | | GET | /v1/codegraph/snapshot · /v1/codegraph/events | Code graph snapshot · SSE |

Registering application-specific tools

Agents integrating Omnius into their own stack can register tools at runtime so the Omnius agent loop can call them alongside built-ins. Registration is a single unified contract — transport.type selects how Omnius reaches the implementation:

http — Omnius POSTs {name, args, session_id} to a callback_url your app hosts and relays the response.
mcp — the tool proxies to a named tool on an MCP server (auto-connected when you pass connect).

Registered tools are persisted per working directory (.omnius/external-tools.json), surface in GET /v1/tools, and respect the same scope/off-device security gate as built-ins. Registration needs run scope (remote callers need admin).

# Register an HTTP-backed tool
curl -s -X POST localhost:11435/v1/tools/register -H 'content-type: application/json' -d '{
  "name": "lookup_order",
  "description": "Look up an order by id in the billing system",
  "parameters": {"type":"object","properties":{"id":{"type":"string"}},"required":["id"]},
  "security": {"requires_scope":"run","risk":"low"},
  "transport": {"type":"http","callback_url":"https://app.internal/tools/lookup_order","auth_header":"Bearer …"}
}'

# It now appears in the registry and is directly callable
curl -s localhost:11435/v1/tools/lookup_order
curl -s -X POST localhost:11435/v1/tools/lookup_order/call -H 'content-type: application/json' -d '{"args":{"id":"A-1001"}}'

# Evaluate it against cases during development (pass/fail + metrics)
curl -s -X POST localhost:11435/v1/tools/lookup_order/eval -H 'content-type: application/json' -d '{
  "cases": [
    {"name":"known order","args":{"id":"A-1001"},"expect":{"success":true,"output_contains":"A-1001"}},
    {"name":"missing order","args":{"id":"nope"},"expect":{"success":false}}
  ]
}'

# Unregister when done
curl -s -X DELETE localhost:11435/v1/tools/lookup_order

The same registration accepts an MCP transport, e.g. "transport":{"type":"mcp","server":"acme","tool":"search","connect":{"url":"https://app.internal/mcp","transport":"streamable-http"}}.

AIWG

| Method | Path | Purpose | | --- | --- | --- | | GET | /v1/aiwg | AIWG root and control map | | GET | /v1/aiwg/frameworks · /v1/aiwg/frameworks/{name} · /v1/aiwg/frameworks/{name}/content | List · details · tier-aware content | | GET | /v1/aiwg/skills · /v1/aiwg/skills/{name} | List · load AIWG skill | | GET | /v1/aiwg/agents · /v1/aiwg/agents/{name} | List · load AIWG agent | | GET | /v1/aiwg/addons | List AIWG addons | | POST | /v1/aiwg/use · /v1/aiwg/expand | Activation bundle · expand item |

Memory, sessions, context

| Method | Path | Purpose | | --- | --- | --- | | GET | /v1/memory | Memory backend summary | | POST | /v1/memory/search · /v1/memory/write | Search · write memory | | GET | /v1/memory/episodes · /v1/memory/failures | List episodes · failures | | GET | /v1/sessions · /v1/sessions/{id} | List task sessions · get history | | GET | /v1/context | Current context snapshot | | GET | /v1/context/window-dumps · /v1/context/window-dumps/{id} | List/fetch full outbound model context-window dumps | | POST | /v1/context/save · /v1/context/compact | Save entry · request compaction | | GET | /v1/context/restore | Build restore prompt |

Context-window dumps are written for main agents, sub-agents, internal runners, and adversary audits before backend inference. Use GET /v1/context/window-dumps?agent_type=main for summaries with signal/noise metrics, or GET /v1/context/window-dumps/latest for the full request payload. Dumps include focus-supervisor state when the runner is enforcing a next-action contract. Set OMNIUS_CONTEXT_WINDOW_DUMP_DIR to relocate dumps, OMNIUS_DISABLE_CONTEXT_WINDOW_DUMPS=1 to disable them, or OMNIUS_FOCUS_SUPERVISOR=off|auto|strict to tune small-model focus enforcement.

Files, nexus, ollama pool

| Method | Path | Purpose | | --- | --- | --- | | GET | /v1/files | List workspace directory | | POST | /v1/files/read | Read workspace file | | GET | /v1/nexus/status | Nexus peer state | | GET | /v1/sponsors | Sponsor directory cache | | GET | /v1/ollama/pool/processes | Ollama process inventory | | POST | /v1/ollama/pool/cleanup | Cleanup stale Ollama pool processes |

Voice, audio, vision

| Method | Path | Purpose | | --- | --- | --- | | GET | /v1/voice/state | Voice runtime status | | GET/POST | /v1/voice/models · /v1/voice/models/switch | List · switch TTS model | | GET/POST | /v1/voice/supertonic-settings | Read · update voice tuning | | GET/POST | /v1/voice/asr-models · /v1/voice/asr-models/switch | List · switch ASR model | | POST | /v1/voice/tts · /v1/audio/speech | Synthesize speech (+ OpenAI alias) | | POST | /v1/voice/transcribe · /v1/audio/transcriptions · /v1/voice/transcribe/stream | Transcribe (+ alias + streaming) | | GET/POST | /v1/voice/clone-refs | List · upload clone reference | | POST | /v1/voice/clone-refs/upload · /v1/voice/clone-refs/from-url | Upload · fetch clone reference | | POST | /v1/voice/clone-refs/{filename}/activate · /v1/voice/clone-refs/{filename}/rename | Activate · rename clone reference | | DELETE | /v1/voice/clone-refs/{filename} | Delete clone reference | | POST | /v1/voice/speak | Broadcast speech to voicechat clients | | GET | /v1/voicechat/ws | WebSocket upgrade for full-duplex voicechat | | POST | /v1/vision/describe | Vision describe placeholder |

Generative media

Backed by the unified ~/.omnius store and shared venvs (see Shared Media Dependencies). Outputs land in the global gallery at ~/.omnius/media/{images,videos,audio,music}.

| Method | Path | Purpose | | --- | --- | --- | | GET | /v1/media/models | List available image/video/audio/music models | | GET | /v1/media/store | Unified store disk usage + reclaimable legacy caches | | POST | /v1/media/migrate | Dedup + migrate legacy per-group caches into the unified store | | POST | /v1/media/image · /v1/media/video · /v1/media/audio · /v1/media/music | Generate media (run scope) | | GET | /v1/media/gallery | List previously generated media (global, newest first) | | GET | /v1/media/file | Stream one generated media file |

Engines and scheduled jobs

| Method | Path | Purpose | | --- | --- | --- | | GET | /v1/engines | Long-running engine status | | GET | /v1/scheduled · /v1/scheduled/all · /v1/scheduled/status | List · list all · scheduler status | | POST | /v1/scheduled/kill · /v1/scheduled/fixup · /v1/scheduled/reconcile | Kill · reconcile · force reconcile | | GET | /v1/services/systemd | Systemd service status | | GET | /v1/update | Self-update status |

AIMS governance (ISO/IEC 42001:2023)

| Method | Path | Purpose | | --- | --- | --- | | GET | /v1/aims | AIMS root and endpoint index | | GET/PUT | /v1/aims/policies | Policy register · replace | | GET | /v1/aims/roles · /v1/aims/resources | Roles · resource inventory | | GET/POST | /v1/aims/impact-assessments | List · file impact assessment | | GET | /v1/aims/lifecycle · /v1/aims/data-quality · /v1/aims/transparency · /v1/aims/usage · /v1/aims/suppliers | Lifecycle, data quality, transparency, usage, suppliers | | GET/POST | /v1/aims/incidents | List · file incident | | GET | /v1/aims/oversight · /v1/aims/decisions · /v1/aims/config-history | Oversight gates · decision log · config history |

For per-endpoint schemas, parameters, and response shapes, see the served /openapi.json and the maintained inventory in docs/reference/rest-api.md.

Agent-Explorable Documentation

Omnius discovers project-local docs skills from .aiwg/addons/*/skills. The docs bundles in this repo expose high-signal entrypoints for agents:

/skills omnius docs
skill_execute name="omnius-docs"
skill_execute name="omnius-rest-docs"
skill_extract name="omnius-realtime-docs" query="How does realtime REST mode work?"

The intended pattern is index first, targeted document second, not loading the whole manual into the active context.

Development

pnpm install
pnpm -r build
pnpm docs:check

Focused checks used for the docs skill surface:

pnpm --filter @omnius/execution exec vitest run tests/skill-discovery.test.ts
pnpm --filter omnius exec vitest run tests/realtime-mode.test.ts tests/command-registry.test.ts

Publishing

Publish only from publish/.

cd omnius
pnpm -r clean || true
find . -name 'tsconfig.tsbuildinfo' -not -path '*/node_modules/*' -delete
pnpm -r build
node scripts/build-publish.mjs
cd publish
mkdir -p .npm-cache
NPM_CONFIG_CACHE=$(pwd)/.npm-cache npm pack --prefer-online --cache-min=0 --registry https://registry.npmjs.org/
NPM_CONFIG_CACHE=$(pwd)/.npm-cache npm publish --access public --prefer-online --cache-min=0 --registry https://registry.npmjs.org/

Before publishing, verify README.md, package.json, dist/index.js, and dist/launcher.cjs are in the tarball, and that package.json includes readmeFilename: "README.md" plus a string readme.

License

Omnius is released under CC-BY-NC-4.0 for non-commercial use. Commercial use, redistribution, hosted services, and enterprise deployment require a commercial license.