omnishapeagent

v1.0.12

Published

a day ago

Local AI agent with chat, tool use, persistent memory, and vision

0High
0Medium
0Low

chromebookwiz

ai agent autonomous llm ollama openrouter vllm computer-use

OmniShapeAgent

OmniShapeAgent is a local-first autonomous agent runtime with a Next.js web app, persistent memory, OLR geometry reasoning, computer-use tools, local model routing, and an optional neural orchestrator. This repository is designed to run on your machine, keep its learned state locally, and expose enough operational surface to move from development into an actual launchable local product.

What ships in this tree

A web chat UI and autonomous agent runtime.
A tool-executing agent loop with filesystem, terminal, git, browser, vision, messaging, physics, and bot-management tooling.
A geometry-first long-term memory system backed by the OmniShape Linguistic Resonator.
A learned memory policy, knowledge graph, user profile, memory lattice, and maintenance cycle.
An OLR API and visual workbench for rendering and comparing text as algebraic geometry on the unit circle.
An optional Python neural orchestrator service for user-intent learning and directive injection.
A minimal CLI that talks to the single shared OmniShapeAgent runtime instead of spawning a second agent instance.

Runtime state is intentionally local and excluded from normal source flow. Memory stores, learned profiles, policy state, OLR resonator state, saved chats, queues, screenshots, and generated artifacts are created in workspace runtime directories.

Main components

App and runtime

src/app/page.tsx boots the main shell.
src/components/HomeShell.tsx loads chat immediately and defers heavier panels until idle.
src/lib/agent.ts is the main runtime loop, tool dispatcher, retrieval coordinator, and memory-feedback engine.

Path and storage layer

src/lib/paths-core.ts defines canonical workspace paths.
src/lib/paths-bootstrap.ts creates required directories.
src/lib/paths-migrations.ts migrates legacy root-level data into data/.
src/lib/paths.ts remains the compatibility wrapper.

Persistent runtime state lives in:

data/ for memory, graph, profiles, policies, queues, and logs.
weights/ for learned weight artifacts.
saved_chats/ for exported conversations.
screenshots/ and screenshots/generated/ for captures and rendered images.
workspace/ for agent-generated work products.

Memory architecture

The memory system is geometry-first. Embeddings still exist, but they are no longer the primary identity or similarity layer.

Memory record model

Each record stored by src/lib/vector-store.ts contains:

Original content.
Embedding vector.
OLR geometry signature.
Lifecycle counters for injection, acknowledgement, rejection, and stale streaks.
Lattice metadata for neighborhood, degree, cluster strength, and centrality.
Consolidation state for support, volatility, abstraction level, and long-term maturation.

The geometry signature generated by src/lib/memory-geometry.ts includes:

shapeKey as a quantized geometric identity.
fingerprint, harmonics, and vibration for shape comparison.
coherence, virtue, entropy, and closure from OLR analysis.
repetitionScore and repetitionCount for repeated-shape detection.

Retrieval pipeline

Query-time retrieval works like this:

The query is embedded for compatibility.
The same query text is analyzed by OLR to produce a geometry signature.
Search blends geometry similarity, embedding similarity, and lexical overlap.
Candidate ranking applies freshness, lifecycle quality, lattice support, repetition, and OLR virtue.
Maximum marginal relevance removes redundant recalls.

This behavior lives mainly in:

src/lib/vector-store.ts
src/lib/memory-policy.ts
src/app/api/memory/route.ts
src/lib/agent.ts

Why OLR is the main memory substrate

OLR gives the system a stable shape representation for words, phrases, and paragraphs. That simplifies several problems at once:

Memory similarity becomes geometry comparison instead of only vector distance.
Repetition detection becomes repeated-shape detection.
Lattice formation becomes a topology problem rather than just an embedding graph.
Injection quality can be learned on geometry-aware features.
Consolidation can cluster by shared shape identity and topological synonymy.

Injection policy

src/lib/memory-policy.ts learns which memories should actually be injected. It scores:

Geometry similarity.
Embedding similarity.
Base retrieval score.
Importance.
Text-hit coverage.
Acknowledgement and rejection ratios.
Lattice support and centrality.
Geometry virtue.
Repetition score and repetition count.
Freshness and unacknowledged streaks.

This means the system is not simply doing nearest-neighbor retrieval. It is learning which geometrically relevant memories are useful enough to spend context on.

The current memory stack also models a more human-like maturation path:

Fresh episodic traces can stay volatile and high-novelty.
Repeatedly acknowledged traces gain consolidation support.
Stable, rehearsed memories can be promoted into semantic or procedural knowledge.
Intrusive or repeatedly rejected recalls are suppressed before they can fixate the agent.

Feedback and maintenance loop

After a turn completes, the agent:

Infers which injected memories were actually acknowledged using lexical overlap plus geometry similarity.
Updates lifecycle counters in the vector store.
Adjusts memory importance.
Trains the memory policy using the observed outcome.
Triggers maintenance and lattice rebuilds when enough feedback accumulates.

The maintenance surface supports:

Pruning decayed memories.
Forgetting stale, repeatedly unacknowledged memories.
Rebuilding the memory lattice.
Resetting or inspecting learned policy state.

Consolidation

src/lib/memory-consolidator.ts consolidates clusters of similar memories into higher-value synthesized memories. Consolidation is now geometry-first, so clusters can form around shared shape identity and topological similarity rather than only embedding cosine. Consolidated memories preserve OLR context such as scripts, audits, and shape keys in the synthesized summary.

Supporting memory layers

src/lib/knowledge-graph.ts stores entity and relation memory.
src/lib/user-profile.ts stores learned user facts and goals.
src/lib/meta-learner.ts stores agent-level outcome patterns.
src/components/MemoryPanel.tsx exposes stats, policy summary, maintenance, and lattice rebuild controls.
src/app/api/memory/route.ts exposes search, stats, maintenance, lattice, clear, ack, reject, and policy reset operations.

OLR system

The OmniShape Linguistic Resonator in src/lib/olr.ts treats text as geometry on a unit circle.

What OLR computes

Grapheme segmentation and script detection.
Glyph placement on a circle.
Traversal paths through the glyph field.
Radial vibration bins.
Harmonic descriptors.
A geometric fingerprint.
Structural metrics such as smoothness, symmetry, regular polygon tendency, spiral tendency, coherence, entropy, closure, collision ratio, and virtue.
Cross-text topological similarity.

OLR features in this repo

src/app/api/olr/route.ts exposes analyze, compare, stats, gate updates, and reset operations.
src/app/olr/page.tsx and src/components/OLRWorkbench.tsx provide a visual workbench.
scripts/olr_render.py renders high-fidelity mandalas through matplotlib when Python is available.
The fallback renderer emits SVG when Python or matplotlib is unavailable.

Agent features

Tooling

The agent loop in src/lib/agent.ts includes tooling for:

Filesystem editing and search.
Shell and terminal control.
Git inspection and mutation.
Browser and network access.
Vision, screenshots, and pixel analysis.
Physics simulation and window orchestration.
Voice history and voice memory.
Weight registry maintenance.
Hall-of-fame, bot, and model utilities.
Memory maintenance and OLR analysis from inside the agent loop.

Physics and windows

src/lib/physics-types.ts centralizes the physics command contract.
src/lib/physics-state-store.ts supports waiting for fresh physics state after commands.
src/components/WindowManager.tsx and src/components/PhysicsSimulator.tsx share the same typed physics protocol.

Voice and vision integration

src/lib/tools/voice-tools.ts stores voice interactions into the same geometry-aware memory system.
Vision tooling can persist observations into long-term memory.

Neural orchestrator

The optional Python service under orchestrator/ provides:

User observation ingestion.
Directive generation.
Online training and checkpointing.
A lightweight FastAPI service.

It is not required to run the web app, but it is part of the shipped architecture.

npm package usage

The package is designed to support both local development and npm-installed runtime usage.

Warning:

OmniShapeAgent gives the running agent broad access to your local system.
It can execute terminal commands, read and write files, inspect network resources, and operate enabled integrations.
Only install and run it if you trust the package and explicitly want to grant that access.
The CLI will ask for confirmation before first runtime startup unless you pass --yes.

Quick start

Install from npm and start the packaged runtime:

npm install -g omnishapeagent
omnishapeagent serve --yes

Or run it without a global install:

npx omnishapeagent serve --yes

Open http://localhost:3000 unless you selected a different port.

Global install:

npm install -g omnishapeagent
omnishapeagent serve --yes

One-shot launch without a global install:

npx omnishapeagent serve --yes

The CLI server command will:

Create the local runtime directories it needs.
Reuse an existing production build when available.
Run next build automatically on first launch, or when you pass --rebuild.
Start the shared runtime on a configurable host and port.

Useful variants:

omnishapeagent serve --port 4123 --yes
omnishapeagent serve --host 127.0.0.1 --port 3000 --yes
omnishapeagent serve --dev --yes
omnishapeagent serve --rebuild --yes

CLI commands such as omnishapeagent chat and omnishapeagent status target http://127.0.0.1:3000 by default. Override that with either:

--server http://127.0.0.1:4123
OMNISHAPEAGENT_URL=http://127.0.0.1:4123
OMNISHAPEAGENT_PORT=4123

Production launch

If you are working from a source checkout instead of the published npm package, build and start the app with:

npm run build
npm run start

Equivalent packaged runtime command:

npm run serve

The default dev script uses webpack for stability:

npm run dev

If you explicitly want Turbopack dev mode:

npm run dev:turbo

Optional orchestrator launch

cd orchestrator
pip install -r requirements.txt
python launcher.py

Launch checklist

Install Node.js 20 or newer.
Run npm install.
Set any model and integration environment variables you need.
Run npm run build and confirm a clean compile.
Start the app with npm run start.
If using the orchestrator, start orchestrator/service.py or orchestrator/launcher.py.
If using Python OLR rendering, ensure the local Python environment has matplotlib available.

Environment and integrations

Common optional environment variables include:

OLLAMA_URL for local Ollama routing.
OPENROUTER_API_KEY for OpenRouter-compatible models.
DISCORD_BOT_TOKEN and DISCORD_APPLICATION_ID for Discord integration.
TELEGRAM_BOT_TOKEN for Telegram bot control.
PORT or OMNISHAPEAGENT_PORT to choose the runtime port.
OMNISHAPEAGENT_HOST to choose the bind host for omnishapeagent serve.
OMNISHAPEAGENT_URL to tell the CLI where the shared runtime is listening.
TELEGRAM_CHAT_ID for the single authorized Telegram control chat.
TELEGRAM_TRANSPORT as polling or webhook.
TELEGRAM_WEBHOOK_URL when webhook mode is active.
Telegram, email, and other provider credentials depending on which tools you enable.

Telegram setup

Telegram now supports a full shared-runtime bootstrap flow. The goal is to keep one persistent OmniShapeAgent instance in charge of the ecosystem, while Telegram acts as another control surface into that same instance.

Shared-runtime behavior

Telegram does not launch a second agent runtime.
The web runtime remains the source of truth for memory, tools, and state.
The CLI is only a thin client that talks to that runtime.
Telegram can run through polling or webhook mode.
If no chat ID is configured yet, setup can leave the runtime in capture mode so the first private message to the bot becomes the authorized control chat.

Setup from the running app

Use the agent tool directly:

telegram_setup("BOT_TOKEN", "polling")
telegram_setup("BOT_TOKEN", "webhook", "https://your-domain.com")

Or call the dedicated setup endpoint:

curl -X POST http://127.0.0.1:3000/api/telegram/setup \
	-H "Content-Type: application/json" \
	-d '{"token":"BOT_TOKEN","mode":"polling"}'

If the authorized chat is not yet known, send /start to the bot from the Telegram chat you want OmniShapeAgent to own. The runtime will capture that chat ID and lock future control to that chat.

Webhook mode

For webhook mode, provide a public domain that resolves to the running app:

telegram_setup("BOT_TOKEN", "webhook", "https://your-domain.com")

The runtime will register https://your-domain.com/api/telegram as the Telegram webhook and persist the shared configuration in .env.local.

CLI

The package now ships a deliberately small CLI. It does not host a separate runtime. It only talks to the already-running OmniShapeAgent server.

Examples:

omnishapeagent status
omnishapeagent chat "diagnose the system"
omnishapeagent telegram setup --token BOT_TOKEN --mode polling
omnishapeagent telegram setup --token BOT_TOKEN --mode webhook --domain https://your-domain.com

You can point the CLI at another runtime with --server http://host:port.

See TAILSCALE.md for remote-access guidance.

Validation

Useful commands before release:

npm run build
npm run lint
npm pack --dry-run
npm audit --omit=dev

Notes

src/app/api/discord/route.ts reports Discord bot status and invite configuration.
src/app/api/weights/route.ts supports cleanup, update, delete, and import operations for learned weight entries.
src/app/api/memory/route.ts is the operational surface for the geometry-first memory layer.

This tree is no longer just a prototype chat frontend. It now includes a documented launch surface, geometry-native memory, OLR analysis and rendering, policy learning, maintenance tooling, and operational APIs that match the current implementation.