nova-agent
v1.3.0
Published
Personal AI Agent with Browser Automation
Maintainers
Readme
What is NOVA?
NOVA is an open-source personal AI agent that goes beyond chat. It remembers you, browses the web, controls your desktop, manages your goals, writes and executes code, sends emails, schedules tasks, and improves itself — all from a single interface.
It runs as a web app, a CLI, a desktop app (via Tauri), or connects to WhatsApp, Discord, Telegram, and Slack.
You: "Research the top 5 AI startups, compare their funding, and save a report"
NOVA: [plans 4 sub-tasks] → [web_search x5] → [deep_research] → [file_write]
→ Saved ai-startups-report.md (2,847 words, 5 companies analyzed)Why NOVA?
| Feature | ChatGPT | OpenClaw | Manus | NOVA | |---------|---------|----------|-------|----------| | Multi-model (GPT-4o, Claude, Groq, Ollama) | Single | Single | Single | 5+ providers | | Persistent memory across sessions | Limited | No | No | Full vector memory | | Browser automation (Playwright) | No | Basic | Yes | 16 browser tools | | Desktop control (mouse, keyboard, screenshots) | No | No | Partial | Full PC control | | Self-improvement loop | No | No | No | Autonomous fix cycle | | Multi-agent orchestration (Arena) | No | No | Basic | 6-phase debate system | | CLI + Web + Desktop + Mobile channels | Web only | CLI only | Web only | 7 channels | | Goal tracking with deadlines | No | No | No | Full goal system | | Scheduled tasks (cron) | No | No | No | Built-in cron | | Custom skill generation | No | No | No | LLM generates new skills | | Voice I/O (TTS + STT) | Voice chat | No | No | edge-tts + Whisper | | WhatsApp / Discord / Telegram | No | No | No | Native integrations | | Deep research with sub-tasks | Limited | No | Yes | Parallel research engine | | Proactive suggestions | No | No | No | Proactive engine | | Task decomposition & planning | No | No | Basic | AI task planner | | Webhooks & API | API | No | API | Webhooks + full API | | Open source | No | Yes | No | 100% open source |
Features
Core Intelligence
- Multi-Agent Arena — 6-phase orchestration: Triage, Assemble, Explore, Debate, Verify, Deliver. Six specialist agents (planner, researcher, writer, coder, creative, critic) collaborate or debate depending on complexity.
- Intent Classification — Hybrid regex + LLM classifier. Simple messages get 0ms regex routing; ambiguous messages go through an AI classifier for accurate tool selection.
- Task Planning — Automatically decomposes complex requests into ordered sub-tasks with dependencies.
- Deep Thinking — 5-phase cognitive pipeline for complex reasoning.
Memory & Learning
- Persistent Memory — Vector-based memory with embeddings. Remembers your name, preferences, past conversations.
- Workspace Files — SOUL.md (personality), USER.md (user profile), TOOLS.md (tool docs) — all editable.
- Continuous Learning — Learns from conversations, tracks tool failures, adapts over time.
- Proactive Engine — Checks goal deadlines, detects failures, suggests actions before you ask.
Tools (120+)
- Web —
web_search,web_read_page,browser_navigate,browser_click,browser_type,browser_extract,browser_screenshot, and 9 more browser tools - Memory —
memory_save,memory_search,memory_delete - Goals —
create_goal,list_goals,update_goal,delete_goal - Files —
file_read,file_write,file_list,file_delete - Shell —
shell_execute(sandboxed command execution) - Desktop —
desktop_screenshot,desktop_click,desktop_type,desktop_open_app - Voice —
speak_text(TTS via edge-tts),listen(STT via Whisper) - Research —
deep_research(parallel multi-source research with synthesis) - Schedule —
cron_create,cron_list,cron_delete,cron_toggle - Email —
send_email,read_inbox,search_email,draft_email,reply_email,forward_email - Calendar —
calendar_list,calendar_create,calendar_update,calendar_delete,calendar_availability,calendar_upcoming - Skills —
skill_generate,skill_list,skill_run,skill_toggle,skill_delete,skill_export - Utility —
calculator,datetime,weather,translate,think - Meta —
self_improve,analyze_performance,suggest_improvement, and policy tools
Channels
- Web UI — Next.js chat interface at
localhost:3000 - CLI —
nova "your question"(one-shot),nova(REPL), pipe mode - WhatsApp —
nova whatsapp→ QR code scan → auto-responds via Baileys - Discord — Full bot integration with
discord.js - Telegram — Bot integration with
grammy - Slack — Workspace app integration
- Tauri Desktop — Native desktop app (Windows, macOS, Linux)
Self-Improvement
NOVA includes a recursive self-improvement loop powered by Claude Code:
Test → Fix → Pass → Add New Tests → Fix → Pass → RepeatIt runs smoke tests, identifies failures, spawns Claude Code to fix them, verifies the build, and commits — all autonomously. Safety guards revert changes if tests regress.
Quick Start
Option 1: npm (recommended)
npm install -g nova-agent
novaOption 2: From source
git clone https://github.com/realisnice1-star/nova-im.git
cd nova-im
npm installCreate .env.local with at least one LLM provider:
OPENAI_API_KEY=sk-...
# or
ANTHROPIC_API_KEY=sk-ant-...
# or
GROQ_API_KEY=gsk_...
# or any combination — NOVA auto-routes to the best available modelStart:
npm run dev # Web UI at http://localhost:3000
# or
nova # CLI mode (interactive REPL)
# or
nova "what time is it in Tokyo?" # One-shot modeOption 3: Desktop app (Tauri)
npm run tauri:buildArchitecture
┌─────────────────────────────────────┐
│ User Message │
└──────────────┬──────────────────────┘
│
┌──────────────▼──────────────────────┐
│ Intent Detection │
│ (Regex fast-path + LLM classifier) │
└──────────────┬──────────────────────┘
│
┌───────────────────┼───────────────────┐
│ │ │
┌────────▼───────┐ ┌────────▼───────┐ ┌────────▼───────┐
│ Tool Flow │ │ Arena (Multi- │ │ Task Planner │
│ (Direct tool │ │ Agent Debate) │ │ (Decompose & │
│ execution) │ │ 6 specialists │ │ schedule) │
└────────┬───────┘ └────────┬───────┘ └────────┬───────┘
│ │ │
└───────────────────┼───────────────────┘
│
┌──────────────▼──────────────────────┐
│ LLM Router │
│ (OpenAI / Anthropic / Groq / │
│ OpenRouter / Ollama) │
└──────────────┬──────────────────────┘
│
┌──────────────▼──────────────────────┐
│ Tool Execution Layer │
│ 120+ tools: browser, desktop, │
│ memory, files, email, calendar... │
└──────────────┬──────────────────────┘
│
┌───────────────────┼───────────────────┐
│ │ │
┌────────▼───────┐ ┌────────▼───────┐ ┌────────▼───────┐
│ Memory │ │ Learning │ │ Proactive │
│ (Persistent │ │ (Outcome │ │ Engine │
│ + workspace) │ │ tracking) │ │ (Goal watch) │
└────────────────┘ └────────────────┘ └────────────────┘Model Routing
NOVA automatically selects the best available model based on your API keys:
| Tier | Purpose | Models | |------|---------|--------| | fast | Intent classification, planning, extraction | GPT-4o-mini, Haiku, Groq Llama | | primary | Main conversations and tool use | GPT-4o, Claude Sonnet | | premium | Complex reasoning and debate | GPT-4o, Claude Opus |
Configure in ~/.nova/config.json or via environment variables. If one provider is down, NOVA automatically falls back to the next available one.
CLI Usage
# Interactive REPL
nova
# One-shot
nova "search for the latest AI news and summarize it"
# Pipe mode
cat error.log | nova "explain this error"
# With options
nova -m gpt-4o "write a haiku about coding"
nova -a researcher "deep dive into quantum computing"
nova -t premium "review this architecture" -c design.md
# WhatsApp mode
nova whatsapp
# System health check
npm run doctorSmoke Tests
NOVA includes 21 integration tests that verify every major feature:
npm run dev # Start the server first
npm run test:smoke # Run all 21 testsNOVA Smoke Test Suite
Server: http://localhost:3000
Tests: 21
Health PASS (9ms)
Chat basic PASS (27ms)
Web search PASS → web_search
Calculator PASS → calculator
Goal create PASS → create_goal
Goal list PASS → list_goals
Memory save PASS → memory_save
Memory recall PASS
DateTime PASS → datetime
Cron create PASS → cron_create
Skill generate PASS → skill_generate
Deep research PASS → deep_research
Shell execute PASS → shell_execute
Weather PASS → weather
Translate PASS → translate
Intent classifier PASS → web_search
Browser navigate PASS → browser_navigate
Task planner PASS → deep_research
Desktop screenshot PASS
Desktop open app PASS → desktop_open_app
Voice speak PASS
Results: 21/21 passed
All tests passed!Self-Improvement
# Standard: test → auto-fix → commit
npm run self-improve
# Full cycle: test → fix → add new tests → fix → repeat
npm run self-improve:full
# Single round
npm run self-improve:onceThe self-improvement loop:
- Builds the project
- Runs all smoke tests
- For each failure: spawns Claude Code CLI to analyze and fix
- Rebuilds and re-tests
- If tests improved → commits. If worse → reverts.
- Repeats until all passing or max rounds reached.
Safety guards: build failures revert all changes, test regressions revert, max 5 rounds, scoped fixes only.
Project Structure
src/
├── app/api/ # Next.js API routes
│ ├── agentic-step/ # Main LLM endpoint (1,400+ lines)
│ ├── execute-step/ # Tool execution (120+ tools)
│ └── chat/ # Chat API wrapper
├── core/ # Core engine (80+ modules)
│ ├── init.ts # Boot sequence (27-step initialization)
│ ├── agent/ # Tool registry, agent loop, budgets
│ ├── orchestrator/ # Arena (6-phase multi-agent debate)
│ ├── models/ # LLM abstraction (5 providers)
│ ├── memory/ # Vector memory, workspace files
│ ├── browser/ # Playwright automation (16 tools)
│ ├── desktop/ # PC control (screenshot, click, type)
│ ├── voice/ # TTS (edge-tts) + STT (Whisper)
│ ├── channels/ # WhatsApp, Discord, Telegram, Slack
│ ├── skills/ # Skill generator + registry
│ ├── research/ # Deep research engine
│ ├── planning/ # Task decomposition
│ ├── learning/ # Continuous learning pipeline
│ ├── proactive/ # Proactive suggestion engine
│ ├── cron/ # Scheduled tasks
│ ├── email/ # Email (IMAP + SMTP)
│ ├── calendar/ # CalDAV calendar integration
│ ├── mind/ # Self-improvement policies
│ └── recovery/ # Watchdog + graceful shutdown
├── components/ # React UI
├── cli/ # CLI entry point + modes
└── lib/ # Shared utilitiesConfiguration
Environment Variables
# LLM Providers (at least one required)
OPENAI_API_KEY=sk-...
ANTHROPIC_API_KEY=sk-ant-...
GROQ_API_KEY=gsk_...
OPENROUTER_API_KEY=sk-or-...
OLLAMA_BASE_URL=http://localhost:11434
# Optional services
SUPABASE_URL=https://...
SUPABASE_ANON_KEY=eyJ...
STRIPE_SECRET_KEY=sk_...
# Channel tokens
DISCORD_BOT_TOKEN=...
TELEGRAM_BOT_TOKEN=...
SLACK_BOT_TOKEN=xoxb-...User Config (~/.nova/config.json)
{
"model": "gpt-4o",
"tier": "primary",
"memory": true,
"skills": true,
"channels": {
"whatsapp": { "enabled": true },
"discord": { "enabled": true, "token": "..." }
}
}API
All tools are accessible via REST API:
# Chat (agentic, with tools)
curl -X POST http://localhost:3000/api/agentic-step \
-H 'Content-Type: application/json' \
-d '{
"messages": [{"role": "user", "content": "Search for AI news"}],
"stream": false,
"mode": "cloud"
}'
# Direct tool execution
curl -X POST http://localhost:3000/api/execute-step \
-H 'Content-Type: application/json' \
-d '{"tool": "web_search", "input": {"query": "AI news 2026"}}'
# Health check
curl http://localhost:3000/api/healthTech Stack
| Layer | Technology | |-------|-----------| | Framework | Next.js 16, React 19 | | Language | TypeScript 5 (strict mode) | | Styling | Tailwind CSS 4 | | LLMs | OpenAI, Anthropic, Groq, OpenRouter, Ollama | | Browser | Playwright | | Desktop | Tauri 2 | | Database | Supabase (cloud), local JSON (offline) | | Testing | Vitest + custom smoke tests | | Voice | edge-tts (TTS), OpenAI Whisper (STT) | | Messaging | Baileys (WhatsApp), discord.js, grammy, Slack SDK | | Email | IMAP (imapflow) + SMTP (nodemailer) | | Calendar | CalDAV (tsdav) |
Contributing
git clone https://github.com/realisnice1-star/nova-im.git
cd nova-im
npm install
npm run dev # Start dev server
npm run test # Run unit tests
npm run test:smoke # Run integration tests (server must be running)
npm run build # Verify build passesRead CLAUDE.md for full architecture documentation, coding conventions, and how to add new tools or skills.
License
MIT
