kairolite

v0.1.3

Published

20 days ago

Lightweight local AI coding agent CLI — an autonomous terminal agent running Qwen2.5-Coder 3B fully offline on CPU (Linux + Windows, no GPU). OpenAI-compatible, no API keys.

kairolite

⚠️ First install downloads a ~2 GB AI model (Qwen2.5-Coder-3B) into the package. Runs on CPU — Linux or Windows x64, no GPU required.

A lightweight, CPU-only local coding-agent CLI — the cross-platform little sibling of agentkairo (the GPU/CUDA build). Same agent, same tools, no GPU required.

It runs Qwen/Qwen2.5-Coder-3B-Instruct-GGUF:Q4_K_M (Apache-2.0) through the bundled tinyq4 engine. The ~2 GB model is downloaded once at install time.

Requirements

Linux or Windows on x86-64 (macOS isn't supported yet).
~3 GB of free RAM for the 3B model. No GPU needed.
Python 3 on your PATH.

The engine runs on any 64-bit CPU. If your CPU has AVX2 + FMA (most CPUs since ~2013) it automatically uses a faster code path; otherwise it falls back to a portable scalar path — it never crashes on older hardware.

Got an NVIDIA GPU? Run kairolite gpu — if it finds a capable card it'll point you to the GPU build (agentkairo), which is ~20× faster.

Install

npm install -g kairolite

npm install downloads the ~2 GB model once. If the download is interrupted, the install still succeeds — it's fetched on first run, or on demand with kairolite model.

Run

cd /path/to/project
kairolite

Running kairolite auto-starts the local engine, waits for it to load, and drops you into the agent. The server is shut down when you quit, freeing memory.

A note on speed: kairolite runs on CPU, so the first response waits while the prompt is read (you'll see a live "reading prompt" progress indicator — it's working, not frozen). Generation speed depends on your CPU; more cores + AVX2 = faster. For heavy use on an NVIDIA machine, prefer the GPU build (/gpu).

One-shot / scripting

The model's answer goes to stdout; all status chrome goes to stderr, so redirects stay clean:

kairolite -p "summarize what this project does"
echo "explain src/main.py" | kairolite -p -
kairolite -p "add a docstring to utils.py" --yolo > result.txt

Without --yolo, write/bash tools are auto-declined in one-shot mode, so -p is read-only by default.

Commands

kairolite               # Start the agent (auto-starts the local engine)
kairolite -p "<text>"   # Run one prompt non-interactively, then exit
kairolite model         # Download the model on demand, or show its status
kairolite gpu           # Scan for an NVIDIA GPU and link the faster GPU build
kairolite stop          # Stop the local engine and free memory
kairolite serve         # Run only the engine in the foreground
kairolite doctor        # Show setup status and local server detection
kairolite --url <url>   # Use an already-running OpenAI-compatible server
kairolite --yolo        # Run write/bash tools without confirmation

Inside Kairo:

/yolo — toggle write/bash confirmations.
/tools — list available tools.
/model — show model status / re-download if needed.
/gpu — scan for an NVIDIA GPU and link the faster GPU build.
/save [file] — save the conversation to a markdown transcript.
/cwd — show the workspace directory.
/stats — toggle the per-reply token/speed line.
/clear — reset the chat.
/help — list commands.
/exit — quit.

⚠️ Disclaimer

Kairo is an autonomous coding agent: it can run shell commands, and read, write, and overwrite files in your workspace. It is experimental software, provided "AS IS" with no warranty. It may execute commands you did not intend, make confident-but-wrong assumptions, or proceed after ambiguous input — especially with --yolo, which disables all confirmations.

You are responsible for everything it does on your system. By using it you accept the full terms in DISCLAIMER.md. You pressed enter; the machine listened.

License

Apache-2.0. The bundled model is licensed separately — see MODEL_LICENSES.md.

Published

Vulnerabilities

Links

Maintainers

Keywords

Readme

kairolite

Requirements

Install

Run

One-shot / scripting

Commands

⚠️ Disclaimer

License