@naelstudio/nlpilot
v0.0.2
Published
A CLI clone of GitHub Copilot CLI powered by the Vercel AI SDK
Readme
nlpilot CLI
A GitHub Copilot-style AI coding assistant for the terminal, powered by the Vercel AI SDK. Supports OpenAI, Anthropic, Google, and any custom endpoint (including Azure AI Foundry).
Requirements
- Bun v1.x or later
Installation
# Clone and install dependencies
git clone <repo>
cd nlpilot-cli
bun install
# Run from source
bun run src/index.ts
# Or build a single-file binary and link it globally
bun run build
bun linkAfter linking, the nlpilot command is available system-wide.
Authentication
Standard providers (OpenAI / Anthropic / Google)
nlpilot loginThe interactive wizard will ask you to:
- Select a provider (
openai,anthropic, orgoogle) - Enter your API key (hidden input)
- Optionally enter a custom endpoint base URL (see Azure Foundry below)
- Pick a default model from the curated catalog, or enter a custom model ID
Credentials are saved to ~/.nlpilot/credentials with 0600 permissions.
Azure AI Foundry (custom endpoint)
When prompted "Use custom endpoint?", choose Yes and enter your Foundry endpoint, e.g.:
https://<your-instance>.services.ai.azure.com/anthropic/v1The API key and base URL are stored together in the credentials file. When a baseUrl is present the @ai-sdk/anthropic provider is used directly (bypassing the AI Gateway).
Environment variable overrides
| Variable | Description |
|---|---|
| NLPILOT_API_KEY | Override stored API key |
| NLPILOT_PROVIDER | Override stored provider |
| NLPILOT_MODEL | Override stored model for this run |
Usage
Interactive REPL
nlpilot # start a fresh session
nlpilot --continue # resume the most recent session
nlpilot --model claude-sonnet-4.6One-shot (non-interactive)
nlpilot -p "Explain this codebase"
nlpilot --prompt "Write unit tests for src/utils.ts" --output-format jsonGlobal Flags
| Flag | Short | Description |
|---|---|---|
| --model <id> | -m | Override the default model for this run |
| --prompt <text> | -p | Run one prompt non-interactively, then exit |
| --silent | -s | Suppress banners and tool logs; print only the response |
| --output-format <fmt> | | text (default) or json — controls stdout format in one-shot mode |
| --allow-all-tools | | Skip tool approval prompts (autopilot mode) |
| --allow-all | | Alias for --allow-all-tools |
| --no-ask-user | | GitHub Copilot CLI alias — same as --allow-all-tools |
| --enable-reasoning-summaries | | Show reasoning/thinking summaries from models that support extended thinking |
| --allow-tool <name> | | Always allow a specific tool (repeatable, comma-separated) |
| --deny-tool <name> | | Always deny a specific tool (repeatable, comma-separated) |
| --additional-mcp-config <path> | | Load extra MCP servers from a config file |
| --max-steps <n> | | Maximum model/tool loop steps per prompt |
| --max-output-tokens <n> | | Maximum assistant output tokens |
| --no-mcp | | Disable all MCP servers for this run |
| --no-model-routing | | Disable automatic cheap/balanced/reasoning model routing |
| --no-auto-compact | | Disable rolling REPL compaction after each turn |
| --compact-threshold <pct> | | Auto-compact REPL context when estimated usage exceeds this percent |
| --continue | | Resume the most recent session in the current directory |
| --version | -v | Print version |
REPL Slash Commands
Inside an interactive session, type / to access built-in commands:
| Command | Aliases | Description |
|---|---|---|
| /help | | Show all available commands |
| /exit | /quit | Exit nlpilot |
| /clear | /new, /reset | Start a new conversation (clears history) |
| /model [id] | /models | List available models, or switch to a specific model |
| /mode [ask\|plan\|autopilot] | | Show or change the tool approval mode |
| /compact | | Summarise conversation history to save context tokens (resets cumulative token count) |
| /save [name] | | Save the current session with an optional name |
| /sessions | /history | List all saved sessions |
| /load <id> | | Load a session by ID |
| /delete <id> | /rm | Delete a session |
| /undo | | Revert the most recent file change made by the agent |
| /diff | | Show a diff of all file changes in this session |
| /theme [name] | | Change the display theme (default, dim, high-contrast) |
| /init | | Scaffold .nlpilot/ customization directory |
| /version | | Print version |
Built-in Tools
The agent has access to these built-in tools. Each tool that modifies the filesystem asks for approval unless --allow-all-tools / --no-ask-user is set.
| Tool | Description |
|---|---|
| bash | Execute shell commands via /bin/sh -c |
| view | Read line ranges from a file |
| edit | Apply a targeted string replacement to an existing file |
| create | Create a new file (tracks changes for /undo) |
| grep | Search file contents, or list files with filenamesOnly:true |
| web_fetch | Fetch public HTTP/HTTPS URLs with approval and SSRF protections |
MCP Servers
nlpilot supports the Model Context Protocol for extending the agent with external tools.
Config file locations
| Priority | Path | Description |
|---|---|---|
| Lowest | ~/.nlpilot/mcp.json | Global user config |
| Middle | .mcp.json in project root | Project-level config (overrides global) |
| Highest | --additional-mcp-config <path> | Per-run extra config file (overrides both) |
Config file format
Both Claude Code / Cursor style and nlpilot internal style are supported:
mcpServers style (Claude Code / Cursor):
{
"mcpServers": {
"my-server": {
"command": "node",
"args": ["./mcp-server.js"],
"type": "stdio",
"env": { "MY_KEY": "value" }
},
"remote-server": {
"url": "https://my-mcp.example.com",
"type": "http"
}
}
}servers style (nlpilot):
{
"servers": [
{
"name": "my-server",
"transport": "stdio",
"command": "node",
"args": ["./mcp-server.js"]
}
]
}Managing MCP servers via CLI
nlpilot mcp list # list all configured servers
nlpilot mcp get <name> # show a server's config
nlpilot mcp add [name] # interactively add a server
nlpilot mcp remove <name> # remove a serverSession Management
Sessions are stored under ~/.nlpilot/sessions/<cwdHash>/ and are shared between the CLI and the VS Code extension. Each session records:
- Conversation messages
- Model and provider used
- File changes made by the agent during the session
- Cumulative token counts (tracked across session resumption)
nlpilot --continue # resume the last session in the current directoryContext Optimization
nlpilot automatically manages context window usage through several techniques:
Message Trimming
Large tool outputs (like verbose file reads or bash results) are automatically compressed in the conversation history:
- Kept in full: Last assistant turn (to preserve recent context)
- Tool results: Older outputs are replaced with structured summaries such as
view(src/session.ts:1-160) -> read 160 lines - Duplicate tool results: Repeated large outputs are replaced with reference stubs
- Older text turns: Long user/assistant messages are compacted and repeated attachment/project-context blocks are removed
- Benefit: Preserves context window space while retaining paths, counts, statuses, and errors
Output Limits
To prevent single tool calls from consuming too much context:
- Bash output limited to 4,000 characters
- Grep output limited to 8,000 characters
- File viewing limited to 160 lines per call, with an 80-line minimum window
- Web fetch body reads limited to 100,000 bytes before final text truncation
- Automatic abbreviation: Tool outputs exceeding limits are truncated with a note
Prompt Context Selection
Repeated system prompt context is kept small by default:
- Project instructions and file snapshots are included only for project/code-related prompts
- Skills and agents are listed only when the prompt asks about them
- Source files are injected as a directory summary plus key files, not a full tree
- File snapshots appear on the first project-related turn or when the user asks about repo structure/files
Auto-Compaction
The REPL now performs rolling compaction after each turn:
- Older history is summarized into a compact working-memory note
- The latest exchange stays intact for immediate continuity
- The
--compact-threshold <pct>guard remains as a backstop for unusually large turns - Use
--no-auto-compactto keep full persisted history until/compactor the threshold guard runs
Manual /compact still summarizes the full conversation and resets cumulative token tracking to the compacted baseline.
Cumulative Token Tracking
Sessions track cumulative input and output tokens across all turns:
cumulativeInputTokens: Total tokens sent to the modelcumulativeOutputTokens: Total tokens received from the model- Reset only after
/compactto reflect the summarization - Accessible in session metadata for cost estimation
Customization
Run nlpilot init to scaffold a .nlpilot/ directory in your project with examples for all customization points.
Instruction files
Loaded automatically from (in order):
.nlpilot/instructions.mdin the current directoryAGENTS.mdin the current directory
Supports YAML front-matter for metadata:
---
name: Project Guidelines
---
Always use TypeScript strict mode. Prefer `bun` over `npm`.Custom agents
Place .md files in .nlpilot/agents/. Each file defines a specialized agent persona with its own system prompt, model override, and tool allowlist:
---
name: Code Reviewer
description: Strict code review with security focus
model: claude-opus-4-6
tools: view,grep
---
You are a strict code reviewer. Focus on security, performance, and maintainability...Activate with /agent <name> in the REPL.
Skills
Place .md files in .nlpilot/skills/. Skills inject additional context into the system prompt for specialized workflows (similar to GitHub Copilot skills).
Hooks
Define lifecycle hooks in .nlpilot/hooks.json to run shell scripts before/after turns:
{
"preTurn": ["./scripts/lint.sh"],
"postTurn": ["./scripts/format.sh"]
}Model Catalog Configuration
nlpilot uses a flexible model catalog system that combines built-in defaults with user customization.
Catalog Location
The model catalog is loaded from (in priority order):
- User catalog:
~/.nlpilot/models.json(if exists) - Fallback: Built-in embedded catalog (automatically used if file missing or invalid)
Catalog File Format
Create or edit ~/.nlpilot/models.json to customize available models:
{
"anthropic": [
{
"id": "claude-sonnet-4.6",
"name": "Claude Sonnet 4.6",
"description": "Balanced flagship model",
"contextSize": 1000000,
"inputCost": 0.003,
"outputCost": 0.015
}
],
"openai": [
{
"id": "gpt-5.4",
"name": "GPT-5.4",
"description": "Balanced model",
"contextSize": 1100000,
"inputCost": 0.005,
"outputCost": 0.020
}
],
"google": [
{
"id": "gemini-2.5-pro",
"name": "Gemini 2.5 Pro",
"description": "Advanced reasoning model",
"contextSize": 1000000,
"inputCost": 0.0025,
"outputCost": 0.01
}
]
}Catalog Schema
Each model entry supports:
- id (required): Unique model identifier
- name: Display name
- description: Brief description
- contextSize: Maximum context window in tokens (used for planning and message trimming)
- inputCost: Cost per 1M input tokens (optional, for future billing features)
- outputCost: Cost per 1M output tokens (optional, for future billing features)
Default Model Selection
During nlpilot login, you'll select a default model from the available catalog. This default is used whenever no --model flag is provided.
To change your default model later:
nlpilot login # Re-run the wizard to select a different model
nlpilot --model <id> # Override for a single runCost-Aware Model Routing
When no explicit model is provided, nlpilot classifies each prompt and routes to a cheaper capable model where possible:
| Task class | Typical prompts | Routing behavior |
|---|---|---|
| cheap | help-like questions, listings, repo structure, simple search/explain prompts | cheapest catalog model for the provider |
| balanced | normal coding edits and implementation work | balanced coding/general model |
| reasoning | debugging, security, architecture, complex analysis, plan mode | configured default model |
--model <id>, NLPILOT_MODEL, /model <id>, and agent model overrides pin the chosen model and bypass automatic routing. Custom baseUrl credentials also bypass routing because those endpoints may not support the built-in catalog names. Use --no-model-routing to disable this behavior for a run.
Supported Models
Anthropic
| Model ID | Description |
|---|---|
| claude-sonnet-4.6 | Balanced flagship · 1M ctx |
| claude-opus-4.7 | Top quality · 1M ctx |
| claude-haiku-4.5 | Fast & cheap · 200K ctx |
| claude-opus-4-6 | High quality · 1M ctx |
| claude-sonnet-4.5 | Balanced · 1M ctx |
| claude-3-haiku | Cheapest · 200K ctx |
OpenAI
| Model ID | Description |
|---|---|
| gpt-5.5 | Flagship · 1M ctx |
| gpt-5.5-pro | Heavy reasoning |
| gpt-5.4 | Balanced · 1.1M ctx |
| gpt-5.4-mini | Fast · 400K ctx |
| gpt-5.4-nano | Cheapest · 400K ctx |
| Model ID | Description |
|---|---|
| gemini-2.5-pro | Flagship reasoning |
| gemini-2.5-flash | Fast multimodal |
| gemini-2.5-flash-lite | Most affordable |
Any model ID not in the catalog can be entered as a custom ID during nlpilot login or by passing --model <id>.
Project Structure
src/
├── index.ts # CLI entry point, command definitions
├── config.ts # Credentials load/save, env overrides
├── models.ts # Model catalog per provider
├── model-router.ts # Cheap/balanced/reasoning model selection
├── providers.ts # AI SDK provider factory (gateway + custom endpoint)
├── session.ts # Session interface, system prompt builder
├── persistence.ts # Session file I/O (~/.nlpilot/sessions/)
├── customization.ts # Agents, skills, instruction file loader
├── mcp.ts # MCP config load/save/merge
├── hooks.ts # Pre/post-turn hook runner
├── commands/
│ ├── repl.ts # Interactive REPL loop
│ ├── oneshot.ts # Non-interactive single-prompt runner
│ ├── slash.ts # /command handler registry
│ ├── compact.ts # Context compaction (/compact)
│ ├── login.ts # nlpilot login wizard
│ ├── logout.ts # nlpilot logout
│ ├── models.ts # nlpilot models list command
│ ├── mcp.ts # nlpilot mcp subcommand
│ ├── init.ts # nlpilot init scaffolding
│ └── help.ts # nlpilot help
└── tools/
├── index.ts # Built-in tool definitions (bash, file ops, glob, grep)
├── approval.ts # Tool approval state machine
└── mcp.ts # MCP runtime (connect + expose as AI SDK tools)Changelog
v0.x (Latest)
Context Management & Optimization
- Automatic message trimming: Large tool outputs in conversation history are now automatically compressed to preserve context window. Recent assistant turns retain full context, while older verbose outputs are replaced with
[output trimmed — X chars]notation. - Output limits: Reduced default output sizes to optimize context usage:
- Bash command output capped at 4,000 characters
- Grep output capped at 8,000 characters
- File viewing limited to 160 lines per call
- Web fetch body reads capped at 100,000 bytes before final text truncation
- Cumulative token tracking: Sessions now track total input/output tokens across all turns and resumptions
- Auto-compaction token reset: When
/compactsummarizes a conversation, cumulative token counts reset to reflect the new baseline
Model Catalog System
- Dynamic model loading: Model catalog now loads from
~/.nlpilot/models.jsonwith automatic fallback to embedded defaults - User customization: Create or modify
~/.nlpilot/models.jsonto add custom models, adjust costs, and override model configurations - Context size awareness: Models now include
contextSizeproperty for better context planning and message trimming decisions - Dynamic default selection: Default models are determined from the loaded catalog rather than hardcoded
Tool & Approval System
- Fixed readline deadlock: Improved REPL approval prompt handling with separate readline instances and mutex-based serialization
- Better tool logging: New
toolInputSummary()function provides cleaner display of tool execution details (bash commands, file paths with line ranges, grep patterns) - Concurrent prompt protection: Mutex-based
_promptLockprevents concurrent approval dialogs from interfering
Configuration & Startup
- Auto-initialization: Config directory is now automatically initialized on first run
- Early catalog loading: Model catalog pre-loaded during app startup for better initialization flow
- Separation of concerns: Config file handles credentials; model catalog handles model definitions
Developer Documentation
- Architecture Overview — how the codebase fits together
- Contributing Guide — setup, style, and PR workflow
- docs/ — detailed guides for extending commands, providers, MCP, skills, and hooks
License
Licensed under the Apache License 2.0.
