agent-terminal

v1.0.1

Published

25 days ago

Headless terminal automation for AI agents - capture CLI output as ASCII text, send keyboard input, automate any terminal application

0High
0Medium
0Low

jasonkneen

mcp terminal automation pty cli headless ascii ai-agent agent pseudo-terminal node-pty screenshot capture terminal-emulator

agent-terminal

Headless terminal automation for AI agents

agent-terminal lets AI agents interact with any CLI application - capture output as ASCII text, send keyboard input, and automate terminal applications without a display.

Built on node-pty, the same technology powering VS Code's integrated terminal.

Features

✅ Completely Headless - No terminal or display needed ✅ Pure ASCII Output - Looks exactly like terminal output ✅ Interactive Control - Send keyboard input, capture any state ✅ Any CLI App - Works with git, npm, python, node, vim, etc. ✅ MCP Server - Model Context Protocol integration ✅ AI Agent Ready - Perfect for LLM-powered automation

Installation

npm install agent-terminal

Quick Start

As a Library

import { CLISession } from 'agent-terminal'

// Launch a CLI app
const session = new CLISession('node', [], { cols: 80, rows: 24 })
await session.wait(500)

// Capture output as ASCII text
console.log(session.getBuffer(true))
// Output: Welcome to Node.js v22.14.0
//         Type ".help" for more information.
//         >

// Send keyboard input
session.sendKeys('2 + 2\r')
await session.wait(200)

console.log(session.getBuffer(true))
// Output: > 2 + 2
//         4
//         >

session.close()

As MCP Server

Add to your Claude Code or other MCP client's settings.json:

{
  "mcpServers": {
    "agent-terminal": {
      "command": "npx",
      "args": ["-y", "agent-terminal"]
    }
  }
}

Or with local installation:

{
  "mcpServers": {
    "agent-terminal": {
      "command": "node",
      "args": ["node_modules/agent-terminal/index.mjs"]
    }
  }
}

MCP Tools

When running as an MCP server, these tools are available:

`cli_launch`

Launch a CLI application in a pseudo-terminal.

{
  "command": "node",
  "args": [],
  "cols": 80,
  "rows": 24
}

Returns:

{
  "session_id": "abc123...",
  "command": "node",
  "initial_output": "Welcome to Node.js..."
}

`cli_screenshot`

Capture current terminal state as ASCII text.

{
  "session_id": "abc123",
  "strip_ansi": true
}

Returns plain ASCII text of terminal screen.

`cli_send_keys`

Send keyboard input to the application.

{
  "session_id": "abc123",
  "keys": "console.log('Hello')\r"
}

Special keys:

\r - Enter
\t - Tab
\x1b - Escape
\n - Newline

`cli_wait`

Wait for output to stabilize.

{
  "session_id": "abc123",
  "timeout": 1000
}

`cli_close`

Close a CLI session.

{
  "session_id": "abc123"
}

`cli_list_sessions`

List all active sessions.

{}

API Reference

`CLISession`

Create a new terminal session.

const session = new CLISession(command, args, options)

Parameters:

command (string) - Command to execute
args (array) - Command arguments
options (object)
- cols (number) - Terminal width (default: 80)
- rows (number) - Terminal height (default: 24)
- cwd (string) - Working directory

Methods:

`getBuffer(stripAnsi = true)`

Get current terminal buffer as text.

const text = session.getBuffer(true)

`sendKeys(keys)`

Send keyboard input.

session.sendKeys('ls -la\r')

`wait(timeout = 1000)`

Wait for output to stabilize.

await session.wait(1000)

`resize(cols, rows)`

Resize terminal.

session.resize(120, 40)

`close()`

Close the session.

session.close()

Examples

Automate Git Commands

import { CLISession } from 'agent-terminal'

const git = new CLISession('git', ['log', '--oneline', '--graph', '-10'], {
  cols: 100,
  rows: 30
})

await git.wait(1000)

console.log(git.getBuffer(true))
// * abc123 Latest commit
// * def456 Merge branch
// |\
// | * ghi789 Feature

git.close()

Interactive Python Session

const python = new CLISession('python3', [], { cols: 80, rows: 24 })
await python.wait(500)

python.sendKeys('x = [1, 2, 3, 4, 5]\r')
await python.wait(200)

python.sendKeys('[i * 2 for i in x]\r')
await python.wait(200)

console.log(python.getBuffer(true))
// >>> x = [1, 2, 3, 4, 5]
// >>> [i * 2 for i in x]
// [2, 4, 6, 8, 10]
// >>>

python.close()

Monitor System Commands

const top = new CLISession('top', ['-l', '1'], { cols: 120, rows: 50 })
await top.wait(2000)

const stats = top.getBuffer(true)
console.log(stats)
// Processes: 450 total, 3 running, 447 sleeping...

top.close()

Navigate Interactive Apps

const vim = new CLISession('vim', ['file.txt'], { cols: 80, rows: 24 })
await vim.wait(500)

// Enter insert mode
vim.sendKeys('i')
await vim.wait(100)

// Type text
vim.sendKeys('Hello from agent-terminal!')
await vim.wait(100)

// Exit and save
vim.sendKeys('\x1b:wq\r')
await vim.wait(500)

vim.close()

Use Cases

🤖 AI Agent Automation - Let LLMs interact with terminal apps
🧪 CLI Testing - Automated testing of command-line tools
📊 System Monitoring - Capture and analyze system command output
🔄 CI/CD Pipelines - Headless automation in build systems
📝 Documentation - Generate CLI screenshots as text
🎓 Educational - Teaching terminal concepts programmatically

How It Works

agent-terminal uses node-pty to create pseudo-terminals (PTY). CLI applications think they're running in a real terminal, but all output is captured as text.

┌─────────────────────────────────────┐
│     Your Application/Agent          │
└─────────────┬───────────────────────┘
              │
              ├─ CLISession API
              │
┌─────────────▼───────────────────────┐
│        agent-terminal               │
│     (Pseudo-terminal wrapper)       │
└─────────────┬───────────────────────┘
              │
              ├─ node-pty
              │
┌─────────────▼───────────────────────┐
│         CLI Application             │
│    (vim, git, python, etc.)         │
└─────────────────────────────────────┘

Key Points:

Runs completely headless (no display needed)
Applications behave as if in a real terminal
Output captured as plain ASCII text
Full keyboard input control

Security

agent-terminal includes security protections for safe operation:

Command Whitelist: Only approved commands can be executed (vim, nano, htop, node, python, git, etc.)
Path Validation: Working directory restricted to workspace to prevent path traversal
Environment Sanitization: Only safe environment variables passed to processes (secrets protected)
Resource Limits: Maximum 50 concurrent sessions, automatic cleanup
Input Validation: All MCP tool parameters validated with Zod schemas

See SECURITY.md for details and configuration options.

Requirements

Node.js 18 or higher
Works on Linux, macOS, and Windows

Contributing

Contributions welcome!

git clone https://github.com/jkneen/agent-terminal.git
cd agent-terminal
npm install
npm test

License

MIT © jkneen

Related Projects

node-pty - Pseudoterminal bindings
Model Context Protocol - MCP specification

Links

Made for AI agents 🤖 Built with node-pty 🚀

Published

Vulnerabilities

Links

Maintainers

Keywords

Readme

agent-terminal

Features

Installation

Quick Start

As a Library

As MCP Server

MCP Tools

cli_launch

cli_screenshot

cli_send_keys

cli_wait

cli_close

cli_list_sessions

API Reference

CLISession

getBuffer(stripAnsi = true)

sendKeys(keys)

wait(timeout = 1000)

resize(cols, rows)

close()

Examples

Automate Git Commands

Interactive Python Session

Monitor System Commands

Navigate Interactive Apps

Use Cases

How It Works

Security

Requirements

Contributing

License

Related Projects

Links

`cli_launch`

`cli_screenshot`

`cli_send_keys`

`cli_wait`

`cli_close`

`cli_list_sessions`

`CLISession`

`getBuffer(stripAnsi = true)`

`sendKeys(keys)`

`wait(timeout = 1000)`

`resize(cols, rows)`

`close()`