agentic-browser-cli
v1.0.0
Published
AI-powered browser automation CLI — automate the web with natural language using Ollama, Anthropic, OpenAI, Azure, AWS Bedrock, Google Vertex AI, or Groq
Maintainers
Readme
AI Browser CLI
An AI-powered browser automation CLI that accepts natural-language queries and executes web tasks autonomously — fully local, no cloud APIs required.
How it works
User Query (CLI)
│
▼
Provider Selection ← ollama | anthropic | openai | azure | bedrock | vertexai | groq
│
▼
LangChain ReAct Agent
│
├── LLM Layer (chosen provider)
│ ├── Ollama (local, no API key)
│ ├── Anthropic Claude
│ ├── OpenAI
│ ├── Azure OpenAI
│ ├── AWS Bedrock
│ ├── Google Vertex AI
│ └── Groq
│
├── Tools Layer
│ ├── Primary : @playwright/mcp (MCP subprocess)
│ └── Fallback : Direct Playwright (in-process)
│
└── Session Memory ← optional context carry-overPrerequisites
| Requirement | Version | Notes |
|-------------|---------|-------|
| Node.js | ≥ 18 | nodejs.org |
| At least one LLM provider — choose any: | | |
| Ollama (local) | any | ollama serve + ollama pull llama3 |
| Anthropic Claude | — | ANTHROPIC_API_KEY in .env |
| OpenAI | — | OPENAI_API_KEY in .env |
| Azure OpenAI | — | AZURE_OPENAI_API_KEY + AZURE_OPENAI_ENDPOINT |
| AWS Bedrock | — | AWS credentials in .env or IAM role |
| Google Vertex AI | — | GOOGLE_CLOUD_PROJECT + gcloud auth |
| Groq | — | GROQ_API_KEY in .env |
Installation
# 1. Install npm dependencies
npm install
# 2. Install the Chromium browser binary used by Playwright
npm run install:browsers
# 3. Copy and edit environment variables
cp .env.example .envOptional: global install
npm link
# then call: ai-browser "…"Configuration (.env)
Copy .env.example to .env and fill in the values for the provider(s) you want to use.
Common settings
| Variable | Default | Description |
|----------|---------|-------------|
| DEFAULT_PROVIDER | ollama | Provider used when --provider is omitted |
| HEADLESS | false | Set true to run the browser without a visible window |
| BROWSER_TIMEOUT | 30000 | Timeout (ms) for each browser action |
| MAX_ITERATIONS | 50 | Maximum agent reasoning steps per query |
| AGENT_TIMEOUT | 300000 | Hard timeout (ms) for the full agent run |
| MEMORY_DIR | .memory | Folder where session data is stored |
| DEBUG | false | Set true to enable verbose output |
Ollama (local)
| Variable | Default | Description |
|----------|---------|-------------|
| OLLAMA_BASE_URL | http://localhost:11434 | Ollama server endpoint |
| DEFAULT_MODEL | llama3 | Default model when --model is omitted |
Anthropic Claude
| Variable | Description |
|----------|-------------|
| ANTHROPIC_API_KEY | API key from console.anthropic.com |
OpenAI
| Variable | Description |
|----------|-------------|
| OPENAI_API_KEY | API key from platform.openai.com |
Azure OpenAI
| Variable | Description |
|----------|-------------|
| AZURE_OPENAI_API_KEY | Azure resource API key |
| AZURE_OPENAI_ENDPOINT | https://<resource>.openai.azure.com/ |
| AZURE_OPENAI_DEPLOYMENT | Deployment / model name |
| AZURE_OPENAI_API_VERSION | API version (default 2024-10-21) |
AWS Bedrock
| Variable | Description |
|----------|-------------|
| AWS_ACCESS_KEY_ID | AWS access key (or use IAM role) |
| AWS_SECRET_ACCESS_KEY | AWS secret key |
| AWS_SESSION_TOKEN | Optional session token |
| AWS_REGION | Region (default us-east-1) |
Google Vertex AI
| Variable | Description |
|----------|-------------|
| GOOGLE_CLOUD_PROJECT | GCP project ID |
| GOOGLE_CLOUD_LOCATION | Region (default us-central1) |
Run gcloud auth application-default login before using Vertex AI.
Groq
| Variable | Description |
|----------|-------------|
| GROQ_API_KEY | API key from console.groq.com |
Usage
node src/cli.js [options] <query>
Options:
-P, --provider <provider> LLM provider to use (default: ollama)
ollama | anthropic | openai | azure | bedrock | vertexai | groq
-m, --model <model> Model / deployment name (skips the picker)
-H, --headless Run browser headlessly
-v, --verbose Print tool calls and debug info
-s, --screenshot Auto-save screenshots
--no-memory Disable session memory for this run
--max-iterations <number> Cap agent reasoning steps (default: 50)
--timeout <ms> Agent timeout (default: 300000)
-V, --version Show version
-h, --help Show helpBuilt-in subcommands
# Check connection status for ALL configured providers
node src/cli.js status
# List models for a specific provider
node src/cli.js models # Ollama (default)
node src/cli.js models --provider anthropic
node src/cli.js models --provider groqExamples
Using Ollama (local)
node src/cli.js --provider ollama "Search best JavaScript frameworks in 2025"
node src/cli.js --provider ollama --model mistral "Find iPhone 16 price on Amazon"Using Anthropic Claude
node src/cli.js --provider anthropic "Go to news.ycombinator.com and list the top 5 stories"
node src/cli.js --provider anthropic --model claude-3-opus-20240229 "Extract the main headline from bbc.com"Using OpenAI
node src/cli.js --provider openai "Fill the contact form on example.com with name 'Jane Doe'"
node src/cli.js --provider openai --model gpt-4-turbo --verbose "Go to MDN and summarise the Fetch API page"Using Azure OpenAI
node src/cli.js --provider azure "Go to github.com/trending and take a screenshot"Using AWS Bedrock
node src/cli.js --provider bedrock --model anthropic.claude-3-5-sonnet-20241022-v2:0 "Search for Node.js tutorials"Using Google Vertex AI
node src/cli.js --provider vertexai "Go to google.com/maps and search for coffee near me"Using Groq
node src/cli.js --provider groq --model llama3-70b-8192 "Summarise the front page of reuters.com"Interactive mode (no --provider flag)
node src/cli.js
# → shown a provider picker, then a model picker, then a task promptHeadless + screenshot
node src/cli.js --provider openai --headless --screenshot "Go to github.com/trending"Project structure
ai-browser-cli/
│
├── src/
│ ├── cli.js ← Entry point (Commander CLI + provider selection)
│ │
│ ├── agent/
│ │ ├── agent.js ← LangGraph ReAct agent + streaming
│ │ ├── tools.js ← MCP tools (primary) + direct Playwright (fallback)
│ │ └── prompts.js ← System prompt + task-planning template
│ │
│ ├── llm/
│ │ ├── providers.js ← Multi-provider LLM factory (all 7 providers)
│ │ └── ollama.js ← Ollama-specific helpers (health-check, pull)
│ │
│ ├── browser/
│ │ └── playwright.js ← BrowserController (chromium singleton)
│ │
│ ├── memory/
│ │ └── memory.js ← SessionMemory + LongTermMemory
│ │
│ └── utils/
│ ├── logger.js ← Coloured logger factory
│ └── retry.js ← withRetry / withTimeout helpers
│
├── .env.example
├── .gitignore
├── package.json
└── README.mdTool system
MCP tools (via @playwright/mcp)
When available, the agent runs @playwright/mcp as a subprocess and loads its tools through the MCP protocol. These include:
browser_navigate · browser_click · browser_fill · browser_snapshot · browser_screenshot · browser_press_key · browser_scroll · browser_wait_for
Direct Playwright tools (fallback)
If the MCP server cannot start, the agent falls back to Playwright running in-process:
| Tool | Description |
|------|-------------|
| open_url | Navigate to a URL |
| click_element | Click a CSS-selected element |
| type_text | Fill an input field |
| press_key | Send a keyboard key |
| extract_content | Read page text |
| get_page_info | Current URL + title |
| scroll_page | Scroll the viewport |
| wait_for_element | Wait for an element |
| wait | Fixed-time pause |
| take_screenshot | Capture a screenshot |
Memory system
Session memory (default on)
- Stores the last 10 query/answer pairs in
.memory/session.json - Injected as context into the next run's system prompt
- Disable for a single run:
--no-memory
Long-term memory (optional)
- Key/value JSON store in
.memory/long-term.json - Enable via
ENABLE_LONG_TERM_MEMORY=truein.env
Troubleshooting
Ollama not connecting
ollama serve # start the server
curl http://localhost:11434/api/tags # verify it respondsModel not installed (Ollama)
ollama pull llama3
node src/cli.js models --provider ollama # confirm it appearsCloud provider credentials missing
The CLI prints a setup hint with the exact variables to add. Copy them into your .env file and re-run. You can also verify all providers at once:
node src/cli.js statusVertex AI authentication error
gcloud auth application-default loginAWS Bedrock access denied
Ensure the IAM policy attached to your credentials includes bedrock:InvokeModel for the target model ARN.
Playwright / browser not found
npm run install:browsers # installs Chromium
npx playwright install # installs all browsersMCP server fails to start
The agent automatically falls back to direct Playwright. Use --verbose to confirm which mode is active.
Security notes
- Never pass real passwords as CLI arguments (they appear in shell history).
- Store credentials in
.env(excluded from version control via.gitignore). - The agent will not retry login more than twice to avoid account lockouts.
- No browser cookies or credentials are persisted between runs.
License
MIT
