dipclaw
v0.5.17
Published
Browser automation agent with scheduling, skills, and LLM integration
Downloads
4,511
Maintainers
Readme
dipclaw
Browser automation agent powered by LLM. Control Chrome via Telegram or terminal, with scheduled tasks, persistent memory, and auto-generated skills.
Features
Lightweight by design
Inspired by OpenClaw but stripped down to the essentials. No complex plugin system, no heavyweight dependencies — just a single agent that connects an LLM to your real Chrome browser via CDP (Chrome DevTools Protocol). The entire codebase stays small and hackable.
Self-evolving skills (Hermes-style)
Inspired by Hermes. When the agent completes a task via LLM-driven tool calls, it automatically generates a reusable skill (a structured script) from the execution log. On subsequent runs of the same task, the agent replays the skill directly — faster and more reliable. If a skill step fails, the LLM handles it as a fallback and then improves the skill based on what it learned. Tasks get better over time without manual intervention.
Human behavior simulation
Browser actions simulate real human behavior to reduce detection by anti-bot systems:
- Mouse movement: Bezier curve trajectories with sub-pixel jitter and slow-fast-slow speed profiles, instead of instant teleportation to element centers
- Clicking: Random click position within the element (not always dead center), with realistic mousedown-to-mouseup intervals (50-130ms)
- Typing: Per-character input with variable delays (30-150ms) and occasional longer pauses simulating natural thinking rhythm
- Scrolling: Chunked into small increments (50-150px) with random intervals, mimicking real scroll wheel behavior
Real Chrome, real sessions
Connects to your system Chrome (not a bundled browser), preserving your login sessions, cookies, extensions, and profiles across restarts. Each agent gets its own user data directory under the workspace.
Scheduled tasks with cron
Define recurring tasks with standard cron expressions. The agent runs them automatically, sends results (or errors) to Telegram, and tracks success/failure stats over time.
Persistent memory
The agent can save and recall information across sessions — website quirks, login flows, discovered patterns — stored as files in the workspace.
Configurable iteration limits
Set the max number of tool-calling rounds per conversation. When the limit is reached, the agent produces a final summary instead of silently stopping.
Proxy and timezone spoofing
Built-in support for SOCKS5/HTTP proxies with authentication, and browser timezone override via CDP emulation — useful for geo-specific automation.
Requirements
- Node.js >= 18
- Google Chrome installed
Install
Option 1: npm global install
npm install -g dipclawOption 2: From source
git clone https://github.com/dipcoinlab/dipclaw.git
cd dipclaw
npm installConfiguration
Copy the example config and edit it:
cp config.example.json config.json{
"name": "my-agent",
"workspace": "./workspace",
"llm": {
"protocol": "openai",
"baseUrl": "https://api.openai.com",
"apiKey": "sk-...",
"model": "gpt-4o"
},
"telegram": {
"botToken": "123456:ABC...",
"allowedUsers": [123456789]
},
"maxIterations": 90,
"browser": {
"headless": false,
"proxy": {
"protocol": "socks5",
"host": "127.0.0.1",
"port": 1080,
"username": "user",
"password": "pass"
}
}
}Config fields
| Field | Required | Description |
|-------|----------|-------------|
| name | Yes | Agent name (also used as Chrome profile name) |
| workspace | Yes | Directory for browser data, logs, skills, and memory |
| llm.protocol | Yes | openai or claude |
| llm.baseUrl | Yes | LLM API base URL |
| llm.apiKey | Yes | API key |
| llm.model | Yes | Model name |
| telegram.botToken | No | Telegram bot token from @BotFather |
| telegram.allowedUsers | No | Array of allowed Telegram user IDs |
| maxIterations | No | Max tool-calling iterations per chat (default: 90) |
| browser.headless | No | Run Chrome headless (default: false) |
| browser.proxy | No | Browser proxy config |
| browser.viewport | No | { width, height } - omit for maximized |
| browser.timezone | No | Browser timezone (e.g. America/New_York) |
| debug | No | Log LLM requests/responses (default: false) |
Environment variables can be used in config values with ${VAR} or $VAR syntax.
Usage
Global install
dipclaw --config config.jsonFrom source
# Development (with hot reload)
npm run dev -- --config config.json
# Production
npm run build
npm start -- --config config.jsonCLI options
--config <path> Path to config file
--max-iterations <n> Override max iterations
--headless Run browser headless
--proxy <url> Browser proxy (e.g. socks5://user:pass@host:port)
--viewport <WxH> Browser viewport size
--timezone <tz> Browser timezone (e.g. America/New_York)
--debug Enable debug logging
--tui Enable interactive terminal UITelegram commands
| Command | Description |
|---------|-------------|
| /new | Start a new session (clear chat history) |
| /tasks | List scheduled tasks |
| /run <id> | Run a task immediately |
| /remove <id> | Remove a task |
| /enable <id> | Enable a task |
| /disable <id> | Disable a task |
| /logs | View execution logs |
| /skills | List available skills |
| /memory | View saved memories |
License
MIT
