dipclaw

v0.5.17

Published

2 months ago

Browser automation agent with scheduling, skills, and LLM integration

0High
0Medium
0Low

star_dipcoin

agent browser-automation playwright llm telegram chrome automation

dipclaw

Browser automation agent powered by LLM. Control Chrome via Telegram or terminal, with scheduled tasks, persistent memory, and auto-generated skills.

Features

Lightweight by design

Inspired by OpenClaw but stripped down to the essentials. No complex plugin system, no heavyweight dependencies — just a single agent that connects an LLM to your real Chrome browser via CDP (Chrome DevTools Protocol). The entire codebase stays small and hackable.

Self-evolving skills (Hermes-style)

Inspired by Hermes. When the agent completes a task via LLM-driven tool calls, it automatically generates a reusable skill (a structured script) from the execution log. On subsequent runs of the same task, the agent replays the skill directly — faster and more reliable. If a skill step fails, the LLM handles it as a fallback and then improves the skill based on what it learned. Tasks get better over time without manual intervention.

Human behavior simulation

Browser actions simulate real human behavior to reduce detection by anti-bot systems:

Mouse movement: Bezier curve trajectories with sub-pixel jitter and slow-fast-slow speed profiles, instead of instant teleportation to element centers
Clicking: Random click position within the element (not always dead center), with realistic mousedown-to-mouseup intervals (50-130ms)
Typing: Per-character input with variable delays (30-150ms) and occasional longer pauses simulating natural thinking rhythm
Scrolling: Chunked into small increments (50-150px) with random intervals, mimicking real scroll wheel behavior

Real Chrome, real sessions

Connects to your system Chrome (not a bundled browser), preserving your login sessions, cookies, extensions, and profiles across restarts. Each agent gets its own user data directory under the workspace.

Scheduled tasks with cron

Define recurring tasks with standard cron expressions. The agent runs them automatically, sends results (or errors) to Telegram, and tracks success/failure stats over time.

Persistent memory

The agent can save and recall information across sessions — website quirks, login flows, discovered patterns — stored as files in the workspace.

Configurable iteration limits

Set the max number of tool-calling rounds per conversation. When the limit is reached, the agent produces a final summary instead of silently stopping.

Proxy and timezone spoofing

Built-in support for SOCKS5/HTTP proxies with authentication, and browser timezone override via CDP emulation — useful for geo-specific automation.

Requirements

Node.js >= 18
Google Chrome installed

Install

Option 1: npm global install

npm install -g dipclaw

Option 2: From source

git clone https://github.com/dipcoinlab/dipclaw.git
cd dipclaw
npm install

Configuration

Copy the example config and edit it:

cp config.example.json config.json

{
  "name": "my-agent",
  "workspace": "./workspace",
  "llm": {
    "protocol": "openai",
    "baseUrl": "https://api.openai.com",
    "apiKey": "sk-...",
    "model": "gpt-4o"
  },
  "telegram": {
    "botToken": "123456:ABC...",
    "allowedUsers": [123456789]
  },
  "maxIterations": 90,
  "browser": {
    "headless": false,
    "proxy": {
      "protocol": "socks5",
      "host": "127.0.0.1",
      "port": 1080,
      "username": "user",
      "password": "pass"
    }
  }
}

Config fields

| Field | Required | Description | |-------|----------|-------------| | name | Yes | Agent name (also used as Chrome profile name) | | workspace | Yes | Directory for browser data, logs, skills, and memory | | llm.protocol | Yes | openai or claude | | llm.baseUrl | Yes | LLM API base URL | | llm.apiKey | Yes | API key | | llm.model | Yes | Model name | | telegram.botToken | No | Telegram bot token from @BotFather | | telegram.allowedUsers | No | Array of allowed Telegram user IDs | | maxIterations | No | Max tool-calling iterations per chat (default: 90) | | browser.headless | No | Run Chrome headless (default: false) | | browser.proxy | No | Browser proxy config | | browser.viewport | No | { width, height } - omit for maximized | | browser.timezone | No | Browser timezone (e.g. America/New_York) | | debug | No | Log LLM requests/responses (default: false) |

Environment variables can be used in config values with ${VAR} or $VAR syntax.

Usage

Global install

dipclaw --config config.json

From source

# Development (with hot reload)
npm run dev -- --config config.json

# Production
npm run build
npm start -- --config config.json

CLI options

--config <path>       Path to config file
--max-iterations <n>  Override max iterations
--headless            Run browser headless
--proxy <url>         Browser proxy (e.g. socks5://user:pass@host:port)
--viewport <WxH>     Browser viewport size
--timezone <tz>      Browser timezone (e.g. America/New_York)
--debug               Enable debug logging
--tui                 Enable interactive terminal UI

Telegram commands

| Command | Description | |---------|-------------| | /new | Start a new session (clear chat history) | | /tasks | List scheduled tasks | | /run <id> | Run a task immediately | | /remove <id> | Remove a task | | /enable <id> | Enable a task | | /disable <id> | Disable a task | | /logs | View execution logs | | /skills | List available skills | | /memory | View saved memories |

License

MIT

Published

Vulnerabilities

Links

Maintainers

Keywords

Readme

dipclaw

Features

Lightweight by design

Self-evolving skills (Hermes-style)

Human behavior simulation

Real Chrome, real sessions

Scheduled tasks with cron

Persistent memory

Configurable iteration limits

Proxy and timezone spoofing

Requirements

Install

Option 1: npm global install

Option 2: From source

Configuration

Config fields

Usage

Global install

From source

CLI options

Telegram commands

License