npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2026 – Pkg Stats / Ryan Hefner

claude-web-client

v1.0.0

Published

Browser automation for AI assistants using Playwright's accessibility tree - giving Claude local browsing capabilities without vision models

Readme

Claude Web Client

A professional browser automation tool that gives Claude AI local web browsing capabilities through code, not vision. Built on Playwright and Puppeteer, it provides both a CLI interface for AI assistants and a programmatic API for JavaScript/TypeScript applications.

Key Innovation: Uses Playwright's accessibility tree to bridge the gap between visual UIs and text-based AI - giving Claude a semantic understanding of web pages without needing vision capabilities!

⚠️ Important: What This Is (And Isn't)

✅ What This IS:

  • Cost-effective web interaction - Uses local compute + text-based APIs instead of expensive vision models
  • Code-native browsing - Playwright returns semantic page structure as code/JSON that Claude natively understands
  • Semantic layout understanding - Claude can understand element roles, hierarchy, and interactions through accessibility tree
  • Text and structure extraction - Perfect for scraping, form filling, navigation, data extraction
  • Professional automation - Built on battle-tested tools (Playwright/Puppeteer) used by thousands of companies

❌ What This ISN'T:

  • NOT multimodal vision browsing - Claude cannot "see" images, charts, or visual layouts
  • NOT for visual content - Cannot describe what images look like or understand visual design
  • NOT pixel-perfect understanding - Works with semantic structure, not visual appearance
  • NOT a replacement for vision models - For visual tasks, you still need Claude with vision capabilities

🎯 Best Use Cases:

  • Web scraping and data extraction
  • Form automation and testing
  • Navigation and interaction with web apps
  • Reading text content from pages
  • Monitoring websites for changes
  • E2E testing without visual validation

💡 Cost Reduction Strategy:

Instead of: Screenshot → Vision Model → $$$ We use: Accessibility Tree → Text Tokens → 10-100x cheaper

This is about making web automation cost-effective by using professional tools that return code, Claude's native language.

Features

  • 🤖 CLI for AI assistants - Simple commands Claude can execute
  • 📦 Programmatic API - Import and use in your own scripts
  • 🧠 Accessibility Tree - Playwright's semantic page representation for LLM understanding
  • 🎭 Dual Engine Support - Use Playwright (recommended) or Puppeteer
  • 🔒 Type-safe - Full TypeScript support with type definitions
  • 🚀 Easy to use - Works with both JavaScript and TypeScript
  • 🌐 Full browser control - Navigate, interact, execute JS, take screenshots
  • 📸 Screenshot support - Capture full pages or viewports
  • 🎯 Element interaction - Click, type, and wait for elements
  • 📄 Content extraction - Get HTML, text, or structured accessibility data

Quick Start

New to using this with Claude?Read the Claude Usage Guide

Installation

Local Installation

npm install

Global Installation

npm install -g .

After global installation, you can use the claude-web-client command from anywhere:

claude-web-client launch
claude-web-client navigate https://example.com
claude-web-client text

Usage

claude-web-client can be used in two ways:

  1. CLI Mode - For AI assistants like Claude to execute browser commands
  2. Programmatic Mode - Import as a library in your JavaScript/TypeScript projects

🤖 CLI Usage (For Claude AI)

The CLI provides various commands to interact with the browser:

Launch Browser

node bin/cli.js launch          # Launch in headless mode
node bin/cli.js launch --no-headless  # Launch with visible browser

Navigate to URL

node bin/cli.js navigate https://example.com

Take Screenshot

node bin/cli.js screenshot screenshot.png
node bin/cli.js screenshot --type jpeg output.jpg

Get Page Content

node bin/cli.js content   # Get HTML content
node bin/cli.js text      # Get text content only

Execute JavaScript

node bin/cli.js execute "document.title"
node bin/cli.js execute "document.querySelectorAll('a').length"

Interact with Elements

node bin/cli.js click "#submit-button"
node bin/cli.js type "#search-input" "search query"
node bin/cli.js wait ".loading-complete"

Get Page Info

node bin/cli.js info

Close Browser

node bin/cli.js close

Example: Claude Browsing the Web

When I (Claude) need to browse the web, I can use these commands:

# Launch browser
claude-web-client launch

# Navigate to a page
claude-web-client navigate https://news.ycombinator.com

# Get the page text to read it
claude-web-client text

# Execute JavaScript to extract specific data
claude-web-client execute "Array.from(document.querySelectorAll('.titleline > a')).slice(0,5).map(a => a.textContent)"

# Take a screenshot
claude-web-client screenshot hn-screenshot.png

# Close when done
claude-web-client close

Or run the demo script:

bash examples/claude-demo.sh

📦 Programmatic Usage (For Scripts)

Import BrowserService into your JavaScript or TypeScript projects:

JavaScript Example (Recommended: Playwright)

import { UnifiedBrowser } from 'claude-web-client';

const browser = new UnifiedBrowser('playwright'); // or 'puppeteer'

// Launch and navigate
await browser.launch({ headless: true });
await browser.navigate('https://example.com');

// Get accessibility tree - semantic page representation for AI!
const { tree } = await browser.getAccessibilityTree();
console.log('Page structure:', tree);

// Get page content
const { text } = await browser.getText();
console.log(text);

// Execute JavaScript
const result = await browser.execute('document.title');
console.log(result.result);

// Take screenshot
await browser.screenshot('./screenshot.png');

// Clean up
await browser.close();

Legacy Puppeteer Example

import { BrowserService } from 'claude-web-client';

const browser = new BrowserService();
await browser.launch();
// ... same API as above (but no accessibility tree)
await browser.close();

TypeScript Example

import { BrowserService, NavigateResult } from 'claude-web-client';

async function scrapeWebsite(url: string): Promise<string> {
  const browser = new BrowserService();

  try {
    await browser.launch({ headless: true });

    const navResult: NavigateResult = await browser.navigate(url);

    if (navResult.success) {
      const { text } = await browser.getText();
      return text || '';
    }

    throw new Error(navResult.error);
  } finally {
    await browser.close();
  }
}

Run the Examples

# Basic usage example (Puppeteer)
node examples/basic-usage.js

# Form interaction example (Puppeteer)
node examples/form-interaction.js "Claude AI"

# 🌟 Playwright with accessibility tree - THE GAME CHANGER
node examples/playwright-accessibility.js

# TypeScript example (requires ts-node)
npx ts-node examples/typescript-example.ts

🧠 Why Accessibility Tree Matters for AI

Traditional browser automation requires AI to either:

  1. See screenshots (requires vision models, expensive, slow)
  2. Parse raw HTML (messy, includes styles/scripts, hard to understand)
  3. Rely on selectors (brittle, requires knowing page structure)

Playwright's Accessibility Tree provides a better way:

  • Semantic structure - Clean representation of what's actually on the page
  • Text-based - Perfect for LLMs like Claude that work with text/code
  • Fast - No image processing needed
  • Reliable - Based on ARIA and semantic HTML
  • Actionable - Includes roles, names, and values of interactive elements

What Claude CAN do with the accessibility tree:

  • ✅ Understand semantic structure (headings, buttons, links, forms)
  • ✅ Read all text content and labels
  • ✅ Interpret element roles and relationships
  • ✅ Find interactive elements by purpose
  • ✅ Navigate and interact with forms
  • ✅ Understand general page layout hierarchy

What Claude CANNOT do:

  • ❌ See images or describe visual content
  • ❌ Understand colors, fonts, or visual styling
  • ❌ Detect visual layouts or spatial positioning
  • ❌ Read text embedded in images
  • ❌ Understand charts, graphs, or diagrams visually

The key insight: Claude understands the semantic meaning of page elements, not their visual appearance.

Example accessibility tree output:

{
  "role": "WebArea",
  "name": "Example Page",
  "children": [
    {
      "role": "heading",
      "name": "Welcome",
      "level": 1
    },
    {
      "role": "button",
      "name": "Click me",
      "focused": false
    }
  ]
}

Claude can read this and understand "there's a heading that says Welcome and a button I can click"!


📚 API Reference

BrowserService Methods

All methods return promises with result objects:

| Method | Parameters | Returns | Description | |--------|-----------|---------|-------------| | launch(options?) | LaunchOptions | ActionResult | Launch browser instance | | navigate(url) | string | NavigateResult | Navigate to URL | | screenshot(path?, options?) | string?, ScreenshotOptions? | ScreenshotResult | Take screenshot | | getContent() | - | ContentResult | Get page HTML | | getText() | - | TextResult | Get page text | | execute(script) | string | ExecuteResult | Execute JavaScript | | click(selector) | string | ActionResult | Click element | | type(selector, text) | string, string | ActionResult | Type into input | | waitForSelector(selector, timeout?) | string, number? | ActionResult | Wait for element | | getPageInfo() | - | PageInfo | Get page title/URL | | close() | - | ActionResult | Close browser |

See src/browser-service.d.ts for full TypeScript definitions.


🏗 Architecture

claude-web-client/
├── src/
│   ├── browser-service.js      # Core Puppeteer wrapper
│   ├── browser-service.d.ts    # TypeScript definitions
│   └── index.js                # Public API exports
├── bin/
│   └── cli.js                  # CLI interface
└── examples/
    ├── basic-usage.js          # Basic JS example
    ├── form-interaction.js     # Form interaction example
    ├── typescript-example.ts   # TypeScript example
    └── claude-demo.sh          # CLI demo for Claude

💰 Cost Analysis: Why This Matters

Running Costs Comparison

| Solution | Where It Runs | Cost per 1000 Interactions | Speed | |----------|--------------|---------------------------|-------| | Anthropic Web Search | Cloud | ~$5-10 (vision model calls) | Moderate | | Vision-based browsing | Cloud | ~$10-20 (screenshot uploads) | Slow | | This Tool (Local) | Your machine | $0.10-0.50 (text tokens only) | Fast | | This Tool (EC2 t3.micro) | AWS | ~$0.01/hour + tokens | Fast |

The Economics

Traditional approach:

  • Screenshot: 1000+ image tokens (~$0.01-0.02 each)
  • Vision processing: Slow (upload + process)
  • Data leaves your machine

Our approach:

  • Accessibility tree: ~100 text tokens (~$0.0001)
  • Text processing: Fast (local compute)
  • Data stays local

Real-world example:

  • 100 web interactions/day
  • Traditional: ~$500-1000/month
  • This tool: $5-10/month (100x cheaper!)

Deployment Options

  1. Local (Best for development)

    • Install on your machine
    • Zero compute cost
    • Full privacy
  2. EC2 Instance (Best for production)

    • t3.micro (~$7/month)
    • Run 24/7 for your team
    • Still 10-50x cheaper than cloud browsing
  3. Docker Container

    • Deploy anywhere
    • Scale as needed
    • Predictable costs

Community-Driven Battle Testing

This is open source because:

  • Community contributions make it better
  • Battle-tested by real users
  • Faster bug fixes and features
  • Cost savings benefit everyone

As more people use and contribute:

  • More edge cases handled
  • Better reliability
  • More examples and use cases
  • Becomes the standard for AI web automation

The Vision: Make web browsing for AI assistants accessible and affordable for everyone, not just those who can afford expensive cloud solutions.


🚀 Future Enhancements

  • [ ] Multiple page/tab support
  • [ ] Session persistence
  • [ ] Cookie management
  • [ ] Network request interception
  • [ ] PDF generation
  • [ ] Mobile device emulation
  • [ ] Performance metrics collection
  • [ ] Proxy support
  • [ ] Custom headers and user agents

License

MIT