npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2025 – Pkg Stats / Ryan Hefner

browser-agentkit

v0.0.2

Published

A powerful SDK for Chrome extensions to interact with browsers and web pages

Readme

Browser AgentKit SDK

A powerful SDK for Chrome extensions to interact with browsers and web pages. Simplifies browser automation, web scraping, and page interaction tasks.

Features

  • 🎯 CDP-Powered - Direct Chrome DevTools Protocol integration for fine-grained browser control
  • 🔒 Stable & Reliable - CDP provides deterministic, low-level operations without DOM injection
  • 🚀 Easy to Use - Simple, intuitive API design
  • 🌐 Comprehensive - Browser control, page interactions, content extraction
  • 🎨 Flexible - Support for keyboard, mouse, and complex interactions
  • 📦 Lightweight - Minimal dependencies, no external browser binaries required

Why Browser AgentKit?

Built for Chrome Extensions

Unlike traditional browser automation tools (Puppeteer, Playwright, Selenium), Browser AgentKit is specifically designed for Chrome extension development:

| Feature | Browser AgentKit | Puppeteer/Playwright | |---------|------------------|---------------------| | Runtime | Inside Chrome extension | External Node.js process | | Browser Instance | Uses user's existing browser | Launches separate browser | | User Session | Access to logged-in sessions & cookies | Isolated browser context | | Installation | npm package only | Requires browser binary download | | Use Case | Extension-based automation | Testing & scraping servers |

CDP: The Foundation of Reliability

Browser AgentKit leverages the Chrome DevTools Protocol (CDP) directly, the same protocol that powers Chrome DevTools, Puppeteer, and Playwright internally:

  • Fine-grained Control - Direct access to browser internals: DOM, network, input, runtime
  • No Script Injection - Input simulation happens at the browser level, not via JavaScript injection
  • Deterministic Operations - CDP commands are executed synchronously by the browser engine
  • Anti-Detection Friendly - Native browser events are indistinguishable from real user actions
  • Full Debugging Capabilities - Access to the same powerful APIs used by Chrome DevTools

When to Use Browser AgentKit

Use Browser AgentKit when you need:

  • Build Chrome extensions with automation capabilities
  • Access user's authenticated sessions (no re-login required)
  • Operate within user's existing browser environment
  • Lightweight SDK without bundled browser binaries

Use Puppeteer/Playwright instead when you need:

  • Server-side web scraping or testing
  • Headless browser automation in CI/CD pipelines
  • Cross-browser testing (Firefox, Safari, etc.)
  • Isolated browser contexts for parallel execution

Installation

npm install browser-agentkit

Quick Start

import { browser, Page } from 'browser-agentkit';

// Open a new tab
const tab = await browser.openTab('https://example.com');

// Create a page instance
const page = new Page(tab.id);
await page.initialize();

// Interact with the page
await page.fill('#search', 'browser automation');
await page.click('#submit');

// Extract content
const extractor = page.getContentExtractor();
const html = await extractor.getHTML();
const metadata = await extractor.getMetadata();

console.log('Page title:', metadata.title);

// Clean up
await page.close();
await browser.closeTab(tab.id);

Core Modules

Browser Module

Manage browser tabs and windows:

import { browser } from 'browser-agentkit';

// Open and manage tabs
const tab = await browser.openTab('https://example.com');
const currentTab = await browser.getCurrentTab();
const allTabs = await browser.queryTabs({ currentWindow: true });

// Navigation
await browser.navigate(tab.id, 'https://google.com');
await browser.goBack(tab.id);
await browser.reload(tab.id);

// Screenshots
const screenshot = await browser.captureVisibleTab(tab.id);

Page Module

Interact with web pages:

import { Page, ScrollDirection } from 'browser-agentkit';

const page = new Page(tabId);
await page.initialize();

// Click elements
await page.click('#button');

// Fill forms
await page.fill('#email', '[email protected]');
await page.fill('#password', 'secret');

// Scroll
await page.scroll(ScrollDirection.DOWN);

// Wait for elements
await page.waitForSelector('#result');

// Advanced interactions
const keyboard = await page.getKeyboard();
await keyboard.press('Enter');

const mouse = await page.getMouse();
await mouse.click(100, 200);

Content Extraction

Extract data from web pages:

import { ContentExtractor } from 'browser-agentkit';

const extractor = new ContentExtractor(tabId);

// Get page content
const html = await extractor.getHTML();
const text = await extractor.getText();
const metadata = await extractor.getMetadata();

// Get element content
const result = await extractor.getElementContent('#article');
console.log(result.text);

// Check if PDF
const isPdf = await extractor.isPDF();

Input Simulation

Simulate keyboard and mouse input:

import { createKeyboard, createMouse } from 'browser-agentkit';

const keyboard = await createKeyboard({ tabId });

// Type text
await keyboard.type('Hello World');

// Press keys
await keyboard.press('Enter');
await keyboard.press('ControlOrMeta+KeyA'); // Ctrl/Cmd + A

// Mouse operations
const mouse = createMouse({ tabId }, keyboard);
await mouse.move(100, 200);
await mouse.click(100, 200);
await mouse.dblclick(150, 250);

Event Management

Listen to browser events:

import { events } from 'browser-agentkit';

// Tab events
events.onTabCreated((tab) => {
  console.log('New tab created:', tab.url);
});

events.onTabUpdated((tabId, changeInfo, tab) => {
  if (changeInfo.status === 'complete') {
    console.log('Tab loaded:', tab.url);
  }
});

// Network events
events.onBeforeRequest((details) => {
  console.log('Request:', details.url);
});

events.onCompleted((details) => {
  console.log('Response:', details.url);
});

Advanced Examples

Form Automation

import { browser, Page } from 'browser-agentkit';

async function fillLoginForm() {
  const tab = await browser.openTab('https://example.com/login');
  const page = new Page(tab.id);
  await page.initialize();

  await page.fill('#username', 'myuser');
  await page.fill('#password', 'mypass');
  await page.click('#login-button');

  await page.waitForNavigation();
  console.log('Login successful!');

  await page.close();
}

Web Scraping

import { browser, Page } from 'browser-agentkit';

async function scrapeData() {
  const tab = await browser.openTab('https://example.com/data');
  const page = new Page(tab.id);
  await page.initialize();

  const extractor = page.getContentExtractor();
  const metadata = await extractor.getMetadata();
  const content = await extractor.getElementContent('.data-container');

  const data = {
    title: metadata.title,
    description: metadata.description,
    content: content.text
  };

  await page.close();
  await browser.closeTab(tab.id);

  return data;
}

Drag and Drop

import { Page } from 'browser-agentkit';

async function dragAndDrop(tabId: number) {
  const page = new Page(tabId);
  await page.initialize();

  const mouse = await page.getMouse();

  // Drag from (100, 100) to (300, 300)
  await mouse.move(100, 100);
  await mouse.down();
  await mouse.move(300, 300, { steps: 10 }); // Smooth movement
  await mouse.up();

  await page.close();
}

API Reference

Classes

Browser

Main class for browser-level operations.

| Method | Description | |--------|-------------| | openTab(url, options?) | Opens a new tab | | getTab(tabId) | Gets tab information | | getCurrentTab(windowId?) | Gets the active tab | | queryTabs(queryInfo) | Queries tabs | | searchHistory(query) | Searches browser history | | closeTab(tabId) | Closes a tab | | navigate(tabId, url) | Navigates to URL | | goBack(tabId) | Goes back in history | | goForward(tabId) | Goes forward in history | | reload(tabId, bypassCache?) | Reloads a tab | | captureVisibleTab(tabId, options?) | Captures visible area screenshot | | captureFullPage(tabId) | Captures full page screenshot | | createHiddenTab(url) | Creates a hidden tab for background processing | | captureVisibleTab(tabId, options?) | Captures visible area |

Page

Main class for page-level operations.

| Method | Description | |--------|-------------| | initialize() | Initializes the page (must call before other methods) | | close() | Cleans up resources | | click(selector) | Clicks an element, The param selector can be any CSS selector or a string in the format node=id, e.g. "node=123". | | fill(selector, text) | Fills an input field, The param selector can be any CSS selector or a string in the format node=id, e.g. "node=123". | | scroll(direction, amount?) | Scrolls the page | | navigate(url) | Navigates to URL | | waitForNavigation(options?) | Waits for navigation to complete | | waitForSelector(selector, options?) | Waits for element to appear | | element(selector) | Creates PageElement instance | | getKeyboard() | Gets Keyboard instance | | getMouse() | Gets Mouse instance | | getContentExtractor() | Gets ContentExtractor instance | | evaluate(fn, ...args) | Executes script in page context | | captureVisible(tabId, options?) | Captures using debugger API |

Actions

Standalone class for page actions (used internally by Page).

| Method | Description | |--------|-------------| | click(selector) | Clicks an element | | fill(selector, value) | Fills an input field | | search(selector, value) | Fills input and presses Enter | | scroll(direction, amount?) | Scrolls the page | | waitForSelector(selector, options?) | Waits for element to appear |

Navigation

Handles page navigation.

| Method | Description | |--------|-------------| | navigate(url) | Navigates to URL | | waitForNavigation(options?) | Waits for navigation to complete | | waitForCondition(fn, options?) | Waits for a condition to be true |

PageElement

Represents a DOM element for interaction.

| Method | Description | |--------|-------------| | waitForExist(options?) | Waits for element to exist | | exists() | Checks if element exists | | getBoundingBox() | Gets element's bounding box | | getText() | Gets element's text content | | getAttributeValue(name) | Gets attribute value | | scrollIntoView() | Scrolls element into view | | findElementNodeIds() | Finds element and returns node IDs |

ContentExtractor

Extracts content from web pages.

| Method | Description | |--------|-------------| | getPageSnapshot(options?) | Gets page snapshot with HTML and metadata | | getHTML(options?) | Gets Structured HTML Content (options: { viewportOnly?: boolean }). Every interactive DOM element has a unique "node" attribute, e.g. ...<button onclick="..." node="1234"></button>.... | | getText() | Gets plain text content | | getMetadata() | Gets page metadata (title, description, etc.) | | getElementContent(selector) | Gets element's HTML and text content | | isPDF() | Checks if current page is a PDF |

Keyboard

Simulates keyboard input.

| Method | Description | |--------|-------------| | down(key) | Presses a key down | | up(key) | Releases a key | | press(key, options?) | Presses and releases a key | | type(text, options?) | Types a sequence of characters | | insertText(text) | Inserts text directly |

Mouse

Simulates mouse input.

| Method | Description | |--------|-------------| | move(x, y, options?) | Moves mouse to position | | down(options?) | Presses mouse button down | | up(options?) | Releases mouse button | | click(x, y, options?) | Clicks at position | | dblclick(x, y, options?) | Double-clicks at position | | wheel(deltaX, deltaY) | Scrolls using mouse wheel |

EventManager

Manages browser event listeners.

| Method | Description | |--------|-------------| | on(eventName, callback) | Registers custom event listener | | off(eventName, callback) | Removes custom event listener | | emit(eventName, ...args) | Emits custom event | | onTabCreated(callback) | Listens for tab created events | | onTabRemoved(callback) | Listens for tab removed events | | onTabUpdated(callback) | Listens for tab updated events | | onTabActivated(callback) | Listens for tab activated events | | onBeforeRequest(callback, filter?) | Listens for network requests | | onCompleted(callback, filter?) | Listens for completed requests | | cleanup() | Removes all listeners |

TabManager

Low-level tab management (used internally by Browser).

| Method | Description | |--------|-------------| | create(url, options?) | Creates a new tab | | get(tabId) | Gets tab by ID | | getActive(windowId?) | Gets active tab | | query(queryInfo) | Queries tabs | | close(tabId) | Closes a tab | | update(tabId, properties) | Updates tab properties | | navigate(tabId, url) | Navigates to URL | | goBack(tabId) | Goes back in history | | goForward(tabId) | Goes forward in history | | reload(tabId, bypassCache?) | Reloads tab | | createHidden(url) | Creates hidden tab |

Enums

ScrollDirection

enum ScrollDirection {
  UP = 'UP',
  DOWN = 'DOWN',
}

Types

// Tab information
type TabInfo = chrome.tabs.Tab

// Tab context for operations
interface TabContext {
  tabId: number
  mainTabId?: number
  url?: string
  title?: string
}

// Page content snapshot
interface PageContent {
  html: string
  markdown?: string
  meta?: PageMetadata
  isPdf?: boolean
  resources?: unknown[]
}

// Page metadata
interface PageMetadata {
  title?: string
  description?: string
  url?: string
  image?: string
  [key: string]: string | undefined
}

// Screenshot options
interface ScreenshotOptions {
  format?: 'png' | 'jpeg'
  quality?: number  // 0-100, jpeg only
}

// Element bounding box
interface BoundingBox {
  x: number
  y: number
  width: number
  height: number
}

// Viewport information
interface ViewportInfo {
  width: number
  height: number
  x: number
  y: number
}

Factory Functions

| Function | Description | |----------|-------------| | createKeyboard(debuggee) | Creates Keyboard instance with platform detection | | createMouse(tabContext, keyboard) | Creates Mouse instance |

Default Instances

import { browser, events } from 'browser-agentkit';

// Pre-initialized Browser instance
browser.openTab('https://example.com');

// Pre-initialized EventManager instance
events.onTabCreated((tab) => console.log(tab));

Requirements

  • Chrome/Chromium browser
  • Chrome extension with appropriate permissions:
    • tabs
    • debugger
    • scripting
    • activeTab
    • history

License

MIT