npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2026 – Pkg Stats / Ryan Hefner

sentienceapi

v0.90.19

Published

TypeScript SDK for Sentience AI Agent Browser Automation

Readme

Sentience TypeScript SDK

Semantic geometry grounding for deterministic, debuggable AI web agents with time-travel traces.

📦 Installation

# Install from npm
npm install sentienceapi

# Install Playwright browsers (required)
npx playwright install chromium

For local development:

npm install
npm run build

🚀 Quick Start: Choose Your Abstraction Level

Sentience SDK offers 4 levels of abstraction - choose based on your needs:

Complete automation with natural conversation. Just describe what you want, and the agent plans and executes everything:

import { SentienceBrowser, ConversationalAgent, OpenAIProvider } from 'sentienceapi';

const browser = await SentienceBrowser.create({ apiKey: process.env.SENTIENCE_API_KEY });
const llm = new OpenAIProvider(process.env.OPENAI_API_KEY!, 'gpt-4o');
const agent = new ConversationalAgent({ llmProvider: llm, browser });

// Navigate to starting page
await browser.getPage().goto('https://amazon.com');

// ONE command does it all - automatic planning and execution!
const response = await agent.execute(
  "Search for 'wireless mouse' and tell me the price of the top result"
);
console.log(response); // "I found the top result for wireless mouse on Amazon. It's priced at $24.99..."

// Follow-up questions maintain context
const followUp = await agent.chat("Add it to cart");
console.log(followUp);

await browser.close();

When to use: Complex multi-step tasks, conversational interfaces, maximum convenience Code reduction: 99% less code - describe goals in natural language Requirements: OpenAI or Anthropic API key

Zero coding knowledge needed. Just write what you want in plain English:

import { SentienceBrowser, SentienceAgent, OpenAIProvider } from 'sentienceapi';

const browser = await SentienceBrowser.create({ apiKey: process.env.SENTIENCE_API_KEY });
const llm = new OpenAIProvider(process.env.OPENAI_API_KEY!, 'gpt-4o-mini');
const agent = new SentienceAgent(browser, llm);

await browser.getPage().goto('https://www.amazon.com');

// Just natural language commands - agent handles everything!
await agent.act('Click the search box');
await agent.act("Type 'wireless mouse' into the search field");
await agent.act('Press Enter key');
await agent.act('Click the first product result');

// Automatic token tracking
console.log(`Tokens used: ${agent.getTokenStats().totalTokens}`);
await browser.close();

When to use: Quick automation, non-technical users, rapid prototyping Code reduction: 95-98% less code vs manual approach Requirements: OpenAI API key (or Anthropic for Claude)

Full control with semantic selectors. For technical users who want precision:

import { SentienceBrowser, snapshot, find, click, typeText, press } from 'sentienceapi';

const browser = await SentienceBrowser.create({ apiKey: process.env.SENTIENCE_API_KEY });
await browser.getPage().goto('https://www.amazon.com');

// Get semantic snapshot
const snap = await snapshot(browser);

// Find elements using query DSL
const searchBox = find(snap, 'role=textbox text~"search"');
await click(browser, searchBox!.id);

// Type and submit
await typeText(browser, searchBox!.id, 'wireless mouse');
await press(browser, 'Enter');

await browser.close();

When to use: Need precise control, debugging, custom workflows Code reduction: Still 80% less code vs raw Playwright Requirements: Only Sentience API key

For when you need complete low-level control (rare):

import { chromium } from 'playwright';

const browser = await chromium.launch();
const page = await browser.newPage();
await page.goto('https://www.amazon.com');
await page.fill('#twotabsearchtextbox', 'wireless mouse');
await page.press('#twotabsearchtextbox', 'Enter');
await browser.close();

When to use: Very specific edge cases, custom browser configs Tradeoffs: No semantic intelligence, brittle selectors, more code


Record complete agent execution traces for debugging, analysis, and replay. Traces capture every step, snapshot, LLM decision, and action in a structured JSONL format.

Quick Start: Agent with Tracing

import {
  SentienceBrowser,
  SentienceAgent,
  OpenAIProvider,
  Tracer,
  JsonlTraceSink
} from 'sentienceapi';
import { randomUUID } from 'crypto';

const browser = await SentienceBrowser.create({ apiKey: process.env.SENTIENCE_API_KEY });
const llm = new OpenAIProvider(process.env.OPENAI_API_KEY!, 'gpt-4o');

// Create a tracer
const runId = randomUUID();
const sink = new JsonlTraceSink(`traces/${runId}.jsonl`);
const tracer = new Tracer(runId, sink);

// Create agent with tracer
const agent = new SentienceAgent(browser, llm, 50, true, tracer);

// Emit run_start
tracer.emitRunStart('SentienceAgent', 'gpt-4o');

try {
  await browser.getPage().goto('https://google.com');

  // Every action is automatically traced!
  await agent.act('Click the search box');
  await agent.act("Type 'sentience ai' into the search field");
  await agent.act('Press Enter');

  tracer.emitRunEnd(3);
} finally {
  // Flush trace to disk
  await agent.closeTracer();
  await browser.close();
}

console.log(`✅ Trace saved to: traces/${runId}.jsonl`);

What Gets Traced

Each agent action generates multiple events:

  1. step_start - Before action execution (goal, URL, attempt)
  2. snapshot - Page state with all interactive elements
  3. llm_response - LLM decision (model, tokens, response)
  4. action - Executed action (type, element ID, success)
  5. error - Any failures (error message, retry attempt)

Schema Compatibility

Traces are 100% compatible with Python SDK traces - use the same tools to analyze traces from both TypeScript and Python agents!

See full example: examples/agent-with-tracing.ts


This example demonstrates navigating Amazon, finding products, and adding items to cart:

import { SentienceBrowser, snapshot, find, click } from './src';

async function main() {
  const browser = new SentienceBrowser(undefined, undefined, false);

  try {
    await browser.start();

    // Navigate to Amazon Best Sellers
    await browser.goto('https://www.amazon.com/gp/bestsellers/');
    await browser.getPage().waitForLoadState('networkidle');
    await new Promise(resolve => setTimeout(resolve, 2000));

    // Take snapshot and find products
    const snap = await snapshot(browser);
    console.log(`Found ${snap.elements.length} elements`);

    // Find first product in viewport using spatial filtering
    const products = snap.elements
      .filter(el =>
        el.role === 'link' &&
        el.visual_cues.is_clickable &&
        el.in_viewport &&
        !el.is_occluded &&
        el.bbox.y < 600  // First row
      );

    if (products.length > 0) {
      // Sort by position (left to right, top to bottom)
      products.sort((a, b) => a.bbox.y - b.bbox.y || a.bbox.x - b.bbox.x);
      const firstProduct = products[0];

      console.log(`Clicking: ${firstProduct.text}`);
      const result = await click(browser, firstProduct.id);

      // Wait for product page
      await browser.getPage().waitForLoadState('networkidle');
      await new Promise(resolve => setTimeout(resolve, 2000));

      // Find and click "Add to Cart" button
      const productSnap = await snapshot(browser);
      const addToCart = find(productSnap, 'role=button text~"add to cart"');

      if (addToCart) {
        const cartResult = await click(browser, addToCart.id);
        console.log(`Added to cart: ${cartResult.success}`);
      }
    }
  } finally {
    await browser.close();
  }
}

main();

📖 See the complete tutorial: Amazon Shopping Guide


📚 Core Features

  • SentienceBrowser - Playwright browser with Sentience extension pre-loaded
  • browser.goto(url) - Navigate with automatic extension readiness checks
  • Automatic bot evasion and stealth mode
  • Configurable headless/headed mode

snapshot(browser, options?) - Capture page state with AI-ranked elements

Features:

  • Returns semantic elements with roles, text, importance scores, and bounding boxes
  • Optional screenshot capture (PNG/JPEG)
  • Optional visual overlay to see what elements are detected
  • TypeScript types for type safety

Example:

const snap = await snapshot(browser, { screenshot: true, show_overlay: true });

// Access structured data
console.log(`URL: ${snap.url}`);
console.log(`Viewport: ${snap.viewport.width}x${snap.viewport.height}`);
console.log(`Elements: ${snap.elements.length}`);

// Iterate over elements
for (const element of snap.elements) {
  console.log(`${element.role}: ${element.text} (importance: ${element.importance})`);
}
  • query(snapshot, selector) - Find all matching elements
  • find(snapshot, selector) - Find single best match (by importance)
  • Powerful query DSL with multiple operators

Query Examples:

// Find by role and text
const button = find(snap, 'role=button text="Sign in"');

// Substring match (case-insensitive)
const link = find(snap, 'role=link text~"more info"');

// Spatial filtering
const topLeft = find(snap, 'bbox.x<=100 bbox.y<=200');

// Multiple conditions (AND logic)
const primaryBtn = find(snap, 'role=button clickable=true visible=true importance>800');

// Prefix/suffix matching
const startsWith = find(snap, 'text^="Add"');
const endsWith = find(snap, 'text$="Cart"');

// Numeric comparisons
const important = query(snap, 'importance>=700');
const firstRow = query(snap, 'bbox.y<600');

📖 Complete Query DSL Guide - All operators, fields, and advanced patterns

  • click(browser, elementId) - Click element by ID
  • clickRect(browser, rect) - Click at center of rectangle (coordinate-based)
  • typeText(browser, elementId, text) - Type into input fields
  • press(browser, key) - Press keyboard keys (Enter, Escape, Tab, etc.)

All actions return ActionResult with success status, timing, and outcome:

const result = await click(browser, element.id);

console.log(`Success: ${result.success}`);
console.log(`Outcome: ${result.outcome}`);  // "navigated", "dom_updated", "error"
console.log(`Duration: ${result.duration_ms}ms`);
console.log(`URL changed: ${result.url_changed}`);

Coordinate-based clicking:

import { clickRect } from './src';

// Click at center of rectangle (x, y, width, height)
await clickRect(browser, { x: 100, y: 200, w: 50, h: 30 });

// With visual highlight (default: red border for 2 seconds)
await clickRect(browser, { x: 100, y: 200, w: 50, h: 30 }, true, 2.0);

// Using element's bounding box
const snap = await snapshot(browser);
const element = find(snap, 'role=button');
if (element) {
  await clickRect(browser, {
    x: element.bbox.x,
    y: element.bbox.y,
    w: element.bbox.width,
    h: element.bbox.height
  });
}
  • waitFor(browser, selector, timeout?, interval?, useApi?) - Wait for element to appear
  • expect(browser, selector) - Assertion helper with fluent API

Examples:

// Wait for element (auto-detects optimal interval based on API usage)
const result = await waitFor(browser, 'role=button text="Submit"', 10000);
if (result.found) {
  console.log(`Found after ${result.duration_ms}ms`);
}

// Use local extension with fast polling (250ms interval)
const result = await waitFor(browser, 'role=button', 5000, undefined, false);

// Use remote API with network-friendly polling (1500ms interval)
const result = await waitFor(browser, 'role=button', 5000, undefined, true);

// Custom interval override
const result = await waitFor(browser, 'role=button', 5000, 500, false);

// Semantic wait conditions
await waitFor(browser, 'clickable=true', 5000);  // Wait for clickable element
await waitFor(browser, 'importance>100', 5000);  // Wait for important element
await waitFor(browser, 'role=link visible=true', 5000);  // Wait for visible link

// Assertions
await expect(browser, 'role=button text="Submit"').toExist(5000);
await expect(browser, 'role=heading').toBeVisible();
await expect(browser, 'role=button').toHaveText('Submit');
await expect(browser, 'role=link').toHaveCount(10);
  • showOverlay(browser, elements, targetElementId?) - Display visual overlay highlighting elements
  • clearOverlay(browser) - Clear overlay manually

Show color-coded borders around detected elements to debug, validate, and understand what Sentience sees:

import { showOverlay, clearOverlay } from 'sentienceapi';

// Take snapshot once
const snap = await snapshot(browser);

// Show overlay anytime without re-snapshotting
await showOverlay(browser, snap);  // Auto-clears after 5 seconds

// Highlight specific target element in red
const button = find(snap, 'role=button text~"Submit"');
await showOverlay(browser, snap, button.id);

// Clear manually before 5 seconds
await new Promise(resolve => setTimeout(resolve, 2000));
await clearOverlay(browser);

Color Coding:

  • 🔴 Red: Target element
  • 🔵 Blue: Primary elements (is_primary=true)
  • 🟢 Green: Regular interactive elements

Visual Indicators:

  • Border thickness/opacity scales with importance
  • Semi-transparent fill
  • Importance badges
  • Star icons for primary elements
  • Auto-clear after 5 seconds

read(browser, options?) - Extract page content

  • format: "text" - Plain text extraction
  • format: "markdown" - High-quality markdown conversion (uses Turndown)
  • format: "raw" - Cleaned HTML (default)

Example:

import { read } from './src';

// Get markdown content
const result = await read(browser, { format: 'markdown' });
console.log(result.content);  // Markdown text

// Get plain text
const result = await read(browser, { format: 'text' });
console.log(result.content);  // Plain text

screenshot(browser, options?) - Standalone screenshot capture

  • Returns base64-encoded data URL
  • PNG or JPEG format
  • Quality control for JPEG (1-100)

Example:

import { screenshot } from './src';
import { writeFileSync } from 'fs';

// Capture PNG screenshot
const dataUrl = await screenshot(browser, { format: 'png' });

// Save to file
const base64Data = dataUrl.split(',')[1];
const imageData = Buffer.from(base64Data, 'base64');
writeFileSync('screenshot.png', imageData);

// JPEG with quality control (smaller file size)
const dataUrl = await screenshot(browser, { format: 'jpeg', quality: 85 });

findTextRect(page, options) - Find text on page and get exact pixel coordinates

Find buttons, links, or any UI elements by their visible text without needing element IDs or CSS selectors. Returns exact pixel coordinates for each match.

Example:

import { SentienceBrowser, findTextRect, clickRect } from 'sentienceapi';

const browser = await SentienceBrowser.create();
await browser.getPage().goto('https://example.com');

// Find "Sign In" button (simple string syntax)
const result = await findTextRect(browser.getPage(), "Sign In");
if (result.status === "success" && result.results) {
  const firstMatch = result.results[0];
  console.log(`Found at: (${firstMatch.rect.x}, ${firstMatch.rect.y})`);
  console.log(`In viewport: ${firstMatch.in_viewport}`);

  // Click on the found text
  if (firstMatch.in_viewport) {
    await clickRect(browser, {
      x: firstMatch.rect.x,
      y: firstMatch.rect.y,
      w: firstMatch.rect.width,
      h: firstMatch.rect.height
    });
  }
}

Advanced Options:

// Case-sensitive search
const result = await findTextRect(browser.getPage(), {
  text: "LOGIN",
  caseSensitive: true
});

// Whole word only (won't match "login" as part of "loginButton")
const result = await findTextRect(browser.getPage(), {
  text: "log",
  wholeWord: true
});

// Find multiple matches
const result = await findTextRect(browser.getPage(), {
  text: "Buy",
  maxResults: 10
});
for (const match of result.results || []) {
  if (match.in_viewport) {
    console.log(`Found '${match.text}' at (${match.rect.x}, ${match.rect.y})`);
    console.log(`Context: ...${match.context.before}[${match.text}]${match.context.after}...`);
  }
}

Returns: Promise with:

  • status: "success" or "error"
  • results: Array of TextMatch objects with:
    • text - The matched text
    • rect - Absolute coordinates (with scroll offset)
    • viewport_rect - Viewport-relative coordinates
    • context - Surrounding text (before/after)
    • in_viewport - Whether visible in current viewport

Use Cases:

  • Find buttons/links by visible text without CSS selectors
  • Get exact pixel coordinates for click automation
  • Verify text visibility and position on page
  • Search dynamic content that changes frequently

Note: Does not consume API credits (runs locally in browser)

See example: examples/find-text-demo.ts


📋 Reference

Elements returned by snapshot() have the following properties:

element.id              // Unique identifier for interactions
element.role            // ARIA role (button, link, textbox, heading, etc.)
element.text            // Visible text content
element.importance      // AI importance score (0-1000)
element.bbox            // Bounding box (x, y, width, height)
element.visual_cues     // Visual analysis (is_primary, is_clickable, background_color)
element.in_viewport     // Is element visible in current viewport?
element.is_occluded     // Is element covered by other elements?
element.z_index         // CSS stacking order

Basic Operators

| Operator | Description | Example | |----------|-------------|---------| | = | Exact match | role=button | | != | Exclusion | role!=link | | ~ | Substring (case-insensitive) | text~"sign in" | | ^= | Prefix match | text^="Add" | | $= | Suffix match | text$="Cart" | | >, >= | Greater than | importance>500 | | <, <= | Less than | bbox.y<600 |

Supported Fields

  • Role: role=button|link|textbox|heading|...
  • Text: text, text~, text^=, text$=
  • Visibility: clickable=true|false, visible=true|false
  • Importance: importance, importance>=N, importance<N
  • Position: bbox.x, bbox.y, bbox.width, bbox.height
  • Layering: z_index

⚙️ Configuration

Default viewport is 1280x800 pixels. You can customize it using Playwright's API:

const browser = new SentienceBrowser();
await browser.start();

// Set custom viewport before navigating
await browser.getPage().setViewportSize({ width: 1920, height: 1080 });

await browser.goto('https://example.com');
// Headed mode (shows browser window)
const browser = new SentienceBrowser(undefined, undefined, false);

// Headless mode
const browser = new SentienceBrowser(undefined, undefined, true);

// Auto-detect based on environment (default)
const browser = new SentienceBrowser();  // headless=true if CI=true, else false

For users running from datacenters (AWS, DigitalOcean, etc.), you can configure a residential proxy to prevent IP-based detection by Cloudflare, Akamai, and other anti-bot services.

Supported Formats:

  • HTTP: http://username:password@host:port
  • HTTPS: https://username:password@host:port
  • SOCKS5: socks5://username:password@host:port

Usage:

// Via constructor parameter
const browser = new SentienceBrowser(
  undefined,
  undefined,
  false,
  'http://username:[email protected]:8000'
);
await browser.start();

// Via environment variable
process.env.SENTIENCE_PROXY = 'http://username:[email protected]:8000';
const browser = new SentienceBrowser();
await browser.start();

// With agent
import { SentienceAgent, OpenAIProvider } from 'sentienceapi';

const browser = new SentienceBrowser(
  'your-api-key',
  undefined,
  false,
  'http://user:[email protected]:8000'
);
await browser.start();

const agent = new SentienceAgent(browser, new OpenAIProvider('openai-key'));
await agent.act('Navigate to example.com');

WebRTC Protection: The SDK automatically adds WebRTC leak protection flags when a proxy is configured, preventing your real datacenter IP from being exposed via WebRTC even when using proxies.

HTTPS Certificate Handling: The SDK automatically ignores HTTPS certificate errors when a proxy is configured, as residential proxies often use self-signed certificates for SSL interception.

Inject pre-recorded authentication sessions (cookies + localStorage) to start your agent already logged in, bypassing login screens, 2FA, and CAPTCHAs. This saves tokens and reduces costs by eliminating login steps.

// Workflow 1: Inject pre-recorded session from file
import { SentienceBrowser, saveStorageState } from 'sentienceapi';

// Save session after manual login
const browser = new SentienceBrowser();
await browser.start();
await browser.getPage().goto('https://example.com');
// ... log in manually ...
await saveStorageState(browser.getContext(), 'auth.json');

// Use saved session in future runs
const browser2 = new SentienceBrowser(
  undefined, // apiKey
  undefined, // apiUrl
  false,     // headless
  undefined,  // proxy
  undefined,  // userDataDir
  'auth.json' // storageState - inject saved session
);
await browser2.start();
// Agent starts already logged in!

// Workflow 2: Persistent sessions (cookies persist across runs)
const browser3 = new SentienceBrowser(
  undefined,      // apiKey
  undefined,      // apiUrl
  false,          // headless
  undefined,      // proxy
  './chrome_profile', // userDataDir - persist cookies
  undefined       // storageState
);
await browser3.start();
// First run: Log in
// Second run: Already logged in (cookies persist automatically)

Benefits:

  • Bypass login screens and CAPTCHAs with valid sessions
  • Save 5-10 agent steps and hundreds of tokens per run
  • Maintain stateful sessions for accessing authenticated pages
  • Act as authenticated users (e.g., "Go to my Orders page")

See examples/auth-injection-agent.ts for complete examples.


💡 Best Practices

1. Wait for Dynamic Content

await browser.goto('https://example.com');
await browser.getPage().waitForLoadState('networkidle');
await new Promise(resolve => setTimeout(resolve, 1000));  // Extra buffer

2. Use Multiple Strategies for Finding Elements

// Try exact match first
let btn = find(snap, 'role=button text="Add to Cart"');

// Fallback to fuzzy match
if (!btn) {
  btn = find(snap, 'role=button text~"cart"');
}

3. Check Element Visibility Before Clicking

if (element.in_viewport && !element.is_occluded) {
  await click(browser, element.id);
}

4. Handle Navigation

const result = await click(browser, linkId);
if (result.url_changed) {
  await browser.getPage().waitForLoadState('networkidle');
}

5. Use Screenshots Sparingly

// Fast - no screenshot (only element data)
const snap = await snapshot(browser);

// Slower - with screenshot (for debugging/verification)
const snap = await snapshot(browser, { screenshot: true });

6. Always Close Browser

const browser = new SentienceBrowser();

try {
  await browser.start();
  // ... your automation code
} finally {
  await browser.close();  // Always clean up
}

🛠️ Troubleshooting

"Extension failed to load"

Solution: Build the extension first:

cd sentience-chrome
./build.sh

"Cannot use import statement outside a module"

Solution: Don't use node directly. Use ts-node or npm scripts:

npx ts-node examples/hello.ts
# or
npm run example:hello

"Element not found"

Solutions:

  • Ensure page is loaded: await browser.getPage().waitForLoadState('networkidle')
  • Use waitFor(): await waitFor(browser, 'role=button', 10000)
  • Debug elements: console.log(snap.elements.map(el => el.text))

Button not clickable

Solutions:

  • Check visibility: element.in_viewport && !element.is_occluded
  • Scroll to element: await browser.getPage().evaluate(`window.sentience_registry[${element.id}].scrollIntoView()`)

💻 Examples & Testing

  • agent-google-search.ts - Google search automation with natural language commands
  • agent-amazon-shopping.ts - Amazon shopping bot (6 lines vs 350 lines manual code)
  • agent-with-anthropic.ts - Using Anthropic Claude instead of OpenAI GPT
  • agent-with-tracing.ts - Agent execution tracing for debugging and analysis
  • hello.ts - Extension bridge verification
  • basic-agent.ts - Basic snapshot and element inspection
  • query-demo.ts - Query engine demonstrations
  • wait-and-click.ts - Waiting for elements and performing actions
  • read-markdown.ts - Content extraction and markdown conversion

⚠️ Important: You cannot use node directly to run TypeScript files. Use one of these methods:

Option 1: Using npm scripts (recommended)

npm run example:hello
npm run example:basic
npm run example:query
npm run example:wait

Option 2: Using ts-node directly

npx ts-node examples/hello.ts
# or if ts-node is installed globally:
ts-node examples/hello.ts

Option 3: Compile then run

npm run build
# Then use compiled JavaScript from dist/
# Run all tests
npm test

# Run with coverage
npm run test:coverage

# Run specific test file
npm test -- snapshot.test.ts

📖 Documentation


📜 License

This project is licensed under either of:

at your option.