npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2026 – Pkg Stats / Ryan Hefner

gemini-cua

v1.0.5

Published

Model Context Protocol server for Gemini computer use functionality with browser automation using Playwright

Downloads

18

Readme

Gemini CUA - Computer Use Automation

A Model Context Protocol (MCP) server that provides computer use and browser automation capabilities based on Google's Gemini computer use preview functionality. This server enables AI assistants to control web browsers and interact with web pages through natural language commands.

Features

This MCP server provides tools for browser automation and computer interaction with support for both local and remote environments:

Supported Environments

  • Playwright (Local): Run browser automation locally with Chromium
  • Browserbase (Remote): Use Browserbase's cloud browser infrastructure

Capabilities

  • Browser Control: Open web browser, navigate to URLs, go back/forward
  • Mouse Interactions: Click, hover, drag and drop at specific coordinates
  • Keyboard Actions: Type text, execute key combinations
  • Page Navigation: Scroll, search, navigate to specific URLs
  • State Capture: Take screenshots and get current page state with base64 encoding
  • Session Management: Automatic session creation and cleanup (Browserbase)
  • Waiting: Built-in delays for page loading

Installation

From npm

npm install -g gemini-cua

From source

git clone https://github.com/snakecased/gemini-cua-mcp.git
cd gemini-cua
npm install
npm run build

Usage

As MCP Server (Recommended)

Add to your MCP client configuration (e.g., Claude Desktop, Cline):

{
  "mcpServers": {
    "gemini-computer-use": {
      "command": "gemini-cua"
    }
  }
}

Or if installed locally:

{
  "mcpServers": {
    "gemini-computer-use": {
      "command": "npx",
      "args": ["gemini-cua"]
    }
  }
}

Direct Usage

# If installed globally
gemini-cua

# If installed locally
npx gemini-cua

# From source
npm run start

Requirements

  • Node.js 18+
  • Chromium browser (automatically installed via Playwright)
  • For Browserbase: API key and Project ID from browserbase.com

Environment Configuration

Environment Variables

Set the following environment variable to choose your browser environment:

# Use local Playwright (default)
export COMPUTER_USE_ENV=playwright

# Use Browserbase remote browsers
export COMPUTER_USE_ENV=browserbase
export BROWSERBASE_API_KEY=your_api_key_here
export BROWSERBASE_PROJECT_ID=your_project_id_here

Browserbase Setup

  1. Sign up at browserbase.com
  2. Create a project and get your API key and Project ID
  3. Set the environment variables as shown above

Available Tools

Browser Management

  • open_web_browser: Launch browser and navigate to URL
  • navigate: Go to specific URL
  • go_back: Navigate to previous page
  • go_forward: Navigate to next page
  • search: Open search engine with optional query

Mouse Interactions

  • click_at(x, y): Click at coordinates
  • hover_at(x, y): Hover at coordinates
  • drag_and_drop(from_x, from_y, to_x, to_y): Drag and drop

Keyboard Actions

  • type_text(text): Type text at cursor
  • key_combination(keys): Execute keyboard shortcuts

Page Control

  • scroll_document(direction, amount): Scroll up/down
  • screen_size(): Get viewport dimensions
  • current_state(): Get screenshot and URL
  • wait_5_seconds(): Wait for page loading

Dependencies

  • @modelcontextprotocol/sdk: MCP protocol implementation
  • playwright: Browser automation framework

Development

npm run dev     # Run in development mode
npm run watch   # Watch for changes
npm run build   # Build for production

License

Apache-2.0 License (matching original Google Gemini computer use preview)