gemini-computer-use-mcp
v0.0.15
Published
[](https://opensource.org/licenses/MIT)
Readme
Gemini Computer Use MCP
An MCP (Model Context Protocol) server for building browser-control agents using Gemini Computer Use. This project enables agents to plan and perform UI actions in a browser.
✨ Features
- Computer Use (Browser Control): Provides an MCP tool (
run_browser_task) to instruct a browser to perform a high-level task using the Gemini Computer Use model. - Generative AI Integration: Utilizes
@google/genaifor planning and executing computer-use steps. - stdio Transport: Communicates using the standard MCP stdio transport mechanism.
Learn more about Gemini Computer Use in the official docs: Gemini Computer Use
📚 Table of Contents
🚀 Usage
This project runs as an MCP server. It's typically invoked by an MCP client or controller.
Connecting an MCP Client
Point your MCP client to this server's executable. If your client supports a config file, use the following configs:
stdio Mode
// .mcp.json
{
"mcpServers": {
"gemini-computer-use": {
"type": "stdio",
"timeout": 300,
"command": "npx",
"args": ["--yes", "gemini-computer-use-mcp@latest"],
"env": {
"VERTEX_PROJECT_KEY": "vertex-project-key"
}
}
}
}# ~/.codex/config.toml
tool_timeout_sec = 300
[mcp_servers.gemini-computer-use]
command = "npx"
args = ["--yes", "gemini-computer-use-mcp@latest"]
[mcp_servers.gemini-computer-use.env]
VERTEX_PROJECT_KEY = "vertex-project-key"SSE Mode
Start server with:
VERTEX_PROJECT_KEY=vertex-project-key npx --yes gemini-computer-use-mcp@latest --serverThen add:
// .mcp.json
{
"mcpServers": {
"gemini-computer-use": {
"type": "sse",
"timeout": 300,
"url": "http://localhost:8888/sse"
}
}
}Streamable HTTP Mode
Start server with:
VERTEX_PROJECT_KEY=vertex-project-key npx --yes gemini-computer-use-mcp@latest --serverThen add:
// .mcp.json
{
"mcpServers": {
"gemini-computer-use": {
"type": "http",
"timeout": 300,
"url": "http://localhost:8888/mcp"
}
}
}# ~/.codex/config.toml
tool_timeout_sec = 300
[mcp_servers.gemini-computer-use]
url = "http://localhost:8888/mcp"Environment Variables
| Variable | Description | Required | Default |
| --------------------- | -------------------------------------------------------------------------- | --------------------------------------- | ---------------------------------------- |
| VERTEX_PROJECT_KEY | Vertex AI project key (preferred over GEMINI_API_KEY) | Yes, unless GEMINI_API_KEY is set | |
| GEMINI_API_KEY | Your Gemini API key | Yes, unless VERTEX_PROJECT_KEY is set | |
| MODEL | The model ID to use | No | gemini-2.5-computer-use-preview-10-2025|
| PROJECT_PATH | Filesystem path used by some tools (defaults to current working directory) | No | (current working directory) |
| PORT | Server port to use (only for streamable HTTP) | No | 8888 |
Note: Either GEMINI_API_KEY or VERTEX_PROJECT_KEY must be provided (see src/helpers/config.ts).
Tools
Once connected, the client can invoke the tools provided by this server.
run_browser_task
| Argument | Description | Required | Default |
| ---------- | ------------------------------------------------ | -------- | -------------- |
| task | The high-level task to perform | Yes | |
This tool leverages Gemini Computer Use to plan and perform UI actions to accomplish the provided task. It implements:
- Automatic browser management: Checks for existing browser at
localhost:9222or starts a new instance - Agent loop: Continuously captures screenshots, sends them to Gemini, receives UI actions, and executes them
- All supported UI actions: mouse movement, clicks, keyboard input, scrolling, text extraction, and more
- Safety guidelines: Follows Gemini's safety best practices from the official documentation
See the official guidance for capabilities and safety considerations: Gemini Computer Use.
⚙️ Development
Prerequisites
- Git
Steps
Install dependencies:
npm installConfiguration:
- Set
GEMINI_API_KEYorVERTEX_PROJECT_KEY. Optionally setMODELandPROJECT_PATH.
- Set
Run:
- In IDEs: Reload window and check if the MCP is connected
- Manually: Run
./runin your terminal
💻 Technology Stack
- Runtime: Node.js
- Language: TypeScript
- Core Libraries:
- @modelcontextprotocol/sdk: For MCP server implementation.
- @google/genai: For generative AI features.
- Zod: For schema validation.
- Development: @types/node, TypeScript
📜 License
This project is licensed under the MIT License - see the LICENSE file for details. Copyright (c) 2025 Khoa Nguyen
📧 Contact
- Khoa Nguyen @ [email protected]
