image-freshness-mcp
v1.0.2
Published
MCP server for analyzing image freshness and relevance
Downloads
4
Maintainers
Readme
Image Freshness MCP Server
A Model Context Protocol (MCP) server that helps verify if images in documentation are up-to-date. It uses web automation to navigate to web pages, take screenshots of components, and compare them with existing images to ensure documentation accuracy.
Features
This server provides a prompt and a set of tools to automate the process of checking image freshness in technical documentation.
Prompt
analyze-image-freshness: Orchestrates the entire workflow. It takes an image name or line number, navigates to the relevant page, captures a new screenshot, compares it with the original, and reports whether the image is out-of-date.
Tools
find-product-start-page: Identifies the correct starting URL for a given product (e.g., "Azure Portal"), which is crucial for initiating the navigation workflow.compare-images: Compares two images (an original from the document and a newly captured screenshot) using a powerful multimodal Large Language Model (LLM). It returns a detailed description of the differences, categorized into important and trivial changes.get-screenshot-instructions: Analyzes an existing screenshot and provides clear instructions on what UI element should be captured. This is useful for resolving ambiguities when the initial comparison shows significant differences due to incorrect screenshot scope.get-default-account: Returns the demo account credentials used for sign-in workflows. The email address is set viaimage_scan_username, while the password is read fromimage_scan_secret_passwordso you can rotate it without code changes.get-account-otp-code: Generates the current 6-digit TOTP needed for multi-factor authentication. The underlying secret comes fromimage_scan_secret_otpKey.- Automatic LLM fallback: When the connected MCP client does not advertise the
samplingcapability (or you force the fallback), the server automatically sends multimodal prompts to the OpenAI SDK, which can target OpenAI's public API or any OpenAI-compatible deployment via a configurable base URL.
Installation
Install via npm (recommended for users)
You can install the server globally and run it from anywhere:
npm install -g image-freshness-mcp
image-freshness-mcpOr use npx without a global install:
npx image-freshness-mcp@latestLocal development setup
# Install dependencies
npm install
# Build the project
npm run build
# Start the server
npm startRequired Environment Variables
Before running the server, set the password used by the get-default-account tool:
export image_scan_secret_password="<your-demo-password>"
export image_scan_secret_otpKey="<base32-encoded-otp-secret>"
export image_scan_username="<demo-account-email>"You can also use the generic names AZURE_OPENAI_API_KEY, AZURE_OPENAI_ENDPOINT, AZURE_OPENAI_DEPLOYMENT, OPENAI_API_KEY, OPENAI_MODEL, and OPENAI_BASE_URL (or the legacy self_llm_* variables) if you already rely on those in your environment.
If any of the environment variables are missing, calls to the corresponding tool will return an error instead of the secret value.
OpenAI fallback behaviour
When the MCP client does not support sampling, the server converts your prompts (text plus base64-encoded images) into Azure OpenAI Chat Completion messages and calls the SDK. Provide a custom image_scan_azure_openai_endpoint (or image_scan_openai_base_url) to target your Azure deployment or any OpenAI-compatible endpoint. Set image_scan_force_openai=true if you want to always use the SDK, even when the client advertises sampling support.
Development
# Watch mode for development
npm run dev
# Run linting
npm run lint
# Run tests
npm testUsage with a Copilot Extension
Add this server to your Copilot extension's configuration file (e.g., mcp.json). This example also shows how to configure the Playwright MCP server to connect to your browser, allowing the assistant to interact with pages where you are already logged in.
{
"servers": {
"playwright-mcp": {
"type": "stdio",
"command": "npx",
"args": [
"@playwright/mcp@latest",
"--browser",
"msedge",
"--user-data-dir",
"%USERPROFILE%\\AppData\\Local\\Microsoft\\Edge\\User Data",
"--extension"
],
"env": {}
},
"image-freshness": {
"type": "stdio",
"command": "node",
"args": [
"D:\\src\\hackathon-image-freshness\\build\\index.js"
]
}
},
"inputs": []
}For more details on setting up the Playwright MCP extension, refer to its documentation. The key is to run the Playwright MCP server with the --extension flag.
Playwright Extension Setup
The Playwright MCP Chrome Extension allows you to connect to pages in your existing browser and leverage the state of your default user profile. This means the AI assistant can interact with websites where you're already logged in, using your existing cookies, sessions, and browser state.
Prerequisites
- Chrome/Edge/Chromium browser
Installation
- Download the Extension: Download the latest Chrome extension from the Playwright MCP releases page.
- Load the Extension:
- Open your browser and navigate to
chrome://extensions/. - Enable "Developer mode" (usually a toggle in the top right corner).
- Click "Load unpacked" and select the directory where you unzipped the extension.
- Open your browser and navigate to
Usage
When the assistant needs to use the browser, it will prompt you to select a tab to connect to. This gives you control over which page the assistant can interact with.
Technical Details
- Built with TypeScript and the
@modelcontextprotocol/sdk. - Uses a standard stdio transport for communication.
- The
compare-imagesandget-screenshot-instructionstools leverage a multimodal LLM for advanced image analysis. - Includes robust error handling and input validation for file paths and image formats.
Future Enhancements
- Integration with more computer vision APIs for deeper analysis.
- Advanced algorithms for detecting subtle but important UI changes.
- Support for batch processing multiple images in a single run.
- Caching mechanisms for improved performance.
Publishing this package
To share the MCP server publicly, publish it to the npm registry:
# Build the TypeScript output
npm run build
# Log in (once per machine)
npm login
# Publish the package
npm publish --access publicThe prepublishOnly script ensures the latest build is generated automatically during publish.
License
MIT License
