npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2026 – Pkg Stats / Ryan Hefner

@mcpcn/image-recognition-mcp

v1.0.4

Published

MCP服务器用于图片识别和信息提取

Downloads

63

Readme

通用图片识别 MCP 服务器

基于阿里云百炼平台千问视觉模型的通用图片识别 MCP 服务器,支持识别图片中的文字和信息内容。

功能特性

  • 通用图片识别: 识别图片中的所有文字内容和关键信息
  • 自定义提示词: 支持自定义识别提示词,满足不同需求
  • 多种图片格式: 支持URL和base64两种图片输入方式
  • 灵活配置: 支持自定义图片像素参数

安装配置

1. 安装依赖

npm install

2. 环境变量配置

设置阿里云百炼API Key:

export DASHSCOPE_API_KEY="your-api-key-here"

或在使用时临时设置:

DASHSCOPE_API_KEY="your-api-key" npm start

3. 构建项目

npm run build

使用方法

启动服务器

npm start

在 Claude Desktop 中配置

在 Claude Desktop 的配置文件中添加:

{
  "mcpServers": {
    "image-recognition": {
      "command": "node",
      "args": ["/path/to/image-recognition-mcp/dist/index.js"],
      "env": {
        "DASHSCOPE_API_KEY": "your-api-key-here"
      }
    }
  }
}

工具说明

recognize_image - 通用图片识别工具

支持识别各种类型的图片内容:

参数说明:

  • image_url: 图片URL地址
  • image_data: 图片base64数据(可选)
  • custom_prompt: 自定义识别提示词(可选)
  • min_pixels/max_pixels: 图片像素参数(可选)

使用示例

📝 基本文字识别

请识别这张图片中的文字:https://example.com/image.jpg

🎯 自定义识别需求

请识别这张名片并提取联系信息,图片:https://example.com/business-card.jpg

📋 发票信息提取

请提取这张发票的详细信息,包括金额、税额、开票日期等:https://example.com/invoice.jpg

📊 表格数据识别

请识别这个表格的所有数据:https://example.com/table.jpg

🆔 证件信息识别

请识别这张身份证的信息:https://example.com/id-card.jpg

技术栈

  • MCP SDK: Model Context Protocol 框架
  • OpenAI SDK: 兼容阿里云百炼API
  • TypeScript: 类型安全开发
  • 千问视觉模型: qwen-vl-ocr-latest

开发说明

项目结构

src/
├── index.ts              # 主服务器文件
├── tools/
│   └── image-recognition.ts  # 图片识别工具实现
└── types/
    └── index.ts          # 类型定义

开发模式

npm run dev  # 监听文件变化并自动重新构建

注意事项

  1. API密钥: 需要有效的阿里云百炼API Key
  2. 图片访问: 图片URL需要可公开访问
  3. Base64格式: Base64图片数据需要包含数据头(data:image/...)
  4. 图片大小: 模型对图片大小有限制,建议合理设置像素参数

更新日志

v1.0.0

  • ✨ 通用图片识别功能
  • 🛠️ 自定义提示词支持
  • 📤 多种图片输入方式
  • ⚙️ 灵活参数配置