@visionengine/video-recognize
v1.0.1
Published
VisionEngine Video Recognize MCP Server - Async video understanding via backend proxy
Maintainers
Readme
VE Video Recognize MCP
Async MCP server for video understanding via ve-backend proxy.
Environment
API_URL: backend proxy url, defaulthttps://api.visionengine-tech.com/api/v1/videoAPI_KEY: user API key from VisionEngine backend (required for submit/query and remote upload)MODEL: platform model id, default@preset/vec-1-0-video-recognizeWORKDIR: local workspace rootFILE_MODE: local file handling mode,localorremote, defaultremoteREMOTION_WORK_DIR: shared mount root used inlocalmode, default/vecBASE_URL: backend public base url used for/saveand/sharedlinks, defaulthttps://api.visionengine-tech.com- Remote upload path is built-in in code:
public/videos
Tools
submitquery
submit
Submit an async video understanding task and receive a taskId for later polling.
Supported task types:
understandcut_effect_pointsemotion_analysisscript_generatestyle_analyze
Input uses a single parameter:
video: can be either a public video URL or a local file path
When video is a local file path:
FILE_MODE=local: after validating the file is underREMOTION_WORK_DIR, MCP sends a path relative toREMOTION_WORK_DIR, and backend resolves it as local file input internallyFILE_MODE=remote(default): upload local file to backend/save, then convert returned path to/shared/...?...download=trueURL
First version always submits with stream=false and returns a stable task-oriented payload. Use query to retrieve final results.
Supported optional analysis range parameter:
analysisRange.type:timeorframeanalysisRange.startSec/analysisRange.endSec: select a time range in secondsanalysisRange.startFrame/analysisRange.endFrame: select a frame range
Optional source timeline mapping parameter:
sourceTimeRange.startSec/sourceTimeRange.endSec: declare where the submitted clip sits on the original full-length video timeline
Use sourceTimeRange when the submitted file is already a trimmed segment from a larger source video and you want backend to align all returned timestamps back to the original video timeline.
Rules:
type=timeonly allowsstartSec/endSectype=frameonly allowsstartFrame/endFrame- at least one boundary is required
- single-sided ranges are supported, for example
{ type: "time", startSec: 30 }
Example submit parameters:
{
"video": "https://example.com/demo.mp4",
"sourceTimeRange": {
"startSec": 30,
"endSec": 45
},
"analysisRange": {
"type": "time",
"startSec": 5,
"endSec": 20
},
"taskType": "understand",
"responseFormat": "json_object"
}In the example above, the uploaded clip itself corresponds to 30s ~ 45s of the original source video, while analysisRange further limits analysis to 5s ~ 20s inside the submitted video. Backend will align final returned timestamps to the original video timeline.
query
Query a submitted task by taskId.
- If the task is still running, the tool returns the current status and asks the caller to try again later.
- If the task succeeds or partially succeeds, the tool automatically fetches
/task/{taskId}/resultand returns the final structured result. - If the task failed or was canceled, the tool returns the status and backend message/error.
Typical flow:
- Call
submit - Wait a short time
- Call
querywith the returnedtaskId - Repeat
queryuntil the task finishes
Example MCP config
{
"mcpServers": {
"ve-video-recognize": {
"command": "npx",
"args": ["-y", "@visionengine/video-recognize@latest"],
"transport": "stdio",
"env": {
"API_KEY": "<YOUR_API_KEY>",
"WORKDIR": "./",
"FILE_MODE": "remote",
"REMOTION_WORK_DIR": "/vec"
}
}
}
}