json-batch-processor-mcp
v1.2.0
Published
MCP Server for batch processing large JSON arrays in manageable batches with progress tracking
Maintainers
Readme
JSON Batch Processor MCP Server
A Model Context Protocol (MCP) server that enables AI IDEs to process large JSON arrays in manageable batches. This tool solves the context window limitation problem by breaking down large datasets into smaller chunks that can be processed incrementally.
Features
- JSON File Binding: Bind JSON files for batch processing with automatic structure analysis
- Flexible Task Generation: Create batch processing tasks with customizable batch sizes and field paths
- Batch Reading: Read specific batches of data from large JSON arrays
- Progress Tracking: Persistent progress tracking that survives system restarts
- Result Merging: Automatically merge processed batches back into a complete JSON file
- MCP Integration: Seamless integration with AI IDEs through the Model Context Protocol
Installation
Prerequisites
- Node.js v18 or higher
- npm or yarn package manager
Install from npm
npm install -g json-batch-processor-mcpInstall from source
git clone <repository-url>
cd json-batch-processor-mcp
npm install
npm run build
npm linkConfiguration
MCP Server Configuration
Add the following configuration to your MCP settings file:
- Workspace level:
.kiro/settings/mcp.json(project-specific) - User level:
~/.kiro/settings/mcp.json(global, all projects)
Option 1: Development/Local Installation
If you cloned the repository or installed from source:
{
"mcpServers": {
"json-batch-processor": {
"command": "node",
"args": ["/absolute/path/to/json-batch-processor-mcp/dist/index.js"],
"disabled": false,
"autoApprove": []
}
}
}Replace /absolute/path/to/json-batch-processor-mcp with the actual path to your installation.
Option 2: Global npm Installation
If you installed globally via npm install -g:
{
"mcpServers": {
"json-batch-processor": {
"command": "json-batch-processor",
"args": [],
"disabled": false,
"autoApprove": []
}
}
}Configuration Options
- command: The executable command or path to the server
- args: Command-line arguments (empty for global install)
- disabled: Set to
trueto disable the server without removing the configuration - autoApprove: List of tool names to auto-approve (e.g.,
["bind_json_file", "read_batch"])
Restart MCP Server
After updating the configuration:
- Open the MCP Server view in your IDE
- Click "Reconnect" next to the json-batch-processor server
- Or restart your IDE
Storage Location
The server stores all data in:
~/.kiro/mcp-data/json-batch-processor/Each binding creates a subdirectory with:
binding.json- Binding metadatatasks.json- Task listprogress.json- Progress trackingresults/- Processed batch results
Available MCP Tools
1. bind_json_file
Bind a JSON file for batch processing.
Parameters:
filePath(string, required): Absolute path to the JSON file
Returns:
{
"bindingId": "uuid-string",
"structure": {
"type": "object",
"arrayPaths": ["$.data", "$.items"],
"totalElements": 1000
}
}2. generate_task_list
Generate a list of batch processing tasks.
Parameters:
bindingId(string, required): The binding ID from bind_json_filebatchSize(number, required): Number of elements per batchfieldPath(string, optional): JSONPath to the array (e.g., "$.data.items")
Returns:
{
"bindingId": "uuid-string",
"totalTasks": 20,
"batchSize": 50,
"fieldPath": "$.data",
"tasks": [
{
"id": "uuid-batch-0",
"batchIndex": 0,
"startIndex": 0,
"endIndex": 49,
"status": "pending"
}
]
}3. read_batch
Read data for a specific batch.
Parameters:
bindingId(string, required): The binding IDtaskId(string, required): The task ID from the task list
Returns:
{
"taskId": "uuid-batch-0",
"batchIndex": 0,
"data": [...],
"startIndex": 0,
"endIndex": 49,
"totalElements": 1000
}4. update_task_status
Update the status of a task and optionally save processed results.
Parameters:
bindingId(string, required): The binding IDtaskId(string, required): The task IDstatus(string, required): Either "pending" or "completed"result(array, optional): The processed batch data
Returns:
{
"success": true,
"taskId": "uuid-batch-0",
"status": "completed"
}5. get_progress
Get the current processing progress.
Parameters:
bindingId(string, required): The binding ID
Returns:
{
"bindingId": "uuid-string",
"totalTasks": 20,
"completedTasks": 5,
"pendingTasks": 15,
"percentage": 25,
"tasks": [
{
"taskId": "uuid-batch-0",
"status": "completed",
"completedAt": "2025-11-08T10:05:00Z",
"hasResult": true
}
]
}6. merge_results
Merge all processed batches into a final JSON file.
Parameters:
bindingId(string, required): The binding IDoutputPath(string, required): Path for the output JSON file
Returns:
{
"success": true,
"outputPath": "/path/to/output.json",
"totalElements": 1000,
"message": "Successfully merged 20 batches"
}Quick Start
Here's a minimal example to get started:
// 1. Bind a JSON file
const binding = await callTool("bind_json_file", {
filePath: "/path/to/data.json"
});
// 2. Generate tasks (50 items per batch)
const tasks = await callTool("generate_task_list", {
bindingId: binding.bindingId,
batchSize: 50,
fieldPath: "$.items" // Optional: specify array path
});
// 3. Process each batch
for (const task of tasks.tasks) {
// Read batch
const batch = await callTool("read_batch", {
bindingId: binding.bindingId,
taskId: task.id
});
// Process data (your custom logic)
const processed = batch.data.map(item => ({
...item,
processed: true
}));
// Save results
await callTool("update_task_status", {
bindingId: binding.bindingId,
taskId: task.id,
status: "completed",
result: processed
});
}
// 4. Merge all results
const result = await callTool("merge_results", {
bindingId: binding.bindingId,
outputPath: "/path/to/output.json"
});For a complete walkthrough with detailed examples, see EXAMPLE.md.
Error Handling
The server returns structured error responses:
{
"error": {
"code": "FileNotFoundError",
"message": "JSON file not found at path: /path/to/file.json",
"details": {}
}
}Common error codes:
FileNotFoundError: JSON file doesn't existInvalidJSONError: Invalid JSON formatBindingNotFoundError: Binding ID not foundTaskNotFoundError: Task ID not foundInvalidBatchIndexError: Batch index out of rangeIncompleteTasksError: Not all tasks completedInvalidFieldPathError: Invalid JSONPath or path doesn't point to an arrayStorageError: File system operation failed
Performance
- Binding operation: < 1 second (10MB file)
- Task generation: < 500ms (1000 tasks)
- Batch reading: < 100ms (50 element batch)
- Progress update: < 50ms
- Result merging: < 5 seconds (1000 batches)
Limitations
- Maximum file size: 100MB (configurable)
- Maximum storage per binding: 500MB (configurable)
- Supports JSON format only (CSV, XML support planned)
- Requires Node.js runtime (not browser-compatible)
Use Cases
- Data Enrichment: Add information from external APIs to large datasets
- Data Transformation: Convert data structures in manageable chunks
- Data Validation: Validate large datasets batch by batch
- Data Migration: Transform and migrate data between formats
- AI Processing: Process large datasets within AI context window limits
- Data Analysis: Analyze large datasets incrementally
Architecture
The server consists of six core components:
- Binding Manager: Validates and binds JSON files, scans structure
- Task Manager: Generates and manages batch processing tasks
- Batch Reader: Extracts specific batches from JSON arrays
- Progress Tracker: Tracks and persists task completion status
- Result Merger: Merges processed batches back into complete JSON
- MCP Server: Exposes functionality through MCP protocol tools
Data is stored in ~/.kiro/mcp-data/json-batch-processor/{bindingId}/:
binding.json- File binding metadatatasks.json- Complete task listprogress.json- Current progress stateresults/- Individual batch results
Development
Build
npm run buildWatch mode
npm run devClean build
npm run build:cleanProject Structure
json-batch-processor-mcp/
├── src/
│ ├── index.ts # MCP server entry point
│ ├── managers/ # Core business logic
│ │ ├── BindingManager.ts
│ │ ├── TaskManager.ts
│ │ ├── BatchReader.ts
│ │ ├── ProgressTracker.ts
│ │ └── ResultMerger.ts
│ ├── types/ # TypeScript type definitions
│ │ └── index.ts
│ └── utils/ # Utility functions
│ ├── storage.ts
│ └── errors.ts
├── examples/ # Sample JSON files
├── dist/ # Compiled JavaScript
└── package.jsonLicense
MIT
Contributing
Contributions are welcome! Please open an issue or submit a pull request.
Recent Updates
v1.0.1 - Binding Persistence Fix (2025-11-08)
Fixed a critical bug where binding information was not persisted to disk, causing read_batch and subsequent operations to fail. The server now correctly saves binding.json files, ensuring reliable operation across server restarts.
What changed:
- Binding information (including JSON content) is now saved to disk
read_batchoperations work reliably even after server restart- All batch processing workflows are now resumable
Migration: If you have existing bindings from before this fix, please re-bind your JSON files using bind_json_file.
Troubleshooting
Server Not Connecting
Problem: MCP server doesn't appear in your IDE
Solutions:
- Verify the configuration path in your mcp.json is correct
- Check that the server is not disabled (
"disabled": false) - Restart your IDE or reconnect the MCP server
- Check the MCP server logs for errors
File Not Found Errors
Problem: FileNotFoundError when binding
Solutions:
- Use absolute paths, not relative paths
- Verify the file exists:
ls -la /path/to/file.json - Check file permissions (must be readable)
- Ensure no typos in the file path
Invalid JSON Errors
Problem: InvalidJSONError when binding
Solutions:
- Validate your JSON:
cat file.json | jq . - Check for trailing commas (not allowed in JSON)
- Ensure proper quote escaping
- Verify UTF-8 encoding
Binding Not Found
Problem: BindingNotFoundError on subsequent operations
Solutions:
- Verify you're using the correct binding ID from the bind response
- Check if the binding data still exists in
~/.kiro/mcp-data/json-batch-processor/ - Re-bind the file if necessary
Merge Fails with Incomplete Tasks
Problem: IncompleteTasksError when merging
Solutions:
- Call
get_progressto see which tasks are pending - Complete all pending tasks before merging
- Check for failed tasks and retry them
FAQ
Q: Can I process multiple JSON files simultaneously?
A: Yes! Each binding has a unique ID, so you can bind multiple files and process them in parallel.
Q: What happens if my process crashes?
A: All progress is saved to disk. Simply call get_progress to see where you left off and continue.
Q: Can I change the batch size after generating tasks?
A: No, you need to unbind and create a new binding with a different batch size.
Q: Does this work with nested arrays?
A: Yes! Use JSONPath syntax to specify the exact array path (e.g., $.data.users[*].orders).
Q: Can I process arrays in parallel?
A: Yes, tasks are independent. You can process multiple batches simultaneously if your logic allows.
Q: What if my JSON has multiple arrays?
A: Specify the exact array using the fieldPath parameter. If omitted, the first array is used.
Q: Can I modify the original file?
A: No, the original file is never modified. Results are always written to a new output file.
Q: How do I clean up old bindings?
A: Delete the binding directory: rm -rf ~/.kiro/mcp-data/json-batch-processor/{bindingId}
Support
For issues and questions, please open an issue on the GitHub repository.
