movie-mcp

v1.0.0

Published

2 months ago

AI-powered Movie Organizer MCP Server - Open source, no API keys required

Downloads

0High
0Medium
0Low

simpletoolsindiaorg

mcp movie media organizer ai claude

🎬 Movie Organizer MCP Server (OSS, No API Key)

1. Overview

Build a Model Context Protocol (MCP) Server that powers an AI Movie Organizer Bot.

This server will:

analyze messy media files
identify movies / TV shows
fetch metadata from web (no API keys)
organize + rename files
learn over time using RAG
reduce LLM (Claude) token usage drastically

💡 Fully:

open-source
self-hosted
privacy-first
no paid APIs

2. Core Capabilities

2.1 Media Identification

parse messy filenames
detect:
- title
- year
- resolution
- release group
- language
- season/episode
support:
- movies
- TV series
- anime
- documentaries

2.2 Pattern-Based Search

search using:
- raw filename
- cleaned title
- regex pattern
- wildcard
- folder names
fuzzy matching support
typo tolerance
multilingual handling

3. Filesystem Navigation (Shell-like)

MCP must support safe system exploration.

Commands (tool-based, not raw shell)

ls(path) → list files/folders
cd(path) → change working directory
pwd() → current directory
tree(path) → recursive structure
stat(path) → file metadata
find(pattern) → search files
du(path) → disk usage

Safety

sandboxed root directory
no system-level destructive commands
no execution of arbitrary shell
read-only mode by default

4. Web Search & Crawling (No API)

Supported sources

IMDb (public pages)
Wikipedia
Letterboxd
Rotten Tomatoes
Google search (HTML parsing)
public blogs / articles

Tools

requests / httpx
BeautifulSoup
Playwright (JS fallback)
trafilatura
readability-lxml
SearxNG (optional self-hosted search)

5. Web Search Strategy

title + year
title + language
filename cleaned query
series + S01E01
fallback: folder name
multiple query reformulation

6. Metadata Extraction

Core fields

title
original title
year
runtime
genres
language
country

People

director
cast
writers

Ratings

IMDb rating
votes (if available)

Series fields

season
episode
episode title
air date

7. Matching Engine

Compare:

filename vs web title
year match
language match
cast overlap
edition keywords

Detect:

remakes
director's cut
extended version
dubbed vs original

Output:

best match
confidence score (0–1)
explanation

8. Rename Engine

Output format examples

Movies:

Inception (2010) [1080p BluRay x264]

Series:

Breaking Bad - S01E01 - Pilot

Features

dry run preview
batch rename
undo support
customizable templates

9. RAG (Learning System)

Stores:

past matches
rename decisions
user corrections
aliases
release patterns
failures

Use cases:

improve matching accuracy
avoid repeated mistakes
learn naming preferences

Storage options:

SQLite + FTS
Chroma / FAISS / LanceDB
JSON knowledge base

10. RAG Retrieval Usage

find similar filenames
recall past corrections
reuse rename patterns
improve ranking confidence

11. Web Crawling Safety

Must:

rate limit
cache responses
retry with backoff
timeout handling

Must NOT:

require login
bypass protections illegally
spam websites

12. Caching Layer

Cache:

parsed filenames
search results
metadata summaries
match decisions

Storage:

SQLite
Redis (optional)

13. MCP Tools

Expose tools:

parse_filename
search_media
search_web
crawl_imdb
extract_metadata
match_candidates
get_best_match
rename_preview
apply_rename
filesystem_ls
filesystem_cd
filesystem_find
rag_store
rag_search
explain_match

14. Explainability

Each decision must include:

why match chosen
what signals matched
confidence score
source references

Example:

title match strong
year matched
IMDb + Wikipedia agree
confidence: 0.93

15. Token Optimization (Claude)

Goal

Minimize token usage by 70–90%

15.1 Local-first Pipeline

parse filename
check cache
query RAG
web search (if needed)
rank locally
send minimal data to Claude

15.2 Compact Payloads

Send only:

{
  "title": "Inception",
  "year": 2010,
  "candidates": [
    {"title": "Inception", "year": 2010, "score": 0.95},
    {"title": "Inception: The Cobol Job", "year": 2010, "score": 0.52}
  ]
}

15.3 Never Send

❌ full HTML pages ❌ full directory dumps ❌ raw logs ❌ repeated metadata

15.4 Preprocessing

Convert:

Movie.Name.2010.1080p.x264.mkv

Into:

{
  "title": "Movie Name",
  "year": 2010,
  "quality": "1080p",
  "type": "movie"
}

15.5 Summarization

Return:

title
year
rating
2–3 evidence lines

NOT full documents.

15.6 Cache Everything

Avoid re-calling LLM:

same filename
same query
same decision

16. Architecture

Components

MCP Server (core brain)
Parser Engine
Matching Engine
Web Crawler
RAG Store
Cache Layer
Filesystem Adapter

17. Suggested Stack (OSS Only)

Backend

Python (FastAPI)
Node.js optional

Parsing

guessit
regex custom

Crawling

BeautifulSoup
Playwright
httpx

Storage

SQLite
Chroma / FAISS

Search

SearxNG

18. Future Extensions

subtitle auto-matching
torrent/indexer integration
Jellyfin metadata sync
poster downloader
duplicate detection
quality upgrade suggestions

19. Key Design Principles

local-first
zero API cost
explainable decisions
deterministic before LLM
token-efficient
modular MCP tools
safe filesystem access

20. Final Goal

A single intelligent MCP server that replaces:

manual file organization
messy naming
repeated metadata lookup
heavy LLM usage

And enables:

🤖 Fully automated AI-powered media organization