videostil
v0.3.5
Published
Convert videos into LLM-friendly input by extracting and deduplicating frames
Downloads
16,993
Maintainers
Readme
videostil
Videos are large. LLM context windows are small. You need a way to "compress" videos so they fit. We've got you.
videostil extracts and deduplicates video frames, converting videos into LLM-friendly formats that fit within context windows.
Features
- 🎬 Extract frames at configurable FPS (default: 25)
- 🔍 Smart deduplication (3 algorithms: greedy, dynamic programming, sliding window)
- 📦 Simple API and CLI
- ⚡ Fast processing with caching
- 🎯 Absolute frame indexing preserves video timeline
- 🖼️ Automatic scaling and padding to 1280x720
- 🌐 Built-in analysis viewer server
- 📊 Frame similarity analysis and visualization
- 🤖 Optional LLM-powered video analysis (supports Anthropic, Google, OpenAI)
Installation
npm install videostilRequirements: ffmpeg and ffprobe must be installed on your system.
# macOS
brew install ffmpeg
# Ubuntu/Debian
sudo apt-get install ffmpeg
# Windows
# Download from https://ffmpeg.org/download.htmlQuick Start
CLI Usage
# Extract frames from a video (automatically opens viewer)
npx videostil https://example.com/video.mp4
# With custom parameters
npx videostil video.mp4 --fps 20 --threshold 0.01 --algo dp
# Extract without opening viewer
npx videostil video.mp4 --no-serve
# Start viewer server for existing analyses
npx videostil serve
npx videostil serve --port 8080 --no-openAPI Usage
import { extractUniqueFrames, startAnalysisServer, analyseFrames } from 'videostil';
// Extract and deduplicate frames
const result = await extractUniqueFrames({
videoUrl: 'https://example.com/video.mp4',
fps: 25,
threshold: 0.001,
algo: 'gd' // greedy (default), or 'dp', 'sw'
});
console.log(`Extracted ${result.uniqueFrames.length} unique frames`);
console.log(`Video duration: ${result.videoDurationSeconds}s`);
// Access frame data
for (const frame of result.uniqueFrames) {
console.log(`Frame ${frame.index} at ${frame.timestamp}`);
console.log(`Path: ${frame.path}`);
console.log(`Base64: ${frame.base64.substring(0, 50)}...`);
}
// Optional: Analyze frames with LLM
const analysisResult = await analyseFrames({
selectedModel: 'claude-sonnet-4-6',
workingDirectory: result.uniqueFramesDir,
frameBatch: result.uniqueFrames.map(frame => ({
name: frame.fileName,
contentType: 'image/png',
base64Data: frame.base64
})),
systemPrompt: 'You are a video analysis assistant.',
initialUserPrompt: 'Analyze these video frames and provide insights.',
apiKeys: {
ANTHROPIC_API_KEY: process.env.ANTHROPIC_API_KEY,
GOOGLE_API_KEY: process.env.GOOGLE_API_KEY,
OPENAI_API_KEY: process.env.OPENAI_API_KEY
}
});
console.log(analysisResult.analysis);
// Start analysis viewer server
const server = await startAnalysisServer({
port: 63745,
openBrowser: true
});
console.log(`Server running at ${server.url}`);API Reference
extractUniqueFrames(options)
Extract and deduplicate frames from a video.
Options:
videoUrl(string, required): Video URL or local file pathfps(number, default: 25): Frames per second to extractthreshold(number, default: 0.001): Deduplication similarity threshold (0.0-1.0)startTime(string , optional): Start extraction at specified time in MM:SS format (e.g., "1:30")duration(number, optional): Extract only specified duration in secondsalgo(string, default: "gd"): Deduplication algorithm"gd"- Greedy (fastest, compares with previous frame only)"dp"- Dynamic Programming (looks back at N frames, default 5)"sw"- Sliding Window (maintains rolling window, default 3 frames)
workingDir(string, optional): Absolute path for output directory (default:~/.videostil/{hash})
Returns: ExtractResult
totalFramesCount(number): Total frames before deduplicationuniqueFrames(FrameInfo[]): Array of unique framesvideoDurationSeconds(number): Video duration in secondsuniqueFramesDir(string): Path to unique frames directory
startAnalysisServer(options)
Start analysis viewer server to browse and visualize extracted frames.
Options:
port(number, default: 63745): Server porthost(string, default: "127.0.0.1"): Server hostopenBrowser(boolean, default: true): Automatically open browserworkingDir(string, optional): Custom directory to serve analyses from (default:~/.videostil/)
Returns: ServerHandle
url(string): Server URLport(number): Server porthost(string): Server hostserver(http.Server): Node.js HTTP server instanceclose(() => Promise): Function to close the server
analyseFrames(options)
Analyze video frames using AI models from Anthropic, Google, or OpenAI.
Options:
selectedModel(string, required): Model to use (e.g., "claude-sonnet-4-6", "gpt-5-2025-08-07", "gemini-2.5-pro")workingDirectory(string, required): Directory containing the framesframeBatch(Attachment[], required): Array of frame attachments with base64 datasystemPrompt(string, required): System prompt for the AIinitialUserPrompt(string, required): User prompt for the AIapiKeys(object, optional): API keys object with ANTHROPIC_API_KEY, GOOGLE_API_KEY, and/or OPENAI_API_KEY
Returns: AnalyseFramesResult
analysis(string): Raw analysis text from the AIallMessages(CanonicalMessage[]): All messages in the conversationparsedXml(VideoAnalysisSection[]): Parsed XML sections from analysiskeyFrameImagesMap(Map): Map of key frame images
Note: At least one API key must be provided either in the apiKeys parameter or as environment variables.
Frame Information
Each frame in uniqueFrames contains:
index(number): Absolute frame position in original videopath(string): Local file path to frame imagefileName(string): Normalized filename (e.g., "frame_000123.png")base64(string): Base64-encoded PNG image datatimestamp(string): Human-readable timestamp (e.g., "1m23s")
CLI Reference
Extract Command
videostil <video-url> [options]
Options:
--fps <number> Frames per second (default: 25)
--threshold <number> Deduplication threshold (default: 0.01)
--algo <string> Algorithm: gd|dp|sw (default: gd)
--start <MM:SS> Start time in video (format: MM:SS, e.g., 1:30)
--duration <seconds> Duration to extract in seconds
--output <dir> Output directory
--no-serve Don't start viewer after extraction
--model <string> LLM model for analysis (e.g., claude-sonnet-4-6)
--system-prompt <str> System prompt for LLM analysis
--user-prompt <str> User prompt for LLM analysis
--help Show help
--version Show versionNote: LLM analysis requires API keys. Set ANTHROPIC_API_KEY, GOOGLE_API_KEY, or OPENAI_API_KEY environment variables to enable automatic frame analysis.
Serve Command
videostil serve [options]
Options:
--port <number> Server port (default: 63745)
--host <string> Server host (default: 127.0.0.1)
--no-open Don't open browser automatically
--help Show helpHow It Works
- Download/Copy Video: Fetches video from URL or copies local file
- Extract Frames: Uses ffmpeg to extract frames at specified FPS
- Scale & Pad: Normalizes all frames to 1280x720 (maintains aspect ratio)
- Deduplicate: Removes duplicate frames using selected algorithm
- Store Results: Saves unique frames and metadata
- Analyze (Optional): If API keys are available, analyzes frames using LLM
Deduplication Algorithms
Greedy (gd)
- Compares each frame only with the previous unique frame
- Fastest, lowest memory usage
- Best for videos with gradual scene changes
Dynamic Programming (dp)
- Compares each frame with previous N unique frames (lookback window)
- Default lookback: 5 frames
- Better for detecting duplicates with brief scene interruptions
Sliding Window (sw)
- Maintains a rolling window of recent unique frames
- Default window size: 3 frames
- Good balance between accuracy and performance
Output Structure
~/.videostil/{hash}/
├── video_{hash}.webm # Downloaded video
├── frames/ # All extracted frames
│ └── frame_000000.png
├── unique_frames/ # Deduplicated frames only
│ └── frame_000000.png
├── analysis-result.json # Analysis metadata
└── frame-diff-data.json # Cached similarity dataExamples
Extract Specific Time Range
import { extractUniqueFrames } from 'videostil';
// Extract 30 seconds starting at 1 minute
const result = await extractUniqueFrames({
videoUrl: 'video.mp4',
startTime: '1:00',
duration: 30,
fps: 30
});Use Different Algorithms
import { extractUniqueFrames } from 'videostil';
// Try each algorithm and compare results
const algorithms = ['gd', 'dp', 'sw'] as const;
for (const algo of algorithms) {
const result = await extractUniqueFrames({
videoUrl: 'video.mp4',
algo,
threshold: 0.01
});
console.log(`${algo}: ${result.uniqueFrames.length} unique frames`);
}Custom Working Directory
import { extractUniqueFrames } from 'videostil';
import path from 'path';
const result = await extractUniqueFrames({
videoUrl: 'video.mp4',
workingDir: path.join(process.cwd(), 'my-frames')
});Analysis Viewer Server
import { startAnalysisServer } from 'videostil';
// Start server with custom options
const server = await startAnalysisServer({
port: 8080,
host: '0.0.0.0',
openBrowser: false
});
console.log(`Analysis viewer: ${server.url}`);
// Close server when done
await server.close();LLM Frame Analysis
import { extractUniqueFrames, analyseFrames } from 'videostil';
// First extract frames
const result = await extractUniqueFrames({
videoUrl: 'video.mp4',
fps: 25,
threshold: 0.01
});
// Analyze with Claude
const analysis = await analyseFrames({
selectedModel: 'claude-sonnet-4-6',
workingDirectory: result.uniqueFramesDir,
frameBatch: result.uniqueFrames.map(frame => ({
name: frame.fileName,
contentType: 'image/png',
base64Data: frame.base64
})),
systemPrompt: 'Analyze these video frames for key events and transitions.',
initialUserPrompt: 'What are the main scenes in this video?'
});
console.log(analysis.analysis);Contributing
Contributions are welcome! Please see the GitHub repository for guidelines.
License
MIT
