npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2026 – Pkg Stats / Ryan Hefner

video-query

v1.0.0

Published

Video surveillance screenshot query service using AI vision models

Readme

video-query

A video surveillance screenshot query service using AI vision models. Extract frames from videos, create labeled grid mosaics, and use vision AI models to locate specific scenes or objects.

中文文档

Features

  • Extract frames from videos using FFmpeg WASM (no local FFmpeg installation required)
  • Compose multiple images into labeled grid mosaics with row/column annotations
  • Analyze images using OpenAI multimodal models and other vision AI providers
  • Automatically parse AI responses into structured results
  • Support for custom system prompts
  • Extensible custom model adapters

Installation

npm install video-query

To use OpenAI models, also install:

npm install openai

Quick Start

Basic Usage

import { QueryVideo } from 'video-query';
import OpenAI from 'openai';
import fs from 'fs';

// Initialize OpenAI client
const openai = new OpenAI({ apiKey: 'your-api-key' });

// Create query instance
const query = new QueryVideo({
  mosaic: {
    columns: 4,           // 4 columns per row
    size: { width: 2048, height: 2048 },  // Total mosaic size (rows auto-calculated)
    // cellAspectRatio: 1,  // Optional: cell aspect ratio, default 1 (square)
  },
  model: {
    sdk: openai,          // SDK type auto-detected
    defaultModel: 'gpt-4o',
  },
});

// Extract frames from video
await query.addVideo({
  source: fs.readFileSync('video.mp4'),  // Or pass file path directly
  interval: 5,                            // Extract one frame every 5 seconds
  startTime: 0,                           // Start from 0 seconds
});

// Execute query
const result = await query.query('Find all frames where someone is wearing red clothes');

if (result.success) {
  console.log('Matches found:', result.matches.length);
  for (const match of result.matches) {
    console.log(`- ${match.description}`);
    console.log(`  Time: ${match.item.metadata?.videoTime}s`);
  }
}

Using Images

import { QueryVideo } from 'video-query';
import fs from 'fs';

const query = new QueryVideo({
  mosaic: {
    columns: 3,
    size: { width: 1280, height: 720 },  // Total mosaic size
  },
  model: {
    sdk: openai,
  },
});

// Add single image (Buffer)
query.addImage(fs.readFileSync('image1.png'), { source: 'camera-1' });

// Add Base64 image
const base64Image = 'data:image/png;base64,iVBORw0KGgo...';
query.addImage(base64Image, { source: 'camera-2' });

// Add multiple images
query.addImages([
  { data: fs.readFileSync('image2.png'), metadata: { source: 'camera-3' } },
  { data: 'iVBORw0KGgo...', metadata: { source: 'camera-4' } },  // Pure Base64 (without prefix)
]);

const result = await query.query('Are there any animals in the images?');

Generate Mosaics (for debugging)

import { QueryVideo } from 'video-query';
import fs from 'fs';

const query = new QueryVideo({ /* ... */ });

// Add images
query.addImage(/* ... */);

// Generate mosaics for debugging or saving
const mosaics = await query.generateMosaics();

for (const m of mosaics) {
  fs.writeFileSync(`mosaic-${m.index}.png`, m.buffer);
  console.log(`Saved mosaic ${m.index}: ${m.width}x${m.height}, contains ${m.imageCount} images`);
}

API

QueryVideo

Main class that combines all modules to provide query functionality.

Constructor

new QueryVideo(config: QueryVideoConfig)

QueryVideoConfig:

| Parameter | Type | Description | |-----------|------|-------------| | mosaic | MosaicConfig | Mosaic configuration | | model | ModelAdapterConfig | Model adapter configuration | | systemPromptTemplate | string? | Custom system prompt | | debug | boolean? | Debug mode |

Methods

| Method | Description | |--------|-------------| | addImage(data, metadata?) | Add single image (supports Buffer or Base64 string) | | addImages(items) | Add multiple images (supports Buffer or Base64 string) | | addVideo(config) | Extract frames from video and add them | | query(prompt) | Execute query | | generateMosaics() | Generate mosaics (for debugging or saving) | | getImages() | Get all added images | | getImageCount() | Get image count | | clearImages() | Clear all images | | preloadFFmpeg() | Preload FFmpeg |

MosaicConfig

Mosaic configuration.

| Parameter | Type | Default | Description | |-----------|------|---------|-------------| | columns | number | - | Number of columns per row | | cellAspectRatio | number? | 1 | Cell aspect ratio (width/height), 1 for square | | size | SizeConfig | - | Size configuration | | labelFontSize | number? | 14 | Label font size (label area auto-calculated as fontSize * 1.5) | | backgroundColor | string? | #000000 | Background color | | labelColor | string? | #FFFFFF | Label color | | gridColor | string? | #333333 | Grid line color | | gridWidth | number? | 1 | Grid line width |

SizeConfig:

// Specify total mosaic size (including label area and grid lines)
// Cell size and row count are auto-calculated based on total size
// Empty slots display as solid black placeholders
{ width: 2048, height: 2048 }

VideoConfig

Video frame extraction configuration.

| Parameter | Type | Default | Description | |-----------|------|---------|-------------| | source | Buffer \| string | - | Video data or file path | | startTime | number? | 0 | Start time for extraction (seconds) | | interval | number? | 1 | Extraction interval (seconds) | | duration | number? | - | Extraction duration (seconds) | | sourceId | string? | - | Video source identifier |

QueryResult

Query result.

interface QueryResult {
  success: boolean;          // Whether successful
  matches: MatchedImage[];   // List of matched images
  error?: string;            // Error message
  rawResponse?: string;      // Raw model response
  duration?: number;         // Query duration (milliseconds)
}

interface MatchedImage {
  item: IImageItem;          // Original image item
  confidence?: number;       // Match confidence
  description?: string;      // Description from model
}

ImageData

Data parameter type for addImage() and addImages() methods.

type ImageData = Buffer | string | null;
  • Buffer: Binary image data
  • string: Base64 encoded image data, supports two formats:
    • With prefix: data:image/png;base64,iVBORw0KGgo...
    • Pure Base64: iVBORw0KGgo...
  • null: Solid black placeholder image

MosaicBuffer

Return type for generateMosaics() method.

interface MosaicBuffer {
  index: number;       // Mosaic index
  buffer: Buffer;      // Image Buffer
  base64: string;      // Base64 encoded image data
  width: number;       // Width
  height: number;      // Height
  imageCount: number;  // Number of images contained
}

Extending Model Adapters

The library includes a built-in OpenAI adapter. You can also extend custom adapters to support other AI providers:

import { BaseModelAdapter, createModelAdapter } from 'video-query';
import type { VisionRequest, VisionResponse, IModelAdapter } from 'video-query';

// Method 1: Extend BaseModelAdapter
class MyCustomAdapter extends BaseModelAdapter {
  async callVision(request: VisionRequest): Promise<VisionResponse> {
    // Implement your AI call logic
    const response = await myAIClient.chat({
      systemPrompt: request.systemPrompt,
      userPrompt: request.userPrompt,
      images: request.images,
    });

    return { content: response.text };
  }
}

// Method 2: Implement IModelAdapter interface
const myAdapter: IModelAdapter = {
  validate: () => true,
  callVision: async (request) => {
    // ...
    return { content: '...' };
  },
};

// Use in QueryVideo
const query = new QueryVideo({
  mosaic: { /* ... */ },
  model: {
    sdk: openai,  // SDK type auto-detected
  },
});

How It Works

  1. Frame Extraction: Use FFmpeg WASM to extract frames from video at specified intervals
  2. Mosaic Generation: Compose multiple images into labeled grid mosaics with row/column annotations (A, B, C... / 1, 2, 3...)
  3. AI Analysis: Send mosaics to vision model, which returns matching positions based on coordinate labels
  4. Result Parsing: Parse AI responses into structured results, mapping back to original images

Requirements

  • Node.js >= 18.0.0
  • ES Module support

License

MIT