moondream
v0.2.0
Published
Official Node.js client for Moondream, a fast and efficient vision language model.
Downloads
11,944
Readme
Moondream Node.js Client Library
Official Node.js client library for Moondream, a fast multi-function VLM. This client can target either Moondream Cloud or Moondream Station.
Capabilities
Moondream goes beyond the typical VLM "query" ability to include more visual functions:
| Method | Description |
|--------|-------------|
| caption | Generate descriptive captions for images |
| query | Ask questions about image content |
| detect | Find bounding boxes around objects in images |
| point | Identify the center location of specified objects |
| segment | Generate an SVG path segmentation mask for objects |
Try it out on Moondream's playground.
Installation
npm install moondreamQuick Start
Choose how you want to run Moondream:
- Moondream Cloud — Get an API key from the cloud console
- Moondream Station — Run locally by installing Moondream Station
import { vl } from 'moondream';
import fs from 'fs';
// Initialize with Moondream Cloud
const model = new vl({ apiKey: '<your-api-key>' });
// Or initialize with a local Moondream Station
const model = new vl({ endpoint: 'http://localhost:2020/v1' });
// Load an image
const image = fs.readFileSync('path/to/image.jpg');
// Generate a caption
const caption = await model.caption({ image });
console.log('Caption:', caption.caption);
// Ask a question
const answer = await model.query({ image, question: "What's in this image?" });
console.log('Answer:', answer.answer);
// Stream the response
const stream = await model.caption({ image, stream: true });
for await (const chunk of stream.caption) {
process.stdout.write(chunk);
}API Reference
Constructor
const model = new vl({ apiKey: '<your-api-key>' }); // Cloud
const model = new vl({ endpoint: 'http://localhost:2020/v1' }); // LocalMethods
caption({ image, length?, stream? })
Generate a caption for an image.
Parameters:
image—BufferorBase64EncodedImagelength—"normal","short", or"long"(default:"normal")stream—boolean(default:false)
Returns: CaptionOutput — { caption: string | AsyncGenerator }
const result = await model.caption({ image, length: 'short' });
console.log(result.caption);
// With streaming
const stream = await model.caption({ image, stream: true });
for await (const chunk of stream.caption) {
process.stdout.write(chunk);
}query({ image?, question, stream? })
Ask a question about an image.
Parameters:
image—BufferorBase64EncodedImage(optional)question—stringstream—boolean(default:false)
Returns: QueryOutput — { answer: string | AsyncGenerator }
const result = await model.query({ image, question: "What's in this image?" });
console.log(result.answer);
// With streaming
const stream = await model.query({ image, question: "Describe this", stream: true });
for await (const chunk of stream.answer) {
process.stdout.write(chunk);
}detect({ image, object })
Detect specific objects in an image.
Parameters:
image—BufferorBase64EncodedImageobject—string
Returns: DetectOutput — { objects: DetectedObject[] }
const result = await model.detect({ image, object: 'car' });
console.log(result.objects);point({ image, object })
Get coordinates of specific objects in an image.
Parameters:
image—BufferorBase64EncodedImageobject—string
Returns: PointOutput — { points: Point[] }
const result = await model.point({ image, object: 'person' });
console.log(result.points);segment({ image, object, spatialRefs?, stream? })
Segment an object from an image and return an SVG path.
Parameters:
image—BufferorBase64EncodedImageobject—stringspatialRefs—Array<[x, y] | [x1, y1, x2, y2]>— optional spatial hints (normalized 0-1)stream—boolean(default:false)
Returns:
- Non-streaming:
SegmentOutput—{ path: string, bbox?: SegmentBbox } - Streaming:
SegmentStreamOutput—{ stream: AsyncGenerator<SegmentStreamChunk> }
const result = await model.segment({ image, object: 'cat' });
console.log(result.path); // SVG path string
console.log(result.bbox); // { x_min, y_min, x_max, y_max }
// With spatial hint (point)
const result = await model.segment({ image, object: 'cat', spatialRefs: [[0.5, 0.5]] });
// With streaming
const stream = await model.segment({ image, object: 'cat', stream: true });
for await (const update of stream.stream) {
if (update.bbox && !update.completed) {
console.log('Bbox:', update.bbox); // Available immediately
}
if (update.chunk) {
process.stdout.write(update.chunk); // Coarse path chunks
}
if (update.completed) {
console.log('Final path:', update.path); // Refined path
}
}Types
| Type | Description |
|------|-------------|
| Buffer | Raw binary image data |
| Base64EncodedImage | { imageUrl: string } with base64-encoded image |
| DetectedObject | Bounding box with x_min, y_min, x_max, y_max |
| Point | Coordinates with x, y indicating object center |
| SegmentBbox | Bounding box with x_min, y_min, x_max, y_max |
| SpatialRef | [x, y] point or [x1, y1, x2, y2] bbox, normalized to [0, 1] |
