npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2025 – Pkg Stats / Ryan Hefner

resona

v0.2.0

Published

Semantic embeddings and vector search - find concepts that resonate

Readme

Resona

Semantic embeddings and vector search - find concepts that resonate

Resona is a TypeScript library for generating, storing, and searching text embeddings. It supports multiple embedding providers and enables cross-source semantic search with hierarchical source identification.

Features

  • Multiple Embedding Providers: Ollama, OpenAI, Voyage AI, and Transformers.js (CPU-based)
  • Vector Storage: LanceDB for efficient embedded vector similarity search
  • Change Detection: Hash-based tracking to avoid re-embedding unchanged content
  • Batch Processing: Efficient batch embedding with progress callbacks and LanceDB write batching
  • Unified Search: Search across multiple sources (Tana, email, etc.) with source identification
  • Hierarchical Source IDs: Support for type/instance patterns (e.g., tana/main, email/work)

Installation

bun add resona

No additional system dependencies required - LanceDB includes prebuilt binaries for all platforms.

Quick Start

Basic Embedding Service

import { EmbeddingService, OllamaProvider } from "resona";

// Create an Ollama provider (requires Ollama running locally)
const provider = new OllamaProvider("nomic-embed-text");

// Create the embedding service with a database path
const service = new EmbeddingService(provider, "./embeddings.db");

// Embed items
await service.embed({
  id: "doc-1",
  text: "The quick brown fox jumps over the lazy dog",
});

await service.embedBatch([
  { id: "doc-2", text: "Machine learning fundamentals" },
  { id: "doc-3", text: "Natural language processing basics" },
]);

// Search for similar content
const results = await service.search("AI and ML concepts", 5);
console.log(results);
// [
//   { id: "doc-2", similarity: 0.87, contextText: "..." },
//   { id: "doc-3", similarity: 0.82, contextText: "..." },
//   ...
// ]

Unified Search Across Sources

import { UnifiedSearchService, EmbeddingService, OllamaProvider } from "resona";

// Create a unified search service
const unifiedSearch = new UnifiedSearchService();

// Register multiple sources
const tanaService = new EmbeddingService(provider, "./tana.db");
const emailService = new EmbeddingService(provider, "./email.db");

// Sources implement the SearchSource interface
unifiedSearch.registerSource({
  sourceId: "tana/main",
  description: "Tana main workspace",
  search: (query, k) => tanaService.search(query, k),
});

unifiedSearch.registerSource({
  sourceId: "email/work",
  description: "Work email",
  search: (query, k) => emailService.search(query, k),
});

// Search across all sources
const results = await unifiedSearch.search("project planning", 10);
// [
//   { source: "tana/main", id: "node_abc", similarity: 0.92 },
//   { source: "email/work", id: "msg_123", similarity: 0.88 },
//   ...
// ]

// Filter by source type
const tanaOnly = await unifiedSearch.search("planning", 10, {
  sourceTypes: ["tana"],
});

API Reference

Providers

OllamaProvider

import { OllamaProvider } from "resona";

// Default endpoint (http://localhost:11434)
const provider = new OllamaProvider("nomic-embed-text");

// Custom endpoint
const provider = new OllamaProvider("nomic-embed-text", "http://ollama:11434");

// Custom dimensions for unknown models
const provider = new OllamaProvider("custom-model", undefined, 512);

// Check if Ollama is available
const available = await provider.healthCheck();

Supported Ollama Models:

  • nomic-embed-text (768 dimensions)
  • mxbai-embed-large (1024 dimensions)
  • all-minilm (384 dimensions)
  • bge-m3 (1024 dimensions)
  • snowflake-arctic-embed (1024 dimensions)

OpenAIProvider

import { OpenAIProvider } from "resona";

// Default model (text-embedding-3-small)
const provider = new OpenAIProvider(process.env.OPENAI_API_KEY!);

// Specific model
const provider = new OpenAIProvider(apiKey, "text-embedding-3-large");

// Custom dimensions (for dimension reduction)
const provider = new OpenAIProvider(apiKey, "text-embedding-3-large", {
  dimensions: 1024,
});

// Azure OpenAI or custom endpoint
const provider = new OpenAIProvider(apiKey, "text-embedding-3-small", {
  endpoint: "https://your-resource.openai.azure.com/v1",
});

Supported OpenAI Models:

  • text-embedding-3-small (1536 dimensions, supports reduction to 512+)
  • text-embedding-3-large (3072 dimensions, supports reduction to 256+)
  • text-embedding-ada-002 (1536 dimensions, legacy)

VoyageProvider

import { VoyageProvider } from "resona";

// Default model (voyage-3)
const provider = new VoyageProvider(process.env.VOYAGE_API_KEY!);

// Specific model
const provider = new VoyageProvider(apiKey, "voyage-3-large");

// With input type (optimizes for queries vs documents)
const queryProvider = new VoyageProvider(apiKey, "voyage-3", {
  inputType: "query",
});
const docProvider = new VoyageProvider(apiKey, "voyage-3", {
  inputType: "document",
});

Supported Voyage Models:

  • voyage-3 (1024 dimensions) - Best general-purpose
  • voyage-3-large (1024 dimensions) - Higher quality
  • voyage-3-lite (512 dimensions) - Fast and cost-effective
  • voyage-code-3 (1024 dimensions) - Code retrieval

TransformersProvider (CPU-based)

import { TransformersProvider } from "resona";

// Default model (all-MiniLM-L6-v2)
const provider = new TransformersProvider();

// Specific model
const provider = new TransformersProvider("Xenova/bge-base-en-v1.5");

// Custom cache directory
const provider = new TransformersProvider("Xenova/all-MiniLM-L6-v2", {
  cacheDir: "/path/to/cache",
});

Supported Transformers Models:

  • Xenova/all-MiniLM-L6-v2 (384 dimensions) - Fast, good quality
  • Xenova/all-MiniLM-L12-v2 (384 dimensions) - Higher quality
  • Xenova/bge-small-en-v1.5 (384 dimensions)
  • Xenova/bge-base-en-v1.5 (768 dimensions)
  • Xenova/bge-large-en-v1.5 (1024 dimensions)
  • nomic-ai/nomic-embed-text-v1.5 (768 dimensions)

Note: TransformersProvider requires @huggingface/transformers as an optional dependency:

bun add @huggingface/transformers

EmbeddingService

import { EmbeddingService } from "resona";

const service = new EmbeddingService(provider, "./embeddings.db");

// Embed single item
await service.embed({
  id: "unique-id",
  text: "Content to embed",
  contextText: "Optional enriched context for embedding",
  metadata: { tags: ["example"] },
});

// Batch embed with progress
await service.embedBatch(items, {
  onProgress: (progress) => {
    console.log(`${progress.processed}/${progress.total} (stored: ${progress.stored})`);
  },
  progressInterval: 100,
  forceAll: false, // Set to true to re-embed unchanged items
  storeBatchSize: 5000, // Buffer this many before writing to LanceDB (default: 5000)
});

// Search
const results = await service.search("query text", 10);

// Get statistics (async)
const stats = await service.getStats();
// { totalEmbeddings: 1000, model: "nomic-embed-text", dimensions: 768 }

// Cleanup old embeddings (async)
const removed = await service.cleanup(["id1", "id2"]); // Keep only these IDs

// Get embedded IDs (async)
const ids = await service.getEmbeddedIds();

// Close connection
service.close();

UnifiedSearchService

import { UnifiedSearchService } from "resona";

const unified = new UnifiedSearchService();

// Register sources
unified.registerSource(source);

// List sources
const sources = unified.listSources();
// [{ sourceId: "tana/main", description: "..." }, ...]

// Search with filters
const results = await unified.search("query", 10, {
  sources: ["tana/main"],        // Specific source IDs
  sourceTypes: ["tana", "email"], // Source type prefixes
});

// Get item details
const item = await unified.getItem("tana/main", "node_abc");
// { preview: "...", url: "https://..." }

Source IDs

Resona uses hierarchical source IDs in the format type/instance:

import { parseSourceId, createSourceId } from "resona";

// Parse a source ID
const { type, instance } = parseSourceId("tana/main");
// { type: "tana", instance: "main" }

// Create a source ID
const sourceId = createSourceId("email", "work");
// "email/work"

Storage

Resona uses LanceDB for vector storage. Database paths with .db extension are automatically converted to .lance directories:

// This creates ./embeddings.lance/ directory
const service = new EmbeddingService(provider, "./embeddings.db");

LanceDB provides:

  • Fast vector similarity search
  • No external dependencies (prebuilt binaries included)
  • Efficient columnar storage
  • Works with Bun, Node.js, and Deno

Development

# Install dependencies
bun install

# Run tests
bun test

# Type check
bun run typecheck

Architecture

resona/
├── src/
│   ├── index.ts                    # Package exports
│   ├── types.ts                    # Core type definitions
│   ├── providers/
│   │   ├── ollama.ts               # Ollama provider (local GPU)
│   │   ├── openai.ts               # OpenAI provider (cloud)
│   │   ├── voyage.ts               # Voyage AI provider (cloud)
│   │   └── transformers.ts         # Transformers.js (local CPU)
│   └── service/
│       ├── embedding-service.ts    # Core embedding service (LanceDB)
│       └── unified-search-service.ts # Cross-source search
└── test/
    ├── providers/
    │   ├── ollama.test.ts
    │   ├── openai.test.ts
    │   ├── voyage.test.ts
    │   └── transformers.test.ts
    └── service/
        ├── embedding-service.test.ts
        └── unified-search-service.test.ts

Performance

LanceDB Write Batching

By default, Resona buffers embedding records in memory before writing to LanceDB. This dramatically improves performance for large embedding jobs:

// Default: buffer 5000 records before writing
await service.embedBatch(items);

// Custom buffer size for memory-constrained environments
await service.embedBatch(items, { storeBatchSize: 1000 });

// Monitor buffering with progress callback
await service.embedBatch(items, {
  onProgress: ({ processed, stored, bufferSize }) => {
    console.log(`Generated: ${processed}, Persisted: ${stored}, Buffer: ${bufferSize}`);
  },
});

| Dataset Size | Without Batching | With Batching (default) | |--------------|------------------|------------------------| | 100k items | ~2000 writes | ~20 writes | | Memory overhead | 0 | ~20-40MB |

Known Issues

LanceDB Large Result Set Bug

LanceDB 0.13.x has a bug where querying large result sets (1000+ rows) without pagination returns corrupted string data. Resona works around this by paginating getEmbeddedIds() queries in batches of 100 rows. The warnings about "Ran out of fragments" at the end of pagination are expected and harmless.

License

MIT License - see LICENSE for details.

Contributing

Contributions welcome! Please follow the existing code style and add tests for new features.