npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2026 – Pkg Stats / Ryan Hefner

@recvector/adapters

v0.1.4

Published

RecVector adapters - Chroma, OpenAI, Gemini, and Knex for @recvector/sdk

Downloads

99

Readme

@recvector/adapters

Concrete adapter implementations for @recvector/sdk. Provides the SQL storage adapter (Knex.js), vector DB client (Chroma), and all supported embedding model providers (OpenAI, Gemini, HuggingFace).

These adapters implement the pluggable interfaces defined in @recvector/sdk — the core engine never imports from this package directly, keeping the SDK decoupled from any specific infrastructure.


Installation

pnpm add @recvector/adapters @recvector/sdk

Install the dependencies for the adapters you plan to use:

# Database driver (pick one)
pnpm add pg             # PostgreSQL
pnpm add mysql2         # MySQL
pnpm add better-sqlite3 # SQLite

# Embedding provider (pick one)
pnpm add openai         # OpenAI
pnpm add @google/genai  # Gemini

# Vector DB
# Chroma is the only supported vector DB in v1
# The chromadb client is included as a peer dependency

Quick start

In almost all cases you do not instantiate adapters directly — createRecEngine() reads recvector.config.ts and constructs them automatically. You only need to import from this package when:

  • Passing a pre-built adapter to createRecEngine() to override the default
  • Writing tests with mock or in-memory adapters
  • Using an adapter standalone outside of RecVector
import { createRecEngine } from '@recvector/sdk'

// Default: createRecEngine reads recvector.config.ts and builds all adapters
const rec = await createRecEngine()

KnexStorageAdapter

Implements the StorageAdapter interface using Knex.js. Supports PostgreSQL, MySQL, and SQLite.

Responsibilities

  • Auto-creates the two SDK-managed tables (rec_user_profiles, rec_entity_stats) on initialize()
  • Reads user interaction history from your existing interaction tables (defined in rec_schema.json)
  • Reads entity features from your existing entity table, applying column and join feature mappings
  • Reads and writes user profile embeddings and entity popularity stats

Constructor

import knex from 'knex'
import { KnexStorageAdapter } from '@recvector/adapters'
import type { RecVectorSchema } from '@recvector/sdk'

const db = knex({
  client: 'pg',
  connection: process.env.DATABASE_URL,
})

const schema: RecVectorSchema = { /* your schema */ }

const storage = new KnexStorageAdapter(db, schema)

// Creates rec_user_profiles and rec_entity_stats if they don't exist
await storage.initialize()

Feature mapping

KnexStorageAdapter reads entity features by applying the features array from rec_schema.json. Two source modes are supported:

"column" source — reads a column directly from the entity table:

{
  "name": "category",
  "type": "categorical",
  "source": { "type": "column", "column": "category" }
}

"join" source — joins through a linking table to collect multi-value features:

{
  "name": "tags",
  "type": "multi_categorical",
  "source": {
    "type": "join",
    "join_table": "product_tags",
    "join_fk": "product_id",
    "value_column": "tag_name"
  }
}

Join features are collected into an array (e.g. ["audio", "wireless"]) and serialised as comma-separated text during embedding.

SDK-managed tables

initialize() creates these two tables if they don't already exist. They are safe to add to an existing production database — no existing tables are touched.

rec_user_profiles

| Column | Type | Description | |--------|------|-------------| | user_id | TEXT PRIMARY KEY | User identifier | | embedding | TEXT | JSON-serialised profile vector (number[]) | | last_updated | DATETIME | Timestamp of last profile recomputation | | version | TEXT | Schema version at update time | | interaction_count_since_update | INTEGER | Batch threshold counter | | accumulated_weight | REAL | Total weight accumulated (incremental strategy only) |

rec_entity_stats

| Column | Type | Description | |--------|------|-------------| | entity_id | TEXT PRIMARY KEY | Entity identifier | | feedback_counts | TEXT | JSON object { interactionType: count } | | version | TEXT | Schema version |

StorageAdapter interface

KnexStorageAdapter implements the full StorageAdapter interface:

| Method | Description | |--------|-------------| | initialize() | Auto-creates SDK tables | | fetchUserInteractions(userId, since?) | Returns all interactions for a user, optionally since a date | | fetchEntityById(entityId) | Fetches a single entity with its resolved features | | fetchEntitiesBatch(entityIds) | Fetches multiple entities efficiently | | fetchEntityStats(entityId) | Reads popularity counts from rec_entity_stats | | upsertUserProfile(profile) | Insert-or-update a user profile row | | fetchUserProfile(userId) | Load a user profile including accumulated weight | | incrementInteractionCounter(userId, type) | Atomically increment the batch counter (transactional) |


ChromaVectorClient

Implements the VectorDbClient interface against a Chroma vector database.

Constructor

import { ChromaVectorClient } from '@recvector/adapters'

const vectorDb = new ChromaVectorClient({
  type: 'chroma',
  url: 'http://localhost:8000',
  collection: 'my-app',
  index: {
    metric: 'cosine', // 'cosine' | 'dot' | 'l2'
  },
})

ChromaVectorDbConfig

| Field | Type | Required | Description | |-------|------|----------|-------------| | type | 'chroma' | Yes | Discriminant | | url | string | Yes | Chroma server URL (http://host:port) | | collection | string | Yes | Collection name for entity vectors | | index.metric | 'cosine' \| 'dot' \| 'l2' | No | Distance metric (default: cosine) | | namespace | string | No | Logical namespace — encoded as {collection}__{namespace} in Chroma |

Namespace encoding

Chroma has no native namespace concept. ChromaVectorClient encodes namespaces into the collection name: my-app__tenant-a. This means deleteAll({ namespace: 'tenant-a' }) drops exactly the my-app__tenant-a collection, leaving my-app untouched. Future adapters (Pinecone, Weaviate, Milvus) can map namespace to their native equivalents without any core logic changes.

Collection caching

Collections are lazily created on first use and cached in memory to avoid repeated getOrCreateCollection calls across requests.

Score normalisation

Chroma returns distances, not similarities. ChromaVectorClient normalises to a [0, 1] similarity score:

| Metric | Conversion | |--------|-----------| | cosine | score = 1 - distance | | l2 | score = 1 - distance | | dot (ip) | score = -distance (Chroma returns negative inner product) |

VectorDbClient interface

| Method | Description | |--------|-------------| | upsertVectors({ ids, vectors, metadata?, namespace? }) | Upsert entity embeddings into the collection | | query({ vector, topK, filter?, namespace? }) | HNSW nearest-neighbour search; returns scored results | | fetchByIds({ ids, namespace? }) | Retrieve stored vectors by ID (used during profile computation) | | delete({ ids, namespace? }) | Delete specific entity vectors | | deleteAll(args?) | Drop the entire collection — recreated fresh on next operation |

Running Chroma locally

docker run -p 8000:8000 chromadb/chroma

Or with persistent storage:

docker run -p 8000:8000 -v chroma-data:/chroma/.chroma/index chromadb/chroma

Embedding Models

All embedding models implement the EmbeddingModel interface:

interface EmbeddingModel {
  embed(text: string): Promise<number[]>
  embedBatch(texts: string[]): Promise<number[][]>
}

Use the createEmbeddingModel factory to construct the right model from a config object, or instantiate a class directly for more control.

createEmbeddingModel(config)

import { createEmbeddingModel } from '@recvector/adapters'

const model = createEmbeddingModel({
  provider: 'openai',
  model: 'text-embedding-3-small',
  dimensions: 1536,
  apiKey: process.env.OPENAI_API_KEY,
})

const vector = await model.embed('wireless noise-cancelling headphones')

Supported providers: 'openai', 'gemini', 'huggingface'. For 'custom', pass an EmbeddingModel instance directly to createRecEngine({ embeddingModel }) instead.


OpenAIEmbeddingModel

Uses the OpenAI Embeddings API.

import { OpenAIEmbeddingModel } from '@recvector/adapters'

const model = new OpenAIEmbeddingModel({
  provider: 'openai',
  model: 'text-embedding-3-small',
  dimensions: 1536,
  apiKey: process.env.OPENAI_API_KEY,
})

Recommended models:

| Model | Dimensions | Notes | |-------|-----------|-------| | text-embedding-3-small | 1536 | Best cost/quality ratio for most use cases | | text-embedding-3-large | 3072 | Higher accuracy, higher cost | | text-embedding-ada-002 | 1536 | Legacy model |

Config fields:

| Field | Default | Description | |-------|---------|-------------| | apiKey | OPENAI_API_KEY env | API key (falls back to env var if omitted) | | model | required | Model ID | | dimensions | required | Must match the model's output dimensions |


GeminiEmbeddingModel

Uses the Google Gemini Embeddings API via @google/genai.

import { GeminiEmbeddingModel } from '@recvector/adapters'

const model = new GeminiEmbeddingModel({
  provider: 'gemini',
  model: 'gemini-embedding-exp-03-07',
  dimensions: 768,
  apiKey: process.env.GEMINI_API_KEY,
})

Recommended models:

| Model | Dimensions | Notes | |-------|-----------|-------| | gemini-embedding-exp-03-07 | 768 | Latest experimental, high quality | | text-embedding-004 | 768 | Stable production model |

Rate limit handling:

GeminiEmbeddingModel automatically retries on HTTP 429 (rate limit) with exponential backoff and respects Retry-After headers. The free tier allows 100 requests/minute — embedBatch runs texts sequentially within each batch so the concurrency throttle in syncEntities is the only parallelism knob.

Config fields:

| Field | Default | Description | |-------|---------|-------------| | apiKey | required | Gemini API key | | model | required | Model ID | | dimensions | required | Output vector dimensions |


HuggingFaceEmbeddingModel

Uses the Hugging Face Inference API. Works with any model hosted on HuggingFace that supports the feature-extraction pipeline.

import { HuggingFaceEmbeddingModel } from '@recvector/adapters'

const model = new HuggingFaceEmbeddingModel({
  provider: 'huggingface',
  model: 'sentence-transformers/all-MiniLM-L6-v2',
  dimensions: 384,
  apiKey: process.env.HF_API_KEY,
})

Popular models:

| Model | Dimensions | Notes | |-------|-----------|-------| | sentence-transformers/all-MiniLM-L6-v2 | 384 | Fast, lightweight, good for most domains | | sentence-transformers/all-mpnet-base-v2 | 768 | Higher accuracy, slower | | BAAI/bge-small-en-v1.5 | 384 | Strong multilingual support |

Self-hosted inference:

Point at your own inference server by passing baseUrl:

import { HuggingFaceEmbeddingModel } from '@recvector/adapters'

// instantiate directly (not via createEmbeddingModel) to access baseUrl
const model = new HuggingFaceEmbeddingModel({
  provider: 'huggingface',
  model: 'sentence-transformers/all-MiniLM-L6-v2',
  dimensions: 384,
  apiKey: 'your-key',
  baseUrl: 'http://my-tgi-server:8080',
})

Config fields:

| Field | Default | Description | |-------|---------|-------------| | apiKey | required | Hugging Face API key (Bearer token) | | model | required | Model ID on HuggingFace Hub | | dimensions | required | Output vector dimensions | | baseUrl | https://api-inference.huggingface.co | Override for self-hosted inference servers |


Custom Embedding Model

Implement EmbeddingModel to use any embedding source — local models, custom APIs, or caching wrappers.

import type { EmbeddingModel } from '@recvector/sdk'

class MyEmbeddingModel implements EmbeddingModel {
  async embed(text: string): Promise<number[]> {
    // call your API / run local model
    return myApi.embed(text)
  }

  async embedBatch(texts: string[]): Promise<number[][]> {
    return Promise.all(texts.map(t => this.embed(t)))
  }
}

const rec = await createRecEngine({
  embeddingModel: new MyEmbeddingModel(),
})

Exports

import {
  // Storage
  KnexStorageAdapter,

  // Vector DB
  ChromaVectorClient,

  // Embedding models
  OpenAIEmbeddingModel,
  GeminiEmbeddingModel,
  HuggingFaceEmbeddingModel,
  createEmbeddingModel,
} from '@recvector/adapters'

License

MIT