@hasanshoaib/ai-kit

v0.5.18

Published

8 months ago

AI toolkit by Q9Labs with vector search, memory, and deep search utilities

Downloads

213

0High
0Medium
0Low

hasanshoaib

ai vector-search memory deep-search embeddings nlp q9labs

ai-kit

npm downloads license

Type-safe utilities for building AI features in TypeScript apps. This package provides vector search with both Redis and PostgreSQL backends.

Works with Redis + RediSearch (KNN via HNSW)
Works with PostgreSQL + pgvector (HNSW or IVFFlat) via PgVectorSearch
First-class TypeScript types
Simple presets with sensible defaults (RAG, memory)
Flexible vector input (Buffer, Uint8Array, number[])

Installation

# Redis backend
npm i @hasanshoaib/ai-kit redis
# or
pnpm add @hasanshoaib/ai-kit redis

# PostgreSQL backend (choose one client and optionally Drizzle)
npm i @hasanshoaib/ai-kit pg
# or
npm i @hasanshoaib/ai-kit postgres
# optionally
npm i drizzle-orm drizzle-kit

Requirements

For Redis backend: Redis with the RediSearch module enabled (for FT.SEARCH / KNN)
For Postgres backend: PostgreSQL 15+ with the pgvector extension installed
Node 18+

Quick Start

import { createClient } from "redis";
import { VectorSearch } from "@hasanshoaib/ai-kit";

// (1) Connect Redis
const client = createClient({ url: process.env.REDIS_URL });
await client.connect();

// (2) Initialize VectorSearch (memory preset)
const vs = new VectorSearch({
  client,
  index: "memory:index",
  prefix: "memory:",
  vectorField: "memory",
  preset: "memory",
});

// (3) Create index (idempotent)
await vs.createIndex();

// (4) Add documents (vectors can be Buffer | Uint8Array | number[])
await vs.addDocuments([
  {
    id: "mem-1",
    doc: {
      title: "User preference",
      contents: "User prefers TypeScript over JavaScript",
      createdAt: new Date().toISOString(),
    },
    vector: new Array(512).fill(0).map((_, i) => Math.sin(i)), // number[]
  },
]);

// (5) Search: pass a query vector (same dimension as index)
const queryVector = new Array(512).fill(0).map((_, i) => Math.cos(i));
const { data, error } = await vs.search({
  buffer: queryVector, // number[] | Uint8Array | Buffer
  numberOfResults: 10,
  scoreLimit: 0.8,
  returnFields: ["title", "contents", "createdAt"],
});

if (error) console.error(error);
else console.table(data);

await client.disconnect();

From zero to search in 5 minutes

Run Redis with RediSearch
- Docker: docker run -it --rm -p 6379:6379 redislabs/redisearch:latest
- Or use Redis Cloud (see below).
Install deps: pnpm add @hasanshoaib/ai-kit redis
Connect: Set REDIS_URL (e.g., redis://localhost:6379).
Initialize: Use the memory preset to get a sensible schema (dim=512, COSINE).
Embed: Use any embedding provider that outputs 512-d vectors.
Index + Add: Call createIndex() and addDocuments().
Search: Embed the query text and call search({ buffer, numberOfResults, scoreLimit }).

That's it. The library handles vector input types (number[] | Uint8Array | Buffer) for you.

Scope templates (multi-tenant via uid or scope object)

You can configure human-friendly templates at construction and then pass only a uid (string) or a scope object per call. Templates expand placeholders like {uid} to derive the index and/or key prefix automatically.

// Configure templates once
const vs = new VectorSearch({
  client,
  index: "idx",          // default fallback when template not used
  prefix: "p:",          // default fallback when template not used
  vectorField: "vector",
  indexTemplate: "idx:user:{uid}",       // optional
  prefixTemplate: "user:{uid}",          // optional (string or string[])
});

// Create user-scoped index by passing only uid
await vs.createIndex("123");

// Add docs under user prefix with uid only
await vs.addDocuments([
  { id: "1", doc: { contents: "hello" }, vector: new Array(512).fill(0) },
], "123");

// Search using uid (or a full scope object if your template uses other keys)
await vs.search({
  buffer: queryVector,
  numberOfResults: 10,
  scoreLimit: 0.8,
  returnFields: ["contents"],
}, "123");

// Maintenance with uid
await vs.deleteDocuments(["1"], "123");
await vs.ensureIndex("123");
await vs.dropIndex("123");

// If your templates use other variables, pass an object
await vs.search({ buffer: queryVector, numberOfResults: 5, scoreLimit: 1 }, { tenantId: "acme" });

Notes:

Templates are optional. When present, per-call scope takes precedence over base index/prefix.
Per-call explicit overrides (e.g., { index, prefix }) still take highest precedence over templates.
prefixTemplate can be a string or string[]; internally it is normalized to string[].

Per-call overrides (multi-tenant/user-scoped)

You can override the RediSearch index and/or document key prefix per method call. This is useful for multi-tenant setups where each tenant/user gets a separate index and keyspace.

// Create a user-scoped index
await vs.createIndex({ index: `idx:user:${userId}`, prefix: `user:${userId}` });

// Add docs with a user prefix
await vs.addDocuments([
  { id: "1", doc: { contents: "hello" }, vector: new Array(512).fill(0) },
], { prefix: `user:${userId}` });

// Or when using an embed function
await vs.addDocuments([
  { id: "2", doc: { contents: "world" } },
], async (d) => embed(d.contents), { prefix: `user:${userId}` });

// Search a user-scoped index
await vs.search({
  index: `idx:user:${userId}`,
  buffer: queryVector,
  numberOfResults: 10,
  scoreLimit: 0.8,
  returnFields: ["contents"],
});

// Maintenance helpers with overrides
await vs.deleteDocuments(["1", "2"], { prefix: `user:${userId}` });
await vs.ensureIndex({ index: `idx:user:${userId}` });
await vs.dropIndex({ index: `idx:user:${userId}` });

Notes:

Pass index to createIndex, ensureIndex, dropIndex to override the RediSearch index for that call.
Pass prefix to createIndex, addDocuments, deleteDocuments to override the key prefix for that call.
Avoid trailing colons in prefixes (the library composes keys as ${prefix}:${id}).

Typed API

Result wrapper: `FunctionResponse<T>`

All public methods return a predictable, typed envelope:

interface FunctionResponse<T> {
  success: boolean;
  data: T | null | undefined;
  error: Error | null;
  message: string;
  statusCode: number;
}

PostgreSQL Quick Start

import { PgVectorSearch } from "@hasanshoaib/ai-kit";
import { Pool } from "pg"; // or use postgres.js

const pool = new Pool({ connectionString: process.env.DATABASE_URL });
const query = (text: string, params?: unknown[]) => pool.query(text, params);

const vs = new PgVectorSearch({
  query,
  table: "public.documents",
  idColumn: "id",
  vectorColumn: "embedding",
  distanceMetric: "COSINE",
  indexUsing: "hnsw",
});

await vs.createIndex();
await vs.addDocuments([
  { id: 1, doc: { title: "Doc" }, vector: new Array(512).fill(0) },
]);
const { data } = await vs.search({ buffer: new Array(512).fill(0), numberOfResults: 5, scoreLimit: 0.9 });

See docs/vector-search.md and the full step-by-step docs/pgvector-drizzle-guide.md.

VectorSearch

class VectorSearch<
  TDoc extends Record<string, unknown>,
  TVectorField extends string,
  TPreset extends "none" | "memory" | "rag" = "rag"
> {
  constructor(cfg: VectorSearchConfig<TDoc, TVectorField, TPreset>);
  constructor(cfg: VectorSearchConfigNone<TDoc> & { vectorField: TVectorField });

  createIndex(): Promise<FunctionResponse<string>>;

  addDocuments(items: Array<{
    id: string;
    doc: InputDoc<TDoc, TPreset, TVectorField>;
    vector: Buffer | Uint8Array | number[];
  }>): Promise<FunctionResponse<{ inserted: number }>>;

  addDocuments(
    items: Array<{
      id: string;
      doc: InputDoc<TDoc, TPreset, TVectorField>;
    }>,
    embed: (
      doc: InputDoc<TDoc, TPreset, TVectorField>
    ) => Promise<Buffer>
  ): Promise<FunctionResponse<{ inserted: number }>>;

  search(args: {
    buffer: Buffer | Uint8Array | number[];
    numberOfResults: number;
    scoreLimit: number;           // distance threshold (lower=better)
    maxDistance?: number;         // alias, preferred over scoreLimit when provided
    offset?: number;              // paging start offset
    withScores?: boolean;         // default true
    scoring?: "distance" | "similarity"; // similarity adds `similarity = 1 - vector_score` (COSINE only)
  }): Promise<FunctionResponse<SearchResultItem<Partial<TDoc>>[]>>;

  search(args: {
    buffer: Buffer | Uint8Array | number[];
    numberOfResults: number;
    scoreLimit: number;
    maxDistance?: number;
    offset?: number;
    withScores?: boolean;
    scoring?: "distance" | "similarity";
    returnFields?: readonly string[];
  }): Promise<FunctionResponse<SearchResultItem<Record<string, unknown>>[]>>;

  // Maintenance helpers
  deleteDocuments(ids: string[]): Promise<FunctionResponse<{ deleted: number }>>;
  dropIndex(): Promise<FunctionResponse<string>>; // drops index + documents (DD)
  ensureIndex(): Promise<FunctionResponse<{ exists: boolean }>>;
}

Config types

interface VectorSearchConfig<TDoc, TVectorField extends string, TPreset extends IndexPreset> {
  client: RedisClientType;
  index: string;                 // RediSearch index name
  prefix: string | string[];     // Key prefix(es) for documents
  // Optional templates for scope-based expansion
  indexTemplate?: string;         // e.g. "idx:user:{uid}"
  prefixTemplate?: string | string[]; // e.g. "user:{uid}"
  vectorField: TVectorField;     // Name of vector field (used across all presets)
  schema?: TPreset extends "none" ? FtCreateSchemaField[] : never;
  preset?: TPreset;              // "rag" (default), "memory", or "none"
  on?: "HASH" | "JSON";         // Currently only HASH is supported for addDocuments
  dim?: number;                  // Optional override for preset schemas
  distanceMetric?: "COSINE" | "L2" | "IP"; // Optional override for preset schemas
}

interface VectorSearchConfigNone<TDoc> {
  client: RedisClientType;
  index: string;
  prefix: string | string[];
  // Optional templates for scope-based expansion
  indexTemplate?: string;
  prefixTemplate?: string | string[];
  vectorField: string;
  preset: "none";
  schema: FtCreateSchemaField[]; // required when preset is "none"
  on?: "HASH" | "JSON";
}

Document types

// Base of every item returned from search
interface BaseSearchResultItem {
  key: string;         // Redis key
  vector_score: number; // distance score (lower is better)
  similarity?: number;  // present when scoring === "similarity" and metric is COSINE
}

// Result item shape
export type SearchResultItem<T = Record<string, unknown>> = BaseSearchResultItem & T;

// InputDoc ensures you can write your own doc type while the vector field is controlled by the preset/schema
export type InputDoc<
  TDoc extends Record<string, unknown>,
  TPreset extends IndexPreset,
  VField extends string
> = Omit<
  TDoc & PresetFields<TPreset, VField>,
  VectorKey<TPreset, VField> & string
>;

Presets and Schemas

preset: "memory" uses memorySchema(vectorField = "memory", dim = 512, distanceMetric = "COSINE").
- Default vector field name: memory.
- Default dimension: 512.
preset: "rag" uses a small default schema: contents: TEXT + vector: VECTOR(HNSW, dim=512, COSINE).
preset: "none" lets you pass a full custom schema via schema.

See src/vector-search/src/defaults.ts for details.

How search is executed (RediSearch)

Uses FT.SEARCH with KNN (* => [KNN k @<field> $vector AS vector_score]).
Parameters are supplied via PARAMS and the dialect is set to 2.
RETURN fields and SORTBY vector_score are passed as proper command options (not embedded in the query string).

Error handling

All methods return FunctionResponse<T>.
Typical error messages:
- "Error creating index" (RediSearch index creation failure)
- "Missing vector" when adding docs without a vector (and no embed function provided)
- "Redis search error" for FT.SEARCH failures
- "Error parsing Redis result" if the reply cannot be parsed to typed results

Practical recipes

Use your own embedding provider

import { VoyageAIClient } from "voyageai";

async function embed(text: string): Promise<number[]> {
  const v = new VoyageAIClient({ apiKey: process.env.VOYAGE_API_KEY! });
  const res = await v.embed({ input: text, model: "voyage-3-lite", outputDimension: 512 });
  return res.data[0].embedding; // number[]
}

Custom schema (preset "none")

import type { FtCreateSchemaField } from "@hasanshoaib/ai-kit";

const schema: FtCreateSchemaField[] = [
  { identifier: "title", type: "TEXT", sortable: true },
  { identifier: "contents", type: "TEXT" },
  {
    identifier: "my_vector",
    type: "VECTOR",
    vectorType: "HNSW",
    dim: 1536,
    distanceMetric: "COSINE",
  },
];

const vs = new VectorSearch({
  client,
  index: "docs:index",
  prefix: "docs:",
  preset: "none",
  vectorField: "my_vector",
  schema,
});

Selecting return fields

const { data } = await vs.search({
  buffer: queryVector,
  numberOfResults: 5,
  scoreLimit: 0.9,
  returnFields: ["title", "createdAt"],
});

Troubleshooting

Ensure your query vector dimension matches the index’s configured dimension.
Ensure RediSearch module is available and dialect 2+ is supported.
If you see ERR Query syntax error, upgrade to the latest version (>= 0.5.5) and confirm you are not passing a string buffer as the query vector.

Provider guides

OpenAI: docs/providers/openai.md
VoyageAI: docs/providers/voyageai.md
Cohere: docs/providers/cohere.md

Each guide shows how to configure the provider and produce vectors with the correct dimension.

Examples

Runnable example projects:

Memory preset (dim=512): examples/memory-basic
Custom schema (dim=1536): examples/custom-schema

Each example has its own README with run instructions. Both work with Docker Redis or Redis Cloud via REDIS_URL.

Beginner setup guide

Running Redis with RediSearch (Docker)

If you don't already have Redis with RediSearch:

docker run -it --rm \
  -p 6379:6379 \
  redislabs/redisearch:latest

Your app can now connect to redis://localhost:6379.

Verify RediSearch availability

Run any Redis CLI (or use a GUI like RedisInsight):

# Option 1: Check modules
redis-cli MODULE LIST

# You should see a row with name "search"

# Option 2: Check a command
redis-cli FT._LIST

If these fail, you are likely not running RediSearch. Use the Docker image above or install the module in your Redis deployment.

Choosing an embedding model and dimension

The default memory preset assumes vectors of dimension 512 and COSINE distance.
Use an embedding model that can output 512 dim (or change the schema accordingly).
If you change the dimension in the schema, ensure your document vectors and query vectors use the same dimension.

Examples of compatible settings:

memory preset (default): dim=512, COSINE
custom schema: set dim to match your provider (e.g., 1536) and pass vectorField accordingly.

Redis Cloud (managed)

If you prefer a managed option:

Create a free Redis Cloud account.
Create a database with the RediSearch module enabled.
Note the connection string (host, port, password) and construct a URL like redis://default:<password>@<host>:<port>.
Set it as REDIS_URL in your environment.

Release and publishing

This repo includes a helper script publish.sh to streamline releases.

./publish.sh "chore: release x.y.z"

What it does:

Builds the package
Commits and pushes changes: git add . && git commit -m "..." && git push
Checks for an existing npm version and, if needed, bumps the patch version
Creates and pushes a git tag like vX.Y.Z when a bump occurs
Publishes to npm (as configured in the script)

Notes:

Ensure you are logged in to npm: npm login
The script avoids --access public; adjust if you need a different access scope
If publishing under a scope, ensure your package.json name is scoped (e.g., @scope/pkg)

Node and TypeScript notes

ESM: import { VectorSearch } from "@hasanshoaib/ai-kit"
CJS: const { VectorSearch } = require("@hasanshoaib/ai-kit")
Built targets: ESM and CJS; types included.
Requires Node 18+ and TypeScript es2022 target (or compatible) recommended.

Security

Do not log API keys (e.g., embedding provider keys). Use environment variables.
For Redis Cloud or exposed instances, use TLS if available and strong passwords.
Validate input sizes; vectors must match your schema dimension to avoid errors.

FAQ

Q: I get ERR Query syntax error.

Ensure you're on >= 0.5.5.
Do not embed RETURN/SORTBY inside the query string; the library handles them.
Confirm RediSearch is available and dialect 2+ is supported.

Q: My vectors are number[]. Is that okay?

Yes. Pass number[] | Uint8Array | Buffer. The library converts to a Float32 Buffer internally.

Q: What dimension should I use?

The memory preset defaults to 512. Use an embedding model that outputs 512, or use a custom schema with your desired dimension.

Q: Do I need to migrate existing data when I change dimension?

Yes. The index schema and all stored vectors must share the same dimension. Changing it requires reindexing.

Q: Can I store JSON docs?

Current addDocuments() writes HASH fields. You can still store JSON externally and index text fields in Redis HASH for search.

Q: How do scores work?

vector_score is a distance (lower is better). For COSINE, you can request a convenience similarity by passing scoring: "similarity" to search(), which adds similarity = 1 - vector_score to each result.
Filtering still uses distance (scoreLimit/maxDistance). See docs/vector-search.md for details.

Contributing

We welcome issues and PRs!

Keep changes small and focused (KISS)
Avoid duplication (DRY) and favor clarity (Clean Code)
Add tests or examples when fixing bugs or adding features
Follow conventional commits for messages (e.g., feat: ..., fix: ..., docs: ...)

Local development:

pnpm i
pnpm build
pnpm test   # add tests as needed

Changelog

Changes are tracked via git history and releases. Consider adding a CHANGELOG.md if your workflow needs curated release notes.

License

MIT

Published

Vulnerabilities

Links

Maintainers

Keywords

Readme

ai-kit

Installation

Requirements

Table of contents

Quick Start

From zero to search in 5 minutes

Scope templates (multi-tenant via uid or scope object)

Per-call overrides (multi-tenant/user-scoped)

Typed API

Result wrapper: FunctionResponse<T>

PostgreSQL Quick Start

VectorSearch

Config types

Document types

Presets and Schemas

How search is executed (RediSearch)

Error handling

Practical recipes

Use your own embedding provider

Custom schema (preset "none")

Selecting return fields

Troubleshooting

Provider guides

Examples

Beginner setup guide

Running Redis with RediSearch (Docker)

Verify RediSearch availability

Choosing an embedding model and dimension

Redis Cloud (managed)

Release and publishing

Node and TypeScript notes

Security

FAQ

Contributing

Changelog

License

Result wrapper: `FunctionResponse<T>`