@yarflam/potion-base-8m

v1.0.4

Published

2 months ago

Fast Model2Vec inference for potion-base-8M embeddings without ONNX or heavy ML frameworks

Downloads

304

0High
0Medium
0Low

yarflam

embeddings model2vec nlp machine-learning transformers potion static-embeddings

@yarflam/potion-base-8m

Fast Model2Vec inference for minishlab/potion-base-8M embeddings — zero dependencies, pure JavaScript.

⚡ Lightning fast — Static embeddings with no neural network at runtime
📦 Zero dependencies — No PyTorch, TensorFlow, ONNX, or HuggingFace libraries
🔧 Simple API — Just embed(texts) and go
🔍 Built-in semantic search — SemanticSearch class included
🪶 Tiny footprint — 256-dimensional embeddings, perfect for edge devices
🏠 Built-in tokenizer — Custom WordPiece tokenizer, no external deps

Installation

npm install @yarflam/potion-base-8m

Usage

import { embed } from '@yarflam/potion-base-8m';

const texts = ['Hello world', 'How are you?'];
const embeddings = await embed(texts);

console.log(embeddings[0].length); // 256
console.log(embeddings[0]);        // Float32Array(256) [...]

How it works

Model2Vec uses static embeddings — no neural network needed at runtime:

Tokenize input using the built-in WordPiece tokenizer
Lookup each token's vector in the embedding matrix
Mean-pool all token vectors
L2-normalize the result

Done! Pure JavaScript, zero ML framework overhead.

API

`embed(texts)`

Embed one or more texts using potion-base-8M.

Parameters:

texts (string | string[]) — Text(s) to embed

Returns:

Promise<Float32Array[]> — Array of 256-dimensional embeddings

Example:

// Single text
const [embedding] = await embed('Hello world');

// Multiple texts
const embeddings = await embed(['Text one', 'Text two', 'Text three']);

`cosineSimilarity(a, b)`

Compute cosine similarity between two embeddings.

import { embed, cosineSimilarity } from '@yarflam/potion-base-8m';

const [emb1, emb2] = await embed(['cat', 'dog']);
const similarity = cosineSimilarity(emb1, emb2);
console.log(similarity); // 0.0 to 1.0

`SemanticSearch`

Built-in semantic search class for finding similar sentences.

import { SemanticSearch } from '@yarflam/potion-base-8m';

const search = new SemanticSearch();

// Index your documents
await search.index([
  'The cat sleeps on the couch',
  'The dog plays in the garden',
  'A bird sings in the tree'
]);

// Search
const results = await search.search('feline resting', { nb_results: 2 });
// [{ sentence: 'The cat sleeps on the couch', score: 0.85 }, ...]

Methods:

index(sentences: string[]): Promise<SemanticSearch> — Index sentences for search
search(query: string, options?): Promise<Array<{sentence, score}>> — Search indexed sentences
- options.nb_results — Maximum results (default: 10)
- options.threshold — Minimum similarity score 0-1 (default: null)
clear(): void — Clear the index
size: number — Number of indexed sentences (getter)

Model files

The package downloads model files from HuggingFace Hub during installation (build time only):

model.safetensors — Embedding matrix [vocab_size, 256]
tokenizer.json — WordPiece tokenizer vocabulary
config.json — Model metadata

Files are cached in the models/ directory and included in the published package.

Runtime: Zero dependencies — no network calls, no external libraries.

Development

# Download model files (only needed for build/packaging)
npm run download-models

# Run tests
npm test

Note: npm install has no runtime dependencies to install. The package is dependency-free!

GitLab CI Setup

To enable NPM publishing in GitLab CI:

Add your NPM token as a CI/CD variable:
- Go to Settings > CI/CD > Variables
- Add NPM_TOKEN with your npm access token
Create a tag to trigger publish:
```
git tag v1.0.0
git push origin v1.0.0
```

Or manually trigger from the main branch using the publish-main job.

Model Credit

This package uses minishlab/potion-base-8M by Minish Lab.

Model2Vec paper: arXiv:2411.01001

Authors

Yarflam — Creator & maintainer
Mira 🤫 — Assistant & co-conspirator

License

MIT

Published

Vulnerabilities

Links

Maintainers

Keywords

Readme

@yarflam/potion-base-8m

Installation

Usage

How it works

API

embed(texts)

cosineSimilarity(a, b)

SemanticSearch

Model files

Development

GitLab CI Setup

Model Credit

Authors

License

`embed(texts)`

`cosineSimilarity(a, b)`

`SemanticSearch`