npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2026 – Pkg Stats / Ryan Hefner

@akhyar11/ml-v1

v2.3.0

Published

TypeScript + Rust Native machine learning library. Matrix ops, layers (Dense, Embedding, RNN, LSTM, GRU, MultiHeadAttention, etc), models (Sequential, Transformers), dan BPE tokenizer.

Readme

ML-V1

A TypeScript + Rust Native machine learning library — Matrix operations, neural network layers, Transformer models, and a BPE tokenizer, all in one package.

npm version License: ISC Node.js >=18


What is ML-V1?

ML-V1 is a low-to-mid-level machine learning library built with TypeScript and accelerated by a Rust native backend (via napi-rs). It gives you full control over every detail of the training loop — shapes, parameter updates, and custom architectures — without depending on a large ML framework.

Why ML-V1?

  • Full manual control over training loops, tensor shapes, and parameter updates.
  • A research playground for custom model architectures.
  • The productivity of TypeScript combined with Rust performance on hot paths.
  • Graceful fallback to pure JavaScript when the native backend is unavailable.

Features

  • Matrix — flat Float32Array-backed tensor with zero-copy hot-path access via _data.
  • Math primitivesdotProduct, add, sub, sumAxis, clipGradients, and more; automatically dispatched to Rust or JS.
  • LayersDense, Embedding, RNN, LSTM, GRU, SelfAttention, MultiHeadAttention, LayerNormalization, Dropout, PositionalEncoding, Flatten, Convolution.
  • ModelsSequential, Transformers (causal LM), DimentionalityReduction.
  • BPE Tokenizer — train, incremental update, Unicode-aware pre-tokenization, encode/decode with special tokens, padding, and JSON save/load.
  • Rust-accelerated ops — dot-product, activations, LayerNorm, embedding lookup, attention, and optimizer updates; auto-fallback to JS when unavailable.
  • Dynamic padding trim (trimPadding) — reduces effective sequence length per batch, cutting attention cost from O(seqLen²) to O(effectiveSeqLen²).

Installation

npm install @akhyar11/ml-v1

Prerequisites for Native Acceleration

The library works out of the box with a pure JavaScript fallback. For up to 10× faster matrix operations, install the Rust toolchain so the native addon can be compiled automatically on npm install:

  1. Rust Toolchain — install via rustup.rs.
  2. C/C++ Build Tools — required by the native binding compiler (e.g. build-essential on Linux, Xcode CLI on macOS, MSVC on Windows).

Note: If Rust is not installed, a warning is printed and the library falls back to pure JavaScript automatically. Performance will be noticeably slower for large models.


Building from Source

If you cloned the repository or need a manual build:

# Install dependencies
npm install

# Build the native Rust addon (release mode)
npm run build:rust

# Build the TypeScript distribution
npm run build:publish

Rust Native Backend

The native backend is loaded by src/math/rust_backend.ts. You can check whether it is active at runtime:

import { isNativeAvailable } from "@akhyar11/ml-v1";
console.log("Native active:", isNativeAvailable());

To force JavaScript-only execution (useful for debugging or regression comparisons):

ML_DISABLE_NATIVE=1 node your-script.js

Quick Start

Train a simple XOR classifier in a few lines:

import { Dense, mj, Sequential } from "@akhyar11/ml-v1";

const model = new Sequential({
  layers: [
    new Dense({ units: 2, outputUnits: 4, activation: "relu", status: "input" }),
    new Dense({ units: 4, outputUnits: 1, activation: "sigmoid", status: "output", loss: "mse" }),
  ],
});

model.compile({ alpha: 0.01, optimizer: "adam", error: "mse" });

const X = [mj.matrix([[0], [0]]), mj.matrix([[0], [1]]), mj.matrix([[1], [0]]), mj.matrix([[1], [1]])];
const Y = [mj.matrix([[0]]), mj.matrix([[1]]), mj.matrix([[1]]), mj.matrix([[0]])];

const result = model.fit(X, Y, 200, {
  batchSize: 4,
  validationSplit: 0.25,
  earlyStoppingPatience: 10,
  verbose: true,
  onEpochEnd: (epoch, loss, valLoss) => {
    console.log(`epoch=${epoch} loss=${loss} valLoss=${valLoss}`);
  },
});
console.log("best", result.bestEpoch, result.bestLoss);
const pred = model.predict(mj.matrix([[1], [0]]));
pred.print();

A legacy callback overload is also supported for backward compatibility:

model.fit(X, Y, 200, (loss) => console.log("loss", loss));

Examples

Matrix & Math Operations

import { mj } from "@akhyar11/ml-v1";

const a = mj.matrix([[1, 2], [3, 4]]);
const b = mj.matrix([[5, 6], [7, 8]]);
const c = mj.dotProduct(a, b);
const d = mj.add(c, 1);
console.log(c._shape, d._shape);

BPE Tokenizer

import { BPETokenizer } from "@akhyar11/ml-v1";

const tokenizer = new BPETokenizer({ vocabSize: 120, minFrequency: 2 });
tokenizer.train(["hello world", "hello there"]);
const ids = tokenizer.encodeWithSpecial("hello world");
const padded = tokenizer.padSequence(ids, 12);
console.log(ids, padded, tokenizer.decode(ids));

Unicode and Multilingual Tokenization

ML-V1 supports custom and built-in pre-tokenizers for non-Latin text. The default is still "char" for backward compatibility; use "unicode-grapheme" or "script-aware" for multilingual corpora.

Supported modes:

  • char
  • unicode-grapheme
  • unicode-word
  • whitespace
  • script-aware
import { BPETokenizer } from "@akhyar11/ml-v1";

const tokenizer = new BPETokenizer({
  vocabSize: 1000,
  preTokenizer: "script-aware"
});

tokenizer.train([
  "hello world",
  "مرحبا بالعالم",
  "こんにちは世界",
  "你好世界",
  "ภาษาไทย",
  "한국어테스트",
  "ꦱꦺꦴꦥꦺꦴ",
  "x² + y² = z²",
  "hello ꦱꦺꦴꦥꦺꦴ 😊 你好"
]);

BPE alone is not enough for every writing system. Pre-tokenization is important for scripts without spaces, combining marks, emoji sequences, and mixed text. script-aware is a general built-in mode; for language-specific behavior, pass a custom (text: string) => string[] pre-tokenizer. Intl.Segmenter improves grapheme and word segmentation when the runtime supports it. Fallback behavior is deterministic but may be less linguistically accurate.

Transformer Causal LM — Training

import { mj, Transformers } from "@akhyar11/ml-v1";

const model = new Transformers({ units: 64, seqLen: 8, vocabSize: 500, heads: 8, alpha: 0.001, padTokenId: 0 });
model.compile({ alpha: 0.001, optimizer: "adam", error: "softmaxCrossEntropy" });
model.train();

const x = mj.matrix([[0], [0], [10], [20], [30], [40], [50], [60]]); // shape [seqLen, 1]
const y = mj.matrix([[0], [10], [20], [30], [40], [50], [60], [0]]); // shifted targets [seqLen, 1]

const logits = model.forward(x); // shape [vocabSize, seqLen * batch]
model.backward(y);
console.log("shape", logits._shape, "loss", model.loss);

Transformer — Generation / Inference

import { mj, Transformers } from "@akhyar11/ml-v1";

const model = new Transformers({
  units: 64,
  seqLen: 8,
  vocabSize: 500,
  heads: 8,
  alpha: 0.001,
  padTokenId: 0,
  predictMode: "next-token",
});
model.eval();

const x = mj.matrix([[0], [0], [10], [20], [30], [40], [50], [60]]);
const nextTokenLogits = model.predict(x); // shape [vocabSize, batch]
model.setPredictMode("full-sequence");
const fullSequenceLogits = model.predict(x); // shape [vocabSize, seqLen * batch]

API Overview

Models

| Model | Description | |---|---| | Sequential | Generic layer stack (Dense, Embedding, Attention, CNN, etc.). | | Transformers | Multi-block causal language model. Supports numBlocks >= 1, full-sequence training, and configurable predictMode ("next-token" / "full-sequence"). | | DimentionalityReduction | Extends Sequential with an encoder/decoder split via the outputReduction layer status. |

Layers

| Layer | Description | |---|---| | Dense | Fully-connected layer with activation, optimizer, and loss handling. | | Embedding | Token-ID-to-vector lookup with resize() support. | | LayerNormalization | Per-column/token normalization. | | Dropout | Active only during training mode. | | PositionalEncoding | Fixed sinusoidal positional encoding. | | MultiHeadAttention / SelfAttention | Causal attention mask with padding support. | | RNN / LSTM / GRU | Recurrent sequence modeling with BPTT, gradient clipping, save/load, and stateful mode. returnSequences is supported; returnState is not yet supported and will throw explicitly. | | Flatten / Convolution | Standard CNN building blocks. |

Tokenizer

BPETokenizer supports:

| Method | Description | |---|---| | train(corpus) | Initial BPE training on a string array. | | update(corpus) | Incremental vocabulary update without retraining from scratch. | | encode(text) / encodeWithSpecial(text) | Encode text to token IDs, with or without special tokens. | | decode(ids) | Convert token IDs back to text. | | padSequence(ids, length) | Pad or truncate a sequence to a fixed length. | | save(path) / load(path) | Persist and restore the tokenizer as a JSON file. |

Tokenizer options:

type PreTokenizer = (text: string) => string[];

type BuiltInPreTokenizer =
  | "char"
  | "unicode-grapheme"
  | "unicode-word"
  | "whitespace"
  | "script-aware";

type BPETokenizerOptions = {
  vocabSize?: number;
  minFrequency?: number;
  preTokenizer?: BuiltInPreTokenizer | PreTokenizer;
};

Built-in pre-tokenizer names are saved in tokenizer JSON files. Custom pre-tokenizer functions are not serialized; saved metadata records "custom", and the same function must be passed again to BPETokenizer.load(path, { preTokenizer }).


Core Concepts

  • Shape convention: most layers use [rows, cols]; batched Transformer inputs use column-sequence layout [seqLen, batchSize].
  • Recurrent convention: recurrent layers expect a single sequence sample with shape [features, seqLen]. The generic Sequential.fit() does not batch recurrent sequences yet — use batchSize: 1.
  • Sparse classification targets: use softmaxCrossEntropy with a dense output layer and a target of shape [1, batch] containing class indices.
  • Training / eval mode: call model.train() before training and model.eval() before inference. Layers like Dropout respect this flag.

Training Workflow

  1. Prepare data as Matrix inputs and targets.
  2. Build your model and add layers.
  3. Call model.compile({ alpha, optimizer, error }).
  4. Run model.fit() (high-level) or loop forward()backward() manually.
  5. Save the model and tokenizer with save().

Inference Workflow

  1. Load the model and tokenizer.
  2. Convert input text to token IDs and pad to seqLen.
  3. Call model.predict() (respects predictMode for Transformers) or model.forward().
  4. Extract the argmax or raw logits as required by your task.
  5. Decode token IDs back to text for NLP tasks.

Performance Notes

  • The Rust backend accelerates dot-product, activations, LayerNorm, embedding lookup, attention, and optimizer hot paths.
  • Matrix uses Float32Array to minimize allocation overhead. Use _data directly in hot paths.
  • Several layers use pre-allocated output buffers to reduce garbage collection pressure.
  • Dynamic padding trim (trimPadding: true, the default) reduces effectiveSeqLen per batch, cutting attention cost from O(seqLen²) to O(effectiveSeqLen²) and output projection cost from vocabSize × seqLen × batch to vocabSize × effectiveSeqLen × batch.

Dynamic Padding Trim (v2.2.0+)

When training a Transformer on long-context sequences (e.g. seqLen=1024), enable trimPadding to avoid paying the full quadratic attention cost on padding tokens:

import { Transformers } from "@akhyar11/ml-v1";

const model = new Transformers({
  units: 64,
  seqLen: 1024,
  vocabSize: 5000,
  heads: 8,
  numBlocks: 2,
  padTokenId: 0
});

// Right-padding (recommended for new datasets)
model.fit(trainX, trainY, 80, {
  batchSize: 8,
  trimPadding: true,
  paddingSide: "right",
  shuffle: true
});

// Left-padding (for datasets already padded on the left)
model.fit(trainX, trainY, 80, {
  batchSize: 8,
  trimPadding: true,
  paddingSide: "left",
  shuffle: true
});

Options:

  • trimPadding: true (default) — enabled automatically.
  • paddingSide: "right" (default) — trailing PAD tokens are trimmed; positionOffset is 0.
  • paddingSide: "left" — leading PAD tokens are trimmed; positionOffset is adjusted so that positional encodings for real tokens remain unchanged.
  • trimPadding: false — disables the feature entirely.
  • Only applies to full-sequence targets with shape [seqLen, batch]. Legacy targets with shape [1, batch] are not trimmed.

Best Practices

  • Use softmaxCrossEntropy for sparse token classification tasks.
  • Keep seqLen consistent between your preprocessing pipeline and the model constructor.
  • Set padTokenId in both the tokenizer and the model's Embedding layer.
  • For Transformers, prepare shifted next-token targets with shape [seqLen, batch] and fill invalid positions with padTokenId.
  • Call model.train() before training and model.eval() before inference.
  • For Transformer inference, use model.predict() as the primary entry point and set predictMode to "next-token" or "full-sequence" as needed.
  • For stateful recurrent models, avoid shuffle: true and validationSplit > 0 in the generic Sequential.fit() loop.
  • Start debugging with ML_DISABLE_NATIVE=1 when comparing JS vs. native behavior.
  • If loss does not decrease, verify tensor shapes at every layer boundary.

Troubleshooting

| Problem | Solution | |---|---| | Native backend not available | Run npm run build:rust, or verify that the .node binary matches your current platform. | | Shape mismatch in dot product | Check that dimensions satisfy [aRows × aCols] · [bRows × bCols] where aCols === bRows. | | Loss is NaN or Inf | Reduce the learning rate alpha, verify target format, and check for out-of-range token IDs in the embedding. |


Project Structure

src/
  activation/   cost/   optimizer/
  matrix/        math/
  layers/        models/
  tokenizer/
  utils/
src-rust/
  src/lib.rs       ← Rust native ops (napi-rs)
test/
dataset/
docs/

Architecture Overview

| Module | Role | |---|---| | src/matrix | Core Matrix data structure (Float32Array-backed). | | src/math | Numeric primitives + adaptive Rust/JS dispatch. | | src/activation, src/cost, src/optimizer | Training building blocks. | | src/layers | Neural network layer implementations. | | src/models | High-level model compositions. | | src/tokenizer | Text preprocessing (BPE). | | src-rust | Native ops compiled via napi-rs. |


Benchmark & Testing


📖 Documentation

For in-depth guides, see the official documentation:

  1. Overview & Philosophy — Introduction to the library design and system architecture.
  2. Installation & Setup — How to install and enable Rust native acceleration.
  3. Practical Tutorial — Step-by-step guide to building a logic bot and a generative (GPT-style) bot.
  4. Full API Reference — Technical documentation for Matrix, Math, Layers, Tokenizer, Optimizers, and related APIs.

Versioning

This project follows MAJOR.MINOR.PATCH semantic versioning. The current version is 2.2.8.

  • MAJOR — breaking changes or major architectural shifts.
  • MINOR — new backward-compatible features or improvements.
  • PATCH — bug fixes, small optimizations, or minor internal changes.

Recent changelog:

| Version | Summary | |---|---| | 2.2.8 | Full Native Optimizer support (Adam, SGD, AdaGrad, Momentum, NAG) and Sparse Embedding native backend. | | 2.2.7 | Unicode-aware BPE pre-tokenizers and multilingual tokenizer documentation. | | 2.2.5 | Hot-path optimizations for training/validation, embedding lookup, and BPE tokenizer. | | 2.2.4 | Transformers.predictMode API ergonomics, docs sync, and correctness suite refactor. | | 2.2.3 | Training/inference hot-path optimizations and updated correctness learning snapshots. | | 2.2.2 | Combined root suite, family model benchmarks, and correctness learning snapshots. | | 2.2.0 | Dynamic padding trim + positional encoding offset. | | 2.0.2 | Transformer projector optimizations with no API changes. |


Development & Contributing

npm install          # install dependencies
npm run build:rust   # compile the Rust native addon
npm test             # run the correctness suite + synthetic benchmark

Type-check only (no emit):

npx tsc --noEmit

Roadmap

  • Stabilize public API entry points (currently imported directly from src/*).
  • Add deterministic floating-point tests.
  • Clean up scripts that reference non-existent project folders.
  • Add dataset recipe documentation and benchmark workflow guides.

License & Credits