@akhyar11/ml-v1
v2.3.0
Published
TypeScript + Rust Native machine learning library. Matrix ops, layers (Dense, Embedding, RNN, LSTM, GRU, MultiHeadAttention, etc), models (Sequential, Transformers), dan BPE tokenizer.
Maintainers
Readme
ML-V1
A TypeScript + Rust Native machine learning library — Matrix operations, neural network layers, Transformer models, and a BPE tokenizer, all in one package.
What is ML-V1?
ML-V1 is a low-to-mid-level machine learning library built with TypeScript and accelerated by a Rust native backend (via napi-rs). It gives you full control over every detail of the training loop — shapes, parameter updates, and custom architectures — without depending on a large ML framework.
Why ML-V1?
- Full manual control over training loops, tensor shapes, and parameter updates.
- A research playground for custom model architectures.
- The productivity of TypeScript combined with Rust performance on hot paths.
- Graceful fallback to pure JavaScript when the native backend is unavailable.
Features
Matrix— flatFloat32Array-backed tensor with zero-copy hot-path access via_data.- Math primitives —
dotProduct,add,sub,sumAxis,clipGradients, and more; automatically dispatched to Rust or JS. - Layers —
Dense,Embedding,RNN,LSTM,GRU,SelfAttention,MultiHeadAttention,LayerNormalization,Dropout,PositionalEncoding,Flatten,Convolution. - Models —
Sequential,Transformers(causal LM),DimentionalityReduction. - BPE Tokenizer — train, incremental update, Unicode-aware pre-tokenization, encode/decode with special tokens, padding, and JSON save/load.
- Rust-accelerated ops — dot-product, activations, LayerNorm, embedding lookup, attention, and optimizer updates; auto-fallback to JS when unavailable.
- Dynamic padding trim (
trimPadding) — reduces effective sequence length per batch, cutting attention cost from O(seqLen²) to O(effectiveSeqLen²).
Installation
npm install @akhyar11/ml-v1Prerequisites for Native Acceleration
The library works out of the box with a pure JavaScript fallback. For up to 10× faster matrix operations, install the Rust toolchain so the native addon can be compiled automatically on npm install:
- Rust Toolchain — install via rustup.rs.
- C/C++ Build Tools — required by the native binding compiler (e.g.
build-essentialon Linux, Xcode CLI on macOS, MSVC on Windows).
Note: If Rust is not installed, a warning is printed and the library falls back to pure JavaScript automatically. Performance will be noticeably slower for large models.
Building from Source
If you cloned the repository or need a manual build:
# Install dependencies
npm install
# Build the native Rust addon (release mode)
npm run build:rust
# Build the TypeScript distribution
npm run build:publishRust Native Backend
The native backend is loaded by src/math/rust_backend.ts. You can check whether it is active at runtime:
import { isNativeAvailable } from "@akhyar11/ml-v1";
console.log("Native active:", isNativeAvailable());To force JavaScript-only execution (useful for debugging or regression comparisons):
ML_DISABLE_NATIVE=1 node your-script.jsQuick Start
Train a simple XOR classifier in a few lines:
import { Dense, mj, Sequential } from "@akhyar11/ml-v1";
const model = new Sequential({
layers: [
new Dense({ units: 2, outputUnits: 4, activation: "relu", status: "input" }),
new Dense({ units: 4, outputUnits: 1, activation: "sigmoid", status: "output", loss: "mse" }),
],
});
model.compile({ alpha: 0.01, optimizer: "adam", error: "mse" });
const X = [mj.matrix([[0], [0]]), mj.matrix([[0], [1]]), mj.matrix([[1], [0]]), mj.matrix([[1], [1]])];
const Y = [mj.matrix([[0]]), mj.matrix([[1]]), mj.matrix([[1]]), mj.matrix([[0]])];
const result = model.fit(X, Y, 200, {
batchSize: 4,
validationSplit: 0.25,
earlyStoppingPatience: 10,
verbose: true,
onEpochEnd: (epoch, loss, valLoss) => {
console.log(`epoch=${epoch} loss=${loss} valLoss=${valLoss}`);
},
});
console.log("best", result.bestEpoch, result.bestLoss);
const pred = model.predict(mj.matrix([[1], [0]]));
pred.print();A legacy callback overload is also supported for backward compatibility:
model.fit(X, Y, 200, (loss) => console.log("loss", loss));Examples
Matrix & Math Operations
import { mj } from "@akhyar11/ml-v1";
const a = mj.matrix([[1, 2], [3, 4]]);
const b = mj.matrix([[5, 6], [7, 8]]);
const c = mj.dotProduct(a, b);
const d = mj.add(c, 1);
console.log(c._shape, d._shape);BPE Tokenizer
import { BPETokenizer } from "@akhyar11/ml-v1";
const tokenizer = new BPETokenizer({ vocabSize: 120, minFrequency: 2 });
tokenizer.train(["hello world", "hello there"]);
const ids = tokenizer.encodeWithSpecial("hello world");
const padded = tokenizer.padSequence(ids, 12);
console.log(ids, padded, tokenizer.decode(ids));Unicode and Multilingual Tokenization
ML-V1 supports custom and built-in pre-tokenizers for non-Latin text. The default is still "char" for backward compatibility; use "unicode-grapheme" or "script-aware" for multilingual corpora.
Supported modes:
charunicode-graphemeunicode-wordwhitespacescript-aware
import { BPETokenizer } from "@akhyar11/ml-v1";
const tokenizer = new BPETokenizer({
vocabSize: 1000,
preTokenizer: "script-aware"
});
tokenizer.train([
"hello world",
"مرحبا بالعالم",
"こんにちは世界",
"你好世界",
"ภาษาไทย",
"한국어테스트",
"ꦱꦺꦴꦥꦺꦴ",
"x² + y² = z²",
"hello ꦱꦺꦴꦥꦺꦴ 😊 你好"
]);BPE alone is not enough for every writing system. Pre-tokenization is important for scripts without spaces, combining marks, emoji sequences, and mixed text. script-aware is a general built-in mode; for language-specific behavior, pass a custom (text: string) => string[] pre-tokenizer. Intl.Segmenter improves grapheme and word segmentation when the runtime supports it. Fallback behavior is deterministic but may be less linguistically accurate.
Transformer Causal LM — Training
import { mj, Transformers } from "@akhyar11/ml-v1";
const model = new Transformers({ units: 64, seqLen: 8, vocabSize: 500, heads: 8, alpha: 0.001, padTokenId: 0 });
model.compile({ alpha: 0.001, optimizer: "adam", error: "softmaxCrossEntropy" });
model.train();
const x = mj.matrix([[0], [0], [10], [20], [30], [40], [50], [60]]); // shape [seqLen, 1]
const y = mj.matrix([[0], [10], [20], [30], [40], [50], [60], [0]]); // shifted targets [seqLen, 1]
const logits = model.forward(x); // shape [vocabSize, seqLen * batch]
model.backward(y);
console.log("shape", logits._shape, "loss", model.loss);Transformer — Generation / Inference
import { mj, Transformers } from "@akhyar11/ml-v1";
const model = new Transformers({
units: 64,
seqLen: 8,
vocabSize: 500,
heads: 8,
alpha: 0.001,
padTokenId: 0,
predictMode: "next-token",
});
model.eval();
const x = mj.matrix([[0], [0], [10], [20], [30], [40], [50], [60]]);
const nextTokenLogits = model.predict(x); // shape [vocabSize, batch]
model.setPredictMode("full-sequence");
const fullSequenceLogits = model.predict(x); // shape [vocabSize, seqLen * batch]API Overview
Models
| Model | Description |
|---|---|
| Sequential | Generic layer stack (Dense, Embedding, Attention, CNN, etc.). |
| Transformers | Multi-block causal language model. Supports numBlocks >= 1, full-sequence training, and configurable predictMode ("next-token" / "full-sequence"). |
| DimentionalityReduction | Extends Sequential with an encoder/decoder split via the outputReduction layer status. |
Layers
| Layer | Description |
|---|---|
| Dense | Fully-connected layer with activation, optimizer, and loss handling. |
| Embedding | Token-ID-to-vector lookup with resize() support. |
| LayerNormalization | Per-column/token normalization. |
| Dropout | Active only during training mode. |
| PositionalEncoding | Fixed sinusoidal positional encoding. |
| MultiHeadAttention / SelfAttention | Causal attention mask with padding support. |
| RNN / LSTM / GRU | Recurrent sequence modeling with BPTT, gradient clipping, save/load, and stateful mode. returnSequences is supported; returnState is not yet supported and will throw explicitly. |
| Flatten / Convolution | Standard CNN building blocks. |
Tokenizer
BPETokenizer supports:
| Method | Description |
|---|---|
| train(corpus) | Initial BPE training on a string array. |
| update(corpus) | Incremental vocabulary update without retraining from scratch. |
| encode(text) / encodeWithSpecial(text) | Encode text to token IDs, with or without special tokens. |
| decode(ids) | Convert token IDs back to text. |
| padSequence(ids, length) | Pad or truncate a sequence to a fixed length. |
| save(path) / load(path) | Persist and restore the tokenizer as a JSON file. |
Tokenizer options:
type PreTokenizer = (text: string) => string[];
type BuiltInPreTokenizer =
| "char"
| "unicode-grapheme"
| "unicode-word"
| "whitespace"
| "script-aware";
type BPETokenizerOptions = {
vocabSize?: number;
minFrequency?: number;
preTokenizer?: BuiltInPreTokenizer | PreTokenizer;
};Built-in pre-tokenizer names are saved in tokenizer JSON files. Custom pre-tokenizer functions are not serialized; saved metadata records "custom", and the same function must be passed again to BPETokenizer.load(path, { preTokenizer }).
Core Concepts
- Shape convention: most layers use
[rows, cols]; batched Transformer inputs use column-sequence layout[seqLen, batchSize]. - Recurrent convention: recurrent layers expect a single sequence sample with shape
[features, seqLen]. The genericSequential.fit()does not batch recurrent sequences yet — usebatchSize: 1. - Sparse classification targets: use
softmaxCrossEntropywith a dense output layer and a target of shape[1, batch]containing class indices. - Training / eval mode: call
model.train()before training andmodel.eval()before inference. Layers likeDropoutrespect this flag.
Training Workflow
- Prepare data as
Matrixinputs and targets. - Build your model and add layers.
- Call
model.compile({ alpha, optimizer, error }). - Run
model.fit()(high-level) or loopforward()→backward()manually. - Save the model and tokenizer with
save().
Inference Workflow
- Load the model and tokenizer.
- Convert input text to token IDs and pad to
seqLen. - Call
model.predict()(respectspredictModeforTransformers) ormodel.forward(). - Extract the argmax or raw logits as required by your task.
- Decode token IDs back to text for NLP tasks.
Performance Notes
- The Rust backend accelerates dot-product, activations, LayerNorm, embedding lookup, attention, and optimizer hot paths.
MatrixusesFloat32Arrayto minimize allocation overhead. Use_datadirectly in hot paths.- Several layers use pre-allocated output buffers to reduce garbage collection pressure.
- Dynamic padding trim (
trimPadding: true, the default) reduceseffectiveSeqLenper batch, cutting attention cost from O(seqLen²) to O(effectiveSeqLen²) and output projection cost fromvocabSize × seqLen × batchtovocabSize × effectiveSeqLen × batch.
Dynamic Padding Trim (v2.2.0+)
When training a Transformer on long-context sequences (e.g. seqLen=1024), enable trimPadding to avoid paying the full quadratic attention cost on padding tokens:
import { Transformers } from "@akhyar11/ml-v1";
const model = new Transformers({
units: 64,
seqLen: 1024,
vocabSize: 5000,
heads: 8,
numBlocks: 2,
padTokenId: 0
});
// Right-padding (recommended for new datasets)
model.fit(trainX, trainY, 80, {
batchSize: 8,
trimPadding: true,
paddingSide: "right",
shuffle: true
});
// Left-padding (for datasets already padded on the left)
model.fit(trainX, trainY, 80, {
batchSize: 8,
trimPadding: true,
paddingSide: "left",
shuffle: true
});Options:
trimPadding: true(default) — enabled automatically.paddingSide: "right"(default) — trailing PAD tokens are trimmed;positionOffsetis 0.paddingSide: "left"— leading PAD tokens are trimmed;positionOffsetis adjusted so that positional encodings for real tokens remain unchanged.trimPadding: false— disables the feature entirely.- Only applies to full-sequence targets with shape
[seqLen, batch]. Legacy targets with shape[1, batch]are not trimmed.
Best Practices
- Use
softmaxCrossEntropyfor sparse token classification tasks. - Keep
seqLenconsistent between your preprocessing pipeline and the model constructor. - Set
padTokenIdin both the tokenizer and the model'sEmbeddinglayer. - For
Transformers, prepare shifted next-token targets with shape[seqLen, batch]and fill invalid positions withpadTokenId. - Call
model.train()before training andmodel.eval()before inference. - For Transformer inference, use
model.predict()as the primary entry point and setpredictModeto"next-token"or"full-sequence"as needed. - For stateful recurrent models, avoid
shuffle: trueandvalidationSplit > 0in the genericSequential.fit()loop. - Start debugging with
ML_DISABLE_NATIVE=1when comparing JS vs. native behavior. - If loss does not decrease, verify tensor shapes at every layer boundary.
Troubleshooting
| Problem | Solution |
|---|---|
| Native backend not available | Run npm run build:rust, or verify that the .node binary matches your current platform. |
| Shape mismatch in dot product | Check that dimensions satisfy [aRows × aCols] · [bRows × bCols] where aCols === bRows. |
| Loss is NaN or Inf | Reduce the learning rate alpha, verify target format, and check for out-of-range token IDs in the embedding. |
Project Structure
src/
activation/ cost/ optimizer/
matrix/ math/
layers/ models/
tokenizer/
utils/
src-rust/
src/lib.rs ← Rust native ops (napi-rs)
test/
dataset/
docs/Architecture Overview
| Module | Role |
|---|---|
| src/matrix | Core Matrix data structure (Float32Array-backed). |
| src/math | Numeric primitives + adaptive Rust/JS dispatch. |
| src/activation, src/cost, src/optimizer | Training building blocks. |
| src/layers | Neural network layer implementations. |
| src/models | High-level model compositions. |
| src/tokenizer | Text preprocessing (BPE). |
| src-rust | Native ops compiled via napi-rs. |
Benchmark & Testing
- Full test + benchmark entry point:
test/index.ts— run withnpm test. - Correctness suite:
test/correctness/index.ts. - Synthetic benchmark suite:
test/benchmark/index.ts. - Recurrent model benchmarks:
test/benchmark/testFamilyRnn.test.ts. - Transformer mode benchmarks:
test/benchmark/testFamilyTransformers.test.ts. - Benchmark history:
docs/benchmark-sintetis/README.md. - Correctness history:
docs/correctness/README.md.
📖 Documentation
For in-depth guides, see the official documentation:
- Overview & Philosophy — Introduction to the library design and system architecture.
- Installation & Setup — How to install and enable Rust native acceleration.
- Practical Tutorial — Step-by-step guide to building a logic bot and a generative (GPT-style) bot.
- Full API Reference — Technical documentation for Matrix, Math, Layers, Tokenizer, Optimizers, and related APIs.
Versioning
This project follows MAJOR.MINOR.PATCH semantic versioning. The current version is 2.2.8.
- MAJOR — breaking changes or major architectural shifts.
- MINOR — new backward-compatible features or improvements.
- PATCH — bug fixes, small optimizations, or minor internal changes.
Recent changelog:
| Version | Summary |
|---|---|
| 2.2.8 | Full Native Optimizer support (Adam, SGD, AdaGrad, Momentum, NAG) and Sparse Embedding native backend. |
| 2.2.7 | Unicode-aware BPE pre-tokenizers and multilingual tokenizer documentation. |
| 2.2.5 | Hot-path optimizations for training/validation, embedding lookup, and BPE tokenizer. |
| 2.2.4 | Transformers.predictMode API ergonomics, docs sync, and correctness suite refactor. |
| 2.2.3 | Training/inference hot-path optimizations and updated correctness learning snapshots. |
| 2.2.2 | Combined root suite, family model benchmarks, and correctness learning snapshots. |
| 2.2.0 | Dynamic padding trim + positional encoding offset. |
| 2.0.2 | Transformer projector optimizations with no API changes. |
Development & Contributing
npm install # install dependencies
npm run build:rust # compile the Rust native addon
npm test # run the correctness suite + synthetic benchmarkType-check only (no emit):
npx tsc --noEmitRoadmap
- Stabilize public API entry points (currently imported directly from
src/*). - Add deterministic floating-point tests.
- Clean up scripts that reference non-existent project folders.
- Add dataset recipe documentation and benchmark workflow guides.
License & Credits
- License: ISC — see
package.json. - Native backend:
napi-rs,matrixmultiply,rayon. - Issues & feature requests: use the GitHub issue tracker.
- Support the project:
