inferless

v0.1.1

Published

3 months ago

Run scikit-learn models in TypeScript. Zero dependencies.

0High
0Medium
0Low

thefourth

machine-learning sklearn scikit-learn inference serverless edge vercel cloudflare-workers typescript zero-dependencies

inferless

Run scikit-learn models in TypeScript. Zero dependencies.

Train in Python → Export as JSON → Infer anywhere TypeScript runs.

npm install inferless

Why

Serverless and edge environments (Vercel, Cloudflare Workers, Bun, Deno) don't support Python runtimes. This makes deploying classical ML models painful:

| Approach | Problem | |---|---| | Python runtime (Flask/FastAPI) | Not available on edge/serverless | | ONNX Runtime | ~30MB binary, no Cloudflare Workers support | | TensorFlow.js | Designed for neural networks, massive overhead for classical ML | | External ML API | Adds latency, cost, and vendor lock-in | | inferless | Pure TypeScript, zero dependencies, <10KB |

The insight: classical ML inference (linear models, gradient boosted trees) reduces to a handful of arithmetic operations — a dot product, a scaler transform, a tree traversal. These take ~150 lines of TypeScript and no external packages.

inferless packages this as a library. Export your sklearn model as a JSON artifact from Python, then load and run it anywhere TypeScript runs.

Quick Start

1. Train and export your model (Python)

pip install scikit-learn

from sklearn.ensemble import GradientBoostingClassifier
from sklearn.preprocessing import StandardScaler
from inferless_export import export_model  # pip install inferless-export

FEATURES = ["days_since_visit", "total_visits", "visit_frequency"]

scaler = StandardScaler().fit(X_train)
clf = GradientBoostingClassifier(n_estimators=40).fit(scaler.transform(X_train), y_train)

export_model(
    model=clf,
    scaler=scaler,
    feature_columns=FEATURES,
    output_path="churn_model.json",
    validate=True,
    test_X=X_test,
)

2. Run inference (TypeScript)

import { loadModel, predict } from 'inferless'
import modelJson from './churn_model.json'

const model = loadModel(modelJson)

const result = predict(model, {
  days_since_visit: 45,
  total_visits: 3,
  visit_frequency: 0.4,
})

console.log(result.probability) // 0.72 — high churn risk

That's it. No Python. No containers. No external API calls.

Supported Algorithms

| Algorithm | sklearn class | model_type | |---|---|---| | Ridge Regression | Ridge | ridge_regression | | Lasso Regression | Lasso | lasso_regression | | Elastic Net | ElasticNet | elastic_net | | Linear Regression | LinearRegression | linear_regression | | Logistic Regression | LogisticRegression | logistic_regression | | Gradient Boosted Classifier | GradientBoostingClassifier | gradient_boosting_classifier | | Gradient Boosted Regressor | GradientBoostingRegressor | gradient_boosting_regressor |

Scalers: StandardScaler, MinMaxScaler

API

`loadModel(json)`

Parse a model artifact from a JSON object. Validates structure.

const model = loadModel(require('./model.json'))
const model = loadModel(await fetch('/model.json').then(r => r.json()))

`predict(model, input)`

Run inference on a single input. Input can be a named object (recommended) or an ordered array.

// Named object — order-independent, recommended
predict(model, { feature_a: 1.2, feature_b: 0.5 })

// Ordered array — must match feature_names order in artifact
predict(model, [1.2, 0.5])

Returns { value: number, probability?: number }.

value: raw model output (log-odds for classifiers, regression score for regressors)
probability: sigmoid-transformed value for classifiers only

`predictBatch(model, inputs)`

Run inference on multiple inputs. Returns an array of results.

const results = predictBatch(model, guests.map(g => ({
  days_since_visit: g.daysSince,
  total_visits: g.visits,
})))

How It Works

When sklearn trains a model, all the "intelligence" is encoded in the model parameters:

Linear models: a coefficient vector + intercept
GBT: an array of decision trees (each a set of split conditions and leaf values)

At inference time, these parameters are all that's needed. inferless exports them as a JSON artifact and reimplements the inference math in TypeScript.

The artifact is a self-contained description of the model — no sklearn, no Python, no dependencies needed to read it.

JSON artifact format (linear):

{
  "model_type": "ridge_regression",
  "feature_names": ["x1", "x2", "x3"],
  "coefficients": [0.42, -0.18, 0.93],
  "intercept": 1.24,
  "scaler": {
    "type": "standard",
    "mean": [12.3, 0.45, 7.8],
    "scale": [4.2, 0.12, 2.1]
  },
  "trained_at": "2026-01-15T10:00:00Z",
  "version": "1"
}

Validation

The Python exporter includes a round-trip validation step. When validate=True, it runs the same inference math as the TypeScript library (implemented in Python for testing) and checks that predictions match sklearn's output within a configurable tolerance.

export_model(
    model=clf,
    feature_columns=FEATURES,
    output_path="model.json",
    validate=True,
    test_X=X_test,
    tolerance=1e-5,  # default
)
# [inferless] Validation passed (max diff: 3.45e-09)

Limitations

Classical ML only: Neural networks require non-portable inference (matrix operations over thousands of parameters). Use ONNX or TensorFlow.js for those.
Binary classification: Multi-class GBT (n_classes > 2) is not currently supported.
Large GBTs: Models with thousands of trees will have large JSON artifacts. Typically fine for ≤200 trees.
Feature preprocessing: Only StandardScaler and MinMaxScaler are supported. Complex pipelines (e.g. ColumnTransformer) should be applied before export.

Used in Production

inferless was extracted from Signal — a consumer intelligence platform for Africa's hospitality economy — where it powers four ML models (guest churn prediction, rep reliability, event forecasting, attribution scoring) on Vercel serverless functions.

Published

Vulnerabilities

Links

Maintainers

Keywords

Readme

inferless

Why

Quick Start

Supported Algorithms

API

loadModel(json)

predict(model, input)

predictBatch(model, inputs)

How It Works

Validation

Limitations

Used in Production

License

`loadModel(json)`

`predict(model, input)`

`predictBatch(model, inputs)`