starlight-preprocess
v1.0.0
Published
Preprocessing utilities for machine learning in Starlight
Maintainers
Readme
starlight-preprocess
Starlight Preprocess is a lightweight preprocessing library for machine learning in the Starlight ecosystem. It provides common data transformation tools such as scaling, encoding, and dataset splitting — designed to work seamlessly with other Starlight ML packages.
Features
Feature Scaling
StandardScaler (mean = 0, std = 1)
MinMaxScaler (range 0–1)
Label Encoding
LabelEncoder (class → integer)
OneHotEncoder (categorical → binary vectors)
Dataset Splitting
Train / Test split utility
Pipeline Friendly
Works with
starlight-regression,starlight-classifier,starlight-clusterDesigned for
starlight-pipeline
Installation
npm install starlight-preprocessUsage Examples
Standard Scaling
import { StandardScaler } from "starlight-preprocess";
const X = [
[1, 10],
[2, 20],
[3, 30]
];
const scaler = new StandardScaler();
const XScaled = scaler.fitTransform(X);
console.log(XScaled);Min-Max Scaling
import { MinMaxScaler } from "starlight-preprocess";
const scaler = new MinMaxScaler();
const normalized = scaler.fitTransform(X);Label Encoding
import { LabelEncoder } from "starlight-preprocess";
const labels = ["cat", "dog", "cat", "bird"];
const encoder = new LabelEncoder();
const encoded = encoder.fitTransform(labels);
console.log(encoded);
// [0, 1, 0, 2]One-Hot Encoding
import { OneHotEncoder } from "starlight-preprocess";
const encoder = new OneHotEncoder();
const oneHot = encoder.fitTransform(["red", "blue", "red"]);
console.log(oneHot);
// [[1,0],[0,1],[1,0]]Train / Test Split
import { trainTestSplit } from "starlight-preprocess";
const X = [[1], [2], [3], [4], [5]];
const y = [2, 4, 6, 8, 10];
const { XTrain, XTest, yTrain, yTest } =
trainTestSplit(X, y, 0.2);
console.log(XTrain, XTest);API Overview
Scalers
StandardScalerMinMaxScaler
Methods
fit(X)transform(X)fitTransform(X)
Encoders
LabelEncoderOneHotEncoder
Methods
fit(values)transform(values)fitTransform(values)inverseTransform()(LabelEncoder)
Utilities
trainTestSplit(X, y, testSize = 0.2, shuffle = true)
Works Great With
starlight-mlstarlight-vecstarlight-classifierstarlight-regressionstarlight-clusterstarlight-pipeline
License
MIT License © Dominex Macedon
Roadmap
- Missing value handling
- Text normalization helpers
- Image preprocessing
- Auto-preprocessing in pipelines
