starlight-classifier
v1.0.0
Published
A simple text classification library for Starlight using TF-IDF and Naive Bayes.
Downloads
113
Maintainers
Readme
Starlight Classifier
Starlight Classifier is a lightweight text classification library for Starlight, built on top of starlight-vec and starlight-ml. It uses TF-IDF vectorization and a simple Naive Bayes approach for classifying documents quickly and efficiently.
Features
- Vectorizes text using TF-IDF (via
starlight-vec) - Simple Naive Bayes classifier for text
- Predict labels for single or multiple documents
- Works seamlessly with Starlight ML ecosystem
- Customizable stopwords support
Installation
npm install starlight-classifier starlight-vec starlight-mlUsage
Importing
import { NaiveBayesClassifier, classify } from "starlight-classifier";
import * as ml from "starlight-ml";Example
// Sample documents
const docs = [
"I love machine learning",
"Starlight ML is amazing",
"Python is great for AI",
"I enjoy coding in JavaScript"
];
// Corresponding labels
const labels = ["tech", "tech", "tech", "programming"];
// Stopwords (optional)
const stopwords = ["is", "in"];
// Train classifier
const clf = classify(docs, labels, stopwords);
// Predict a new document
const newDoc = "I love AI and machine learning";
const prediction = clf.predict(newDoc);
console.log("Predicted class:", prediction); // e.g., "tech"
// Predict multiple documents
const batch = ["I code every day", "Starlight ML is cool"];
const batchPredictions = clf.predictBatch(batch);
console.log("Batch predictions:", batchPredictions);Advanced Usage
import { NaiveBayesClassifier } from "starlight-classifier";
// Create classifier with stopwords
const clf = new NaiveBayesClassifier(["is", "and"]);
// Fit on documents
clf.fit(docs, labels);
// Transform and analyze similarity
const vec1 = clf.vectorizer.transform(docs[0]);
const vec2 = clf.vectorizer.transform(docs[1]);
const similarity = NaiveBayesClassifier.cosine(vec1, vec2);
console.log("Cosine similarity between doc1 and doc2:", similarity);API
NaiveBayesClassifier(stopwords = [])
- Creates a classifier instance.
stopwords— optional array of words to ignore during vectorization.
fit(docs, labels)
- Train classifier on an array of documents with corresponding labels.
predict(doc)
- Predict the class of a single document.
predictBatch(docs)
- Predict the classes for multiple documents.
classify(docs, labels, stopwords)
- Convenience function: creates a classifier, fits it, and returns it.
Dependencies
starlight-vec— TF-IDF vectorization for Starlightstarlight-ml— Text preprocessing utilities
License
MIT © Dominex Macedon
