@mera-vansh/ms-ltd
v2.3.0
Published
Zero-dependency TypeScript NLP engine for multilingual Indian-language applications
Maintainers
Readme
@mera-vansh/ms-ltd
Multilingual NLP engine for Indian-language applications.
@mera-vansh/ms-ltd is a zero-dependency TypeScript NLP engine built specifically for the 18 official Indian languages. It provides semantic retrieval over your own documents, emotion and tone classification, Unicode script and language detection, a 2200-entry curated lexicon, Sanskrit grammar tools, and prefix autocomplete — all without any external runtime dependencies, ML models, or API calls.
Key Capabilities
- Semantic retrieval — index your own documents and query them in any Indian language or Romanised English
- Emotion detection — classifies text into REVERENCE, JOY, GRIEF, ANGER, CONFUSION, or NEUTRAL
- Tone detection — classifies formality as REVERENTIAL, FORMAL, URGENT, CURIOUS, INFORMAL, or NEUTRAL
- Script & language detection — identifies the dominant script and narrows it to one of 18 language codes
- Lexicon autocomplete — prefix search over 2200 curated entries (salutations, kinship, geography, literature, time, numbers, and the 9 Sanskrit rasas) across 18 languages
- Sanskrit grammar tools — nominal inflection, adjective agreement, IAST transliteration to 9 Indic scripts
- Feedback loop — boost or penalise individual documents based on whether they were helpful
- Serialisable state — export and restore the full engine as plain JSON
Supported Languages
| Code | Language | Script | Code | Language | Script |
|------|----------|--------|------|----------|--------|
| en | English | Latin | pa | Punjabi | Gurmukhi |
| hi | Hindi | Devanagari | or | Odia | Odia |
| mr | Marathi | Devanagari | sa | Sanskrit | Devanagari |
| ne | Nepali | Devanagari | kok | Konkani | Devanagari |
| ma | Maithili | Devanagari | ta | Tamil | Tamil |
| bn | Bengali | Bengali | te | Telugu | Telugu |
| as | Assamese | Bengali | ml | Malayalam | Malayalam |
| gu | Gujarati | Gujarati | kn | Kannada | Kannada |
| ur | Urdu | Arabic | sd | Sindhi | Arabic |
Installation
npm install @mera-vansh/ms-ltd
# or
pnpm add @mera-vansh/ms-ltdRequirements: Node.js ≥ 22
Quick Start
import { LTD } from "@mera-vansh/ms-ltd";
const ltd = new LTD();
// 1. Index your knowledge base
ltd.ingest([
{ id: "g1", text: "भारद्वाज गोत्र के बारे में जानकारी", metadata: { topic: "gotra" } },
{ id: "g2", text: "Bharadwaj gotra pravara rishis", metadata: { topic: "gotra" } },
{ id: "r1", text: "माता पिता का रिश्ता पारिवारिक संबंध", metadata: { topic: "family" } },
]);
// 2. Query in any language
const result = ltd.call("मेरा गोत्र भारद्वाज है");
console.log(result.emotion); // "NEUTRAL"
console.log(result.tone); // "NEUTRAL"
console.log(result.lang); // "hi"
console.log(result.candidates[0]); // { id: "g1", score: 0.82, metadata: { topic: "gotra" } }API
new LTD(options?)
const ltd = new LTD({ defaultTopK: 5 });| Option | Type | Default | Description |
|---|---|---|---|
| defaultTopK | number | 5 | Default number of candidates returned per query |
ltd.ingest(docs)
Indexes a batch of documents. Prefer one large ingest() call over many small ones to get the best retrieval quality across your full corpus.
ltd.ingest([
{ id: "q1", text: "namaste greetings hello", metadata: { lang: "en" } },
{ id: "q2", text: "नमस्ते प्रणाम", metadata: { lang: "hi" } },
{ id: "q3", text: "வணக்கம் நன்றி", metadata: { lang: "ta" } },
]);Calling ingest() again adds to the existing store — it does not replace previous entries. Documents with duplicate IDs are overwritten.
ltd.add(doc)
Adds a single document after the initial ingest. Terms that are completely new to the vocabulary are silently ignored for retrieval scoring.
ltd.add({ id: "q4", text: "additional context for gotra queries", metadata: { source: "user" } });ltd.call(input, targetLang?, topK?): LTDResponse
Runs the full NLP pipeline: normalise → detect script and language → retrieve candidates → classify emotion and tone.
const res = ltd.call("guruji ka ashirwad chahiye");
// res.tone === "REVERENTIAL"
// res.emotion === "REVERENCE"
// res.lang === "hi"
const res2 = ltd.call("meri samasya urgent hai!!", undefined, 3);
// res2.tone === "URGENT"Response shape:
interface LTDResponse {
input: string; // normalised input text
lang: LangCode | null; // detected language ("hi", "te", etc.) or null for mixed script
script: Script; // dominant script ("Devanagari", "Tamil", etc.)
emotion: Emotion; // emotional register
tone: Tone; // formality register
candidates: LTDCandidate[]; // ranked retrieval results
confidence: number; // top candidate score [0, 1]
}ltd.suggest(input, maxResults?): LexiconSuggestion[]
Searches the built-in 2200-entry lexicon for entries whose text or romanised form starts with the given prefix. Works with Devanagari, IAST-romanised input, and plain ASCII.
// Devanagari prefix
ltd.suggest("नमस्ते");
// → [{ entry: { text: "नमस्ते", lang: "hi", romanized: "namaste", gloss: "hello", category: "salutation" }, matchedPrefix: "नमस्ते" }]
// IAST romanised prefix
ltd.suggest("dādā", 3);
// → entries for paternal-grandfather across languages
// Geography
ltd.suggest("gaṃgā");
// → Ganga entries in Hindi, Sanskrit, etc.
// Returns empty array for no match (never null)
ltd.suggest("xyz_no_match"); // → []| Parameter | Type | Default | Description |
|---|---|---|---|
| input | string | — | Devanagari, IAST-romanised, or plain ASCII prefix |
| maxResults | number | 10 | Maximum suggestions returned |
ltd.feedback(id, signal)
Adjusts a document's retrieval weight based on whether the result was helpful.
ltd.feedback("q1", "positive"); // promote: weight increases
ltd.feedback("q1", "negative"); // demote: weight decreases
ltd.feedback("q1", "neutral"); // no changeltd.export() / ltd.import(state)
Persist and restore the full engine state as plain JSON. The snapshot is compatible with any database.
// Save to MongoDB
const snapshot = ltd.export();
await db.collection("brain").replaceOne({ _id: "v1" }, { _id: "v1", ...snapshot }, { upsert: true });
// Restore in a new process
const saved = await db.collection("brain").findOne({ _id: "v1" });
const ltd2 = new LTD();
if (saved) ltd2.import(saved);ltd.reset()
Clears all documents and resets the engine to its initial empty state.
ltd.reset();
ltd.storeSize(); // → 0ltd.storeSize() / ltd.hasDocument(id)
ltd.storeSize(); // number of indexed documents
ltd.hasDocument("q1"); // → true / falseEmotion Detection
detectEmotion(text) classifies text into one of six registers in strict priority order:
| Priority | Emotion | Example triggers |
|---|---|---|
| 1 | REVERENCE | namaste, pranam, jai, श्री, ওম, ੴ, ನಮಸ್ಕಾರ, हरे कृष्ण, राम राम, ॐ नमः शिवाय |
| 2 | JOY | good, great, धन्यवाद, நன்றி, ধন্যবাদ, 😊🎉 |
| 3 | GRIEF | died, मृत्यु, மரணம், మరణం, مرحوم, 😢💔 |
| 4 | ANGER | wrong, error, गलत, தவறு, ভুল, غلط, 😠😡 |
| 5 | CONFUSION | confused, समझ नहीं, புரியவில்லை, అర్థం కాలేదు, ?? |
| 6 | NEUTRAL | (fallback — no rule matched) |
import { detectEmotion } from "@mera-vansh/ms-ltd";
detectEmotion("नमस्ते, आपका स्वागत है"); // → "REVERENCE"
detectEmotion("हरे कृष्ण"); // → "REVERENCE"
detectEmotion("ॐ नमः शिवाय"); // → "REVERENCE"
detectEmotion("bahut acha kiya!"); // → "JOY"
detectEmotion("wrong answer!!"); // → "ANGER"
detectEmotion("what do you mean??"); // → "CONFUSION"
detectEmotion("my gotra is Bharadwaj"); // → "NEUTRAL"When multiple emotion cues are present, the highest-priority one always wins — a message containing both a greeting and an angry phrase is classified as REVERENCE, not ANGER.
Tone Detection
detectTone(text) classifies the formality or register of text in strict priority order:
| Priority | Tone | Example triggers |
|---|---|---|
| 1 | REVERENTIAL | param pujya, guruji, swami, माता श्री, ஐயா, అయ్యా |
| 2 | FORMAL | Dr., Mr., aap, आप, shriman |
| 3 | URGENT | abhi, jaldi, immediately, asap, right now, !! |
| 4 | CURIOUS | ?, why, how, what, kyun, kaise, batao |
| 5 | INFORMAL | tu, tum, yaar, dost, bhai, lol, haha |
| 6 | NEUTRAL | (fallback — no rule matched) |
import { detectTone } from "@mera-vansh/ms-ltd";
detectTone("Param Pujya Guruji ka ashirwad"); // → "REVERENTIAL"
detectTone("Dr. Sharma please help"); // → "FORMAL"
detectTone("abhi batao, urgent!"); // → "URGENT"
detectTone("gotra kya hota hai?"); // → "CURIOUS"
detectTone("yaar bata na"); // → "INFORMAL"
detectTone("my gotra is Bharadwaj"); // → "NEUTRAL"
// Highest priority wins when multiple rules fire
detectTone("aap ko pujya mata shri pranam");
// "aap" → FORMAL, "pujya mata shri" → REVERENTIAL → "REVERENTIAL"Script & Language Detection
ScriptDetector.detectScript(text)
Identifies the dominant Unicode writing system in the input.
import { ScriptDetector } from "@mera-vansh/ms-ltd";
ScriptDetector.detectScript("नमस्ते"); // → "Devanagari"
ScriptDetector.detectScript("hello world"); // → "Latin"
ScriptDetector.detectScript("வணக்கம்"); // → "Tamil"
ScriptDetector.detectScript("hello नमस्ते"); // → "Mixed"Returns one of: Devanagari | Tamil | Telugu | Bengali | Gujarati | Odia | Malayalam | Kannada | Gurmukhi | Arabic | Latin | Mixed | Unknown
ScriptDetector.detectLanguage(text)
Narrows the script to a specific BCP-47 language code using whole-word keyword disambiguation. Correctly separates languages that share the same script (Hindi, Nepali, Marathi, Sanskrit; Bengali and Assamese; Urdu and Sindhi).
ScriptDetector.detectLanguage("मेरा गोत्र भारद्वाज है"); // → "hi"
ScriptDetector.detectLanguage("माझ्या घरी आहे"); // → "mr"
ScriptDetector.detectLanguage("मेरो नाम के छ"); // → "ne"
ScriptDetector.detectLanguage("भवति करोति"); // → "sa"
ScriptDetector.detectLanguage("মোৰ গোত্ৰ কি"); // → "as" (Assamese, not Bengali)
ScriptDetector.detectLanguage("hello मेरा"); // → null (Mixed script)ScriptDetector.detectMixedScripts(text, threshold?)
Returns all scripts present above a share threshold (default 10%).
ScriptDetector.detectMixedScripts("hello नमस्ते world गोत्र");
// → ["Devanagari", "Latin"]LEXICON
The built-in LEXICON contains ~2200 curated entries across 7 semantic categories covering 18 languages, with IAST romanisations and English glosses.
Categories
| Category | Description | Example entries |
|---|---|---|
| salutation | Greetings and farewells | नमस्ते, வணக்கம், నమస్కారం |
| kinship | Family relations (24 roles × 18 langs) | पिता, माता, दादा, नानी |
| emotion_rasa | The 9 Sanskrit rasas | SHRINGAR, KARUNA, VEERA, SHANTA… |
| geography | Sacred rivers, pilgrimage sites | गंगा (gaṃgā), काशी (kāśī) |
| literature | Epics and canonical texts | रामायण (rāmāyaṇa), महाभारत |
| time | Vikrama Samvat months, days, tithis | चैत्र (caitra), सोमवार |
| number | Numerals in Devanagari, Tamil, etc. | ०१२, ௦௧௨ |
import { LEXICON } from "@mera-vansh/ms-ltd";
// All salutations
const salutations = LEXICON.salutation;
// Hindi father
LEXICON.kinship.find(e => e.subcategory === "father" && e.lang === "hi");
// → { text: "पिता", romanized: "pitā", gloss: "father", ... }
// All 9 Sanskrit rasas
LEXICON.emotion_rasa
.filter(e => e.lang === "sa")
.map(e => e.subcategory);
// → ["SHRINGAR", "HASYA", "KARUNA", "RAUDRA", "VEERA", "BHAYANAK", "BIBHATSA", "ADBHUTA", "SHANTA"]Grammar Tools
Transliterator
Converts IAST romanisation to 9 Indic scripts, and maps Devanagari characters to IAST.
import { Transliterator } from "@mera-vansh/ms-ltd";
// IAST → script
Transliterator.iastToScript("k", "Devanagari"); // → "क"
Transliterator.iastToScript("k", "Bengali"); // → "ক"
Transliterator.iastToScript("k", "Tamil"); // → "க"
Transliterator.iastToScript("k", "Telugu"); // → "క"
Transliterator.iastToScript("ā", "Devanagari"); // → "आ"
// Devanagari → IAST
Transliterator.devanagariToIast("क"); // → "k"
Transliterator.devanagariToIast("आ"); // → "ā"
// Check if a string contains IAST diacritics
Transliterator.isIAST("rāma"); // → true
Transliterator.isIAST("rama"); // → falseSupported target scripts: Devanagari | Bengali | Tamil | Telugu | Gujarati | Gurmukhi | Odia | Kannada | Malayalam
VibhaktiEngine
Sanskrit nominal inflection across all 8 vibhaktis (cases) and 3 vacanas (numbers) for 6 stem classes.
import { VibhaktiEngine } from "@mera-vansh/ms-ltd";
// Single form
VibhaktiEngine.inflect("rām", "a_m", 1, "sg");
// → { form: "rāmaḥ", vibhakti: 1, vacana: "sg", linga: "m", kāraka: "kartā" }
VibhaktiEngine.inflect("sīt", "aa_f", 3, "sg");
// → { form: "sītayā", vibhakti: 3, vacana: "sg", linga: "f", kāraka: "karaṇa" }
// Full 24-form paradigm (8 vibhaktis × 3 vacanas)
const paradigm = VibhaktiEngine.paradigm("rām", "a_m");
paradigm.length; // → 24
paradigm[0]!.form; // → "rāmaḥ" (Nominative singular)
paradigm[0]!.kāraka; // → "kartā"Stem classes: a_m (masculine -a), aa_f (feminine -ā), i_m (masculine -i), ii_f (feminine -ī), u_m (masculine -u), cons (consonant-final)
Vibhakti numbers 1–8: Nominative, Accusative, Instrumental, Dative, Ablative, Genitive, Locative, Vocative
GenderAgreement
Adjective agreement and honorific pronoun detection for Hindi, Marathi, and Sanskrit.
import { GenderAgreement } from "@mera-vansh/ms-ltd";
// Hindi / Marathi adjective agreement
GenderAgreement.agreeAdjective("acchā", "m", "sg", "hi"); // → "acchā"
GenderAgreement.agreeAdjective("acchā", "f", "sg", "hi"); // → "acchī"
GenderAgreement.agreeAdjective("acchā", "m", "pl", "hi"); // → "acche"
// Sanskrit adjective agreement
GenderAgreement.agreeAdjective("sundarā", "m", "sg", "sa"); // → "sundaraḥ"
GenderAgreement.agreeAdjective("sundarā", "f", "sg", "sa"); // → "sundarā"
GenderAgreement.agreeAdjective("sundarā", "n", "sg", "sa"); // → "sundaram"
// Honorific pronoun detection
GenderAgreement.isHonorificPronoun("आप", "hi"); // → true
GenderAgreement.isHonorificPronoun("तपाईं", "ne"); // → true (Nepali)
GenderAgreement.isHonorificPronoun("आपण", "mr"); // → true (Marathi)
GenderAgreement.isHonorificPronoun("भवान्", "sa"); // → true (Sanskrit)SovReorder
Converts English SVO sentence order to SOV — useful for building Hindi-style prompts from English training data.
import { SovReorder } from "@mera-vansh/ms-ltd";
SovReorder.reorder("Ram is studying"); // → "Ram studying is"
SovReorder.reorder("She is eating rice"); // → "She eating rice is"
SovReorder.reorder("They will go home"); // → "They go home will"
SovReorder.reorder("She is not eating"); // → "She eating is not"
// Non-Latin text is returned unchanged
SovReorder.reorder("मेरा नाम राम है"); // → "मेरा नाम राम है"Application Examples
Multilingual FAQ bot
import { LTD } from "@mera-vansh/ms-ltd";
const bot = new LTD({ defaultTopK: 3 });
bot.ingest([
{
id: "faq-gotra-en",
text: "What is gotra? gotra is a clan lineage system",
metadata: { answer: "Gotra is a patrilineal clan system in Hindu tradition." },
},
{
id: "faq-gotra-hi",
text: "गोत्र क्या है गोत्र वंश परंपरा",
metadata: { answer: "गोत्र एक पितृवंशीय कुल परंपरा है।" },
},
{
id: "faq-nakshatra",
text: "nakshatra birth star lunar mansion",
metadata: { answer: "Nakshatra is the lunar mansion at the time of birth." },
},
]);
function ask(userInput: string): string {
const res = bot.call(userInput);
if (res.confidence < 0.1) {
return "Sorry, I don't have information on that yet.";
}
const top = res.candidates[0]!;
bot.feedback(top.id, "positive");
return top.metadata["answer"] as string;
}
ask("gotra kya hota hai?"); // → Hindi FAQ answer
ask("what is a nakshatra?"); // → English FAQ answerEmotion-aware response routing
import { LTD } from "@mera-vansh/ms-ltd";
const ltd = new LTD();
ltd.ingest(myKnowledgeBase);
function handleMessage(userText: string) {
const { emotion, tone, candidates, confidence } = ltd.call(userText);
if (emotion === "GRIEF") {
return { type: "condolence", message: "I'm so sorry for your loss." };
}
if (emotion === "REVERENCE") {
return { type: "blessing", message: "Pranam. How may I assist you?" };
}
if (emotion === "CONFUSION") {
return { type: "clarify", message: "Let me explain that more clearly." };
}
if (tone === "URGENT") {
return { type: "priority", answer: candidates[0] };
}
return { type: "standard", answer: candidates[0], confidence };
}Lexicon-powered autocomplete
import { LTD } from "@mera-vansh/ms-ltd";
const ltd = new LTD();
// Suggest as user types (Devanagari)
ltd.suggest("नमस्", 5).map(r => r.entry.text);
// → ["नमस्ते", "नमस्कार", ...]
// Suggest from IAST romanisation
ltd.suggest("rāmā", 3).map(r => ({
text: r.entry.text,
gloss: r.entry.gloss,
lang: r.entry.lang,
}));
// → [{ text: "रामायण", gloss: "the Ramayana", lang: "hi" }, ...]
// Suggest kinship terms
ltd.suggest("dādā").map(r => r.entry.gloss);
// → ["paternal grandfather", ...]Domain classifier
import { LTD } from "@mera-vansh/ms-ltd";
const classifier = new LTD();
classifier.ingest([
{ id: "astro-1", text: "nakshatra rashi horoscope kundali", metadata: { domain: "astrology" } },
{ id: "astro-2", text: "नक्षत्र राशि कुंडली ज्योतिष", metadata: { domain: "astrology" } },
{ id: "gotra-1", text: "gotra pravara rishi lineage clan", metadata: { domain: "gotra" } },
{ id: "gotra-2", text: "गोत्र प्रवर ऋषि वंश कुल", metadata: { domain: "gotra" } },
{ id: "ritl-1", text: "vivah puja samskara ritual ceremony", metadata: { domain: "ritual" } },
]);
function classify(userInput: string): string {
const { candidates, confidence } = classifier.call(userInput);
if (confidence < 0.15) return "unknown";
return candidates[0]?.metadata["domain"] as string ?? "unknown";
}
classify("my nakshatra is Rohini"); // → "astrology"
classify("bharadwaj gotra pravara"); // → "gotra"
classify("random unrelated words xyz"); // → "unknown"Sanskrit inflection pipeline
import { VibhaktiEngine, GenderAgreement, Transliterator } from "@mera-vansh/ms-ltd";
// Inflect a noun and get its grammatical role
const nom = VibhaktiEngine.inflect("rām", "a_m", 1, "sg");
console.log(nom.form); // → "rāmaḥ"
console.log(nom.kāraka); // → "kartā" (nominative agent)
// Generate a full 24-form paradigm
VibhaktiEngine.paradigm("sīt", "aa_f")
.forEach(f => console.log(`${f.vibhakti}/${f.vacana}: ${f.form}`));
// Agree an adjective with its noun
GenderAgreement.agreeAdjective("sundarā", "f", "sg", "sa"); // → "sundarā"
// Transliterate to Devanagari
Transliterator.iastToScript("rāmaḥ", "Devanagari"); // → "रआमअः"Script-aware input routing
import { ScriptDetector, detectTone } from "@mera-vansh/ms-ltd";
function classifyInput(text: string) {
return {
script: ScriptDetector.detectScript(text),
lang: ScriptDetector.detectLanguage(text),
tone: detectTone(text),
};
}
classifyInput("aap ka gotra kya hai?");
// → { script: "Latin", lang: "en", tone: "FORMAL" }
classifyInput("आप का गोत्र क्या है?");
// → { script: "Devanagari", lang: "hi", tone: "FORMAL" }
classifyInput("enna gotra?");
// → { script: "Latin", lang: null, tone: "CURIOUS" }Persistence — MongoDB
import { LTD } from "@mera-vansh/ms-ltd";
import { MongoClient } from "mongodb";
const client = new MongoClient(process.env.MONGODB_URI!);
const col = client.db("myapp").collection("ltd_brain");
// Save
const ltd = new LTD();
ltd.ingest(myDocs);
await col.replaceOne({ _id: "v1" }, { _id: "v1", ...ltd.export() }, { upsert: true });
// Load
const saved = await col.findOne({ _id: "v1" });
const ltd2 = new LTD();
if (saved) ltd2.import(saved);Persistence — file system
import { writeFileSync, readFileSync } from "fs";
// Save
writeFileSync("brain.json", JSON.stringify(ltd.export(), null, 2));
// Load
const ltd2 = new LTD();
ltd2.import(JSON.parse(readFileSync("brain.json", "utf8")));TypeScript Types
import type {
// Core engine
LTDOptions, // { defaultTopK?: number }
LTDResponse, // { input, lang, script, emotion, tone, candidates, confidence }
LTDCandidate, // { id, score, metadata }
LTDState, // exported state shape
// Documents
IngestDocument, // { id, text, metadata? }
FeedbackSignal, // "positive" | "negative" | "neutral"
// Classification
Emotion, // "REVERENCE" | "JOY" | "GRIEF" | "ANGER" | "CONFUSION" | "NEUTRAL"
Tone, // "REVERENTIAL" | "FORMAL" | "URGENT" | "CURIOUS" | "INFORMAL" | "NEUTRAL"
Script, // "Devanagari" | "Tamil" | ... | "Mixed" | "Unknown"
LangCode, // "en" | "hi" | "mr" | "bn" | ... (18 codes)
// Lexicon
LexiconEntry, // { text, lang, romanized, gloss, category, subcategory? }
LexiconCategory, // "salutation" | "kinship" | "emotion_rasa" | "geography" | "literature" | "time" | "number"
LexiconSuggestion, // { entry: LexiconEntry, matchedPrefix: string }
// Grammar
Vibhakti, // 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8
Vacana, // "sg" | "du" | "pl"
Linga, // "m" | "f" | "n"
StemClass, // "a_m" | "aa_f" | "i_m" | "ii_f" | "u_m" | "cons"
InflectedForm, // { form, vibhakti, vacana, linga, kāraka }
TranslitScheme, // "Devanagari" | "Bengali" | "Tamil" | "Telugu" | ...
} from "@mera-vansh/ms-ltd";License
GPL-3.0 © Mera Vansh — dwivna
