lbkm

v1.0.0

Published

11 days ago

Local Base Knowledge Model - A lightweight in-memory knowledge base with file persistence

0High
0Medium
0Low

aditya.patange

knowledge-base search nlp in-memory local vector semantic cli

LBKM - Local Base Knowledge Model

A lightweight, zero-dependency in-memory knowledge base with file persistence for Node.js. Features full-text search with BM25/TF-IDF ranking, automatic tokenization, and a simple CLI.

Features

Zero Dependencies - Pure Node.js, no external packages required
In-Memory Search - Fast BM25 and TF-IDF ranking algorithms
File Persistence - Automatic JSON file storage
Full-Text Search - Tokenization, stemming, stop word removal
TypeScript Support - Full type definitions included
CLI Interface - Use via npx lbkm or install globally
ES Modules - Modern ESM package

Installation

npm install lbkm

Or use directly with npx:

npx lbkm --help

CLI Usage

Quick Start

# Add knowledge
npx lbkm add -p "The sky is blue due to Rayleigh scattering of sunlight"
npx lbkm add -p "Water freezes at 0 degrees Celsius"
npx lbkm add -p "The Earth orbits the Sun in approximately 365 days"

# Query
npx lbkm -p "Why is the sky blue?"
npx lbkm -p "What is the freezing point?"

Try the Demo

# Load 40 sample documents (science, tech, history, general)
npx lbkm demo

# Query with different output modes
npx lbkm -p "black holes"              # Truncated output
npx lbkm -p "black holes" -e           # Expanded (full content)
npx lbkm -p "machine learning" -s      # Summary (grouped by relevance)

Commands

lbkm query -p "search query"     # Search the knowledge base
lbkm add -p "content"            # Add content
lbkm add -f ./file.txt           # Add content from file
lbkm list                        # List all documents
lbkm stats                       # Show statistics
lbkm clear                       # Clear all documents
lbkm demo                        # Load demo documents
lbkm interactive                 # Interactive mode

Options

-p, --prompt <text>    Query or content to add
-f, --file <path>      File to add as content
-n, --name <name>      Knowledge base name (default: 'default')
-d, --dir <path>       Storage directory (default: '.lbkm')
-l, --limit <num>      Max results (default: 10)
-m, --metadata <json>  JSON metadata for content
-e, --expand           Show full content (no truncation)
-s, --summary          Show AI-style summary of results
-j, --json             Output as JSON

Programmatic Usage

Basic Example

import { KnowledgeBase } from 'lbkm';

const kb = new KnowledgeBase();

// Add documents
kb.add('The quick brown fox jumps over the lazy dog');
kb.add('Machine learning is a subset of artificial intelligence');
kb.add('Node.js is a JavaScript runtime built on Chrome V8');

// Search
const results = kb.search('JavaScript runtime');
console.log(results);
// [{ document: {...}, score: 0.85, matchedTerms: ['javascript', 'runtim'] }]

With Persistence

import { PersistentKnowledgeBase } from 'lbkm';

const kb = new PersistentKnowledgeBase({
  name: 'my-kb',
  storagePath: './data'
});

// Load existing data
await kb.load();

// Add documents (auto-saved)
await kb.add('Important information here', { source: 'manual' });

// Search
const results = await kb.search('important');

// Clean up
await kb.close();

Advanced Options

import { KnowledgeBase } from 'lbkm';

const kb = new KnowledgeBase({
  algorithm: 'bm25',      // 'bm25' or 'tfidf'
  stemming: true,         // Apply word stemming
  removeStopWords: true,  // Remove common words
  caseSensitive: false    // Case-insensitive matching
});

// Batch add
kb.addBatch([
  { content: 'Document 1', metadata: { category: 'tech' } },
  { content: 'Document 2', metadata: { category: 'science' } }
]);

// Search with options
const results = kb.search('query', {
  limit: 5,
  minScore: 0.1
});

// Export/Import for custom storage
const state = kb.export();
// ... save state to database, etc.
kb.import(state);

Using Utilities

import { tokenizer, vector } from 'lbkm';

// Tokenization
const tokens = tokenizer.process('Hello World!');
// ['hello', 'world']

// TF-IDF calculation
const tf = vector.termFrequency(tokens);
const similarity = vector.cosineSimilarity(vecA, vecB);

API Reference

KnowledgeBase

| Method | Description | |--------|-------------| | add(content, metadata?) | Add a document | | addBatch(documents) | Add multiple documents | | get(id) | Get document by ID | | remove(id) | Remove a document | | search(query, options?) | Search documents | | all() | Get all documents | | clear() | Remove all documents | | export() | Export state for persistence | | import(state) | Import previously exported state | | stats() | Get statistics |

PersistentKnowledgeBase

Extends KnowledgeBase with automatic file persistence:

| Method | Description | |--------|-------------| | load() | Load from storage | | save() | Save to storage | | delete() | Delete from storage | | close() | Flush and clean up |

FileStorage

Low-level file storage:

| Method | Description | |--------|-------------| | save(key, data) | Save JSON data | | load(key) | Load JSON data | | delete(key) | Delete data | | list() | List all keys | | clear() | Delete all data |

How It Works

LBKM uses an inverted index for fast full-text search:

Tokenization - Text is split into words, normalized, and optionally stemmed
Indexing - Terms are mapped to document IDs in an inverted index
Scoring - BM25 or TF-IDF algorithms rank documents by relevance
Persistence - State is serialized to JSON files with atomic writes

Requirements

Node.js >= 20.0.0

License

MIT