@mostlylucid/docsummarizer
v4.3.0
Published
Local-first RAG engine for documents: semantic segmentation, embeddings, salience-aware retrieval, and citation-grounded Q&A
Maintainers
Readme
@mostlylucid/docsummarizer
A local-first RAG engine for documents: semantic segmentation, embeddings, salience-aware retrieval, and citation-grounded Q&A — without requiring cloud APIs.
import { DocSummarizer } from "@mostlylucid/docsummarizer";
const doc = new DocSummarizer();
const { summary } = await doc.summarizeFile("report.pdf");
const { answer, evidence } = await doc.askFile("report.pdf", "What are the key findings?");Install
npm install @mostlylucid/docsummarizerThe CLI binary is automatically downloaded during installation. Verify it works:
npx @mostlylucid/docsummarizer checkWhat It Does
- RAG Q&A — Ask questions with source citations (single doc or folders)
- Salience-aware retrieval — Better chunks, less noise
- Deterministic ingestion — Same document = same segments = reproducible results
- Local-first embeddings — ONNX runtime, no API keys required
- Composable storage — Built-in stores or export embeddings to your own
How It Works
- Segment — Splits documents into semantic units (sentences, headings, lists, code blocks)
- Embed — Generates 384-dim vectors using BERT (runs locally via ONNX)
- Score — Computes salience scores based on position, structure, and content
- Retrieve — Finds relevant segments using cosine similarity
- Synthesize — Optionally uses an LLM to generate coherent answers
No API keys required for basic usage. Add Ollama for LLM-enhanced synthesis.
API
Summarize
// File (supports .md, .pdf, .docx, .txt)
const result = await doc.summarizeFile("./document.md");
// URL
const result = await doc.summarizeUrl("https://example.com/article");
// Raw markdown
const result = await doc.summarizeMarkdown("# My Doc\n\nContent here...");
// With options
const result = await doc.summarizeFile("./doc.md", {
query: "focus on security concerns", // Focus summary on specific topic
mode: "BertRag", // Summarization mode
});
console.log(result.summary); // The summary text
console.log(result.wordCount); // Word count
console.log(result.topics); // Extracted topics with sourcesQuestion Answering
const result = await doc.askFile("./contract.pdf", "What are the payment terms?");
console.log(result.answer); // The answer
console.log(result.confidence); // "High" | "Medium" | "Low"
console.log(result.evidence); // Source segments with similarity scoresSemantic Search
Search finds relevant segments within a single document (not a global index).
const result = await doc.search("./document.md", "machine learning", {
topK: 5, // Max results (default: 10)
});
result.results.forEach(r => {
console.log(`[${r.score.toFixed(2)}] ${r.section}: ${r.preview}`);
});Diagnostics
// Quick check
const ok = await doc.check();
// Detailed diagnostics
const info = await doc.diagnose();
console.log(info.available); // true/false
console.log(info.executablePath); // Resolved CLI path
console.log(info.output); // Raw diagnostic outputCLI Usage
The package includes a CLI passthrough:
# Check installation
npx @mostlylucid/docsummarizer check
# Run diagnostics
npx @mostlylucid/docsummarizer doctor
# Summarize (JSON output)
npx @mostlylucid/docsummarizer tool --file doc.md
# Search
npx @mostlylucid/docsummarizer search --file doc.md --query "topic" --json
# All CLI commands
npx @mostlylucid/docsummarizer --helpModes
| Mode | Description | Requires LLM |
|------|-------------|--------------|
| Auto | Auto-select based on document | Maybe |
| BertRag | BERT embeddings + retrieval | No |
| Bert | Pure extractive (BERT only) | No |
| BertHybrid | BERT + LLM synthesis | Yes |
| Rag | Full RAG pipeline | Yes |
// No LLM needed
await doc.summarizeFile("doc.md", { mode: "BertRag" });
// LLM-enhanced (requires Ollama)
await doc.summarizeFile("doc.md", { mode: "BertHybrid" });Configuration
Constructor Options
const doc = new DocSummarizer({
executable: "/path/to/docsummarizer", // Custom CLI path
configPath: "./config.json", // Config file
model: "llama3.2", // Ollama model
timeout: 300000, // Timeout (ms)
});Config File
Create docsummarizer.json:
{
"ollama": {
"baseUrl": "http://localhost:11434",
"model": "llama3.2"
},
"onnx": {
"embeddingModel": "AllMiniLmL6V2"
}
}Environment Variables
DOCSUMMARIZER_PATH=/path/to/cli # Direct path to CLI
DOCSUMMARIZER_PROJECT=/path/to/proj.csproj # Use dotnet run (dev mode)Response Types
SummaryResult
interface SummaryResult {
schemaVersion: string; // "1.0.0"
success: true;
source: string;
type: "summary";
summary: string;
wordCount: number;
topics: Array<{
topic: string;
summary: string;
sourceChunks: string[];
}>;
entities?: {
people?: string[];
organizations?: string[];
locations?: string[];
dates?: string[];
};
metadata: {
documentId: string;
totalChunks: number;
chunksProcessed: number;
coverageScore: number; // 0-1
processingTimeMs: number;
mode: string;
model: string;
};
}QAResult
interface QAResult {
schemaVersion: string;
success: true;
source: string;
type: "qa";
question: string;
answer: string;
confidence: "High" | "Medium" | "Low";
evidence: Array<{
segmentId: string;
text: string;
similarity: number; // 0-1
section: string;
}>;
metadata: {
processingTimeMs: number;
model: string;
};
}SearchResult
interface SearchResult {
schemaVersion: string;
query: string;
results: Array<{
section: string;
score: number; // 0-1 (cosine similarity)
preview: string;
}>;
}Error Handling
import { DocSummarizer, DocSummarizerError } from "@mostlylucid/docsummarizer";
try {
await doc.summarizeFile("missing.md");
} catch (err) {
if (err instanceof DocSummarizerError) {
console.error(err.message); // Error description
console.error(err.output); // Raw CLI output
}
}Events
doc.on("stderr", (data: string) => {
console.log("Progress:", data);
});Requirements
- Node.js 18+
- .NET 8+ runtime
- Ollama (optional, for LLM modes)
Troubleshooting
CLI not found
# Run diagnostics
npx @mostlylucid/docsummarizer doctor
# Re-run postinstall to download CLI
node node_modules/@mostlylucid/docsummarizer/scripts/postinstall.js
# Verify
npx @mostlylucid/docsummarizer checkPDF/DOCX not working
PDF and DOCX conversion requires the Docling service. See CLI docs.
Slow first run
First run downloads ONNX models (~50MB). Subsequent runs are fast.
Why This Isn't a RAG Framework
- No agent orchestration — You control the flow
- No opinionated prompts — Bring your own LLM prompts
- No cloud dependency — Runs entirely local
Why It Still Is RAG
This is a RAG engine, not a RAG framework. It handles:
- Semantic chunking with structure preservation
- Vector embeddings (local ONNX)
- Similarity-based retrieval with salience scoring
- Citation-grounded answers
What it doesn't handle (by design): conversational memory, agent loops, eval dashboards. Those are layers you add if you need them.
Links
License
MIT
