npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2026 – Pkg Stats / Ryan Hefner

treedex

v0.1.4

Published

Tree-based, vectorless document RAG framework. Connect any LLM via URL/API key.

Readme

TreeDex

Tree-based, vectorless document RAG framework.

Index any document into a navigable tree structure, then retrieve relevant sections using any LLM. No vector databases, no embeddings — just structured tree retrieval.

Available for both Python and Node.js — same API, same index format, fully cross-compatible.

Open In Colab PyPI npm License: MIT Python 3.10+ Node 18+


How It Works

  1. Load — Extract pages from any supported format
  2. Index — LLM analyzes page groups and extracts hierarchical structure
  3. Build — Flat sections become a tree with page ranges and embedded text
  4. Query — LLM selects relevant tree nodes for your question
  5. Return — Get context text, source pages, and reasoning

Why TreeDex instead of Vector DB?


Supported LLM Providers

TreeDex works with every major AI provider out of the box. Pick what works for you:

One-liner backends (zero config)

| Backend | Provider | Default Model | Python Deps | Node.js Deps | |---------|----------|---------------|-------------|-------------| | GeminiLLM | Google | gemini-2.0-flash | google-generativeai | @google/generative-ai | | OpenAILLM | OpenAI | gpt-4o | openai | openai | | ClaudeLLM | Anthropic | claude-sonnet-4-20250514 | anthropic | @anthropic-ai/sdk | | MistralLLM | Mistral AI | mistral-large-latest | mistralai | @mistralai/mistralai | | CohereLLM | Cohere | command-r-plus | cohere | cohere-ai | | GroqLLM | Groq | llama-3.3-70b-versatile | groq | groq-sdk | | TogetherLLM | Together AI | Llama-3-70b-chat-hf | None | None (fetch) | | FireworksLLM | Fireworks | llama-v3p1-70b-instruct | None | None (fetch) | | OpenRouterLLM | OpenRouter | claude-sonnet-4 | None | None (fetch) | | DeepSeekLLM | DeepSeek | deepseek-chat | None | None (fetch) | | CerebrasLLM | Cerebras | llama-3.3-70b | None | None (fetch) | | SambanovaLLM | SambaNova | Llama-3.1-70B-Instruct | None | None (fetch) | | HuggingFaceLLM | HuggingFace | Mistral-7B-Instruct | None | None (fetch) | | OllamaLLM | Ollama (local) | llama3 | None | None (fetch) |

Universal backends

| Backend | Use case | Dependencies | |---------|----------|-------------| | OpenAICompatibleLLM | Any OpenAI-compatible endpoint (URL + key) | None | | LiteLLM | 100+ providers via litellm library (Python only) | litellm | | FunctionLLM | Wrap any function | None | | BaseLLM | Subclass to build your own | None |


Quick Start

Install

pip install treedex

# With optional LLM SDK
pip install treedex[gemini]
pip install treedex[openai]
pip install treedex[claude]
pip install treedex[all]
npm install treedex

# With optional LLM SDK
npm install treedex openai
npm install treedex @google/generative-ai
npm install treedex @anthropic-ai/sdk

Pick your LLM and go

from treedex import TreeDex, GeminiLLM

llm = GeminiLLM(api_key="YOUR_KEY")

index = TreeDex.from_file("doc.pdf", llm=llm)
result = index.query("What is the main argument?")

print(result.context)
print(result.pages_str)  # "pages 5-8, 12-15"
import { TreeDex, GeminiLLM } from "treedex";

const llm = new GeminiLLM("YOUR_KEY");

const index = await TreeDex.fromFile("doc.pdf", llm);
const result = await index.query("What is the main argument?");

console.log(result.context);
console.log(result.pagesStr);  // "pages 5-8, 12-15"

All providers work the same way

from treedex import *

# Google Gemini
llm = GeminiLLM(api_key="YOUR_KEY")

# OpenAI
llm = OpenAILLM(api_key="sk-...")

# Claude
llm = ClaudeLLM(api_key="sk-ant-...")

# Groq (fast inference)
llm = GroqLLM(api_key="gsk_...")

# Together AI
llm = TogetherLLM(api_key="...")

# DeepSeek
llm = DeepSeekLLM(api_key="...")

# OpenRouter (access any model)
llm = OpenRouterLLM(api_key="...")

# Local Ollama
llm = OllamaLLM(model="llama3")

# Any OpenAI-compatible endpoint
llm = OpenAICompatibleLLM(
    base_url="https://your-api.com/v1",
    api_key="...",
    model="model-name",
)
import { /* any backend */ } from "treedex";

// Google Gemini
const llm = new GeminiLLM("YOUR_KEY");

// OpenAI
const llm = new OpenAILLM("sk-...");

// Claude
const llm = new ClaudeLLM("sk-ant-...");

// Groq (fast inference)
const llm = new GroqLLM("gsk_...");

// Together AI
const llm = new TogetherLLM("...");

// DeepSeek
const llm = new DeepSeekLLM("...");

// OpenRouter (access any model)
const llm = new OpenRouterLLM("...");

// Local Ollama
const llm = new OllamaLLM("llama3");

// Any OpenAI-compatible endpoint
const llm = new OpenAICompatibleLLM({
  baseUrl: "https://your-api.com/v1",
  apiKey: "...",
  model: "model-name",
});

Wrap any function

from treedex import FunctionLLM

llm = FunctionLLM(lambda p: my_api(p))
import { FunctionLLM } from "treedex";

const llm = new FunctionLLM((p) => myApi(p));

Build your own backend

from treedex import BaseLLM

class MyLLM(BaseLLM):
    def generate(self, prompt: str) -> str:
        return my_api_call(prompt)
import { BaseLLM } from "treedex";

class MyLLM extends BaseLLM {
  async generate(prompt: string): Promise<string> {
    return await myApiCall(prompt);
  }
}

Agentic RAG — get direct answers

Standard mode returns raw context. Agentic mode goes one step further — it retrieves the relevant sections, then generates a direct answer.

# Standard: returns context + page ranges
result = index.query("What is X?")
print(result.context)

# Agentic: returns a direct answer
result = index.query("What is X?", agentic=True)
print(result.answer)     # LLM-generated answer
print(result.pages_str)  # source pages
// Standard: returns context + page ranges
const result = await index.query("What is X?");
console.log(result.context);

// Agentic: returns a direct answer
const result = await index.query("What is X?", { agentic: true });
console.log(result.answer);    // LLM-generated answer
console.log(result.pagesStr);  // source pages

Swap LLM at query time

# Build index with one LLM
index = TreeDex.from_file("doc.pdf", llm=gemini_llm)

# Query with a different one — same index, different brain
result = index.query("...", llm=groq_llm)

Save and load indexes

Indexes are saved as JSON. An index created in Python loads in Node.js and vice versa.

# Save
index.save("my_index.json")

# Load
index = TreeDex.load("my_index.json", llm=llm)
// Save
await index.save("my_index.json");

// Load
const index = await TreeDex.load("my_index.json", llm);

Supported Document Formats

| Format | Loader | Python Deps | Node.js Deps | |--------|--------|-------------|-------------| | PDF | PDFLoader | pymupdf | pdfjs-dist (included) | | TXT / MD | TextLoader | None | None | | HTML | HTMLLoader | None (stdlib) | htmlparser2 (optional, has fallback) | | DOCX | DOCXLoader | python-docx | mammoth (optional) |

Use auto_loader(path) / autoLoader(path) for automatic format detection.


API Reference

TreeDex

| Method | Python | Node.js | |--------|--------|---------| | Build from file | TreeDex.from_file(path, llm) | await TreeDex.fromFile(path, llm) | | Build from pages | TreeDex.from_pages(pages, llm) | await TreeDex.fromPages(pages, llm) | | Create from tree | TreeDex.from_tree(tree, pages) | TreeDex.fromTree(tree, pages) | | Query | index.query(question) | await index.query(question) | | Agentic query | index.query(q, agentic=True) | await index.query(q, { agentic: true }) | | Save | index.save(path) | await index.save(path) | | Load | TreeDex.load(path, llm) | await TreeDex.load(path, llm) | | Show tree | index.show_tree() | index.showTree() | | Stats | index.stats() | index.stats() | | Find large | index.find_large_sections() | index.findLargeSections() |

QueryResult

| Property | Python | Node.js | Description | |----------|--------|---------|-------------| | Context | .context | .context | Concatenated text from relevant sections | | Node IDs | .node_ids | .nodeIds | IDs of selected tree nodes | | Page ranges | .page_ranges | .pageRanges | [(start, end), ...] page ranges | | Pages string | .pages_str | .pagesStr | Human-readable: "pages 5-8, 12-15" | | Reasoning | .reasoning | .reasoning | LLM's explanation for selection | | Answer | .answer | .answer | LLM-generated answer (agentic mode only) |

Cross-language Index Compatibility

TreeDex uses the same JSON index format in both Python and Node.js. All field names use snake_case in the JSON:

{
  "version": "1.0",
  "framework": "TreeDex",
  "tree": [{ "structure": "1", "title": "...", "node_id": "0001", ... }],
  "pages": [{ "page_num": 0, "text": "...", "token_count": 123 }]
}

Build an index with Python, query it from Node.js (or vice versa).


Benchmarks

TreeDex vs Vector DB vs Naive Chunking

Real benchmark on the same document (NCERT Electromagnetic Waves, 14 pages, 10 queries). All three methods retrieve from the same content — only the indexing and retrieval approach differs. Auto-generated by CI on every push.

TreeDex Stats

| Feature | TreeDex | Vector RAG | Naive Chunking | |---------|---------|------------|----------------| | Page Attribution | Exact source pages | Approximate | None | | Structure Preserved | Full tree hierarchy | None | None | | Index Format | Human-readable JSON | Opaque vectors | Text chunks | | Embedding Model | Not needed | Required | Not needed | | Infrastructure | None (JSON file) | Vector DB required | None | | Core Dependencies | 2 | 5-8+ | 2-5 |

Run your own benchmarks:

# Python
python benchmarks/run_benchmark.py

# Node.js
npx tsx benchmarks/node/run-benchmark.ts

Architecture


Project Structure

treedex/
├── treedex/                # Python package
│   ├── core.py
│   ├── llm_backends.py
│   ├── loaders.py
│   ├── pdf_parser.py
│   ├── tree_builder.py
│   ├── tree_utils.py
│   └── prompts.py
├── src/                    # TypeScript source
│   ├── index.ts
│   ├── core.ts
│   ├── llm-backends.ts
│   ├── loaders.ts
│   ├── pdf-parser.ts
│   ├── tree-builder.ts
│   ├── tree-utils.ts
│   ├── prompts.ts
│   └── types.ts
├── tests/                  # Python tests (pytest)
├── test/                   # Node.js tests (vitest)
├── examples/               # Python examples
├── examples/node/          # Node.js examples
├── benchmarks/             # Python benchmarks
├── benchmarks/node/        # Node.js benchmarks
├── pyproject.toml          # Python package config
├── package.json            # npm package config
├── tsconfig.json           # TypeScript config
└── tsup.config.ts          # Build config (ESM + CJS)

Running Tests

pip install -e ".[dev]"
pytest
pytest --cov=treedex
pytest tests/test_core.py -v
npm install
npm test
npm run test:watch
npm run typecheck

Examples

Python

python examples/quickstart.py path/to/document.pdf
python examples/multi_provider.py
python examples/custom_llm.py
python examples/save_load.py path/to/document.pdf

Node.js

npx tsx examples/node/quickstart.ts path/to/document.pdf
npx tsx examples/node/multi-provider.ts
npx tsx examples/node/custom-llm.ts
npx tsx examples/node/save-load.ts path/to/document.pdf

Contributing

git clone https://github.com/mithun50/TreeDex.git
cd TreeDex

# Python development
pip install -e ".[dev]"
pytest

# Node.js development
npm install
npm run build
npm test

License

MIT License — Mithun Gowda B