@nxuss/lemma

v0.4.8

Published

a day ago

Intelligent AI Gateway for IDEs & Agents — Semantic cache, Privacy Firewall, and Autonomous Cost-Optimization.

0High
0Medium
0Low

semantic-cache privacy-firewall ai-security cost-optimization intelligent-proxy ai-agents orchestration ide-sync llm-gateway openai-proxy anthropic-proxy vector-database

Lemma v0.4.7

The Intelligent AI Gateway — Privacy, Performance, and Precision for the Agentic Era.

Lemma is a high-performance orchestration layer that sits between your development environment and LLM providers. It transforms the way you build with AI by providing Shared Semantic Memory, Autonomous Cost Optimization, and Privacy Guardrails.

⚡ Killer Features

✂️ Codebase Context Squeezer (AST Tree-Shaker)

Unbelievable Prompt Compaction. Stop wasting Claude Pro context limits and throttled Cursor speeds on repetitive codebase payloads. Lemma automatically tree-shakes outgoing prompts, recursively pruning heavy, irrelevant implementations while keeping structural declarations and symbols intact. Get ultra-fast answers under 1 second and enjoy 10x longer chats without ever hitting Claude's 5-hour usage caps.

🛸 Local Cache-Augmented Response Synthesis (CARS)

Zero-Cost Response Generation. When you ask a query semantically similar to a previous question, Lemma avoids the cloud completely. Instead, it dynamically harnesses your local computational environment to adjust and synthesize the historical answer to fit your new requirements in less than a second. 0 Cloud Tokens, 0 Cloud API costs, and seamless offline fallback protection.

🛡️ Privacy Firewall (Semantic Scrubber)

Zero-Trust Prompts. Stop leaking sensitive data. Lemma automatically detects API keys, PII, and credentials in your prompts, masking them with secure tokens before they reach the cloud. Responses are seamlessly reconstructed locally.

🚦 Complexity Router (Cost-Optimizer)

Intelligence Where it Matters. Lemma analyzes the semantic complexity of every request. It autonomously routes lightweight tasks to hyper-efficient models like gpt-4o-mini, reserving premium models for high-reasoning challenges. Save up to 90% on simple tasks.

🧠 Telepathic Context Injector (Runtime Sync)

Bridge the Gap Between Code and Execution. Lemma synchronizes your application's live runtime state and exceptions directly with your IDE’s consciousness. Your AI assistant gains immediate "situational awareness" of crashes.

⚡ Shared Semantic Cache

Stop Paying for the Same Thought Twice. Lemma understands meaning. It recognizes similar prompts and returns instant (3ms) responses, saving 40-70% on total API expenditure.

🚀 Smart CLI (Zero-Config)

Lemma v0.4.7 introduces the Smart CLI, making it easier than ever to get started:

# 1. Install
npm install -g @nxuss/lemma

# 2. Initialize (Auto-configures .env and .lemma/)
lemma init

# 3. Start with Intelligence Report
lemma start

🧠 Intelligence Report

On startup, Lemma performs a System Check to detect dependencies like Ollama and ChromaDB, providing a real-time report of active features and optimizations.

💎 Tier Comparison

🛠️ Integration: Power Up Your Favorite Tools

Lemma is compatible with any tool that allows you to configure a custom OpenAI Base URL. This means you can add Lemma's intelligence to your existing workflow in seconds.

💬 Use it with AI Chats & IDEs

You don't need to change your habits. Just point your tool's "Base URL" to Lemma:

Cursor: Go to Settings > Models > OpenAI API > Override Base URL and set it to http://localhost:8081/v1.
VS Code (Continue): Update your config.json to use http://localhost:8081/v1 as the apiBase.
AutoGPT / BabyAGI: Set the OPENAI_API_BASE environment variable.
Custom Apps: Replace https://api.openai.com/v1 with http://localhost:8081/v1 in your SDK initialization.

⚡ Why use Lemma for Chat?

Privacy: Your IDE won't leak your secrets to the cloud.
Context: Lemma syncs your runtime crashes directly to your chat window.
Speed: Instant responses for similar questions via Semantic Cache.