llmshield

v0.1.4

Published

2 months ago

Zero-config LLM cost shield. One line. Cut your OpenAI / Azure / Anthropic bill by 30-60%.

0High
0Medium
0Low

aminos

openai azure-openai anthropic token optimization llm cost gpt claude shield llmshield cost-reduction

llmshield

Zero-config LLM cost shield. One line. Cut your OpenAI / Azure / Anthropic bill by 30–60%.

Enterprise Edition available — self-hosted gateway, real-time dashboard, multi-tenant policy engine, GDPR/HIPAA compliance, SLA support. Contact [email protected] for licensing and enterprise inquiries.

How it works

Before every LLM call, LLMShield:

Deduplicates — removes repeated sentences across the conversation history
Compresses — strips filler phrases, verbose openers, redundant adverbs (EN, FR, ES, IT, DE)
Condenses — rewrites wordy constructions ("degrees Celsius" → "°C", "white blood cell count" → "WBC", …)
Trims — enforces a token budget, keeping the system prompt and most-recent messages

Structured content (bullet points, numbered lists, measurements like 38.2°C, 120/80 mmHg) is never touched.

Install

npm install llmshield

Integration — which files to change?

If you use the `openai` or `@anthropic-ai/sdk` npm packages

One change only — app.js (or your entry point):

// app.js  ← very first line, before anything else
require('llmshield/auto');

That's it. Every openai.chat.completions.create() call is automatically optimized. No other files need to change.

If you call the LLM via raw `fetch` / `axios` / Azure REST API

The auto-patch cannot intercept raw HTTP calls. You need two small changes:

1. In your chat controller (e.g. chatController.js) — add at the top:

// ✅ REQUIRED — safe import, app works normally even if llmshield is not installed
let _optimizeMessages;
try { ({ optimizeMessages: _optimizeMessages } = require('llmshield')); } catch { _optimizeMessages = null; }

2. Just before your LLM fetch call — add the optimize block:

// ✅ REQUIRED — optimize messages before sending
if (_optimizeMessages) {
  const result = _optimizeMessages(outgoing.messages);
  if (Array.isArray(result?.messages)) {
    outgoing.messages = result.messages;
  }
}

// optional — log what was saved ↓

3. Log what was saved (optional):

if (_optimizeMessages) {
  const result = _optimizeMessages(outgoing.messages);
  if (Array.isArray(result?.messages)) {
    outgoing.messages = result.messages;
    // ⬇ remove this line if you don't want console output
    console.log(`[llmshield] ${result.savedPercent}% saved (${result.tokensBefore} → ${result.tokensAfter} tokens)`);
  }
}

Then send the request as usual:

const resp = await fetch(url, {
  method: 'POST',
  headers: { 'api-key': apiKey, 'Content-Type': 'application/json' },
  body: JSON.stringify(outgoing),  // outgoing.messages is now optimized
});

When to use this: Azure OpenAI REST API, AWS Bedrock, Google Vertex, any custom LLM proxy.

Usage — other options

Auto-patch (openai / anthropic SDK only)

// app.js — first line
require('llmshield/auto');

Explicit wrap

const { wrap } = require('llmshield');
const openai = wrap(new OpenAI({ apiKey: process.env.OPENAI_API_KEY }));
await openai.chat.completions.create({ model: 'gpt-4o', messages });

Manual

const { optimizeMessages } = require('llmshield');
const { messages, savedPercent, tokensBefore, tokensAfter } = optimizeMessages(rawMessages);
// use optimized messages in your LLM call

Configuration

All options can be set via environment variables (auto mode) or passed as an options object (manual/wrap mode).

| Env var | Default | Description | |---------|---------|-------------| | LLMSHIELD_DEBUG=true | false | Log savings per request to stdout | | LLMSHIELD_MAX_TOKENS | 8192 | Hard token budget for trimming | | LLMSHIELD_CONTEXT_WINDOW | 8192 | Context window for dynamic limit calculation | | LLMSHIELD_DEDUP=false | true | Disable deduplication | | LLMSHIELD_COMPRESS=false | true | Disable compression | | LLMSHIELD_OUTPUT_CONSTRAINT=false | true | Disable injecting a concise-output system hint | | LLMSHIELD_DYNAMIC_LIMIT=false | true | Disable dynamic max_tokens calculation | | LLMSHIELD_KEY | — | API key to send savings stats to the cloud dashboard | | LLMSHIELD_URL | — | Self-hosted reporting endpoint (must be https://) |

GDPR / HIPAA — PII Redaction

The scrubber runs before any content leaves your process:

LLMSHIELD_GDPR=true    # redact emails, phones, credit cards, SSNs
LLMSHIELD_HIPAA=true   # also redact MRNs, NPIs, dates of birth, IPs

Only user messages are scrubbed. System and assistant messages are never altered.

Benchmarks

Tested across real-world prompt types (gpt-4o pricing: $0.0025 / 1K input tokens).

| Prompt type | Tokens before | Tokens after | Savings | |-------------|-------------|------------|---------| | Verbose medical | 283 | 158 | 44% | | Verbose chat | 108 | 62 | 43% | | Coding question | 69 | 41 | 41% | | CRISPR explanation | 100 | 63 | 37% | | French medical | 50 | 27 | 46% | | Medical w/ measurements | 99 | 87 | 12% | | Repetitive prompt | 52 | 45 | 13% | | Already concise | 29 | 29 | 0% (intentionally skipped) | | Short prompt | 13 | 13 | 0% (intentionally skipped) |

Key points:

Verbose, conversational, and medical prompts: 37–46% savings
Already-concise and very short prompts: skipped automatically (no degradation)
All medical measurements (38.2°C, 120/80 mmHg, WBC) preserved across all tests
0 critical grammar artifacts across all outputs
Languages tested: English, French (ES, IT, DE patterns also covered)

Supported SDKs

| SDK | Versions | |-----|----------| | openai | ≥ 4.0.0 | | @anthropic-ai/sdk | ≥ 0.20.0 | | Azure OpenAI (AzureOpenAI) | ≥ 4.0.0 |

Enterprise Edition

The open-source package covers client-side optimization.

The LLMShield Enterprise platform adds:

| Feature | Description | |---------|-------------| | Gateway server | Drop-in OpenAI-compatible proxy — just change base_url | | Real-time dashboard | Token savings, cost trends, per-model and per-team breakdown | | Multi-tenant policy engine | Token budgets, model allow-lists, rate limiting per team / API key | | Audit log | Full request history with retention controls | | GDPR / HIPAA compliance | PII redaction, no-body-logging mode, configurable data retention | | SSO / RBAC | Single sign-on, role-based access control | | SLA + Priority support | Dedicated support, uptime guarantee, custom deployment |

Enterprise is delivered under a commercial license and can be deployed on-premises or in your own cloud.

Contact: [email protected]

Disclaimer

This package is provided as-is, under the MIT License, without warranty of any kind — express or implied — including but not limited to warranties of merchantability, fitness for a particular purpose, or non-infringement.

Token optimization inherently modifies message content. While the engine is designed to preserve semantic meaning, LLMShield does not guarantee that optimized prompts will produce identical LLM responses.

Use in production is at your own risk. You are responsible for validating the output quality for your specific use case, particularly in regulated domains (medical, legal, financial). For compliance-critical deployments, evaluate the Enterprise Edition with its audit and policy controls.

The authors shall not be held liable for any damages, direct or indirect, arising from the use of this software.

License

Free for personal and non-commercial use. See LICENSE for full terms.

Commercial use (in a paid product, SaaS, or revenue-generating service) requires a commercial license.

Enterprise Edition with gateway, dashboard, and SLA support is available under a separate commercial agreement.

Contact: [email protected]

Published

Vulnerabilities

Links

Maintainers

Keywords

Readme

llmshield

How it works

Install

Integration — which files to change?

If you use the openai or @anthropic-ai/sdk npm packages

If you call the LLM via raw fetch / axios / Azure REST API

Usage — other options

Auto-patch (openai / anthropic SDK only)

Explicit wrap

Manual

Configuration

GDPR / HIPAA — PII Redaction

Benchmarks

Supported SDKs

Enterprise Edition

Disclaimer

License

If you use the `openai` or `@anthropic-ai/sdk` npm packages

If you call the LLM via raw `fetch` / `axios` / Azure REST API