npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2026 – Pkg Stats / Ryan Hefner

compress-lightreach

v1.0.6

Published

AI cost management SDK with intelligent model routing, prompt compression, and real-time token tracking

Readme

Compress Light Reach

AI cost management SDK with intelligent model routing, prompt compression, and real-time token tracking

npm version Node.js 14+ License: MIT

Compress Light Reach is a Node.js/TypeScript SDK that provides intelligent model routing and prompt compression for LLM applications, reducing token usage and costs while maintaining quality.

Features

  • Intelligent Model Routing: Automatically selects optimal model based on quality requirements (HLE) and available provider keys
  • Token-aware Compression: Replaces repeated substrings with shorter placeholders using a fast greedy algorithm
  • Lossless: Perfect decompression guaranteed
  • Output Compression: Optional model output compression support
  • Cloud API: Uses Light Reach's cloud service for compression and routing
  • Multi-provider Support: OpenAI, Anthropic, Google, DeepSeek, Moonshot
  • TypeScript: Full TypeScript support with type definitions
  • BYOK: Provider API keys managed securely in dashboard (never passed through SDK)

Installation

npm install compress-lightreach

or

yarn add compress-lightreach

Quick Start

The SDK uses intelligent model routing and targets POST /api/v2/complete.

  • Authenticate with your LightReach API key (env var PCOMPRESLR_API_KEY or LIGHTREACH_API_KEY)
  • Manage provider keys (OpenAI/Anthropic/Google/etc.) in the dashboard (BYOK)
  • System automatically selects optimal model based on your requirements
import { PcompresslrAPIClient } from 'compress-lightreach';

const client = new PcompresslrAPIClient("your-lightreach-api-key");

const result = await client.complete({
  messages: [
    { role: 'system', content: 'You are a helpful assistant.' },
    { role: 'user', content: 'Explain quantum computing in simple terms.' },
  ],
  desired_hle: 30,  // Quality ceiling (0-100). Current SOTA is ~40%.
});

console.log(result.decompressed_response);
console.log(`Selected: ${result.routing_info?.selected_model}`);
console.log(`Token savings: ${result.compression_stats.token_savings}`);

OpenAI-compatible API (Cursor / OpenAI SDKs)

LightReach also exposes a strict OpenAI-compatible surface (including streaming SSE) so you can use standard OpenAI tooling without changing your app.

  • Cursor base URL: https://compress.lightreach.io/v1/cursor
  • Generic OpenAI-compatible base URL: https://compress.lightreach.io/v1
  • Endpoints: GET /models, POST /chat/completions
  • Model id: lightreach

Example (cURL):

curl -sS https://compress.lightreach.io/v1/chat/completions \
  -H "Authorization: Bearer lr_your_lightreach_key" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "lightreach",
    "messages": [{"role":"user","content":"Say hello"}],
    "stream": true
  }'

With Output Compression

const result = await client.complete({
  messages: [{ role: 'user', content: 'Generate a long report...' }],
  desired_hle: 25,
  compress_output: true,
});

console.log(result.decompressed_response);

Intelligent Model Routing

The system automatically selects the optimal model based on quality requirements and your available provider keys:

import { PcompresslrAPIClient } from 'compress-lightreach';

const client = new PcompresslrAPIClient("your-lightreach-api-key");

// Cross-provider optimization: system picks cheapest model meeting your quality bar
const result = await client.complete({
  messages: [{ role: 'user', content: 'Explain quantum computing' }],
  desired_hle: 30,  // Quality ceiling (0-100). Current SOTA is ~40%.
});

// Check what was selected
console.log(result.routing_info?.selected_model);           // e.g., "gpt-4o-mini"
console.log(result.routing_info?.selected_provider);        // e.g., "openai"
console.log(result.routing_info?.model_hle);                // e.g., 32.5
console.log(result.routing_info?.model_price_per_million);  // e.g., 0.15

Provider-Constrained Routing

Optionally constrain to a specific provider:

// Only use OpenAI models, but pick the cheapest one meeting HLE 35
const result = await client.complete({
  messages: [{ role: 'user', content: 'Write a poem' }],
  llm_provider: 'openai',  // Optional: constrain to one provider
  desired_hle: 35,
});

HLE Cascading with Admin Controls

Admins can set quality ceilings via the dashboard (global or per-tag) to control costs. Your desired_hle is a preference; if it exceeds an admin-set ceiling, the request will silently clamp to the ceiling and proceed.

// Admin set global HLE ceiling to 30%
// Requesting above the ceiling will be clamped to 30 (no error)
const result = await client.complete({
  messages: [{ role: 'user', content: 'Process payment' }],
  desired_hle: 35,  // Will be clamped down to 30
  tags: { env: 'production' },
});

// Correct usage: request within ceiling
const result = await client.complete({
  messages: [{ role: 'user', content: 'Process payment' }],
  desired_hle: 25,  // OK: below ceiling of 30
  tags: { env: 'production' },
});

// Check if your HLE was lowered by an admin ceiling
if (result.routing_info?.hle_clamped) {
  console.log(`HLE lowered from ${result.routing_info.requested_hle} ` +
              `to ${result.routing_info.effective_hle} ` +
              `by ${result.routing_info.hle_source}-level ceiling`);
}

With Compression Config

Configure per-role compression settings:

import { PcompresslrAPIClient } from 'compress-lightreach';

const client = new PcompresslrAPIClient("your-lightreach-api-key");

const result = await client.complete({
  messages: [{ role: 'user', content: 'Hello!' }],
  desired_hle: 30,
  compress: true,
  compress_output: false,
  compression_config: {
    compress_system: false,
    compress_user: true,
    compress_assistant: false,
    compress_only_last_n_user: 1,
  },
  temperature: 0.7,
  max_tokens: 1000,
  tags: { env: 'production' },
});

console.log(result.decompressed_response);
console.log(`Model used: ${result.routing_info?.selected_model}`);

Compression Only (No LLM Call)

import { PcompresslrAPIClient } from 'compress-lightreach';

const client = new PcompresslrAPIClient("your-lightreach-api-key");

// Compress text without making an LLM call
const compressed = await client.compress(
  "Your text with repeated content here...",
  "gpt-4",      // Model for tokenization
  { env: 'dev' } // Optional tags
);

console.log(compressed.llm_format);
console.log(`Compression ratio: ${compressed.compression_ratio}`);

// Decompress later
const decompressed = await client.decompress(compressed.llm_format);
console.log(decompressed.decompressed);

Command Line Interface

# Set your API key
export PCOMPRESLR_API_KEY=your-api-key

# Compress a prompt
npx pcompresslr "Your prompt with repeated text here..."

API Reference

PcompresslrAPIClient

Main API client for intelligent model routing and compression.

Constructor

new PcompresslrAPIClient(apiKey?: string, apiUrl?: string, timeout?: number)

Parameters:

  • apiKey (string, optional): LightReach API key. Falls back to LIGHTREACH_API_KEY or PCOMPRESLR_API_KEY env vars.
  • apiUrl (string, optional): Override base API URL. Falls back to PCOMPRESLR_API_URL env var. Default: https://api.compress.lightreach.io
  • timeout (number, optional): Request timeout in milliseconds. Default: 900000 (15 minutes)

Methods

complete(request: CompleteV2Request): Promise<CompleteResponse>

Messages-first completion with intelligent routing (POST /api/v2/complete).

Request Parameters (CompleteV2Request):

| Parameter | Type | Default | Description | |-----------|------|---------|-------------| | messages | Message[] | required | Conversation history with role and content | | llm_provider | 'openai' \| 'anthropic' \| 'google' \| 'deepseek' \| 'moonshot' | — | Optional provider constraint. Omit for cross-provider optimization | | desired_hle | number | — | Quality ceiling (0-100). If above an admin ceiling, it is clamped down | | compress | boolean | true | Whether to compress messages | | compress_output | boolean | false | Whether to request compressed output from LLM | | compression_config | object | — | Per-role compression settings (see below) | | temperature | number | — | LLM temperature parameter | | max_tokens | number | — | Maximum tokens to generate | | tags | Record<string, string> | — | Tags for cost attribution and tag-level HLE ceilings | | max_history_messages | number | — | Limit conversation history length |

compression_config options:

{
  compress_system?: boolean;         // default: false
  compress_user?: boolean;           // default: true
  compress_assistant?: boolean;      // default: false
  compress_only_last_n_user?: number | null;  // default: 1
}

Response (CompleteResponse):

{
  decompressed_response: string;     // Final decompressed LLM response
  compression_stats: {
    compression_enabled: boolean;
    original_tokens: number;
    compressed_tokens: number;
    token_savings: number;
    compression_ratio: number;
    token_count_exact?: boolean;
    token_count_source?: string;
    token_accounting_note?: string;
    processing_time_ms?: number;
  };
  llm_stats: {
    provider?: string;
    model?: string;
    input_tokens: number;
    output_tokens: number;
    total_tokens: number;
    finish_reason?: string | null;
  };
  routing_info?: {
    selected_model: string;          // Model chosen by system
    selected_provider: string;       // Provider chosen by system
    selected_model_id: string;
    model_hle: number;               // HLE score of selected model
    model_price_per_million: number;
    requested_hle: number | null;
    effective_hle: number | null;    // Effective HLE after admin ceilings
    hle_source: 'request' | 'tag' | 'global' | 'none';
    hle_clamped: boolean;            // true if admin ceiling lowered your desired_hle
  };
  warnings?: string[];
  
  // Convenience aliases
  text?: string;                     // Alias for decompressed_response
  tokens_saved?: number;             // Alias for compression_stats.token_savings
  tokens_used?: number;              // Alias for llm_stats.total_tokens
  compression_ratio?: number;        // Alias for compression_stats.compression_ratio
}
compress(prompt, model?, tags?): Promise<CompressResponse>

Also supports a legacy call shape: compress(prompt, model, algorithm, tags?) (only "greedy" is supported).

Compression-only (POST /api/v1/compress).

Parameters:

  • prompt (string, required): Text to compress
  • model (string, optional): Model for tokenization. Default: 'gpt-4'
  • algorithm ("greedy", optional): Legacy-only parameter. Only "greedy" is supported.
  • tags (Record<string, string>, optional): Tags for attribution

Response (CompressResponse):

{
  compressed: string;
  dictionary: Record<string, string>;
  llm_format: string;
  compression_ratio: number;
  original_size: number;
  compressed_size: number;
  processing_time_ms: number;
  algorithm: string;
}
decompress(llmFormat): Promise<DecompressResponse>

Decompress an LLM-formatted compressed prompt (POST /api/v1/decompress).

Parameters:

  • llmFormat (string, required): The llm_format string from a compress response

Response (DecompressResponse):

{
  decompressed: string;
  processing_time_ms: number;
}
healthCheck(): Promise<HealthCheckResponse>

Check API health status (GET /health).

Response:

{
  status: string;
  version?: string;
}

Message Types

type MessageRole = 'system' | 'developer' | 'user' | 'assistant';

interface Message {
  role: MessageRole;
  content: string;
}

Environment Variables

| Variable | Description | |----------|-------------| | PCOMPRESLR_API_KEY | Your LightReach API key (primary) | | LIGHTREACH_API_KEY | Your LightReach API key (alternative) | | PCOMPRESLR_API_URL | Override the API base URL (advanced/testing) |

Exceptions

| Exception | Description | |-----------|-------------| | PcompresslrAPIError | Base exception class | | APIKeyError | Invalid or missing API key | | RateLimitError | Rate limit exceeded | | APIRequestError | General API errors (including routing failures) |

import { APIKeyError, RateLimitError, APIRequestError } from 'compress-lightreach';

try {
  const result = await client.complete({ messages: [...] });
} catch (error) {
  if (error instanceof APIKeyError) {
    console.error('Invalid API key');
  } else if (error instanceof RateLimitError) {
    console.error('Rate limited, please retry later');
  } else if (error instanceof APIRequestError) {
    console.error('API error:', error.message);
  }
}

How It Works

Compress Light Reach uses intelligent algorithms to identify repeated substrings in your prompts and replace them with shorter placeholders.

The library:

  1. Identifies repeated substrings using efficient suffix array algorithms
  2. Calculates token savings for each potential replacement
  3. Selects optimal replacements that reduce total token count
  4. Intelligently routes to the best model based on your quality requirements
  5. Formats the result for easy LLM consumption
  6. Provides perfect decompression

Examples

Example 1: Complete with Compression

import { PcompresslrAPIClient } from 'compress-lightreach';

const client = new PcompresslrAPIClient("your-lightreach-api-key");

const prompt = `
Write a story about a cat. The cat is very friendly. 
Write a story about a dog. The dog is very friendly.
Write a story about a bird. The bird is very friendly.
`;

const result = await client.complete({
  messages: [{ role: "user", content: prompt }],
  desired_hle: 30,
});

console.log(result.decompressed_response);
console.log(`Model used: ${result.routing_info?.selected_model}`);
console.log(`Token savings: ${result.compression_stats.token_savings} tokens`);
console.log(`Compression ratio: ${(result.compression_stats.compression_ratio * 100).toFixed(2)}%`);

Example 2: Output Compression

import { PcompresslrAPIClient } from 'compress-lightreach';

const client = new PcompresslrAPIClient("your-lightreach-api-key");

const result = await client.complete({
  messages: [{ role: "user", content: "Generate a long report with repeated sections..." }],
  desired_hle: 35,
  compress_output: true,
});

console.log(result.decompressed_response);

Example 3: Multi-turn Conversation

import { PcompresslrAPIClient } from 'compress-lightreach';

const client = new PcompresslrAPIClient("your-lightreach-api-key");

const result = await client.complete({
  messages: [
    { role: "system", content: "You are a helpful coding assistant." },
    { role: "user", content: "How do I read a file in Python?" },
    { role: "assistant", content: "You can use open() with a context manager..." },
    { role: "user", content: "How about writing to a file?" },
  ],
  desired_hle: 30,
  compression_config: {
    compress_system: false,
    compress_user: true,
    compress_assistant: false,
    compress_only_last_n_user: 2,  // Only compress last 2 user messages
  },
});

Getting an API Key

To use Compress Light Reach, you need an API key from compress.lightreach.io.

  1. Visit compress.lightreach.io
  2. Sign up for an account
  3. Get your API key from the dashboard
  4. Set it as an environment variable: export PCOMPRESLR_API_KEY=your-key

Security & Privacy

BYOK model: Provider keys (OpenAI/Anthropic/Google/etc.) are managed in the dashboard and never passed through this SDK. The SDK only uses your LightReach API key for authentication with the service.

Requirements

  • Node.js 14.0.0 or higher
  • TypeScript 5.3.0+ (for TypeScript projects)

License

MIT License - see LICENSE file for details.

Support

Contributing

Contributions are welcome! Please feel free to submit a Pull Request.