npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2025 – Pkg Stats / Ryan Hefner

@dylitan/gemini-optimizer

v0.1.0

Published

Optimiza costos de prompts Gemini comprimiendo el historial en imágenes 768×N (tall image). System y último USER quedan en texto. Modo auto decide por tokens reales.

Downloads

19

Readme


✨ What It Does

  • Saves tokens: compresses the previous chat history into one or more tall images (768×N) using dense typography (Arial 9px, lineHeight=1.10).
  • Maintains accuracy: keeps the system instruction and last user message in plain text.
  • Smart decisions: auto mode calls countTokens and compares text vs. image (≈ 259 tok/image for a logical 768×768 page).
  • Transcribe mode (test): measures OCR density to validate cost and accuracy.
  • Built-in debug: saves PNGs and an HTML inspector of the sanitized payload.

Real savings depend on the chat history; typically 20–80% for long contexts.


📦 Installation

npm i @dylitan/gemini-optimizer @google/genai
# Requires Node 18+

Create a .env file with:

GEMINI_API_KEY=your_api_key

🚀 Quickstart

import 'dotenv/config';
import { GoogleGenAI } from '@google/genai';
import { CostOptimizer } from '@dylitan/gemini-optimizer';

const ai = new CostOptimizer(GoogleGenAI, process.env.GEMINI_API_KEY, {
  strategy: 'auto',          // 'never' | 'always' | 'auto' (default)
  debugSaveDir: './_debug',  // optional: saves PNG + HTML inspector
});

const config = {
  generationConfig: { temperature: 0.3, maxOutputTokens: 1200 },
  systemInstruction: [{ text: 'You are AURA (B2B sales). Maintain Spanish. Do not reveal internal mechanisms.' }],
};

const contents = [
  { role: 'user',  parts: [{ text: 'Hi, what does NexaCloud do?' }] },
  { role: 'model', parts: [{ text: 'We unify data and automate processes.' }] },
  { role: 'user',  parts: [{ text: 'Give me an executive summary with phases and KPIs.' }] }, // ← last USER stays in plain text
];

const res = await ai.models.generateContent({ model: 'gemini-2.5-flash', config, contents });
console.log(res.text);

🧠 Strategies

  • never: baseline — everything as text (no compression).

  • always: always compresses history into tall 768×N images (system and last USER remain text).

  • auto (recommended):

    1. countTokens for the full text payload (baseline).
    2. countTokens for the tail (system + last USER as text).
    3. Estimate image cost: pages × 259 tok (logical pages 768×768).
    4. Choose image if tail + images < baseline, otherwise text.

Optional env vars: IMAGE_TOKENS_PER_IMAGE (default 259), TALL_MAX_PAGES_PER_IMAGE (default 40).


🧾 What Is Sent to the Model

  • systemInstructiontext (intact).
  • Previous history (everything except the last USER) → tall images (768×N).
  • Last USERplain text.
  • A short hint instructs the model to read images as context and reply normally.

🔍 Transcription Mode (Density Validation)

const r = await ai.models.transcribe({
  model: 'gemini-2.5-flash',
  text: 'Long test text for OCR density validation...'
});

console.log('OCR:', r.transcription);
console.log('Image tokens:', r.tokens.totalImagesPlusPrompt, 'Text tokens:', r.tokens.plainText);

Useful for testing font/size/line-height combinations and their impact on cost vs. OCR accuracy.


🧩 API

new CostOptimizer(GoogleGenAIClass, apiKeyOrAuth, options?)
  • GoogleGenAIClass: usually GoogleGenAI from @google/genai.
  • apiKeyOrAuth: string (API key) or { apiKey } or { auth }.
  • options: see configuration table.

Methods (via models)

await ai.models.generateContent({ model, config?, contents })
await ai.models.generateContentStream({ model, config?, contents })
await ai.models.countTokens({ model, config?, contents }) // respects transformation if applied
await ai.models.transcribe({ model, text, prompt? })      // test OCR/cost mode

⚙️ Options

| Option | Type | Default | Description | | ---------------------- | --------------------------------------------- | ----------: | -------------------------------------------- | | strategy | 'never' \| 'always' \| 'auto' | auto | Compression policy. | | canvasW | number | 768 | Image width. | | pageH | number | 768 | Logical page height (for page estimation). | | marginPx | number | 0 | Internal margin. | | fontPx | number | 9 | Font size (Arial by default). | | lineHeight | number | 1.10 | Line height. | | letterSpacing | number | 0 | Letter spacing. | | imageFormat | 'image/png' \| 'image/jpeg' \| 'image/webp' | image/png | Export format. | | jpegQuality | number | 0.92 | JPEG quality. | | webpQuality | number | 92 | WebP quality. | | tallMaxPagesPerImage | number | 40 | Logical pages stacked per tall image. | | languageConsistency | boolean | true | Keep the last USER language. | | debugSaveDir | string \| null | null | Folder to save PNG + index.html inspector. | | debugGenerateHTML | boolean | true | Generate HTML inspector. | | onImage | (buf, meta) => void | undefined | Callback per generated image. | | printTokenStats | boolean | true | Prints token usage/savings stats. | | verboseAutoLogs | boolean | true | Detailed logs for auto mode decisions. | | cacheImages | boolean | true | LRU cache in memory for base64 images. | | lruSize | number | 200 | LRU cache size. | | autoAccurateBaseline | boolean | true | Real countTokens baseline measurement. |


🧪 Examples

See examples/:

  • 01-basic.mjs: minimal usage with auto.
  • 02-auto.mjs: compares never / always / auto and shows savings.
  • 03-transcribe.mjs: validates OCR and cost (text vs image).

Run with:

node examples/01-basic.mjs
node examples/02-auto.mjs
node examples/03-transcribe.mjs

🛠️ Accuracy Tips

  • Keep system and last USER as text (the lib already does this).
  • Use PNG for stable OCR when accuracy matters.
  • Avoid excessive letterSpacing; dense fonts increase capacity per 768×768 block.
  • For short histories, auto will skip compression (marginal or negative savings).

🔄 Short Roadmap

  • Semantic alignment heuristics to prioritize which parts of the history to compress.
  • Optional OCR quality metric in generateContent for alerts.
  • Native support for multi-turn streaming.

🤝 Contributing

  1. Fork and create a branch: feat/your-feature.
  2. npm i and npm run test.
  3. Submit a PR to main with a clear description.
  4. To publish: create a tag vX.Y.Z and push — CI will publish to npm if NPM_TOKEN is configured.

🧾 License

MIT © Dylitan — see LICENSE


Disclaimer: The per-image cost constant (≈ 259 tok/image 768×768) is a practical approximation. Always verify with the SDK’s countTokens for your specific cases, formats, and model versions.