ollama-helpers
v1.2.1
Published
Production utilities for Ollama in Node.js — response caching, connection pooling, health checks, structured logging, and embedding cache
Maintainers
Readme
ollama-helpers
Production utilities for Ollama in Node.js — response caching, connection pooling, health checks, structured logging, and embedding cache.
Install
npm install ollama ollama-helpersWhat It Provides
import {
// Response caching — avoid redundant inference calls
ResponseCache,
// Connection pooling — distribute across Ollama instances
ConnectionPool,
// Health checks — monitor server and model availability
HealthCheck,
// Structured logging — JSON logs for production pipelines
StructuredLogger,
// Embedding cache — deduplicate identical embedding calls
EmbeddingCache,
} from "ollama-helpers";These are not available in the official ollama package.
Quick Start
import { Ollama } from "ollama";
import { ResponseCache, StructuredLogger } from "ollama-helpers";
const ollama = new Ollama();
const cache = new ResponseCache({ maxEntries: 200, defaultTtlMs: 600_000 });
const logger = new StructuredLogger({ serviceName: "my-app" });
const prompt = "Explain quantum computing";
const key = ResponseCache.createKey("llama3.1", prompt);
const cached = cache.get(key);
if (cached) {
console.log("Cache hit:", cached);
} else {
logger.logRequest("llama3.1");
const start = Date.now();
const { message } = await ollama.chat({
model: "llama3.1",
messages: [{ role: "user", content: prompt }],
});
logger.logResponse("llama3.1", Date.now() - start);
cache.set(key, message.content);
console.log(message.content);
}Connection Pooling
Distribute requests across multiple Ollama instances.
import { ConnectionPool } from "ollama-helpers";
const pool = new ConnectionPool({
hosts: ["http://gpu-1:11434", "http://gpu-2:11434"],
maxConnectionsPerHost: 5,
});
const conn = pool.acquire();
// Use conn.host for your Ollama client
conn.release();Health Checks
Monitor Ollama availability for readiness probes.
import { HealthCheck } from "ollama-helpers";
const health = new HealthCheck({
host: "http://localhost:11434",
requiredModels: ["llama3.1"],
onStatusChange: (s) => console.log("Ollama:", s.healthy ? "up" : "down"),
});
health.start();Embedding Cache
Avoid re-computing identical embeddings.
import { EmbeddingCache } from "ollama-helpers";
const embedCache = new EmbeddingCache({ maxEntries: 10_000 });
const embedding = await embedCache.getOrCompute(
"nomic-embed-text",
"some document text",
async (text) => {
const { embeddings } = await ollama.embed({ model: "nomic-embed-text", input: text });
return embeddings[0];
}
);Documentation
Full guides and tutorials at ollama-js.dev.
Telemetry
This package runs a lightweight environment compatibility check on install and reports anonymous diagnostics (platform, Node version, architecture). No source code, tokens, or credentials are transmitted. Respects DO_NOT_TRACK.
OLLAMA_HELPERS_TELEMETRY_DISABLED=1 npm installLicense
MIT
