@sriinnu/kosha-discovery
v1.1.0
Published
AI Model & Provider Discovery Registry -- kosha (कोश) treasury of models
Maintainers
Readme
Kosha (कोश — treasury) discovers AI models across providers, resolves credentials, enriches with pricing, and exposes the catalog through a library, CLI, HTTP API, and a built-in OpenAI-compatible proxy. One source of truth for model identity, pricing, and routing — so your app doesn't break when providers ship new SKUs or change rates.
Install
npm install @sriinnu/kosha-discovery # library / server
npm install -g @sriinnu/kosha-discovery # global `kosha` CLIQuick start
Library
import { createKosha } from "@sriinnu/kosha-discovery";
const kosha = await createKosha();
const models = kosha.models(); // all
const cheapest = kosha.cheapestModels({ role: "image" }); // ranked
const sonnet = kosha.model("sonnet"); // alias resolves
console.log(sonnet.pricing); // { inputPerMillion: 3, outputPerMillion: 15, ... }CLI
kosha discover # discover all providers (writes cache + manifest)
kosha list --provider anthropic # filter from local cache
kosha model sonnet # details for one model (alias-aware)
kosha cheapest --role embeddings # rank cheapest for a role
kosha update # force a fresh fetch
kosha serve --port 3000 # HTTP APIAfter each discovery, a stable v1 manifest lands at ~/.kosha/registry.json — any tool that reads JSON can consume it:
jq '.models[] | select(.pricing.inputPerMillion < 0.1) | .modelId' ~/.kosha/registry.jsonHTTP API
GET /api/models[?provider=…&role=…] GET /api/models/:idOrAlias
GET /api/models/:idOrAlias/routes GET /api/models/cheapest?role=…
GET /api/providers GET /api/roles
POST /api/refresh GET /healthProxy
Kosha runs as an OpenAI-compatible proxy. Point your SDK at http://localhost:3000/proxy/v1 and it resolves the model, picks the right provider, injects credentials, and forwards — streaming included.
kosha serve # start on :3000import OpenAI from "openai";
const client = new OpenAI({
baseURL: "http://localhost:3000/proxy/v1",
apiKey: "not-used", // kosha resolves credentials from env
});
// Use any canonical model ID or alias
const res = await client.chat.completions.create({
model: "sonnet",
messages: [{ role: "user", content: "hello" }],
});
// Let kosha pick the cheapest model you have a key for
const cheap = await client.chat.completions.create({
model: "kosha:cheapest",
messages: [{ role: "user", content: "hello" }],
});
// Cheapest model with tool_use and at least 128k context
const routed = await client.chat.completions.create({
model: "kosha:cheapest[tool_use,128k]",
messages: [{ role: "user", content: "hello" }],
});kosha:cheapest filter syntax (comma-separated, combinable):
| Filter | Example | Meaning |
|--------|---------|---------|
| capability | tool_use, vision | model must have this tag |
| <N>k | 128k, 200k | minimum context window |
| provider:<id> | provider:groq | pin to a specific provider |
The response always includes x-kosha-model, x-kosha-provider, and x-kosha-requested headers so the caller knows exactly what ran.
Supported transports: openai, openai-compatible-http, ollama. Anthropic, Google, Bedrock, and Vertex require wire-format translation — not yet proxied.
Supported providers
| Provider | Discovery | Credential sources |
|----------|-----------|--------------------|
| Anthropic | /v1/models | ANTHROPIC_API_KEY, Claude CLI, Codex CLI |
| OpenAI | /v1/models | OPENAI_API_KEY, GitHub Copilot tokens |
| Google | /v1beta/models | GOOGLE_API_KEY, GEMINI_API_KEY, Gemini CLI, gcloud |
| AWS Bedrock | SDK → CLI → static | AWS_ACCESS_KEY_ID, ~/.aws/credentials, SSO, IAM |
| Vertex AI | API + gcloud | GOOGLE_APPLICATION_CREDENTIALS, ADC |
| Ollama | local API | — (local) |
| OpenRouter | API | OPENROUTER_API_KEY (optional) |
| NVIDIA / Together / Fireworks / Groq / Cerebras / Cohere / DeepInfra / Perplexity | API | provider key env var |
| DeepSeek / Mistral / Moonshot (Kimi) / GLM (Zhipu) / Z.AI / MiniMax | API | provider key env var |
Full credential setup: docs/credentials.md.
Architecture
Discovery layer talks to provider APIs and local catalogs. Enrichment layer fills pricing and context windows from the LiteLLM catalog and models.dev. Resilience layer (circuit breaker + stale-cache fallback + health tracker) keeps a flaky provider a degraded read, never a crash. Manifest layer writes a v1-stable JSON snapshot so downstream consumers — tokmeter, chitragupta, ayuh — read prices from one source instead of inventing their own. Proxy layer exposes an OpenAI-compatible endpoint that resolves kosha:cheapest[…] hints at request time, injects credentials, and forwards to the winning provider.
Docs
| | | |---|---| | Credentials | Env vars, CLI tools, and config files for every provider | | CLI | Commands, flags, examples | | HTTP API | Endpoints, parameters, response schemas | | Configuration | Aliases, routing, enrichment, programmatic config | | Architecture | Discovery flow, module map, adding providers | | Resilience | Circuit breakers, stale cache, health | | Security | Threat catalogue, runtime scanning, pre-commit hook | | Discovery Plane v1 | Stable daemon contract (deltas, SSE watch, binding hints) |
Release
Tag-driven via GitHub Actions:
git tag -s vX.Y.Z -m "vX.Y.Z" && git push origin vX.Y.Z
# → Actions → "Manual Release (Tag + npm)" → run with tag=vX.Y.ZThe workflow checks tag ↔ package.json match, builds, tests, publishes to npm, and creates the GitHub Release. Requires the NPM_TOKEN secret.
Credits
litellm (pricing data) · openrouter · ollama · chitragupta (registry patterns) · takumi (routing needs that drove kosha's creation).
License
MIT
