npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2026 – Pkg Stats / Ryan Hefner

@fastpaca/cria

v1.7.7

Published

Lightweight, fast, and tiny LLM Context & Memory layout renderer to enforce token budgets in long running agents.

Downloads

585

Readme

The LLM space moves fast. New models drop often. Providers change APIs. Better vector stores emerge. New memory systems drop. Your prompts shouldn't break every time the stack evolves.

Cria is prompt architecture as code. Same prompt logic, swap the building blocks underneath when you need to upgrade.

import { cria } from "@fastpaca/cria";
import { createProvider } from "@fastpaca/cria/openai";
import OpenAI from "openai";

const client = new OpenAI();
const model = "gpt-5-nano";
const provider = createProvider(client, model);

const summarizer = cria.summarizer({
  id: "history",
  store: memory,
  provider,
});
const vectors = cria.vectordb(store);
const summary = summarizer.plugin({ history: conversation });
const retrieval = vectors.plugin({ query, limit: 8 });

const messages = await cria
  .prompt(provider)
  .system("You are a research assistant.")
  .use(summary)
  .use(cria.history({ history: recentTurns }))
  .use(retrieval)
  .user(query)
  .render({ budget: 128_000 });

const response = await client.chat.completions.create({ model, messages });

Why Cria?

When you run LLM features in production, you need to:

  1. Build prompts that last — Swap providers, models, memory, or retrieval without rewriting prompt logic. A/B test components as the stack evolves.
  2. Test like code — Evaluate prompts with LLM-as-a-judge. Run tests in CI. Catch drift when you swap building blocks.
  3. Inspect what runs — See exactly what gets sent to the model. Debug token budgets. See when your RAG input messes up the context. (Local DevTools-style inspector: planned)

Cria gives you composable prompt blocks, explicit token budgets, and building blocks you can easily customise and adapt so you move fast without breaking prompts.

What you get

| Capability | Status | | --- | --- | | Component swapping via adapters | ✅ | | Memory + vector search adapters | ✅ | | Token budgeting | ✅ | | Fit & compaction controls | ✅ | | Conversation summaries | ✅ | | OpenTelemetry integration | ✅ | | Prompt eval/test helpers | ✅ | | Local prompt inspector (DevTools-style) | planned |

Quick start

npm install @fastpaca/cria
import { cria } from "@fastpaca/cria";
import { createProvider } from "@fastpaca/cria/openai";
import OpenAI from "openai";

const client = new OpenAI();
const model = "gpt-5-nano";
const provider = createProvider(client, model);

const messages = await cria
  .prompt(provider)
  .system("You are a helpful assistant.")
  .user("What is the capital of France?")
  .render({ budget: 128_000 });

const response = await client.chat.completions.create({ model, messages });

Core patterns

const vectors = cria.vectordb(qdrant);
const retrieval = vectors.plugin({ query, limit: 10 });

const messages = await cria
  .prompt(provider)
  .system("You are a research assistant.")
  .use(retrieval)
  .user(query)
  .render({ budget: 128_000 });
const summarizer = cria.summarizer({
  id: "conv",
  store: redis,
  priority: 2,
  provider,
});
const summary = summarizer.plugin({ history: conversation });

const messages = await cria
  .prompt(provider)
  .system("You are a helpful assistant.")
  .use(summary)
  .last(conversation, { n: 20 })
  .user(query)
  .render({ budget: 128_000 });
const summarizer = cria.summarizer({
  id: "conv",
  store: redis,
  priority: 2,
  provider,
});
const summary = summarizer.plugin({ history: conversation });
const vectors = cria.vectordb(qdrant);
const retrieval = vectors.plugin({ query, limit: 10 });

const messages = await cria
  .prompt(provider)
  .system(SYSTEM_PROMPT)
  // Dropped first when budget is tight
  .omit(examples, { priority: 3 })
  // Summaries are run ad-hoc once we hit budget limits
  .use(summary)
  // Sacred, need to retain but limit to only 10 entries
  .use(retrieval)
  .user(query)
  // 128k token budget, once we hit the budget strategies
  // will run based on priority & usage (e.g. summaries will
  // trigger).
  .render({ budget: 128_000 });
import { c, cria } from "@fastpaca/cria";
import { createProvider } from "@fastpaca/cria/ai-sdk";
import { createJudge } from "@fastpaca/cria/eval";
import { openai } from "@ai-sdk/openai";

const judge = createJudge({
  target: createProvider(openai("gpt-4o")),
  evaluator: createProvider(openai("gpt-4o-mini")),
});

const prompt = await cria
  .prompt()
  .system("You are a helpful customer support agent.")
  .user("How do I update my payment method?")
  .build();

await judge(prompt).toPass(c`Provides clear, actionable steps`);

Works with

import OpenAI from "openai";
import { createProvider } from "@fastpaca/cria/openai";
import { cria } from "@fastpaca/cria";

const client = new OpenAI();
const model = "gpt-5-nano";
const provider = createProvider(client, model);

const messages = await cria
  .prompt(provider)
  .system("You are helpful.")
  .user(userQuestion)
  .render({ budget: 128_000 });

const response = await client.chat.completions.create({ model, messages });
import OpenAI from "openai";
import { createResponsesProvider } from "@fastpaca/cria/openai";
import { cria } from "@fastpaca/cria";

const client = new OpenAI();
const model = "gpt-5-nano";
const provider = createResponsesProvider(client, model);

const input = await cria
  .prompt(provider)
  .system("You are helpful.")
  .user(userQuestion)
  .render({ budget: 128_000 });

const response = await client.responses.create({ model, input });
import Anthropic from "@anthropic-ai/sdk";
import { createProvider } from "@fastpaca/cria/anthropic";
import { cria } from "@fastpaca/cria";

const client = new Anthropic();
const model = "claude-sonnet-4";
const provider = createProvider(client, model);

const { system, messages } = await cria
  .prompt(provider)
  .system("You are helpful.")
  .user(userQuestion)
  .render({ budget: 128_000 });

const response = await client.messages.create({ model, system, messages });
import { createProvider } from "@fastpaca/cria/ai-sdk";
import { cria } from "@fastpaca/cria";
import { generateText } from "ai";

const provider = createProvider(model);

const messages = await cria
  .prompt(provider)
  .system("You are helpful.")
  .user(userQuestion)
  .render({ budget: 128_000 });

const { text } = await generateText({ model, messages });
import { cria, type StoredSummary } from "@fastpaca/cria";
import { RedisStore } from "@fastpaca/cria/memory/redis";

const store = new RedisStore<StoredSummary>({
  host: "localhost",
  port: 6379,
});

const summarizer = cria.summarizer({
  id: "conv-123",
  store,
  priority: 2,
  provider,
});
const summary = summarizer.plugin({ history: conversation });

const messages = await cria
  .prompt(provider)
  .system("You are a helpful assistant.")
  .use(summary)
  .last(conversation, { n: 20 })
  .user(query)
  .render({ budget: 128_000 });
import { cria, type StoredSummary } from "@fastpaca/cria";
import { PostgresStore } from "@fastpaca/cria/memory/postgres";

const store = new PostgresStore<StoredSummary>({
  connectionString: "postgres://user:pass@localhost/mydb",
});

const summarizer = cria.summarizer({
  id: "conv-123",
  store,
  priority: 2,
  provider,
});
const summary = summarizer.plugin({ history: conversation });

const messages = await cria
  .prompt(provider)
  .system("You are a helpful assistant.")
  .use(summary)
  .last(conversation, { n: 20 })
  .user(query)
  .render({ budget: 128_000 });
import { cria, type StoredSummary } from "@fastpaca/cria";
import { SqliteStore } from "@fastpaca/cria/memory/sqlite";

const store = new SqliteStore<StoredSummary>({
  filename: "cria.sqlite",
});

const summarizer = cria.summarizer({
  id: "conv-123",
  store,
  priority: 2,
  provider,
});
const summary = summarizer.plugin({ history: conversation });

const messages = await cria
  .prompt(provider)
  .system("You are a helpful assistant.")
  .use(summary)
  .last(conversation, { n: 20 })
  .user(query)
  .render({ budget: 128_000 });
import { z } from "zod";
import { cria } from "@fastpaca/cria";
import { SqliteVectorStore } from "@fastpaca/cria/memory/sqlite-vector";

const store = new SqliteVectorStore<string>({
  filename: "cria.sqlite",
  dimensions: 1536,
  embed: async (text) => await getEmbedding(text),
  schema: z.string(),
});

const vectors = cria.vectordb(store);
const retrieval = vectors.plugin({ query, limit: 10 });

const messages = await cria
  .prompt(provider)
  .system("You are a research assistant.")
  .use(retrieval)
  .user(query)
  .render({ budget: 128_000 });
import { ChromaClient } from "chromadb";
import { cria } from "@fastpaca/cria";
import { ChromaStore } from "@fastpaca/cria/memory/chroma";

const client = new ChromaClient({ path: "http://localhost:8000" });
const collection = await client.getOrCreateCollection({ name: "my-docs" });

const store = new ChromaStore({
  collection,
  embed: async (text) => await getEmbedding(text),
});

const vectors = cria.vectordb(store);
const retrieval = vectors.plugin({ query, limit: 10 });

const messages = await cria
  .prompt(provider)
  .system("You are a research assistant.")
  .use(retrieval)
  .user(query)
  .render({ budget: 128_000 });
import { QdrantClient } from "@qdrant/js-client-rest";
import { cria } from "@fastpaca/cria";
import { QdrantStore } from "@fastpaca/cria/memory/qdrant";

const client = new QdrantClient({ url: "http://localhost:6333" });

const store = new QdrantStore({
  client,
  collectionName: "my-docs",
  embed: async (text) => await getEmbedding(text),
});

const vectors = cria.vectordb(store);
const retrieval = vectors.plugin({ query, limit: 10 });

const messages = await cria
  .prompt(provider)
  .system("You are a research assistant.")
  .use(retrieval)
  .user(query)
  .render({ budget: 128_000 });

Documentation

FAQ

What does Cria output? Prompt structures/messages (via a provider adapter). You pass the rendered output into your existing LLM SDK call.

What works out of the box? Provider adapters for OpenAI (Chat Completions + Responses), Anthropic, and Vercel AI SDK; store adapters for Redis, SQLite, Postgres, Chroma, and Qdrant.

How do I validate component swaps? Swap via adapters, diff the rendered prompt output, and run prompt eval/tests to catch drift.

What's the API stability? We use Cria in production, but the API may change before 2.0. Pin versions and follow the changelog.

Contributing

Issues and PRs welcome. Keep changes small and focused.

License

MIT