npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2026 – Pkg Stats / Ryan Hefner

@mullion/ai-sdk

v0.3.0

Published

Mullion integration with Vercel AI SDK

Readme


Installation

npm install @mullion/ai-sdk ai zod

Overview

This package provides a seamless integration between Mullion and the Vercel AI SDK, enabling type-safe context management for LLM operations with automatic confidence tracking, provider-aware caching, and cost estimation.

Features

  • Type-safe contexts - Full Mullion Owned<T, S> integration
  • Automatic confidence scoring - Based on finish reasons
  • Provider-aware caching - Anthropic/OpenAI optimizations
  • Cost estimation & tracking - Pre-call estimates and actual costs
  • Cache metrics - Hit rates, savings calculation
  • Fork integration - Warmup strategies, schema conflict detection
  • Safe-by-default caching - Never cache user content without opt-in
  • TTL support - '5m', '1h', '1d' cache lifetimes
  • All AI SDK providers - OpenAI, Anthropic, Google, custom

Quick Start

Basic Usage

import {createMullionClient} from '@mullion/ai-sdk';
import {openai} from '@ai-sdk/openai';
import {z} from 'zod';

// Create a client with your preferred model
const client = createMullionClient(openai('gpt-4'));

// Define your data schema
const EmailSchema = z.object({
  intent: z.enum(['support', 'sales', 'billing', 'general']),
  urgency: z.enum(['low', 'medium', 'high']),
  entities: z.array(z.string()).describe('Key entities mentioned'),
});

// Scoped LLM operations
const analysis = await client.scope('email-intake', async (ctx) => {
  const result = await ctx.infer(EmailSchema, userEmail);

  // Automatic confidence checking
  if (result.confidence < 0.8) {
    throw new Error('Low confidence - needs human review');
  }

  return ctx.use(result);
});

Multi-Scope Workflows

// Process data across different security contexts
const result = await client.scope('admin', async (adminCtx) => {
  // Admin scope: access sensitive data
  const adminData = await adminCtx.infer(DataSchema, sensitiveInput);

  return await client.scope('user', async (userCtx) => {
    // User scope: safe for customer-facing operations
    const bridged = userCtx.bridge(adminData); // ✅ Explicit bridge
    const response = await userCtx.infer(ResponseSchema, bridged.value);

    return userCtx.use(response);
  });
});

Supported Providers

Works with all Vercel AI SDK providers:

OpenAI

import {openai} from '@ai-sdk/openai';

const client = createMullionClient(openai('gpt-4'));
// or
const client = createMullionClient(openai('gpt-3.5-turbo'));

Anthropic

import {anthropic} from '@ai-sdk/anthropic';

const client = createMullionClient(anthropic('claude-3-5-sonnet-20241022'));

Google

import {google} from '@ai-sdk/google';

const client = createMullionClient(google('gemini-1.5-pro'));

Custom Providers

import {createOpenAI} from '@ai-sdk/openai';

const customProvider = createOpenAI({
  apiKey: process.env.CUSTOM_API_KEY,
  baseURL: 'https://your-custom-endpoint.com/v1',
});

const client = createMullionClient(customProvider('your-model'));

Features

Automatic Confidence Scoring

Confidence is automatically extracted from LLM finish reasons:

const result = await ctx.infer(schema, input);

// Confidence mapping:
// stop: 1.0          - Model completed naturally
// tool-calls: 0.95   - Model made tool calls
// length: 0.75       - Output truncated due to token limit
// content-filter: 0.6 - Content was filtered
// other: 0.5         - Unknown reason
// error: 0.3         - Error occurred

console.log(`Confidence: ${result.confidence}`);

Schema Integration

Full Zod schema support with type inference:

const ProductSchema = z.object({
  name: z.string().describe('Product name'),
  price: z.number().positive().describe('Price in USD'),
  category: z.enum(['electronics', 'clothing', 'books']),
  features: z.array(z.string()).optional(),
});

const product = await ctx.infer(ProductSchema, productDescription);
// product.value is fully typed as ProductSchema's inferred type

Inference Options

Customize LLM behavior:

const result = await ctx.infer(schema, input, {
  temperature: 0.7,
  maxTokens: 500,
  systemPrompt: 'You are a helpful assistant specialized in data extraction.',
});

Caching

Provider-aware caching with safe-by-default behavior and automatic optimization.

Basic Caching

const result = await client.scope('analysis', async (ctx) => {
  // Add cacheable content
  ctx.cache.addSystemPrompt('You are an expert data analyst.');
  ctx.cache.addDeveloperContent(largeDocument, {
    ttl: '5m', // Time-to-live: '5m' | '1h' | '1d'
    scope: 'ephemeral', // or 'persistent'
  });

  // This inference benefits from caching on repeat calls
  const analysis = await ctx.infer(AnalysisSchema, 'Analyze this data');

  // Check cache performance
  const stats = await ctx.getCacheStats();
  console.log(`Cache hits: ${stats.cacheReadTokens} tokens`);
  console.log(`Saved: $${stats.estimatedSavings.toFixed(4)}`);

  return ctx.use(analysis);
});

Cache Segments API

// System prompts (always safe to cache)
ctx.cache.addSystemPrompt('You are a helpful assistant');

// Developer content (your content, safe to cache)
ctx.cache.addDeveloperContent(documentation, {
  ttl: '1h',
  scope: 'persistent',
});

// User content (requires explicit opt-in)
ctx.cache.addDeveloperContent(userQuery, {
  scope: 'allow-user-content', // ⚠️ Only if safe!
  ttl: '5m',
});

Provider-Specific Features:

| Provider | Min Tokens | TTL Options | Auto-Cache | | --------- | ---------- | ----------- | --------------- | | Anthropic | 1024-4096 | 5m, 1h, 1d | No (explicit) | | OpenAI | 1024 | 1h (fixed) | Yes (automatic) |

Learn more: See docs/reference/caching.md

Cost Estimation

Track and predict LLM costs before and after API calls.

Pre-Call Estimation

const estimate = await ctx.estimateNextCallCost(schema, input);
console.log(`Estimated cost: $${estimate.totalCost.toFixed(4)}`);
console.log(`Input tokens: ${estimate.inputTokens}`);
console.log(`Expected output tokens: ${estimate.outputTokens}`);

if (estimate.totalCost > 0.1) {
  console.warn('High cost operation!');
}

Post-Call Tracking

const result = await ctx.infer(schema, input);

const actual = await ctx.getLastCallCost();
console.log(`Actual cost: $${actual.totalCost.toFixed(4)}`);
console.log(`Cache saved: $${actual.cacheSavings.toFixed(4)}`);
console.log(`Net cost: $${actual.netCost.toFixed(4)}`);

// Compare estimate vs actual
const diff = actual.totalCost - estimate.totalCost;
console.log(`Difference: $${diff.toFixed(4)}`);

Token Estimation

import {estimateTokens} from '@mullion/ai-sdk';

const estimate = estimateTokens(text, 'gpt-4');
console.log(`${estimate.tokens} tokens (${estimate.method})`);

Pricing API

import {getPricing, PRICING_DATA} from '@mullion/ai-sdk';

const pricing = getPricing('claude-3-5-sonnet-20241022');
console.log(`Input: $${pricing.inputTokenPrice}/token`);
console.log(`Output: $${pricing.outputTokenPrice}/token`);
console.log(`Cache write: $${pricing.cacheWritePrice}/token`);
console.log(`Cache read: $${pricing.cacheReadPrice}/token`);

// Custom pricing
PRICING_DATA['custom-model'] = {
  modelId: 'custom-model',
  provider: 'custom',
  inputTokenPrice: 0.000002,
  outputTokenPrice: 0.000008,
};

Learn more: See docs/reference/cost-estimation.md

Fork/Merge Integration

Parallel execution with cache optimization and cost tracking.

Warmup Strategies

const result = await ctx.fork({
  branches: {
    model1: (c) => c.infer(schema, prompt),
    model2: (c) => c.infer(schema, prompt),
    model3: (c) => c.infer(schema, prompt),
  },
  strategy: 'cache-optimized',
  warmup: 'first-branch', // Prime cache with first branch
});

// Aggregate cache stats
const stats = await Promise.all(
  Object.values(result).map((r) => r.context.getCacheStats()),
);

Schema Conflict Detection

import {detectSchemaConflict} from '@mullion/ai-sdk';

const result = await ctx.fork({
  branches: {
    simple: (c) => c.infer(SimpleSchema, prompt),
    complex: (c) => c.infer(ComplexSchema, prompt), // Different schema!
  },
  onSchemaConflict: 'warn', // or 'error', 'ignore'
});
// Console warning: "Schema conflict detected - limited cache reuse"

Best Practice: Use unified schemas for fork branches:

// ✅ GOOD: Same schema, full cache reuse
const UnifiedSchema = z.object({
  analysisA: SchemaA,
  analysisB: SchemaB,
});

const result = await ctx.fork({
  branches: {
    a: (c) => c.infer(UnifiedSchema, prompt),
    b: (c) => c.infer(UnifiedSchema, prompt),
  },
});

Learn more: See docs/reference/fork.md

Advanced Examples

Error Handling with Confidence

async function processWithConfidence<T>(
  ctx: Context<string>,
  schema: z.ZodType<T>,
  input: string,
  minConfidence = 0.8,
): Promise<T> {
  const result = await ctx.infer(schema, input);

  if (result.confidence < minConfidence) {
    throw new Error(
      `Low confidence: ${result.confidence.toFixed(2)} < ${minConfidence}. ` +
        `Trace ID: ${result.traceId}`,
    );
  }

  return ctx.use(result);
}

Multi-Step Processing

const analysis = await client.scope('analysis', async (ctx) => {
  // Step 1: Extract entities
  const entities = await ctx.infer(EntitiesSchema, rawText);

  // Step 2: Classify sentiment
  const sentiment = await ctx.infer(SentimentSchema, rawText);

  // Step 3: Combine results
  if (entities.confidence > 0.8 && sentiment.confidence > 0.8) {
    return {
      entities: ctx.use(entities),
      sentiment: ctx.use(sentiment),
    };
  } else {
    throw new Error('Insufficient confidence for analysis');
  }
});

Bridging Complex Data

const pipeline = await client.scope('ingestion', async (ingestCtx) => {
  const rawData = await ingestCtx.infer(RawSchema, input);

  return await client.scope('processing', async (processCtx) => {
    const bridged = processCtx.bridge(rawData);

    return await client.scope('output', async (outputCtx) => {
      const processed = outputCtx.bridge(bridged);
      const final = await outputCtx.infer(OutputSchema, processed.value);

      // final.__scope is 'ingestion' | 'processing' | 'output'
      return outputCtx.use(final);
    });
  });
});

API Reference

Client & Context

createMullionClient(model, options?)

  • Creates Mullion client with AI SDK integration
  • Returns: MullionClient with scope() method

MullionClient.scope<S, R>(name, fn)

  • Creates scoped execution context
  • Returns: Promise<R>

Context<S>.infer<T>(schema, input, options?)

  • Infer structured data using LLM
  • Returns: Promise<Owned<T, S>>

Context<S>.bridge<T, OS>(owned)

  • Transfer value from another scope
  • Returns: Owned<T, S | OS>

Context<S>.use<T>(owned)

  • Extract raw value (scope-safe)
  • Returns: T

Caching

Context Methods:

  • ctx.cache.addSystemPrompt(content) - Add system prompt to cache
  • ctx.cache.addDeveloperContent(content, options) - Add developer content
  • ctx.getCacheStats() - Get cache performance metrics

Utilities:

  • getCacheCapabilities(provider, model) - Get provider cache capabilities
  • supportsCacheFeature(provider, feature) - Check feature support
  • isValidTtl(ttl) - Validate TTL string
  • validateTtlOrdering(segments) - Validate TTL ordering
  • createAnthropicAdapter(options) - Create Anthropic adapter
  • createOpenAIAdapter(options) - Create OpenAI adapter
  • createCacheSegmentManager(options) - Create cache manager
  • parseAnthropicMetrics(response) - Parse Anthropic metrics
  • parseOpenAIMetrics(response) - Parse OpenAI metrics
  • aggregateCacheMetrics(stats) - Aggregate metrics
  • estimateCacheSavings(stats, pricing) - Estimate savings

Types:

  • CacheSegmentManager, CacheSegment, CacheConfig
  • CacheStats, CacheCapabilities, CacheScope, CacheTTL

Cost Estimation

Context Methods:

  • ctx.estimateNextCallCost(schema, input, options?) - Estimate before call
  • ctx.getLastCallCost() - Get actual cost after call

Token Estimation:

  • estimateTokens(text, model?) - Estimate token count
  • estimateTokensForSegments(segments, model) - Estimate for segments

Pricing:

  • getPricing(modelId) - Get pricing for model
  • getAllPricing() - Get all pricing data
  • getPricingByProvider(provider) - Get provider pricing
  • PRICING_DATA - Global pricing object
  • exportPricingAsJSON() - Export pricing
  • importPricingFromJSON(data) - Import pricing

Cost Calculation:

  • calculateCost(params) - Calculate cost from usage
  • estimateCost(params) - Estimate cost
  • calculateBatchCost(calls) - Calculate batch costs
  • formatCostBreakdown(cost) - Format for display
  • compareCosts(estimated, actual) - Compare costs

Types:

  • TokenEstimate, ModelPricing, CostBreakdown, TokenUsage

Fork/Merge Integration

Warmup:

  • explicitWarmup(config) - Explicit cache warmup
  • firstBranchWarmup(branches) - First-branch warmup
  • createWarmupExecutor(config) - Create warmup executor
  • setupWarmupExecutor(config) - Setup global executor
  • estimateWarmupCost(config) - Estimate warmup cost
  • shouldWarmup(estimate) - Warmup recommendation

Schema Conflicts:

  • computeSchemaSignature(schema) - Compute schema hash
  • detectSchemaConflict(branches, options) - Detect conflicts
  • handleSchemaConflict(conflict, behavior) - Handle conflict
  • areSchemasCompatible(schemaA, schemaB) - Check compatibility
  • describeSchemasDifference(schemaA, schemaB) - Describe diff

Types:

  • WarmupConfig, WarmupResult, SchemaInfo
  • DetectSchemaConflictOptions, DetailedSchemaConflictResult

Confidence Scoring

  • extractConfidenceFromFinishReason(reason) - Extract confidence

Confidence Mapping:

  • stop: 1.0 - Model completed naturally
  • tool-calls: 0.95 - Model made tool calls
  • length: 0.75 - Truncated by token limit
  • content-filter: 0.6 - Content filtered
  • other: 0.5 - Unknown reason
  • error: 0.3 - Error occurred

Related Packages

Documentation

Integration with ESLint

Use with @mullion/eslint-plugin for compile-time leak detection:

npm install @mullion/eslint-plugin --save-dev
// eslint.config.js
import mullion from '@mullion/eslint-plugin';

export default [
  {
    plugins: {'@mullion': mullion},
    rules: {
      '@mullion/no-context-leak': 'error',
      '@mullion/require-confidence-check': 'warn',
    },
  },
];

Examples

See the examples directory for complete implementations:

Contributing

Found a bug or want to contribute? See CONTRIBUTING.md for guidelines.

License

MIT - see LICENSE for details.