npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2026 – Pkg Stats / Ryan Hefner

effect-inference

v0.1.0

Published

Effect-native provider-blind inference runtime descriptors and resolution

Readme

effect-inference

Effect-native provider-blind runtime descriptors, route resolution, and replay-safe runtime evidence for text and embeddings workloads.

Core Model

effect-inference separates each part of runtime truth into its own authority:

  • DesiredRuntimeDescriptor records what you want to run.
  • ResolvedRouteDescriptor records how that request mapped onto a provider route, base URL, endpoint, deployment, and provider model where known.
  • ResolvedRuntimeDescriptor records what actually happened after a call completes, including response model identity, usage, finish metadata, and provider metadata.
  • RuntimeEvidence joins the pre-execution resolution record with post-execution runtime truth so downstream packages can store one replay-safe artifact.

This is the main value of the package: callers work against @effect/ai LanguageModel and EmbeddingModel, while effect-inference keeps the runtime metadata around those calls explicit and serializable.

Quick Start

import * as EmbeddingModel from "@effect/ai/EmbeddingModel"
import * as LanguageModel from "@effect/ai/LanguageModel"
import { Effect, Redacted } from "effect"
import { HuggingFace, Runtime } from "effect-inference"

const program = Effect.gen(function* () {
  const resolution = yield* HuggingFace.resolveLiveRuntime({
    serveMode: "routed-marketplace",
    model: "meta-llama/Llama-3.3-70B-Instruct",
    accessToken: Redacted.make("hf_xxxxxxxxxxxxxx"),
    selectionPolicy: "fastest"
  })
  const languageModelLayer = yield* HuggingFace.languageModelLayer(resolution)
  const embeddingModelLayer = yield* HuggingFace.embeddingModelLayer(resolution)
  const summary = yield* LanguageModel.generateText({
    prompt: "Explain runtime provenance in one sentence.",
    toolChoice: "none"
  }).pipe(Effect.provide(languageModelLayer))
  const embeddings = yield* EmbeddingModel.EmbeddingModel.pipe(
    Effect.flatMap((model) => model.embedMany([summary.text])),
    Effect.provide(embeddingModelLayer)
  )
  const evidence = Runtime.makeRuntimeEvidence({
    resolution,
    resolvedRuntime: {
      responseModel: resolution.resolvedRoute.providerModel ?? resolution.desired.artifact.modelRef
    }
  })

  return yield* Effect.log({
    requested: evidence.desired.artifact.modelRef,
    routeFamily: evidence.resolvedRoute.route.family,
    responseModel: evidence.resolvedRuntime.responseModel,
    finishReason: summary.finishReason,
    embeddingDimensions: embeddings[0]?.length
  })
})

Using Hugging Face Live Runtimes

HuggingFace.resolveLiveRuntime(...) returns the canonical RuntimeResolution record for routed-provider and dedicated-endpoint usage, with requested descriptor truth, resolved route provenance, capability metadata, and authenticated live layers kept together. HuggingFace.resolveLiveRuntimeConfig(...) decodes the same routed or endpoint shape from env-backed config, and HuggingFace.resolveLiveRuntimeFromConfig(...) composes that config step with live runtime resolution in one call. From the resulting resolution, HuggingFace.languageModelLayer(...) and HuggingFace.embeddingModelLayer(...) give you the exact layer to provide to LanguageModel.generateText(...) or EmbeddingModel.EmbeddingModel, and Runtime.makeRuntimeEvidence(...) turns the result into replay-safe runtime evidence after the call completes.

RuntimeResolver remains the provider-blind, secret-free resolver surface. The Hugging Face helpers are the auth-bound companion for real routed and endpoint execution.

Other Entry Paths

If you want a config-driven helper for hosted and brokered text providers, Runtime.resolveLiveTextProviderRuntime(...) builds descriptors and LanguageModel layers for OpenAI, Anthropic, and OpenRouter without pulling those provider names into the rest of your program.

Live Example Verification

bun run --filter 'effect-inference' examples:verify executes the live examples behind an explicit opt-in gate. Set EFFECT_INFERENCE_RUN_LIVE_EXAMPLES=true to enable the harness and optionally pass EFFECT_INFERENCE_LIVE_EXAMPLES as a comma-separated list of runtime-config-decoding, hugging-face-routed-runtime, and hugging-face-endpoint-runtime.

The Hugging Face config helper reads env-backed keys such as HUGGINGFACE_ACCESS_TOKEN, HUGGINGFACE_SELECTION_POLICY, HUGGINGFACE_ENDPOINT_BASE_URL, HUGGINGFACE_ENDPOINT_ID, HUGGINGFACE_DEPLOYMENT_ID, and HUGGINGFACE_RUNTIME_FLAVOR. The routed example only needs a token unless you want to override the router URL or selection policy. The endpoint example needs a token plus real endpoint coordinates.

Route Families

  • OpenAiCompatible — the stable transport family for brokered, dedicated, and self-hosted OpenAI-compatible text and embeddings runtimes
  • OpenAiResponses — direct OpenAI Responses support on an explicit companion lane
  • AnthropicMessages — direct Anthropic Messages support on an explicit companion lane
  • HuggingFace — Hugging Face routed-provider and dedicated-endpoint authorities with typed selection policy and deployment identity

Example Stories

  • examples/01-openai-compatible-static-runtime.ts — self-hosted OpenAI-compatible descriptor and evidence assembly
  • examples/02-hugging-face-routed-runtime.ts — Hugging Face routed-provider live runtime resolution plus LanguageModel.generateText
  • examples/03-runtime-config-decoding.ts — config-driven direct provider runtime construction through Runtime.resolveLiveTextProviderRuntime
  • examples/04-hugging-face-endpoint-runtime.ts — Hugging Face dedicated endpoint live runtime resolution plus embeddings execution

Entry Points

  • effect-inference
  • effect-inference/Contracts
  • effect-inference/Errors
  • effect-inference/Runtime
  • effect-inference/OpenAiCompatible
  • effect-inference/HuggingFace
  • effect-inference/Testing
  • effect-inference/experimental

Testing

effect-inference/Testing exports deterministic fixtures and static layers so downstream packages can prove runtime boundaries without importing live provider adapters:

  • Testing.makeDesiredRuntimeDescriptor
  • Testing.makeResolvedRouteDescriptor
  • Testing.makeResolvedRuntimeDescriptor
  • Testing.makeRuntimeEvidenceFixture
  • Testing.staticRuntimeResolver
  • Testing.staticLanguageModel
  • Testing.staticEmbeddingModel

Development

bun run check
bun run check:tests
bun run lint
bun run test
bun run build
bun run docgen