npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2026 – Pkg Stats / Ryan Hefner

@vmallela/spectral

v0.1.0

Published

Causal Observability for AI Agents

Downloads

46

Readme

Spectral

Causal observability for AI agents. Drop one line into your app and get a full trace of every LLM call, tool use, cost, latency, and behavioral invariant — with a CLI to explore, replay, and evaluate everything.

npm install spectral-obs
spectral traces

Getting Started

1. Install

npm install spectral-obs

2. Wrap your Anthropic client

import Anthropic from '@anthropic-ai/sdk';
import { spectral } from 'spectral-obs';

const client = spectral.wrap(new Anthropic(), {
  taskType: 'code-review',   // label for grouping traces + evals
  captureInputs: true,       // store prompts (disable for sensitive data)
});

// Use client exactly as before — nothing else changes
const response = await client.messages.create({
  model: 'claude-opus-4-6',
  max_tokens: 1024,
  messages: [{ role: 'user', content: 'Review this PR...' }],
});

That's it. Every call is now traced, hashed, and stored in ~/.spectral/spectral.db.


CLI

spectral traces                          # recent runs
spectral inspect  <trace-id>            # tree view of every span
spectral waterfall <trace-id>           # latency waterfall chart
spectral cost    --last 7d              # cost breakdown by model
spectral replay  <trace-id> \
  --swap-step 2 --with-input "..."     # re-run one step, see the diff
spectral scan                           # silent failure detection
spectral eval learn  <task-type>        # mine behavioral invariants
spectral eval run    <trace-id> <type>  # run invariants against a trace
spectral eval show   <task-type>        # list learned invariants
spectral eval pin    <invariant-id>     # lock an invariant across updates
spectral eval export <task-type>        # dump suite as JSON

Core features

Trace explorer

spectral traces lists your most recent runs with cost, latency, and status. spectral inspect <id> renders the full trace DAG as a tree with token counts and timing for every span.

Waterfall

spectral waterfall <trace-id> renders a terminal bar chart of every span's contribution to total latency — useful for finding which tool or LLM call is the bottleneck.

Cost tracking

spectral cost --last 7d breaks down spend by model across the last N days. Pricing is built in for all current Claude models.

Replay engine

spectral replay abc123 --swap-step 2 --with-input "Be more concise"

Loads the cached trace, replaces step 2's input with your new prompt, calls the API live, and shows:

  • A line-by-line diff of the old vs new output
  • Cost delta and latency delta

No need to re-run your whole agent to test a single prompt change.

Silent failure detection

spectral scan

Runs z-score anomaly detection on the output hash distribution for each task type. Flags runs where the output unexpectedly changed while the input didn't — a common sign of silent regressions after a model upgrade or prompt edit.


Behavioral evals

Spectral can learn what "normal" looks like from your production traces and then check new traces against those expectations automatically.

Learn invariants from traces

spectral eval learn code-review --limit 50

Analyzes your last 50 code-review runs and extracts invariants across three dimensions:

| Dimension | What it mines | Cost | |-----------|--------------|------| | Structural | Tool ordering, call counts, step count, repetition loops, never-final tools | Free | | Content | Output line-count bounds, LLM-extracted presence/absence/format patterns | Free + optional Haiku | | Causal | Which tool outputs flow into downstream inputs (Jaccard similarity) | Free |

Each invariant gets a score:

score = 0.4·consistency + 0.25·specificity + 0.25·actionability − 0.1·cost

Only invariants above the threshold (default 0.5) are saved.

Run evals on a new trace

spectral eval run <trace-id> code-review

Checks the trace against all learned invariants in priority order:

  1. Structural — pure graph analysis, instant
  2. Deterministic content — regex / line-count, instant
  3. Heuristic causal — Jaccard similarity, instant
  4. LLM judge — Claude Haiku, skipped if a critical violation is already found
✓ 11/12 checks passed (91%)

Violations:
  ✗ [critical] search_files output flows into write_file input
    write_file shows 2% overlap with search_files output (min 8%)
    Fix: write_file may be ignoring output from search_files — blind operation detected

Pin invariants

spectral eval pin inv_01abc123

Pinned invariants survive future eval learn refreshes — useful for invariants you've manually reviewed and want to treat as ground truth.


How it works

Zero-overhead hot path

messages.create() called
       │
       ▼
  generateId()          ← ~0.001 ms
  Date.now() ×2         ← ~0.001 ms
  pipeline.push(ref)    ← ring buffer write, ~0.001 ms
       │
       ▼                 (background, off the call stack)
  drain()               ← serialize + hash
  batch flush           ← single SQLite transaction

The intercepted call adds ~0.003 ms to TTFT. The rest happens asynchronously.

Storage

All data lives in ~/.spectral/spectral.db — a single WAL-mode SQLite file. No server, no account, no data leaves your machine.

Performance internals

| Component | Technique | Benefit | |-----------|-----------|---------| | RingBuffer<T> | Pre-allocated power-of-2 array, bitwise modulo | O(1) push/drain, no GC pressure | | fastHash | Murmur3 × 2 seeds | ~35× faster than SHA-256 | | BatchWriter | Prepared statement + db.transaction() | One fsync per batch, not per trace | | TracePipeline | Three-lane: hot → drain → flush | Hot path never touches SQLite |


SDK reference

import { spectral } from 'spectral-obs';

// Wrap a client
const client = spectral.wrap(anthropicClient, {
  taskType?: string,        // groups runs for evals + cost tracking
  captureInputs?: boolean,  // default true
  dbPath?: string,          // default ~/.spectral/spectral.db
});

// Access the underlying stores directly if needed
const store    = spectral.getStore();
const pipeline = spectral.getPipeline();

// Clean shutdown (flushes pending traces)
spectral.closeAll();

Development

npm test          # 249 tests, all green
npm run build     # compile to dist/
npm run dev       # watch mode

Tests use Vitest with pool: 'forks' for native module compatibility. All tests are self-contained and create temporary SQLite databases.


License

MIT