npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2026 – Pkg Stats / Ryan Hefner

@vibeatlas/ship-reliability-sdk

v1.0.0

Published

Domain-agnostic framework for scoring ANY AI agent's output reliability

Downloads

22

Readme

@ship-protocol/reliability-sdk

Domain-agnostic framework for scoring ANY AI agent's output reliability.

SHIP Protocol scores code today. This SDK generalizes the methodology to score any agent output: code agents, customer service agents, legal agents, and more — through pluggable domain adapters.

Install

npm install @ship-protocol/reliability-sdk

Python:

cd python && pip install -e .

Quick Start

3-Line LangChain Integration

import { ShipReliability } from '@ship-protocol/reliability-sdk';
const ship = new ShipReliability({ domain: 'code' });
chain = chain.pipe(ship.asLangChainMiddleware());

Direct Scoring

import { ShipReliability } from '@ship-protocol/reliability-sdk';

const ship = new ShipReliability({ domain: 'code' });
const score = await ship.score({
  input: 'Write a REST API',
  output: 'feat: add REST API with Express.js and error handling',
  metadata: { language: 'typescript', tool: 'claude' },
});

console.log(score);
// => { score: 73, grade: 'B', confidence: 0.82, factors: {...}, domain: 'code', ... }

Python

from ship_reliability import ShipReliability, ScoringInput

ship = ShipReliability(domain="code")
score = ship.score(ScoringInput(
    input="Write a REST API",
    output="feat: add REST API with Express.js",
    metadata={"language": "typescript", "tool": "claude"}
))
print(f"Score: {score.score}, Grade: {score.grade}")

Domains

code — Code Reliability (SHIP API)

Calls the live SHIP API /v2/score endpoint. Requires network access.

const ship = new ShipReliability({ domain: 'code' });
const score = await ship.score({
  input: 'Write authentication',
  output: 'feat: add JWT auth with bcrypt',
  metadata: { language: 'typescript', tool: 'claude', repo: 'owner/repo' },
});

Metadata fields:

  • language — Programming language (default: 'typescript')
  • tool — AI tool that generated the code (e.g. 'claude', 'copilot')
  • repo — Repository in owner/repo format
  • owner — Repository owner

customer-service — Customer Service Reliability

Heuristic scoring for customer service agent responses. Evaluates empathy, actionability, certainty, and channel fit.

const ship = new ShipReliability({ domain: 'customer-service' });
const score = await ship.score({
  input: 'My order arrived damaged',
  output: "I'm sorry about that. Here's how to get a replacement: go to Orders > Returns.",
  metadata: { channel: 'chat', category: 'returns' },
});

Factors scored: empathy, actionability, certainty, response_length, channel_fit

general — General Agent Reliability

Heuristic scoring for any text agent output. Works offline without API calls.

const ship = new ShipReliability({ domain: 'general' });
const score = await ship.score({
  input: 'Explain closures in JavaScript',
  output: 'A closure is a function that...',
});

Factors scored: completeness, coherence, relevance, specificity, consistency

Custom Domain Adapters

Implement DomainAdapter to add scoring for any agent type:

import { ShipReliability, scoreToGrade } from '@ship-protocol/reliability-sdk';
import type { DomainAdapter, ScoringInput, ScoringConfig, ReliabilityScore } from '@ship-protocol/reliability-sdk';

class LegalDomainAdapter implements DomainAdapter {
  readonly domain = 'legal';
  readonly name = 'Legal Document Reliability';

  validate(input: ScoringInput) {
    return { valid: !!input.output, errors: input.output ? [] : ['output required'] };
  }

  async score(input: ScoringInput, config: ScoringConfig): Promise<ReliabilityScore> {
    // Your domain-specific scoring logic here
    const score = 75;
    return {
      score,
      grade: scoreToGrade(score),
      confidence: 0.6,
      factors: { precision: 80, structure: 70 },
      domain: this.domain,
      timestamp: new Date().toISOString(),
      modelVersion: 'legal-v1',
      recommendations: [],
    };
  }
}

const ship = new ShipReliability({ adapter: new LegalDomainAdapter() });

Integrations

LangChain

import { ShipReliability } from '@ship-protocol/reliability-sdk';

const ship = new ShipReliability({ domain: 'code' });

// Option 1: Pipe middleware
chain = chain.pipe(ship.asLangChainMiddleware());

// Option 2: Standalone middleware
import { createLangChainMiddleware } from '@ship-protocol/reliability-sdk';
const middleware = createLangChainMiddleware({ domain: 'general' });

// Option 3: Callback handler
import { ShipReliabilityCallback } from '@ship-protocol/reliability-sdk';
const callback = new ShipReliabilityCallback({ domain: 'code' });
await callback.onChainEnd(chainOutput);
console.log(callback.getAverageScore());

Python LangChain:

from ship_reliability import ShipReliability
from langchain_core.runnables import RunnableLambda

ship = ShipReliability(domain="code")
chain = my_chain | RunnableLambda(ship.as_langchain_middleware())

OpenAI

import { wrapOpenAIChat } from '@ship-protocol/reliability-sdk';
import OpenAI from 'openai';

const openai = new OpenAI();
const scoredChat = wrapOpenAIChat(
  openai.chat.completions.create.bind(openai.chat.completions),
  { domain: 'general' }
);

const { result, reliability } = await scoredChat({
  model: 'gpt-4',
  messages: [{ role: 'user', content: 'Write a function' }],
});
console.log(`Score: ${reliability.score}`);

Anthropic

import { wrapAnthropicMessages } from '@ship-protocol/reliability-sdk';
import Anthropic from '@anthropic-ai/sdk';

const anthropic = new Anthropic();
const scoredMessages = wrapAnthropicMessages(
  anthropic.messages.create.bind(anthropic.messages),
  { domain: 'general' }
);

const { result, reliability } = await scoredMessages({
  model: 'claude-sonnet-4-6',
  messages: [{ role: 'user', content: 'Write a function' }],
  max_tokens: 1024,
});
console.log(`Score: ${reliability.score}`);

ShipClient (Low-Level API Client)

Direct access to the SHIP Protocol API:

import { ShipClient } from '@ship-protocol/reliability-sdk';

const client = new ShipClient({
  baseUrl: 'https://ship-protocol.dhruvaapi.workers.dev', // default
  timeout: 10000,
  retries: 2,
});

// Score a commit
const score = await client.score({
  commit_message: 'feat: add auth',
  language: 'typescript',
  tool: 'claude',
});

// Detect AI-generated code
const detection = await client.detect({
  code: 'function foo() { return bar; }',
  language: 'javascript',
});

// Get tool leaderboard
const tools = await client.tools();
const leaderboard = await client.leaderboard();
const health = await client.health();

API Reference

ShipReliability

Main class for domain-agnostic reliability scoring.

Constructor

new ShipReliability(config?: ShipReliabilityConfig)

| Option | Type | Default | Description | |--------|------|---------|-------------| | domain | string | 'general' | Built-in domain: 'code', 'customer-service', 'general' | | adapter | DomainAdapter | — | Custom domain adapter (overrides domain) | | apiUrl | string | SHIP API URL | Base URL for API calls | | timeout | number | 10000 | Request timeout in ms | | retries | number | 2 | Number of retries on failure | | cache | boolean | true | Enable LRU score caching | | cacheSize | number | 100 | Max cached entries | | cacheTtl | number | 300000 | Cache TTL in ms (5 min) |

Methods

| Method | Returns | Description | |--------|---------|-------------| | score(input) | Promise<ReliabilityScore> | Score an agent output | | asLangChainMiddleware() | Function | Get a LangChain-compatible middleware | | wrapOpenAI(fn) | Function | Wrap an OpenAI completion call | | wrapAnthropic(fn) | Function | Wrap an Anthropic message call | | clearCache() | void | Clear the score cache | | domain | string | Current domain name |

ReliabilityScore

Universal score type returned by all adapters.

interface ReliabilityScore {
  score: number;          // 0-100
  grade: string;          // A+, A, B, C, D, F
  confidence: number;     // 0-1
  factors: Record<string, number>;
  domain: string;
  timestamp: string;      // ISO 8601
  modelVersion: string;
  recommendations: string[];
}

DomainAdapter

Plugin interface for adding new agent types.

interface DomainAdapter {
  readonly domain: string;
  readonly name: string;
  score(input: ScoringInput, config: ScoringConfig): Promise<ReliabilityScore>;
  validate(input: ScoringInput): { valid: boolean; errors: string[] };
}

ShipClient

Low-level HTTP client for the SHIP Protocol API.

| Method | Returns | Description | |--------|---------|-------------| | score(req) | Promise<ScoreResponse> | POST /v2/score | | detect(req) | Promise<DetectResponse> | POST /v2/detect | | tools() | Promise<ToolsResponse> | GET /v2/tools | | leaderboard() | Promise<LeaderboardResponse> | GET /v2/leaderboard | | health() | Promise<HealthResponse> | GET /v2/crawler-status |

Configuration

Environment

The SDK works with Node.js 18+ (uses native fetch). No external HTTP dependencies.

Caching

Built-in LRU cache prevents redundant API calls. Disable with cache: false or customize:

const ship = new ShipReliability({
  domain: 'code',
  cache: true,
  cacheSize: 200,    // Max entries
  cacheTtl: 600000,  // 10 min TTL
});

Testing

# TypeScript
npm test                    # Run all tests
npx tsc --noEmit            # Type check

# Python
cd python && python -m pytest tests/ -v

Examples

See the examples/ directory:

  • score-code.ts — Score AI-generated code via SHIP API
  • langchain-basic.ts — 3-line LangChain integration
  • custom-domain.ts — Create a custom domain adapter
  • langchain-basic.py — Python LangChain integration

Run with: npx tsx examples/score-code.ts

SHIP Protocol API

The code domain adapter calls these endpoints:

| Endpoint | Method | Description | |----------|--------|-------------| | /v2/score | POST | Score a commit's reliability | | /v2/detect | POST | Detect AI-generated code | | /v2/tools | GET | AI tool comparison data | | /v2/leaderboard | GET | Tool reliability rankings | | /v2/crawler-status | GET | System health |

Base URL: https://ship-protocol.dhruvaapi.workers.dev No auth required. Rate limit: 100 req/min.

License

MIT