provenby-ai-sdk

v0.1.0

Published

14 days ago

Automatic skill tracking from LLM API usage — privacy-first, zero-config

Downloads

120

0High
0Medium
0Low

inferlane

llm skills tracking openai anthropic gemini privacy

ProvenBy-sdk

Automatic skill tracking from LLM API usage. Privacy-first, zero-config, zero runtime dependencies.

Wraps LLM provider SDKs with a transparent Proxy that captures request/response metadata, extracts skills locally in your process, and sends only anonymized metadata to ProvenBy. Your conversation text never leaves your machine.

Installation

npm install ProvenBy-sdk

Quick Start

import { ProvenBy } from 'ProvenBy-sdk';
import OpenAI from 'openai';

const ProvenBy = new ProvenBy({
  candidateId: 'your-candidate-id',
  apiKey: 'sig_your_api_key',
});

const openai = ProvenBy.wrap(new OpenAI());

// Use exactly as normal — zero API changes
const completion = await openai.chat.completions.create({
  model: 'gpt-4o',
  messages: [{ role: 'user', content: 'Help me design a REST API' }],
});

// Skills extracted automatically in the background

Supported Providers

OpenAI

import OpenAI from 'openai';
const openai = ProvenBy.wrap(new OpenAI());
await openai.chat.completions.create({ model: 'gpt-4o', messages: [...] });

Anthropic

import Anthropic from '@anthropic-ai/sdk';
const anthropic = ProvenBy.wrap(new Anthropic());
await anthropic.messages.create({ model: 'claude-sonnet-4-20250514', messages: [...] });

Google Gemini

import { GoogleGenerativeAI } from '@google/generative-ai';
const genAI = new GoogleGenerativeAI(process.env.GEMINI_API_KEY);
const model = ProvenBy.wrap(genAI.getGenerativeModel({ model: 'gemini-2.0-flash' }));
await model.generateContent('Help me build a database schema');

Mistral

import { Mistral } from '@mistralai/mistralai';
const mistral = ProvenBy.wrap(new Mistral({ apiKey: '...' }));
await mistral.chat.complete({ model: 'mistral-large-latest', messages: [...] });

xAI / Grok (OpenAI-compatible)

import OpenAI from 'openai';
const xai = ProvenBy.wrap(new OpenAI({ baseURL: 'https://api.x.ai/v1', apiKey: '...' }));
await xai.chat.completions.create({ model: 'grok-3', messages: [...] });

DeepSeek (OpenAI-compatible)

import OpenAI from 'openai';
const deepseek = ProvenBy.wrap(new OpenAI({ baseURL: 'https://api.deepseek.com', apiKey: '...' }));
await deepseek.chat.completions.create({ model: 'deepseek-chat', messages: [...] });

Streaming

Streaming works transparently. The SDK collects chunks as they arrive, extracts skills after the stream completes, and returns each chunk to your code unmodified.

const stream = await openai.chat.completions.create({
  model: 'gpt-4o',
  messages: [{ role: 'user', content: 'Write a React component' }],
  stream: true,
});

for await (const chunk of stream) {
  process.stdout.write(chunk.choices[0]?.delta?.content || '');
}
// Skills extracted automatically after stream completes

Privacy Model

The SDK is designed with privacy as a first-class constraint:

What IS captured (metadata only):

Provider name (openai, anthropic, etc.)
Model name (gpt-4o, claude-sonnet, etc.)
Detected programming languages (from code fences)
Detected frameworks (from import statements)
Domain classification (frontend, backend, devops, etc.)
Complexity estimate (simple, moderate, complex)
Whether debugging was involved
Conversation turn count
Token usage (if provided by the API)

What is NEVER captured:

Raw conversation text
User prompts or assistant responses
API keys, tokens, or credentials
File paths, URLs, or hostnames
Email addresses, phone numbers, or any PII
Company or product names

When extractLocally: true (the default), all extraction happens in your Node.js process. Raw conversation text never leaves your machine. Only the Extraction metadata object is transmitted to ProvenBy. Additionally, a PII stripping layer runs on the extraction output as a safety net.

Configuration

const ProvenBy = new ProvenBy({
  // Required
  candidateId: 'your-candidate-id',
  apiKey: 'sig_your_api_key',

  // Optional
  serverUrl: 'https://provenby.dev', // ProvenBy server URL
  extractLocally: true,    // Extract in-process, never send raw text (default: true)
  bufferIntervalMs: 60000, // Flush buffer every 60 seconds (default)
  bufferMaxSize: 20,       // Flush when buffer hits 20 extractions (default)
  debug: false,            // Log extraction activity to console (default: false)
  onExtraction: (e) => {}, // Callback for each extraction — use for visibility
});

Verifying What's Sent

Use debug: true to log every extraction to the console:

const ProvenBy = new ProvenBy({
  candidateId: 'xxx',
  apiKey: 'sig_xxx',
  debug: true,
});

Or use the onExtraction callback for programmatic access:

const ProvenBy = new ProvenBy({
  candidateId: 'xxx',
  apiKey: 'sig_xxx',
  onExtraction: (extraction) => {
    console.log('Skills detected:', extraction.skills);
    console.log('Languages:', extraction.languages);
    console.log('Domain:', extraction.domain);
  },
});

Lifecycle

// Force-send any buffered extractions
await ProvenBy.flush();

// Flush + stop background timer (call before process exit)
await ProvenBy.close();

The background flush timer is unref'd, so it won't keep your Node.js process alive.

Architecture

The SDK uses JavaScript Proxy to intercept calls on provider SDKs:

wrap() detects the provider by checking for known property patterns
It returns a Proxy that intercepts chat/completion method calls
The real API call executes normally against the provider
After the response arrives, skills are extracted from the conversation
Extractions are buffered and periodically sent to ProvenBy
The original response is returned unchanged

Zero runtime dependencies. Uses only Node.js built-ins.

License

MIT