provenby-ai-sdk
v0.1.0
Published
Automatic skill tracking from LLM API usage — privacy-first, zero-config
Downloads
120
Maintainers
Readme
ProvenBy-sdk
Automatic skill tracking from LLM API usage. Privacy-first, zero-config, zero runtime dependencies.
Wraps LLM provider SDKs with a transparent Proxy that captures request/response metadata, extracts skills locally in your process, and sends only anonymized metadata to ProvenBy. Your conversation text never leaves your machine.
Installation
npm install ProvenBy-sdkQuick Start
import { ProvenBy } from 'ProvenBy-sdk';
import OpenAI from 'openai';
const ProvenBy = new ProvenBy({
candidateId: 'your-candidate-id',
apiKey: 'sig_your_api_key',
});
const openai = ProvenBy.wrap(new OpenAI());
// Use exactly as normal — zero API changes
const completion = await openai.chat.completions.create({
model: 'gpt-4o',
messages: [{ role: 'user', content: 'Help me design a REST API' }],
});
// Skills extracted automatically in the backgroundSupported Providers
OpenAI
import OpenAI from 'openai';
const openai = ProvenBy.wrap(new OpenAI());
await openai.chat.completions.create({ model: 'gpt-4o', messages: [...] });Anthropic
import Anthropic from '@anthropic-ai/sdk';
const anthropic = ProvenBy.wrap(new Anthropic());
await anthropic.messages.create({ model: 'claude-sonnet-4-20250514', messages: [...] });Google Gemini
import { GoogleGenerativeAI } from '@google/generative-ai';
const genAI = new GoogleGenerativeAI(process.env.GEMINI_API_KEY);
const model = ProvenBy.wrap(genAI.getGenerativeModel({ model: 'gemini-2.0-flash' }));
await model.generateContent('Help me build a database schema');Mistral
import { Mistral } from '@mistralai/mistralai';
const mistral = ProvenBy.wrap(new Mistral({ apiKey: '...' }));
await mistral.chat.complete({ model: 'mistral-large-latest', messages: [...] });xAI / Grok (OpenAI-compatible)
import OpenAI from 'openai';
const xai = ProvenBy.wrap(new OpenAI({ baseURL: 'https://api.x.ai/v1', apiKey: '...' }));
await xai.chat.completions.create({ model: 'grok-3', messages: [...] });DeepSeek (OpenAI-compatible)
import OpenAI from 'openai';
const deepseek = ProvenBy.wrap(new OpenAI({ baseURL: 'https://api.deepseek.com', apiKey: '...' }));
await deepseek.chat.completions.create({ model: 'deepseek-chat', messages: [...] });Streaming
Streaming works transparently. The SDK collects chunks as they arrive, extracts skills after the stream completes, and returns each chunk to your code unmodified.
const stream = await openai.chat.completions.create({
model: 'gpt-4o',
messages: [{ role: 'user', content: 'Write a React component' }],
stream: true,
});
for await (const chunk of stream) {
process.stdout.write(chunk.choices[0]?.delta?.content || '');
}
// Skills extracted automatically after stream completesPrivacy Model
The SDK is designed with privacy as a first-class constraint:
What IS captured (metadata only):
- Provider name (openai, anthropic, etc.)
- Model name (gpt-4o, claude-sonnet, etc.)
- Detected programming languages (from code fences)
- Detected frameworks (from import statements)
- Domain classification (frontend, backend, devops, etc.)
- Complexity estimate (simple, moderate, complex)
- Whether debugging was involved
- Conversation turn count
- Token usage (if provided by the API)
What is NEVER captured:
- Raw conversation text
- User prompts or assistant responses
- API keys, tokens, or credentials
- File paths, URLs, or hostnames
- Email addresses, phone numbers, or any PII
- Company or product names
When extractLocally: true (the default), all extraction happens in your Node.js process. Raw conversation text never leaves your machine. Only the Extraction metadata object is transmitted to ProvenBy. Additionally, a PII stripping layer runs on the extraction output as a safety net.
Configuration
const ProvenBy = new ProvenBy({
// Required
candidateId: 'your-candidate-id',
apiKey: 'sig_your_api_key',
// Optional
serverUrl: 'https://provenby.dev', // ProvenBy server URL
extractLocally: true, // Extract in-process, never send raw text (default: true)
bufferIntervalMs: 60000, // Flush buffer every 60 seconds (default)
bufferMaxSize: 20, // Flush when buffer hits 20 extractions (default)
debug: false, // Log extraction activity to console (default: false)
onExtraction: (e) => {}, // Callback for each extraction — use for visibility
});Verifying What's Sent
Use debug: true to log every extraction to the console:
const ProvenBy = new ProvenBy({
candidateId: 'xxx',
apiKey: 'sig_xxx',
debug: true,
});Or use the onExtraction callback for programmatic access:
const ProvenBy = new ProvenBy({
candidateId: 'xxx',
apiKey: 'sig_xxx',
onExtraction: (extraction) => {
console.log('Skills detected:', extraction.skills);
console.log('Languages:', extraction.languages);
console.log('Domain:', extraction.domain);
},
});Lifecycle
// Force-send any buffered extractions
await ProvenBy.flush();
// Flush + stop background timer (call before process exit)
await ProvenBy.close();The background flush timer is unref'd, so it won't keep your Node.js process alive.
Architecture
The SDK uses JavaScript Proxy to intercept calls on provider SDKs:
wrap()detects the provider by checking for known property patterns- It returns a Proxy that intercepts chat/completion method calls
- The real API call executes normally against the provider
- After the response arrives, skills are extracted from the conversation
- Extractions are buffered and periodically sent to ProvenBy
- The original response is returned unchanged
Zero runtime dependencies. Uses only Node.js built-ins.
License
MIT
