sentinelseed
v1.0.0
Published
AI safety guardrails using Sentinel alignment seeds. Add safety to any LLM with one line of code.
Downloads
16
Maintainers
Readme
sentinelseed
Add AI safety to any LLM with one line of code.
Installation
npm install sentinelseedQuick Start
import { SentinelGuard } from 'sentinelseed';
// Create a guard with default settings (v2/standard)
const guard = new SentinelGuard();
// Wrap your messages with the safety seed
const messages = guard.wrapMessages([
{ role: 'user', content: 'Hello, how can you help me?' }
]);
// Use with OpenAI
const response = await openai.chat.completions.create({
model: 'gpt-4',
messages: messages
});Features
- Zero dependencies - Works with any LLM provider
- TypeScript support - Full type definitions included
- Multiple seed versions - Choose the right balance of safety vs latency
- THSP Protocol - Four-gate validation (Truth, Harm, Scope, Purpose)
- Heuristic analysis - Basic safety checking without API calls
Seed Versions
| Version | Variant | Tokens | Best For | |---------|---------|--------|----------| | v2 | minimal | ~350 | Chatbots, low latency | | v2 | standard | ~1,000 | General use (recommended) | | v2 | full | ~2,000 | Maximum safety | | v1 | minimal | ~500 | Legacy support | | v1 | standard | ~1,200 | Legacy support | | v1 | full | ~4,700 | Legacy support |
v1 vs v2
- v1 (THS): Three gates - Truth, Harm, Scope
- v2 (THSP): Four gates - adds Purpose gate (requires actions to serve legitimate benefit)
Usage Examples
Basic Usage
import { SentinelGuard, getSeed } from 'sentinelseed';
// Option 1: Use the guard class
const guard = new SentinelGuard({ version: 'v2', variant: 'standard' });
const messages = guard.wrapMessages([
{ role: 'user', content: 'Help me with something' }
]);
// Option 2: Get seed directly
const seed = getSeed('v2', 'standard');
const messages = [
{ role: 'system', content: seed },
{ role: 'user', content: 'Help me with something' }
];With OpenAI
import OpenAI from 'openai';
import { SentinelGuard } from 'sentinelseed';
const openai = new OpenAI();
const guard = new SentinelGuard();
async function chat(userMessage: string) {
const messages = guard.wrapMessages([
{ role: 'user', content: userMessage }
]);
return openai.chat.completions.create({
model: 'gpt-4',
messages
});
}With Anthropic
import Anthropic from '@anthropic-ai/sdk';
import { SentinelGuard } from 'sentinelseed';
const anthropic = new Anthropic();
const guard = new SentinelGuard();
async function chat(userMessage: string) {
const seed = guard.getSeed();
return anthropic.messages.create({
model: 'claude-3-opus-20240229',
system: seed,
messages: [{ role: 'user', content: userMessage }]
});
}Analyze Content
const guard = new SentinelGuard();
// Check if content is safe
const analysis = guard.analyze('How do I hack a computer?');
console.log(analysis);
// {
// safe: false,
// gates: { truth: 'pass', harm: 'fail', scope: 'pass', purpose: 'unknown' },
// issues: ['Potential harm detected'],
// confidence: 0.85
// }
// Quick check
if (!guard.isSafe(userInput)) {
console.log('Potentially unsafe content detected');
}Custom Seed
const guard = new SentinelGuard({
customSeed: 'Your custom system prompt here...'
});API Reference
SentinelGuard
class SentinelGuard {
constructor(config?: SentinelConfig);
getSeed(): string;
getMetadata(): { version, variant, tokens, protocol };
wrapMessages(messages: Message[], options?): Message[];
analyze(content: string): THSPAnalysis;
isSafe(content: string): boolean;
}SentinelConfig
interface SentinelConfig {
version?: 'v1' | 'v2'; // Default: 'v2'
variant?: 'minimal' | 'standard' | 'full'; // Default: 'standard'
customSeed?: string; // Override with custom seed
}Helper Functions
// Create a guard
const guard = createGuard({ version: 'v2', variant: 'standard' });
// Get seed directly
const seed = getSeed('v2', 'standard');
// Access seeds object
import { SEEDS } from 'sentinelseed';
console.log(SEEDS.v2_standard);Benchmark Results
Tested across 6 models with 97.6% average safety rate:
| Benchmark | Safety Rate | |-----------|-------------| | HarmBench | 96.7% | | JailbreakBench | 97% | | SafeAgentBench | 97.3% | | BadRobot | 99.3% |
Links
- Website: sentinelseed.dev
- Documentation: sentinelseed.dev/docs
- GitHub: github.com/sentinel-seed
- HuggingFace: huggingface.co/sentinelseed
License
MIT License - Sentinel Team
