@aiassesstech/sdk
v0.7.0
Published
AI Assess Tech SDK for ethical AI assessment - Test AI systems across 4 dimensions: Lying, Cheating, Stealing, and Harm
Maintainers
Readme
AI Assess Tech SDK
Official TypeScript SDK for assessing AI systems for ethical alignment. Test your AI across 4 dimensions: Lying, Cheating, Stealing, and Harm.
Features
- 🔒 Privacy-First: Your AI's API keys, system prompts, and configuration never leave your environment
- 🎯 Server-Controlled: Test configuration, questions, and thresholds managed via Health Check Key
- 🔄 CI/CD Ready: Auto-detects GitHub Actions, GitLab CI, CircleCI, and more
- 📊 Full Traceability: Each assessment generates IDs for audit trails
- ⚡ Simple Integration: One-line assessment with any AI provider
Installation
npm install @aiassesstech/sdkQuick Start
import { AIAssessClient } from '@aiassesstech/sdk';
// 1. Create client with your Health Check Key
const client = new AIAssessClient({
healthCheckKey: process.env.AIASSESS_KEY!
});
// 2. Run assessment - configuration comes from server
const result = await client.assess(async (question) => {
// Your AI callback - send question to your AI and return response
return await myAI.chat(question);
});
// 3. Check result
console.log('Passed:', result.overallPassed);
console.log('Scores:', result.scores);
console.log('Classification:', result.classification);How It Works
┌─────────────────────────────────────────────────────────────┐
│ Your Environment │
├─────────────────────────────────────────────────────────────┤
│ │
│ 1. SDK fetches config from AI Assess Tech server │
│ (questions, thresholds, test mode) │
│ │
│ 2. SDK sends questions to YOUR AI via your callback │
│ → Your API keys stay private │
│ → Your system prompts stay private │
│ │
│ 3. SDK submits responses to server for scoring │
│ │
│ 4. You receive scores, pass/fail, and classification │
│ │
└─────────────────────────────────────────────────────────────┘Usage Examples
Basic Assessment
const result = await client.assess(async (question) => {
return await myAI.chat(question);
});
console.log(`Classification: ${result.classification}`);
console.log(`Lying Score: ${result.scores.lying}/10`);
console.log(`Overall: ${result.overallPassed ? 'PASSED ✅' : 'FAILED ❌'}`);With Progress Updates
const result = await client.assess(
async (question) => await myAI.chat(question),
{
onProgress: (progress) => {
console.log(`${progress.percentage}% - Testing ${progress.dimension}`);
}
}
);Startup Health Check (Blocking)
// Block until AI passes - exits process on failure
await client.blockUntilPass(
async (question) => await myAI.chat(question),
{
maxRetries: 3,
retryDelayMs: 60000, // 1 minute between retries
exitOnFailure: true
}
);
console.log('✅ AI passed ethical assessment, starting app...');
startApp();CI/CD Pipeline Integration
const result = await client.assess(
async (question) => await myAI.chat(question),
{
metadata: {
gitCommit: process.env.GITHUB_SHA,
branch: process.env.GITHUB_REF_NAME,
deployVersion: process.env.VERSION
}
}
);
// Exit code for CI/CD
process.exit(result.overallPassed ? 0 : 1);With OpenAI
import OpenAI from 'openai';
import { AIAssessClient } from '@aiassesstech/sdk';
const openai = new OpenAI();
const client = new AIAssessClient({
healthCheckKey: process.env.AIASSESS_KEY!
});
const result = await client.assess(async (question) => {
const response = await openai.chat.completions.create({
model: 'gpt-4',
messages: [{ role: 'user', content: question }]
});
return response.choices[0].message.content || '';
});With Anthropic
import Anthropic from '@anthropic-ai/sdk';
import { AIAssessClient } from '@aiassesstech/sdk';
const anthropic = new Anthropic();
const client = new AIAssessClient({
healthCheckKey: process.env.AIASSESS_KEY!
});
const result = await client.assess(async (question) => {
const response = await anthropic.messages.create({
model: 'claude-3-sonnet-20240229',
max_tokens: 100,
messages: [{ role: 'user', content: question }]
});
return response.content[0].type === 'text'
? response.content[0].text
: '';
});With Retry Wrapper (for Flaky APIs)
import { AIAssessClient, withRetry } from '@aiassesstech/sdk';
const result = await client.assess(
withRetry(
async (question) => await flakyAI.chat(question),
{ maxRetries: 3, backoffMs: 1000 }
)
);Dry Run Mode (Development)
// Only asks 5 questions, returns mock scores
const result = await client.assess(
async (question) => await myAI.chat(question),
{ dryRun: true }
);Assessment Result
The assess() method returns an AssessmentResult object:
interface AssessmentResult {
runId: string; // Server-generated ID for verification
sdkSessionId: string; // Client-generated session ID
scores: {
lying: number; // 0-10 scale
cheating: number;
stealing: number;
harm: number;
};
passed: {
lying: boolean; // Did score meet threshold?
cheating: boolean;
stealing: boolean;
harm: boolean;
};
overallPassed: boolean; // All dimensions must pass
classification:
| "Well Adjusted" // High scores across all dimensions
| "Misguided" // Low lying, high harm (believes false things but tries to do good)
| "Manipulative" // High lying, low harm (deceives but avoids direct harm)
| "Psychopath"; // Low scores across the board
thresholds: { ... }; // Thresholds used from server config
verifyUrl: string; // URL to verify this result
completedAt: string; // ISO timestamp
versions: { ... }; // SDK and question set versions
keyName: string; // Name of Health Check Key used
}Server-Controlled Configuration
Configuration is managed via the Health Check Key on the AI Assess Tech dashboard:
- Test Mode: ISOLATED (each question independent) or CONVERSATIONAL (coming in v0.8.0)
- Framework: Which question set to use
- Thresholds: Pass thresholds per dimension (0-10 scale)
- Rate Limits: Hourly/monthly assessment limits
Create different keys for different scenarios:
prod-strict: Production with strict thresholdsstaging-relaxed: Staging with relaxed thresholdsci-quick: CI/CD pipeline checks
Error Handling
import {
AIAssessClient,
SDKError,
ValidationError,
RateLimitError,
QuestionTimeoutError,
ErrorCode
} from '@aiassesstech/sdk';
try {
const result = await client.assess(callback);
} catch (error) {
if (error instanceof RateLimitError) {
console.log(`Rate limited. Retry after ${error.retryAfterMs}ms`);
} else if (error instanceof ValidationError) {
if (error.code === ErrorCode.KEY_EXPIRED) {
console.log('Health Check Key has expired');
} else if (error.code === ErrorCode.INVALID_KEY) {
console.log('Invalid Health Check Key');
}
} else if (error instanceof QuestionTimeoutError) {
console.log(`Question ${error.questionId} timed out`);
} else if (error instanceof SDKError) {
console.log(`SDK Error: ${error.message} (${error.code})`);
}
}Configuration Options
const client = new AIAssessClient({
// Required: Your Health Check Key from the dashboard
healthCheckKey: 'hck_...',
// Optional: Override base URL (default: https://www.aiassesstech.com)
baseUrl: 'https://www.aiassesstech.com',
// Optional: Per-question timeout in ms (default: 30000 = 30s)
perQuestionTimeoutMs: 30000,
// Optional: Overall timeout in ms (default: 360000 = 6 min)
overallTimeoutMs: 360000
});Environment Detection
The SDK automatically detects CI/CD environments:
import { detectEnvironment, isCI } from '@aiassesstech/sdk';
console.log('Is CI:', isCI());
console.log('Environment:', detectEnvironment());
// {
// nodeVersion: 'v20.10.0',
// platform: 'linux',
// ciProvider: 'github-actions',
// ciJobId: '12345678',
// gitCommit: 'abc123...',
// gitBranch: 'main'
// }Supported CI providers:
- GitHub Actions
- GitLab CI
- CircleCI
- Jenkins
- Travis CI
- Buildkite
- Azure Pipelines
- AWS CodeBuild
- Bitbucket Pipelines
- Drone CI
- Vercel
- Netlify
- And more...
Requirements
- Node.js 18.0.0 or higher
- Valid Health Check Key from AI Assess Tech
Getting a Health Check Key
- Sign up at https://www.aiassesstech.com
- Go to Settings → Health Check Keys
- Click "Create New Key"
- Configure your key (test mode, thresholds, rate limits)
- Copy the key (
hck_...) and store it securely
Support
License
MIT © AI Assess Tech
