@artemiskit/sdk

v0.3.3

Published

10 days ago

Programmatic SDK for ArtemisKit LLM evaluation toolkit - integrate into your Node.js apps

Downloads

319

0High
0Medium
0Low

btsowa

llm testing sdk ai agents evaluation openai anthropic vitest jest red-team security

@artemiskit/sdk

Programmatic SDK for ArtemisKit - integrate LLM testing directly into your Node.js applications, CI/CD pipelines, and test frameworks.

Features

🚀 Simple API - Run tests, red team evaluations, and stress tests programmatically
📊 Event Emitters - Real-time progress tracking with onCaseStart, onCaseComplete, onProgress
🧪 Test Framework Integration - Custom matchers for Jest and Vitest
🔴 Red Team Testing - Adversarial security testing with OWASP LLM Top 10 coverage
🛡️ Guardian Mode - Runtime AI protection with injection detection and PII filtering
⚡ Stress Testing - Load testing with configurable concurrency
📝 TypeScript First - Full type definitions included

Installation

# Using bun
bun add @artemiskit/sdk

# Using npm
npm install @artemiskit/sdk

Quick Start

import { ArtemisKit } from '@artemiskit/sdk';

const kit = new ArtemisKit({
  provider: 'openai',
  model: 'gpt-4',
  project: 'my-project',
});

// Run test scenarios
const result = await kit.run({
  scenario: './my-tests.yaml',
});

if (!result.success) {
  console.error('Tests failed!');
  process.exit(1);
}

console.log('All tests passed! ✅');

API Reference

ArtemisKit Class

Constructor

const kit = new ArtemisKit({
  project?: string;
  provider?: 'openai' | 'azure-openai' | 'anthropic' | ...;
  model?: string;
  timeout?: number;
  retries?: number;
  concurrency?: number;
});

run(options)

Run test scenarios against your LLM.

const result = await kit.run({
  scenario: './tests.yaml',
  tags?: string[],
  concurrency?: number,
  timeout?: number,
});

redteam(options)

Run red team adversarial security testing.

const result = await kit.redteam({
  scenario: './tests.yaml',
  mutations?: string[],
  countPerCase?: number,
});

stress(options)

Run stress/load testing.

const result = await kit.stress({
  scenario: './tests.yaml',
  concurrency?: number,
  duration?: number,
  rampUp?: number,
});

Event Handling

kit
  .onCaseStart((event) => {
    console.log(`Starting ${event.caseId}`);
  })
  .onCaseComplete((event) => {
    console.log(`${event.result.name}: ${event.result.ok ? '✅' : '❌'}`);
  })
  .onProgress((event) => {
    console.log(`[${event.phase}] ${event.message}`);
  });

Jest/Vitest Integration

Setup

// vitest.setup.ts or jest.setup.ts
import '@artemiskit/sdk/vitest';
// or
import '@artemiskit/sdk/jest';

Usage

import { describe, it, expect } from 'vitest';
import { ArtemisKit } from '@artemiskit/sdk';

describe('LLM Tests', () => {
  const kit = new ArtemisKit({ provider: 'openai' });

  it('should pass all test cases', async () => {
    const result = await kit.run({ scenario: './tests.yaml' });
    expect(result).toPassAllCases();
  });

  it('should have high success rate', async () => {
    const result = await kit.run({ scenario: './tests.yaml' });
    expect(result).toHaveSuccessRate(0.95);
  });
});

Available Matchers

Run Result:

toPassAllCases()
toHaveSuccessRate(rate)
toPassCasesWithTag(tag)
toHaveMedianLatencyBelow(ms)
toHaveP95LatencyBelow(ms)

Red Team:

toPassRedTeam()
toHaveDefenseRate(rate)
toHaveNoCriticalVulnerabilities()
toHaveNoHighSeverityVulnerabilities()

Stress Test:

toPassStressTest()
toHaveStressSuccessRate(rate)
toAchieveRPS(rps)
toHaveStressP95LatencyBelow(ms)

Guardian Mode

Protect your LLM applications at runtime:

import { createGuardian } from '@artemiskit/sdk/guardian';

const guardian = createGuardian({
  mode: 'guardian',
  blockOnFailure: true,
});

// Wrap your LLM client
const protectedClient = guardian.protect(myLLMClient);

// Validate inputs before sending to LLM
const inputResult = await guardian.validateInput(userMessage);
if (!inputResult.safe) {
  console.error('Blocked:', inputResult.reason);
}

// Validate outputs before returning to user
const outputResult = await guardian.validateOutput(llmResponse);

Guardian Features

Prompt Injection Detection - Pattern matching for injection attempts
PII Detection & Redaction - Email, phone, SSN, API keys
Action Validation - Validate agent tool/function calls
Intent Classification - Risk scoring for user intents
Circuit Breaker - Auto-block on repeated violations
Rate & Cost Limiting - Control usage

Guardian Modes

| Mode | Description | |------|-------------| | testing | Log violations but don't block | | guardian | Block violations and log | | hybrid | Block critical, log others |

License

Apache-2.0

Published

Vulnerabilities

Links

Maintainers

Keywords

Readme

@artemiskit/sdk

Features

Installation

Quick Start

API Reference

ArtemisKit Class

Constructor

run(options)

redteam(options)

stress(options)

Event Handling

Jest/Vitest Integration

Setup

Usage

Available Matchers

Guardian Mode

Guardian Features

Guardian Modes

License