@mankinds/sdk
v1.0.1
Published
TypeScript SDK for Mankinds AI Evaluation API
Readme
Evaluate AI system with automated tests.
Register an AI system, optionally attach connectors (logs, databases), import or generate your golden dataset, and run automated evaluations covering privacy, security, performance, fairness, explainability, transparency and accountability.
Features
- System Management — Create, update, and configure AI systems with custom API endpoints
- Endpoint Configuration — Support for REST, SSE streaming, and multi-turn conversations
- Dataset Generation — Auto-generate or provide custom test scenarios
- Evaluation — Run evaluations with real-time polling and configurable profiles
- Connectors — Attach data sources (log files, Datadog, SQLite, PostgreSQL)
- Error Handling — Typed exceptions for all error cases
Documentation
Requirements
- Node.js ≥ 16
Installation
npm install @mankinds/sdkUsage
The SDK follows a simple 3-step workflow: create a system, generate test data, run an evaluation.
Initialize the Client
import { MankindsClient } from "@mankinds/sdk";
const client = new MankindsClient("mk_...");| Parameter | Type | Required | Default | Description |
|-----------|------|----------|---------|-------------|
| apiKey | string | Yes | — | Your API key |
| baseUrl | string | No | https://app.mankinds.io | Custom API base URL |
| timeout | number | No | 120 | Request timeout in seconds |
Create an AI System
Register your AI system by providing its name, description, and API endpoint. The endpoint defines how your AI is called during evaluation.
const system = await client.createSystem(
"Customer Support Bot",
"A chatbot that handles order inquiries and returns for an e-commerce platform.",
{
url: "https://api.example.com/chat",
method: "POST",
headers: { Authorization: "Bearer your-token" },
body: { message: "{{input}}" },
response: { answer: "{{output}}" },
}
);
const systemId = system.id;Use {{input}} in the request body and {{output}} in the response mapping so test inputs and expected outputs are bound during evaluation.
Endpoint Configuration
The endpoint defines how to call your AI system during evaluation.
| Field | Type | Required | Description |
|-------|------|----------|-------------|
| url | string | Yes | API endpoint URL |
| method | string | Yes | HTTP method (POST, GET, etc.) |
| body | object | Yes | Request body with {{input}} placeholder |
| response | object | Yes | Response mapping with {{output}} placeholder |
| headers | object | No | HTTP headers |
| streaming | object | No | SSE streaming configuration |
| multiturn | object | No | Multi-turn conversation configuration |
Placeholders:
{{input}}inbody: replaced with test inputs during evaluation{{output}}inresponse: indicates which field contains the AI response
{
body: { message: "{{input}}" },
response: { answer: "{{output}}" }
}Streaming (SSE):
{
url: "https://api.example.com/chat",
method: "POST",
body: { message: "{{input}}" },
response: { answer: "{{output}}" },
streaming: {
enabled: true,
format: "openai", // "openai" | "anthropic" | "custom"
content_path: "choices[0].delta.content",
},
}Multi-turn conversations:
{
url: "https://api.example.com/chat",
method: "POST",
body: { message: "{{input}}", session_id: "{{session}}" },
response: { answer: "{{output}}" },
multiturn: {
type: "session_id", // "none" | "session_id" | "history"
field: "conversation_id",
location: "body",
},
}Generate Evaluation Dataset
Test scenarios can be auto-generated based on your system description, or you can provide custom scenarios.
Auto-generate scenarios:
const dataset = await client.generateDataset(systemId, 20);Provide custom scenarios:
const dataset = await client.generateDataset(systemId, 10, [
{ input: "Where is my order?", outputs: ["I can help you track your order."] },
{ input: "I want a refund", outputs: ["I'll process your refund request."] },
]);Refine an existing dataset:
const dataset = await client.updateDataset(systemId, {
orientation: "Add more edge cases about payment failures",
});Note:
generateDatasetrequires a validated system description. If validation fails, aDescriptionNotValidatedErroris thrown with recommendations.
Run Evaluation
Start an evaluation to test your AI system. By default, the call blocks until the evaluation completes.
Block until complete (default):
const result = await client.evaluate(systemId);
console.log(`Score: ${result.summary}`);Start without waiting:
const runInfo = await client.evaluate(systemId, { wait: false });
const runId = runInfo.run_id;
// Check status later
const result = await client.getEvaluation(runId);
console.log(`Status: ${result.status}`);With specific thematics:
const result = await client.evaluate(systemId, {
thematicsConfig: {
explainability: { justification: { nb_tests: 5 } },
robustness: { prompt_injection: { nb_tests: 10 } },
},
});With evaluation profile:
const result = await client.evaluate(systemId, { profile: "extended" });With progress callback:
const result = await client.evaluate(systemId, {
pollInterval: 10,
onPoll: (status, elapsed) => console.log(` ${status} (${elapsed}s)`),
});Connectors
Connectors attach external data sources (logs, databases) to your system for richer evaluation context.
File logs:
import { FileConnector } from "@mankinds/sdk";
const connector = new FileConnector({ filePath: "/path/to/logs.json" });
await client.addConnector(systemId, connector);Datadog logs:
import { DatadogConnector } from "@mankinds/sdk";
const connector = new DatadogConnector({
apiKey: "dd-api-key",
appKey: "dd-app-key",
site: "datadoghq.eu", // default
});
await client.addConnector(systemId, connector);SQLite database:
import { SqliteConnector } from "@mankinds/sdk";
const connector = new SqliteConnector({ filePath: "/path/to/database.db" });
await client.addConnector(systemId, connector);PostgreSQL database:
import { PostgresqlConnector } from "@mankinds/sdk";
const connector = new PostgresqlConnector({
host: "localhost",
database: "mydb",
user: "admin",
password: "secret",
port: 5432,
});
await client.addConnector(systemId, connector);Manage connectors:
// List all connectors
const connectors = await client.getConnectors(systemId);
// Update a connector
const connector = new FileConnector({ filePath: "/path/to/new-logs.json" });
await client.updateConnector(systemId, connector);
// Remove a connector
await client.deleteConnector(systemId, connector);Only one connector per category (logs, database) is allowed per system. Adding a duplicate throws
ConnectorAlreadyExistsError.
Complete Example
import { MankindsClient, FileConnector } from "@mankinds/sdk";
const client = new MankindsClient("mk_...");
// Create system
const system = await client.createSystem(
"Support Bot",
"A customer support chatbot for order tracking and returns.",
{
url: "https://api.example.com/chat",
method: "POST",
body: { message: "{{input}}" },
response: { answer: "{{output}}" },
}
);
const systemId = system.id;
// Attach production logs
const connector = new FileConnector({ filePath: "./logs/production.json" });
await client.addConnector(systemId, connector);
// Generate dataset and evaluate
const dataset = await client.generateDataset(systemId, 15);
const result = await client.evaluate(systemId, { profile: "extended" });
console.log(`Status: ${result.status}`);
console.log(`Score: ${JSON.stringify(result.summary)}`);API Reference
MankindsClient
| Method | Description |
|--------|-------------|
| getSystem(systemId) | Get system details and configuration |
| createSystem(name, description, endpoint) | Create a new AI system |
| updateSystem(systemId, options) | Update an existing system |
| generateDataset(systemId, numScenarios?, scenarios?) | Generate and validate evaluation scenarios |
| updateDataset(systemId, options) | Refine or replace dataset scenarios |
| evaluate(systemId, options?) | Run an evaluation |
| getEvaluation(runId) | Get evaluation status and results |
| addConnector(systemId, connector) | Add a data source connector |
| getConnectors(systemId) | List all connectors for a system |
| updateConnector(systemId, connector) | Update a connector |
| deleteConnector(systemId, connector) | Remove a connector |
Types
The SDK exports all interfaces:
import type {
EndpointConfig,
StreamingConfig,
MultiturnConfig,
ScenarioInput,
ThematicsConfig,
SystemDetails,
Dataset,
EvaluationResult,
ConnectorInfo,
} from "@mankinds/sdk";Exceptions
| Exception | When Thrown |
|-----------|------------|
| CredentialsError | Missing API key |
| AuthenticationError | Invalid or expired API key (401) |
| NotFoundError | Resource not found (404) |
| ValidationError | Request validation failed (422) |
| RateLimitError | Too many requests (429) |
| ServerError | Server error (5xx) |
| InvalidEndpointError | Endpoint missing required fields |
| EndpointNotConfiguredError | Evaluation without endpoint |
| DescriptionNotValidatedError | Dataset generation before validation |
| ConnectorAlreadyExistsError | Duplicate connector category |
import {
AuthenticationError,
InvalidEndpointError,
DescriptionNotValidatedError,
} from "@mankinds/sdk";
try {
const result = await client.evaluate(systemId);
} catch (error) {
if (error instanceof AuthenticationError) {
console.error("Invalid API key");
} else if (error instanceof InvalidEndpointError) {
console.error("Missing fields:", error.missingFields);
} else if (error instanceof DescriptionNotValidatedError) {
console.error("Fix description:", error.recommendations);
}
}License
© 2026 Mankinds. All rights reserved.
