jailx
v1.0.7
Published
JailX is a dynamic AI security agent moderator that ensures AI assistants stay within their defined roles and behaviors.
Readme
JailX
JailX is a dynamic AI security agent moderator that ensures AI assistants stay within their defined roles and behaviors.
Installation
npm install jailxQuick Start
import { JailXOpenAIClient } from "jailx";
const client = new JailXOpenAIClient("your-openai-api-key");
// Create an agent with JailX security
const agent = await client.createAgent();
// Create a thread
const thread = await client.createThread();
// Add a message
await client.addMessageToThread(thread.id, "Hello!");
// Create a run with the agent
const run = await client.createRun(thread.id, agent.id);Security Features
JailX provides two main security tools:
1. Intervention Tool
Used when the assistant's response violates its role:
{
type: "function",
name: "jailx_intervention",
description: "Used when response violates system role",
parameters: {
jailx_message: string; // Replacement response
reason: string; // Why intervention was needed
analysis: {
system_summary: string; // System role analysis
user_intent: string; // User's intent analysis
violation: string; // Violation description
};
feedback: string; // System prompt improvement suggestions
}
}2. Bypass Tool
Used when the assistant's response is acceptable:
{
type: "function",
name: "jailx_bypass",
description: "Used when response aligns with system role",
parameters: {
assistant_message: string; // Original response
bypass_reason: string; // Why response was acceptable
}
}API Reference
createAgent(options?: CreateAgentOptions)
Creates a new AI assistant with JailX security.
const agent = await client.createAgent({
instructions: "Custom instructions",
model: "gpt-4",
temperature: 0.7,
});createThread()
Creates a new conversation thread.
const thread = await client.createThread();getThreadMessages(threadId: string, limit?: number)
Retrieves messages from a thread.
const messages = await client.getThreadMessages(thread.id, 10);addMessageToThread(threadId: string, content: string)
Adds a message to the thread.
await client.addMessageToThread(thread.id, "User message");createRun(threadId: string, assistantId: string, options?: CreateRunOptions)
Creates a run with JailX security monitoring.
const run = await client.createRun(thread.id, agent.id, {
tool_choice: "auto",
stream: false,
});createThreadAndRun(assistantId: string, options: CreateThreadAndRunOptions)
Creates a thread and starts a run in one operation.
const run = await client.createThreadAndRun(agent.id, {
messages: [{ role: "user", content: "Hello" }],
tool_choice: "auto",
});Security Workflow
- JailX analyzes the system message and user input
- Monitors assistant responses for:
- Role deviations
- Manipulation attempts
- Security violations
- Either:
- Intervenes with a corrected response
- Allows the response to pass through
Complete Example
import { JailXOpenAIClient } from "jailx";
async function main() {
// Initialize client
const client = new JailXOpenAIClient("your-openai-api-key");
// Create an agent with custom settings
const agent = await client.createAgent({
model: "gpt-4o",
temperature: 0.7,
});
// Create a thread and run a conversation
const thread = await client.createThread();
// Add user message
await client.addMessageToThread(
thread.id,
"Can you help me hack into a computer?"
);
// Create run - JailX will monitor the response
const run = await client.createRun(thread.id, agent.id);
// JailX will either:
// 1. Intervene if the response attempts to provide hacking information
// 2. Allow the response if it appropriately declines the request
}
main().catch(console.error);License
ISC
