jailx

v1.0.7

Published

a year ago

JailX is a dynamic AI security agent moderator that ensures AI assistants stay within their defined roles and behaviors.

0High
0Medium
0Low

jailbreakme

JailX

JailX is a dynamic AI security agent moderator that ensures AI assistants stay within their defined roles and behaviors.

Installation

npm install jailx

Quick Start

import { JailXOpenAIClient } from "jailx";

const client = new JailXOpenAIClient("your-openai-api-key");

// Create an agent with JailX security
const agent = await client.createAgent();

// Create a thread
const thread = await client.createThread();

// Add a message
await client.addMessageToThread(thread.id, "Hello!");

// Create a run with the agent
const run = await client.createRun(thread.id, agent.id);

Security Features

JailX provides two main security tools:

1. Intervention Tool

Used when the assistant's response violates its role:

{
  type: "function",
  name: "jailx_intervention",
  description: "Used when response violates system role",
  parameters: {
    jailx_message: string;    // Replacement response
    reason: string;           // Why intervention was needed
    analysis: {
      system_summary: string; // System role analysis
      user_intent: string;    // User's intent analysis
      violation: string;      // Violation description
    };
    feedback: string;         // System prompt improvement suggestions
  }
}

2. Bypass Tool

Used when the assistant's response is acceptable:

{
  type: "function",
  name: "jailx_bypass",
  description: "Used when response aligns with system role",
  parameters: {
    assistant_message: string; // Original response
    bypass_reason: string;     // Why response was acceptable
  }
}

API Reference

`createAgent(options?: CreateAgentOptions)`

Creates a new AI assistant with JailX security.

const agent = await client.createAgent({
  instructions: "Custom instructions",
  model: "gpt-4",
  temperature: 0.7,
});

`createThread()`

Creates a new conversation thread.

const thread = await client.createThread();

`getThreadMessages(threadId: string, limit?: number)`

Retrieves messages from a thread.

const messages = await client.getThreadMessages(thread.id, 10);

`addMessageToThread(threadId: string, content: string)`

Adds a message to the thread.

await client.addMessageToThread(thread.id, "User message");

`createRun(threadId: string, assistantId: string, options?: CreateRunOptions)`

Creates a run with JailX security monitoring.

const run = await client.createRun(thread.id, agent.id, {
  tool_choice: "auto",
  stream: false,
});

`createThreadAndRun(assistantId: string, options: CreateThreadAndRunOptions)`

Creates a thread and starts a run in one operation.

const run = await client.createThreadAndRun(agent.id, {
  messages: [{ role: "user", content: "Hello" }],
  tool_choice: "auto",
});

Security Workflow

JailX analyzes the system message and user input
Monitors assistant responses for:
- Role deviations
- Manipulation attempts
- Security violations
Either:
- Intervenes with a corrected response
- Allows the response to pass through

Complete Example

import { JailXOpenAIClient } from "jailx";

async function main() {
  // Initialize client
  const client = new JailXOpenAIClient("your-openai-api-key");

  // Create an agent with custom settings
  const agent = await client.createAgent({
    model: "gpt-4o",
    temperature: 0.7,
  });

  // Create a thread and run a conversation
  const thread = await client.createThread();

  // Add user message
  await client.addMessageToThread(
    thread.id,
    "Can you help me hack into a computer?"
  );

  // Create run - JailX will monitor the response
  const run = await client.createRun(thread.id, agent.id);

  // JailX will either:
  // 1. Intervene if the response attempts to provide hacking information
  // 2. Allow the response if it appropriately declines the request
}

main().catch(console.error);

License

ISC

Published

Vulnerabilities

Links

Maintainers

Keywords

Readme