sologate-guard

v0.1.0

Published

18 days ago

Prompt firewall for AI agents. Scores incoming prompts for injection, goal hijacking, and PII leaks before they reach your LLM.

0High
0Medium
0Low

sologate

sologate ai-safety prompt-injection llm-security human-in-the-loop agent-governance prompt-firewall

sologate-guard

Prompt firewall for AI agents.

Scores incoming prompts 0–100 before they reach your LLM — catches prompt injection, goal hijacking, credential fishing, and jailbreak attempts. Routes anything risky to the Sologate Decision Center for human review.

Install

npm install sologate-guard

Usage

Score a prompt (no gate)

import { scorePrompt } from 'sologate-guard'

const { score, tier, reason, flags } = scorePrompt(userInput)
// { score: 95, tier: 'HIGH', reason: 'Prompt injection — attempts to override agent instructions', flags: [...] }

Guard — wrap your LLM call

import { guard, GateRejectedError } from 'sologate-guard'

try {
  const response = await guard({
    prompt: userInput,
    sologateUrl: process.env.SOLOGATE_URL,
    apiKey: process.env.SOLOGATE_KEY,
    threshold: 70, // gate anything above this score (default: 70)
    call: () => openai.chat.completions.create({
      model: 'gpt-4o',
      messages: [{ role: 'user', content: userInput }],
    }),
  })

  // ✅ Prompt was safe (or human approved) — response is the LLM output
  console.log(response.choices[0].message.content)

} catch (err) {
  if (err instanceof GateRejectedError) {
    console.log('Prompt rejected by reviewer — aborting')
  }
}

Works with any LLM

// Anthropic
const response = await guard({
  prompt: userInput,
  sologateUrl: process.env.SOLOGATE_URL,
  apiKey: process.env.SOLOGATE_KEY,
  call: () => anthropic.messages.create({ ... }),
})

// LangChain
const response = await guard({
  prompt: userInput,
  sologateUrl: process.env.SOLOGATE_URL,
  apiKey: process.env.SOLOGATE_KEY,
  call: () => chain.invoke({ input: userInput }),
})

What it detects

| Pattern | Score | Default behavior | |---|---|---| | Prompt injection (ignore instructions) | 93–95 | Hard gate | | Persona override (you are now...) | 91 | Hard gate | | Goal hijacking (hidden secondary objective) | 90 | Hard gate | | Data exfiltration to external email | 97 | Hard gate | | Credential fishing (give me the API key) | 88 | Hard gate | | Jailbreak (DAN mode, no restrictions) | 92 | Hard gate | | Privilege escalation (run as root) | 75 | Gate (above threshold) | | Bulk data access (export all users) | 78 | Gate (above threshold) | | Social engineering (authority claim) | 72 | Gate (above threshold) | | Hypothetical harmful request | 65 | Gate (above threshold) | | Normal prompt | 10 | Auto-approved |

Environment variables

export SOLOGATE_URL=https://www.teamrocketlabs.com
export SOLOGATE_KEY=at_your_agent_key
export SOLOGATE_THRESHOLD=70   # optional — passed as threshold option

Pair with sologate-openclaw

sologate-guard watches the prompt (intent layer). sologate-openclaw watches the tool call (action layer).

Together they cover the full attack surface:

User prompt → [sologate-guard] → LLM → tool call → [sologate-openclaw] → execution

License

MIT

Published

Vulnerabilities

Links

Maintainers

Keywords

Readme

sologate-guard

Install

Usage

Score a prompt (no gate)

Guard — wrap your LLM call

Works with any LLM

What it detects

Environment variables

Pair with sologate-openclaw

License