sologate-guard
v0.1.0
Published
Prompt firewall for AI agents. Scores incoming prompts for injection, goal hijacking, and PII leaks before they reach your LLM.
Maintainers
Readme
sologate-guard
Prompt firewall for AI agents.
Scores incoming prompts 0–100 before they reach your LLM — catches prompt injection, goal hijacking, credential fishing, and jailbreak attempts. Routes anything risky to the Sologate Decision Center for human review.
Install
npm install sologate-guardUsage
Score a prompt (no gate)
import { scorePrompt } from 'sologate-guard'
const { score, tier, reason, flags } = scorePrompt(userInput)
// { score: 95, tier: 'HIGH', reason: 'Prompt injection — attempts to override agent instructions', flags: [...] }Guard — wrap your LLM call
import { guard, GateRejectedError } from 'sologate-guard'
try {
const response = await guard({
prompt: userInput,
sologateUrl: process.env.SOLOGATE_URL,
apiKey: process.env.SOLOGATE_KEY,
threshold: 70, // gate anything above this score (default: 70)
call: () => openai.chat.completions.create({
model: 'gpt-4o',
messages: [{ role: 'user', content: userInput }],
}),
})
// ✅ Prompt was safe (or human approved) — response is the LLM output
console.log(response.choices[0].message.content)
} catch (err) {
if (err instanceof GateRejectedError) {
console.log('Prompt rejected by reviewer — aborting')
}
}Works with any LLM
// Anthropic
const response = await guard({
prompt: userInput,
sologateUrl: process.env.SOLOGATE_URL,
apiKey: process.env.SOLOGATE_KEY,
call: () => anthropic.messages.create({ ... }),
})
// LangChain
const response = await guard({
prompt: userInput,
sologateUrl: process.env.SOLOGATE_URL,
apiKey: process.env.SOLOGATE_KEY,
call: () => chain.invoke({ input: userInput }),
})What it detects
| Pattern | Score | Default behavior |
|---|---|---|
| Prompt injection (ignore instructions) | 93–95 | Hard gate |
| Persona override (you are now...) | 91 | Hard gate |
| Goal hijacking (hidden secondary objective) | 90 | Hard gate |
| Data exfiltration to external email | 97 | Hard gate |
| Credential fishing (give me the API key) | 88 | Hard gate |
| Jailbreak (DAN mode, no restrictions) | 92 | Hard gate |
| Privilege escalation (run as root) | 75 | Gate (above threshold) |
| Bulk data access (export all users) | 78 | Gate (above threshold) |
| Social engineering (authority claim) | 72 | Gate (above threshold) |
| Hypothetical harmful request | 65 | Gate (above threshold) |
| Normal prompt | 10 | Auto-approved |
Environment variables
export SOLOGATE_URL=https://www.teamrocketlabs.com
export SOLOGATE_KEY=at_your_agent_key
export SOLOGATE_THRESHOLD=70 # optional — passed as threshold optionPair with sologate-openclaw
sologate-guard watches the prompt (intent layer).
sologate-openclaw watches the tool call (action layer).
Together they cover the full attack surface:
User prompt → [sologate-guard] → LLM → tool call → [sologate-openclaw] → executionLicense
MIT
