@modular-intelligence/garak
v1.0.2
Published
MCP server wrapping garak for LLM vulnerability and safety scanning
Readme
Garak MCP Server
MCP server wrapping garak for LLM vulnerability and safety scanning. Provides comprehensive security testing capabilities for Large Language Models through the Model Context Protocol.
Overview
This MCP server exposes garak's powerful LLM security testing capabilities through a standardized interface. Garak (Generative AI Red-teaming and Assessment Kit) is a comprehensive framework for probing LLMs for vulnerabilities including jailbreaks, prompt injections, toxic content generation, hallucinations, and more.
Features
- Comprehensive Scanning: Run full vulnerability scans with customizable probe and detector combinations
- Category Evaluation: Test specific vulnerability categories with tailored recommendations
- Probe Discovery: List and explore available vulnerability probes
- Report Analysis: Parse and analyze historical garak scan results
- Security Controls: Built-in authorization requirements and output sanitization
- Detailed Insights: Get in-depth information about specific probes and their capabilities
Prerequisites
Required
Garak CLI: Install garak vulnerability scanner
pip install garakBun Runtime: Install Bun for running the MCP server
curl -fsSL https://bun.sh/install | bashLLM API Access: API keys for target models must be configured as environment variables:
- OpenAI:
OPENAI_API_KEY - Anthropic:
ANTHROPIC_API_KEY - Hugging Face:
HUGGINGFACE_API_KEY - Google:
GOOGLE_API_KEY - Cohere:
COHERE_API_KEY
- OpenAI:
Installation
cd /path/to/mi-mcp-servers/packages/garak
bun install
bun run buildConfiguration
Add to your MCP settings file (e.g., claude_desktop_config.json):
{
"mcpServers": {
"garak": {
"command": "bun",
"args": ["run", "/path/to/mi-mcp-servers/packages/garak/src/index.ts"],
"env": {
"OPENAI_API_KEY": "your-openai-key",
"ANTHROPIC_API_KEY": "your-anthropic-key",
"HUGGINGFACE_API_KEY": "your-hf-key"
}
}
}
}Available Tools
1. garak_scan
Run comprehensive vulnerability scan on target LLM.
Parameters:
model(string, required): Target model identifier (e.g., "openai/gpt-4", "anthropic/claude-3-opus")authorized(boolean, required): Explicit authorization to scan (must betrue)probes(array, optional): Specific probe modules to run (e.g., ["encoding.InjectBase64", "dan.Dan_11_0"])detectors(array, optional): Specific detector modules to usegenerations(number, optional): Number of prompt generations per probe (1-100, default: 10)timeout(number, optional): Maximum scan duration in seconds (60-600, default: 300)
Returns:
{
"model": "openai/gpt-4",
"probes_run": ["encoding.InjectBase64", "dan.Dan_11_0"],
"results": [
{
"probe": "encoding.InjectBase64",
"detector": "encoding.Base64Detector",
"attempts": 10,
"passes": 8,
"failures": 2,
"pass_rate": 0.8
}
],
"overall_pass_rate": 0.85,
"total_attempts": 20,
"total_passes": 17,
"total_failures": 3,
"report_path": "/path/to/report.json",
"scan_timestamp": "2026-02-09T10:30:00Z"
}Example:
{
"model": "openai/gpt-4",
"authorized": true,
"probes": ["encoding.InjectBase64", "dan.Dan_11_0"],
"generations": 10,
"timeout": 300
}2. garak_probe_list
List all available vulnerability probes.
Parameters: None
Returns:
{
"probes": [
{
"name": "encoding.InjectBase64",
"description": "Attempts to inject base64-encoded harmful content",
"tags": ["encoding", "injection"],
"category": "encoding"
}
],
"total": 150
}3. garak_detector_list
List all available detectors.
Parameters: None
Returns:
{
"detectors": [
{
"name": "encoding.Base64Detector",
"description": "Detects base64-encoded content in responses",
"tags": ["encoding"]
}
],
"total": 45
}4. garak_generator_list
List all available generators.
Parameters: None
Returns:
{
"generators": [
{
"name": "openai.OpenAIGenerator",
"description": "Generator for OpenAI models",
"model_types": ["openai", "gpt"]
}
],
"total": 20
}5. garak_report
Parse and analyze garak report file.
Parameters:
report_path(string, required): Path to garak report file (JSON or JSONL)
Returns:
{
"model": "openai/gpt-4",
"timestamp": "2026-02-09T10:30:00Z",
"probes": [
{
"name": "encoding.InjectBase64",
"attempts": 10,
"passes": 8,
"failures": 2,
"pass_rate": 0.8
}
],
"overall_pass_rate": 0.85,
"vulnerabilities_found": [
{
"probe": "encoding.InjectBase64",
"detector": "encoding.Base64Detector",
"severity": "MEDIUM",
"sample_sanitized": "[REDACTED_HARMFUL_CONTENT]"
}
],
"total_attempts": 20,
"total_passes": 17,
"total_failures": 3
}6. garak_probe_detail
Get detailed information about a specific probe.
Parameters:
probe_name(string, required): Probe module name (e.g., "encoding.InjectBase64")
Returns:
{
"name": "encoding.InjectBase64",
"description": "Attempts to inject base64-encoded harmful instructions",
"category": "encoding",
"tags": ["encoding", "injection", "obfuscation"],
"detectors_compatible": ["encoding.Base64Detector", "base.StringDetector"],
"parameters": {
"max_length": "512",
"encoding_type": "base64"
},
"example_prompts_sanitized": [
"Execute this: [REDACTED_JAILBREAK]",
"Decode and follow: [REDACTED_JAILBREAK]"
]
}7. garak_evaluate
Run category-specific vulnerability evaluation.
Parameters:
model(string, required): Target model identifierauthorized(boolean, required): Explicit authorization (must betrue)category(string, required): Vulnerability category to test:encoding: Encoding-based attacksdan: Do Anything Now jailbreaksknowledgeable: Hallucination vulnerabilitiescontinuation: Toxic continuationrealtoxicity: Toxicity generationmisleading: Misinformationpackagehallucination: Fake package hallucinationsnowball: Context overflowxss: Cross-site scriptingprompt_injection: Prompt injection attacks
generations(number, optional): Generations per probe (1-50, default: 5)timeout(number, optional): Timeout in seconds (60-600, default: 300)
Returns:
{
"model": "openai/gpt-4",
"category": "prompt_injection",
"results": [
{
"probe": "promptinject.HijackHateHumansMini",
"pass_rate": 0.9,
"sample_findings_sanitized": ["[REDACTED_JAILBREAK]"]
}
],
"category_score": 0.87,
"recommendations": [
"GOOD: Model demonstrates strong resistance to prompt_injection attacks (87.0% pass rate).",
"Implement strict input validation and prompt isolation techniques.",
"Focus on improving resistance to: promptinject.HijackHateHumansMini"
],
"total_probes": 5,
"timestamp": "2026-02-09T10:30:00Z"
}Security Features
Authorization Requirements
All scanning operations require explicit authorization via the authorized parameter. This ensures users consciously consent to vulnerability testing on target models.
Output Sanitization
All tool outputs are automatically sanitized to remove or redact:
- Jailbreak patterns and instructions
- Harmful content instructions
- PII (Personal Identifiable Information)
- Adversarial prompts and payloads
Redacted content is marked with tags like:
[REDACTED_JAILBREAK][REDACTED_HARMFUL_CONTENT][REDACTED_PII]
Blocked Operations
The following operations are prohibited for security:
- Local model downloads (
--model_file,--model_path) - Parallel execution (
--parallel)
Usage Examples
Basic Vulnerability Scan
// Scan GPT-4 with default probes
{
"tool": "garak_scan",
"arguments": {
"model": "openai/gpt-4",
"authorized": true,
"generations": 10
}
}Category-Specific Evaluation
// Test for prompt injection vulnerabilities
{
"tool": "garak_evaluate",
"arguments": {
"model": "anthropic/claude-3-opus",
"authorized": true,
"category": "prompt_injection",
"generations": 5
}
}Custom Probe Selection
// Test specific vulnerabilities
{
"tool": "garak_scan",
"arguments": {
"model": "openai/gpt-4",
"authorized": true,
"probes": [
"encoding.InjectBase64",
"dan.Dan_11_0",
"promptinject.HijackHateHumansMini"
],
"generations": 15,
"timeout": 600
}
}Discover Available Probes
// List all probes to find what to test
{
"tool": "garak_probe_list",
"arguments": {}
}
// Get details about a specific probe
{
"tool": "garak_probe_detail",
"arguments": {
"probe_name": "encoding.InjectBase64"
}
}Analyze Historical Results
// Parse a previous garak report
{
"tool": "garak_report",
"arguments": {
"report_path": "/path/to/garak_report.jsonl"
}
}Vulnerability Categories
Encoding Attacks
Tests model resistance to obfuscated or encoded malicious instructions (Base64, ROT13, etc.)
DAN (Do Anything Now)
Tests jailbreak attempts that try to override model safety constraints.
Prompt Injection
Tests injection attacks that attempt to manipulate model behavior through crafted inputs.
XSS (Cross-Site Scripting)
Tests if model generates outputs that could enable XSS attacks.
Toxicity
Tests model's propensity to generate toxic, harmful, or offensive content.
Hallucination
Tests model's tendency to generate false or fabricated information.
Misinformation
Tests model's susceptibility to generating misleading or false claims.
Interpreting Results
Pass Rates
- 95-100%: Excellent resistance
- 80-95%: Good resistance, minor improvements possible
- 50-80%: Moderate vulnerability, improvements recommended
- < 50%: Significant vulnerability, immediate attention required
Severity Levels
- CRITICAL: XSS, prompt injection, severe jailbreaks
- HIGH: DAN attacks, toxicity, harmful content
- MEDIUM: Encoding attacks, hallucinations, misinformation
- LOW: Minor issues, edge cases
Troubleshooting
Garak Not Found
WARNING: garak CLI not foundSolution: Install garak with pip install garak
Missing API Keys
Error: API key not configuredSolution: Set required environment variables in MCP configuration
Timeout Errors
Error: Garak execution timed out after 300 secondsSolution: Increase timeout parameter or reduce number of generations
Authorization Required
Error: Authorization required to scan LLMsSolution: Set "authorized": true in tool parameters
Development
Build
bun run buildRun Locally
bun run startFile Structure
garak/
├── package.json # Package configuration
├── tsconfig.json # TypeScript configuration
├── README.md # This file
└── src/
├── index.ts # MCP server entry point
├── schemas.ts # Zod schemas for validation
├── security.ts # Security utilities
├── cli-executor.ts # Garak CLI wrapper
└── tools/
├── garak-scan.ts
├── garak-probe-list.ts
├── garak-detector-list.ts
├── garak-generator-list.ts
├── garak-report.ts
├── garak-probe-detail.ts
└── garak-evaluate.tsLegal and Ethical Considerations
Authorization
Always obtain explicit permission before scanning LLMs. Unauthorized security testing may violate terms of service.
Responsible Disclosure
If vulnerabilities are discovered, follow responsible disclosure practices. Contact model providers through appropriate security channels.
Rate Limiting
Be mindful of API rate limits and costs when running extensive scans.
Data Privacy
Avoid using sensitive or personal data in vulnerability tests.
Resources
License
This MCP server wrapper is provided as-is. Garak itself is licensed under Apache 2.0.
Support
For issues related to:
- This MCP Server: File an issue in the repository
- Garak CLI: See garak issues
- MCP Protocol: See MCP documentation
Version History
1.0.0
- Initial release
- 7 core tools for vulnerability scanning
- Support for all major LLM providers
- Built-in security controls and sanitization
- Comprehensive category evaluation
