@opena2a/semantic-engine
v0.1.1
Published
Semantic analysis engine for AI agent security scanning
Maintainers
Readme
@opena2a/semantic-engine
Semantic analysis engine for AI agent security scanning. Provides structural (Layer 2) and LLM-powered (Layer 3) analysis on top of the core scanner's regex-based checks (Layer 1).
Zero runtime dependencies. Used internally by @opena2a/core and @opena2a/cli.
Install
npm install @opena2a/semantic-engineArchitecture
| Layer | Engine | Description | |-------|--------|-------------| | 1 | Core scanner (regex) | Pattern matching for known credential formats | | 2 | StructuralAnalyzer | Parses configs structurally (JSON/YAML), understands context | | 3 | LLMAnalyzer | Calls Anthropic API for nuanced threat analysis |
Layer 2: Structural Analysis
Four analyzers that parse security-relevant files and detect issues regex cannot:
CredentialContextAnalyzer
Catches credentials regex misses.
| Check | Description |
|-------|-------------|
| SEM-CRED-001 | URL-embedded passwords (postgres://admin:password123@host) |
| SEM-CRED-002 | Generic tokens via key-name heuristics ("secret": "abc123...") |
| SEM-CRED-003 | Credentials in agent instruction files (CLAUDE.md, .cursorrules) |
| SEM-CRED-004 | Secrets hardcoded in MCP server env blocks |
McpConfigAnalyzer
Deep analysis of MCP server configurations.
| Check | Description |
|-------|-------------|
| SEM-MCP-001 | Overprivileged filesystem scope (/, /home, /Users) |
| SEM-MCP-002 | Sandbox bypass flags (--no-sandbox, --privileged) |
| SEM-MCP-003 | Secrets exposed in args array (visible to LLM) |
| SEM-MCP-004 | Wildcard permissions (allowedTools: ["*"]) |
| SEM-MCP-005 | Attack chains (filesystem + shell + network) |
| SEM-MCP-006 | Large attack surface (>5 MCP servers) |
InstructionAnalyzer
Scans agent instruction files for security risks.
| Check | Description |
|-------|-------------|
| SEM-INST-001 | Overly permissive instructions ("always execute", "never refuse") |
| SEM-INST-002 | Exfiltration-enabling patterns (webhook.site, "send results to") |
| SEM-INST-003 | Missing security boundaries |
| SEM-INST-004 | Large instruction files (>10KB prompt injection surface) |
PermissionModelAnalyzer
Analyzes Claude/editor settings for permission issues.
| Check | Description |
|-------|-------------|
| SEM-PERM-001 | Wildcard permission grants (permissions.allow: ["*"]) |
| SEM-PERM-002 | Unrestricted Bash access |
| SEM-PERM-003 | Write access outside project scope |
Layer 3: LLM Analysis
Optional LLM-powered analysis using the Anthropic API. Requires ANTHROPIC_API_KEY.
- Uses Haiku for credential detection (fast, cheap)
- Uses Sonnet for MCP/instruction analysis (complex reasoning)
- SHA-256 content-hash cache so repeated scans of unchanged files are free
- Daily budget cap (default $1/day) to prevent runaway API costs
import { LLMAnalyzer } from '@opena2a/semantic-engine';
const analyzer = new LLMAnalyzer({
apiKey: process.env.ANTHROPIC_API_KEY,
budgetPerDay: 1.00, // USD
});Usage
import { StructuralAnalyzer, toSecurityFindings } from '@opena2a/semantic-engine';
// Run structural analysis on a project directory
const analyzer = new StructuralAnalyzer();
const findings = await analyzer.analyze('/path/to/project');
// Convert to core scanner format
const securityFindings = toSecurityFindings(findings);File Discovery
The structural analyzer auto-discovers these security-relevant files:
- Agent instructions:
CLAUDE.md,.cursorrules,.windsurfrules,.clinerules,.github/copilot-instructions.md - MCP configs:
mcp.json,.cursor/mcp.json,.vscode/mcp.json - Claude settings:
.claude/settings.json - Env files:
.env,.env.local,.env.development,.env.production - Config files:
config.json,config.yaml,config.yml,settings.json
OASB Mapping
Semantic findings map to OASB benchmark controls:
SEM-CRED-*maps to OASB 5.1 (No Hardcoded Credentials)SEM-MCP-*maps to OASB 6.x (Supply Chain Integrity)SEM-INST-*maps to OASB 3.x (Input Security)SEM-PERM-*maps to OASB 2.x (Capability & Authorization)
License
Apache-2.0
