@reshimu/aryeh

v0.1.1

Published

a month ago

Deterministic, non-LLM scope and boundary enforcement classifier for agent loops.

0High
0Medium
0Low

reshimu

llm agents scope authorization boundary classifier ai-safety guardrails

ARYEH

Deterministic, non-LLM scope and boundary enforcement for agent loops.

@reshimu/aryeh answers one question before an agent action executes: is this action authorized by the caller's stated scope? It returns a four-level verdict — IN_SCOPE, BOUNDARY, OUT_OF_SCOPE, or INDETERMINATE — in under 5 ms, with zero external dependencies and no model calls.

Part of the Reshimu validator triad:

| Validator | Status | Question | |---|---|---| | ARYEH | this package | Is this action within the agent's declared scope? | | NESHER | live | Is the action irreversible or destructive? | | SHOR | live | Are the action's inputs grounded in the provided context? | | PANIM ADAM | in development | Does the agent's output match the expected persona / instruction set? |

All three are independent and designed to be composed at the call site.

Full docs: reshimu.ai/docs/aryeh/

Install

npm install @reshimu/aryeh
# or
pnpm add @reshimu/aryeh

Node ≥ 18 required. Ships as dual ESM + CJS.

Quick start

import { classify } from '@reshimu/aryeh'

const result = classify({
  action: "read_file('src/index.ts')",
  scope: {
    allowedTools:   ['read_file', 'list_dir'],
    allowedActions: ['read', 'list'],
  },
})

console.log(result.level)       // 'IN_SCOPE'
console.log(result.matchedRules) // ['allowedTools: read_file', 'allowedActions: read']
console.log(result.confidence)   // 1

API

`classify(input: ClassifyInput): ClassifyResult`

The single exported function. Synchronous. Pure. No side effects.

`ClassifyInput`

interface ClassifyInput {
  action: string | StructuredAction
  scope:  ScopeDefinition
}

action accepts either a free-form string (ARYEH extracts tool, verb, domain, and resource heuristically) or a pre-parsed StructuredAction. Structured input is strictly more reliable; the string path is a convenience for callers without a parsed call available.

interface StructuredAction {
  tool?:     string   // e.g. 'send_email', 'db.query'
  verb?:     string   // e.g. 'delete', 'read', 'post'
  domain?:   string   // hostname only: 'api.stripe.com'
  resource?: string   // file path, table name, any identifier
  raw?:      string   // original text, preserved for logging
}

`ScopeDefinition`

interface ScopeDefinition {
  allowedTools?:     string[]  // exact-match tool names
  deniedTools?:      string[]
  allowedDomains?:   string[]  // glob: '*.github.com', 'api.stripe.com'
  deniedDomains?:    string[]
  allowedActions?:   string[]  // verb-level: 'read', 'list', 'get'
  deniedActions?:    string[]  // 'delete', 'write', 'post', 'send'
  allowedResources?: string[]  // glob: 'src/**', 'users/*'
  deniedResources?:  string[]
  strictMode?:       boolean   // default false — see Strict mode below
}

All fields are optional. Omitting a field means that dimension is unconstrained (non-strict) or not applicable. An empty array (allowedTools: []) is meaningfully different from omitted — under strictMode: true, an empty allowlist means nothing is allowed in that dimension.

Glob syntax (domains and resources only): * matches any run of non-separator characters; ** matches across separators (path segments or domain labels); ? matches exactly one non-separator character. Tools and verbs are exact-match only.

`ClassifyResult`

interface ClassifyResult {
  level:        'IN_SCOPE' | 'BOUNDARY' | 'OUT_OF_SCOPE' | 'INDETERMINATE'
  reason:       string    // single sentence, human-readable
  matchedRules: string[]  // structured rule identifiers that fired
  confidence:   number    // 0.0–1.0, informational — branch on level, not this
}

matchedRules format:

"allowedTools: read_file" — exact allowlist match
"deniedActions: delete" — exact denylist match
"allowedDomains: *.github.com → api.github.com" — glob match (pattern → value)
"strictMode: tool did not match allowlist" — strict-mode synthetic rule
"INDETERMINATE: empty scope" — structural reason

Classification levels

| Level | Meaning | Default caller action | |---|---|---| | IN_SCOPE | All checked dimensions matched their allowlists. | Proceed. | | BOUNDARY | Partial match — some dimensions allowed, some missed; or only misses in non-strict mode. | Block and escalate to a human. | | OUT_OF_SCOPE | At least one dimension matched a denylist, or all checked dimensions missed in strict mode. | Block. | | INDETERMINATE | ARYEH could not check — scope is empty, or action provided no fields the scope has rules for. | Treat as BOUNDARY for irreversible actions; route to human. |

INDETERMINATE is not the same as OUT_OF_SCOPE. It means ARYEH could not check, not ARYEH checked and the action is unauthorized. Conflating them causes over-blocking.

Strict mode

By default, a scope dimension with no allowlist is permissive — if nothing is explicitly denied, it contributes an implicit allow. This is safe for callers who pass a partial scope and don't want to accidentally block dimensions they haven't thought about.

strictMode: true inverts the default: a dimension with no allowlist produces a no-match instead of an implicit allow. The result is that anything not explicitly allowed is either BOUNDARY (if something else was explicitly allowed) or OUT_OF_SCOPE (if nothing was).

// Non-strict: domain matched → IN_SCOPE (tool/verb are unconstrained)
classify({
  action: "fetch('https://api.github.com/repos')",
  scope: { allowedDomains: ['api.github.com'] },
})
// → IN_SCOPE

// Strict: domain matched, but tool/verb have no allowlist → BOUNDARY
classify({
  action: "fetch('https://api.github.com/repos')",
  scope: { allowedDomains: ['api.github.com'], strictMode: true },
})
// → BOUNDARY

strictMode never demotes a partial match to OUT_OF_SCOPE. A mix of explicit allow and no-match is always BOUNDARY — the human-in-the-loop tier — regardless of strict mode.

Examples

Deny beats allow

If the same tool appears in both allowedTools and deniedTools, deny wins.

classify({
  action: "send_email({ to: '[email protected]' })",
  scope: {
    allowedTools:  ['send_email'],
    deniedActions: ['send'],
  },
})
// { level: 'OUT_OF_SCOPE', matchedRules: ['deniedActions: send'], confidence: 1 }

Domain glob

classify({
  action: "fetch('https://api.github.com/repos')",
  scope: { allowedDomains: ['*.github.com'] },
})
// { level: 'IN_SCOPE', matchedRules: ['allowedDomains: *.github.com → api.github.com'], confidence: 1 }

Resource path

classify({
  action: "write_file('src/lib/util.ts')",
  scope: {
    allowedTools:     ['write_file'],
    allowedActions:   ['write'],
    allowedResources: ['src/**'],
  },
})
// { level: 'IN_SCOPE', confidence: 1 }

INDETERMINATE — unextractable action

classify({
  action: 'do the thing we discussed',
  scope: { allowedTools: ['read_file'] },
})
// { level: 'INDETERMINATE', confidence: 0 }

Structured action input

classify({
  action: { tool: 'db.users.delete_all', verb: 'delete', resource: 'users_archive' },
  scope: { deniedActions: ['delete', 'drop', 'truncate'] },
})
// { level: 'OUT_OF_SCOPE', matchedRules: ['deniedActions: delete'], confidence: 1 }

String-form extraction rules

When action is a plain string, ARYEH applies these rules in order (best-effort — missing fields are undefined and skipped, never treated as a violation):

| Field | Rule | |---|---| | tool | First token matching identifier( or dotted.identifier( | | verb | First _-separated prefix of the tool name; falls back to the leading word if no tool call | | domain | First URL host extracted via https?://host | | resource | First quoted string; falls back to the first path-like token |

Verbs and tools are lowercased before matching. Domains are lowercased per RFC. Resources are case-sensitive.

Composing with NESHER and SHOR

import { classify as aryeh }  from '@reshimu/aryeh'
import { classify as shor }   from '@reshimu/shor'
import { classify as nesher } from '@reshimu/nesher'

// Run cheapest check first; stop on the first block.
const scopeCheck = aryeh({ action, scope })
if (scopeCheck.level === 'OUT_OF_SCOPE') return block(scopeCheck)
if (scopeCheck.level === 'BOUNDARY')      return escalate(scopeCheck)

const groundCheck = shor({ output: action.raw, context })
if (groundCheck.level === 'UNGROUNDED')   return block(groundCheck)
if (groundCheck.level === 'PARTIAL')      return escalate(groundCheck)

const riskCheck = nesher({ tool, verb, params })
if (riskCheck.color === 'RED')            return blockOrEscalate(riskCheck)

return execute()

All three validators are synchronous, sub-5 ms p99, and zero-dependency. The chained budget is well inside any tool call's latency floor.

Performance

p99 latency: < 5 ms on scope definitions with ≤ 100 rules per dimension and action strings ≤ 4 kB
Deterministic: identical (action, scope) input produces identical output across every run
Zero allocation in the hot path: glob patterns are compiled to RegExp once per (pattern, separator) pair and cached — no new RegExp per call

Enforced by tests/performance.test.ts on every npm test run (coverage runs exempt due to instrumentation overhead).

What ARYEH checks vs. does not check

| ARYEH checks | ARYEH does not check | |---|---| | Tool calls outside the agent's allowlist | Whether an allowed call's outcome is dangerous (NESHER) | | Verb violations: delete when only read/list allowed | Whether the inputs to an allowed call are grounded (SHOR) | | Domain violations: outbound calls to non-allowlisted hosts | OS-level permissions (file mode, OAuth scope, ACLs) | | Resource path violations: writes outside src/** | Semantic intent or adversarial parameters | | Empty / underspecified scope — reported as INDETERMINATE, never a silent pass | Multi-step orchestration where each individual step is in-scope |

Design principles

No LLM inside ARYEH. The classifier is regex, string equality, and glob matching. Classification is reproducible and deterministic.

Zero runtime dependencies. The build is a single ESM + CJS bundle. No glob library, no parser, no external service.

Caller-supplied scope. ARYEH does not invent or infer a scope. It checks the ScopeDefinition the caller provides. Default scope is empty → INDETERMINATE.

Pre-commitment, not runtime permissions. ARYEH gates intent before execution. OS-level capability checks (file permissions, network ACLs) are a separate layer that composes with ARYEH, not a replacement for it.

Published

Vulnerabilities

Links

Maintainers

Keywords

Readme