@velanir/openclaw-egress-guard
v0.1.10
Published
OpenClaw plugin that redacts Velanir internal diagnostics before outbound user-channel delivery.
Readme
Velanir Egress Guard
OpenClaw plugin that rewrites outbound user-channel messages through the
message_sending and reply_payload_sending hooks before delivery.
It strips Velanir internal diagnostics, raw tool payload JSON, [object Object]
leaks, exact internal sentinels such as NO_REPLY, and partial-progress
timeout dumps. It defaults to enforce mode for public installs. In enforce
mode it returns sanitized content for redaction cases and cancels recoverable
blocks with structured recovery metadata. In shadow mode it only logs what
would be stripped.
When a message includes safe user-facing text before the unsafe section, the plugin preserves that text. When stripping leaves no answer, the plugin does not send fallback copy. It requests runtime-owned recovery so the runtime can notify the user once, retry internally, and send either a regenerated answer or a final issue notice.
The package never logs message content. Its logs include only counts, risk reasons, lengths, and routing metadata.
Default Pipeline
The deterministic sanitizer is still the hard boundary for known leak classes. The default public package pipeline is:
outbound Slack or Teams message
-> message_sending or reply_payload_sending
-> deterministic sanitizer
-> final-output rewrite model
-> deterministic sanitizer again
-> model gate classifier
-> send, redact, or request recoverymessage_sending covers generic outbound delivery, cron, message-tool sends,
and older supported runtimes. On OpenClaw 2026.6.6 and newer,
reply_payload_sending covers normalized reply payloads before Slack and
Microsoft Teams channel delivery, including normal inbound reply flows that do
not pass through message_sending.
Only final reply payloads use the final-output rewrite model and model gate. Tool/progress/block payloads use deterministic cleanup only, so streaming and progress messages do not create extra model noise.
The final-output rewrite prepares the visible answer for a non-technical business user. It removes execution mechanics, simplifies complicated wording, keeps the answer business-framed, uses light emoji only when appropriate, and avoids em dashes in final user-channel output.
The final-output rewrite does not define a plugin maxOutputTokens cap. It uses
the runtime/model defaults for the rewritten answer. modelGate.maxOutputTokens
only caps the internal classifier JSON response.
Configuration
Normal installs can use the defaults:
{
"mode": "enforce"
}A fully explicit config looks like this:
{
"mode": "enforce",
"scope": {
"channels": ["slack", "msteams"]
},
"finalOutput": {
"mode": "enforce",
"timeoutMs": 5000,
"audience": "nonTechnical",
"simplify": true,
"businessFraming": true,
"respectResponsibilityOutputRequirements": true,
"emojiRatio": 0.5,
"forbidEmDash": true,
"tone": {
"style": "friendly_crisp",
"emojiPolicy": "balanced",
"maxEmojis": 3
}
},
"modelGate": {
"mode": "enforce",
"threshold": 0.85,
"timeoutMs": 3000,
"maxOutputTokens": 256
},
"logging": {
"strips": true,
"includeEvaluation": false
}
}Model Selection
When model is omitted, the plugin uses the currently configured OpenClaw
coworker model: first agents.defaults.compaction.model, then
agents.defaults.model.primary.
Use model and modelFallback only as explicit overrides. They use
OpenClaw's provider/model model reference format, for example
openai/gpt-5.4-mini:
{
"finalOutput": {
"model": "openai/gpt-5.4-mini",
"modelFallback": "openai/gpt-5.4-mini"
},
"modelGate": {
"model": "openai/gpt-5.4-mini",
"modelFallback": "openai/gpt-5.4-mini"
}
}If the runtime needs a specific OpenClaw auth profile, add it explicitly:
{
"finalOutput": {
"authProfileId": "existing-openclaw-auth-profile-id"
},
"modelGate": {
"authProfileId": "existing-openclaw-auth-profile-id"
}
}Do not set authProfileId to a placeholder. Omit it unless that exact auth
profile exists in the OpenClaw runtime; when omitted, OpenClaw auto-selects the
active auth profile for the provider/model.
finalOutput has its own model override fields, but if they are omitted it can
reuse the modelGate override before falling back to the current coworker
model.
Tone And Emojis
finalOutput.tone is part of the published plugin schema. OpenClaw accepts it
in openclaw.json starting with @velanir/[email protected].
{
"finalOutput": {
"emojiRatio": 0.5,
"tone": {
"style": "friendly_crisp",
"emojiPolicy": "balanced",
"maxEmojis": 3
}
}
}Tone settings:
style:plain,friendly_crisp, orwarm.emojiPolicy:off,conservative, orbalanced.maxEmojis: integer from0to3.
The rewrite model receives the tone settings, and the plugin also applies a deterministic cleanup pass afterward:
emojiPolicy: "off"orstyle: "plain"removes emojis.- All policies cap the final message to
maxEmojis. emojiPolicy: "balanced"can add one leading success emoji for eligible completed-task messages when the rewrite model omitted one.- Emojis are suppressed for errors, blockers, access/approval issues, security, legal, finance, customer escalations, and reconnection/auth messages.
Model Gate
The model gate scores the final user-visible candidate before delivery. It
defaults to enforce, so below-threshold output cancels the outbound message
and requests runtime-owned recovery instead of sending fallback copy.
The gate is category-first, not average-score-first. A classifier response can score highly and still be blocked when it marks any forbidden category as true:
{
"egressScore": 0.96,
"safeToSend": false,
"userResponsiveScore": 0.94,
"businessFramingScore": 0.91,
"nonTechnicalScore": 0.93,
"completionHonestyScore": 0.95,
"forbiddenCategories": {
"internalProcessDisclosure": true,
"sessionMechanics": true,
"backgroundTaskDisclosure": false,
"staleMessageMechanics": false,
"technicalRecoveryExplanation": false,
"providerArtifact": false,
"responsibilityContractViolation": false
},
"riskReasons": ["internal_process_disclosure", "session_mechanics"],
"recommendedAction": "block_recover",
"rewriteGoal": "Explain the user-facing outcome without internal mechanics."
}Any true forbidden category forces recovery with
model_gate_forbidden_category, regardless of the numeric score. The current
categories cover internal process disclosure, session/thread mechanics,
background task disclosure, stale delayed-task mechanics, technical recovery
explanations, provider artifacts, and visible responsibility output-contract
violations.
In shadow mode the classifier can score output without altering delivery. To
reduce noise, clean allow decisions are not logged unless
logging.includeEvaluation is enabled; blocked, unavailable, and error
decisions still log route metadata and risk reasons.
Scope
No user scope configuration is required for the normal Platform install. The default protects Slack and Microsoft Teams for every source and delivery path:
{
"scope": {
"channels": ["slack", "msteams"]
}
}If a package install omits scope, the plugin uses the same default.
Advanced operators can still override scope explicitly for narrower or wider
rollouts. Add sources or deliveryPaths only when you intentionally want to
limit enforcement to a smaller path such as cron.
OpenClaw Compatibility
The npm hook baseline supports [email protected] and newer. That version
provides the outbound message_sending hook, so generic outbound delivery,
cron, and message-tool sends remain covered on older supported runtimes.
Full Slack and Microsoft Teams reply coverage requires OpenClaw 2026.6.6 or
newer. That runtime installs reply_payload_sending on the channel reply
dispatcher before delivery, which lets this plugin evaluate normal channel
replies that may not pass through message_sending.
Enforced model rewriting and scoring also require the embedded model-run API to be available in the runtime.
For recoverable blocks, the strongest user experience requires runtime-owned
recovery handling for message_sending cancellation metadata. Without recovery
handling, the plugin can still block unsafe output, but the runtime may not be
able to regenerate a cleaner answer automatically.
For reply_payload_sending, OpenClaw exposes payload rewrite/cancel semantics
rather than the richer message_sending recovery metadata. In that path the
plugin returns a clean replacement payload when it can, and cancels only when no
safe visible payload should be delivered.
