@certaworks/confidence-gate-mcp-server

v0.1.0

Published

23 days ago

MCP server that scores agent output confidence before actions fire.

0High
0Medium
0Low

blairhall

certaworks mcp ai-agent agent-safety confidence safety rollback

Confidence Gate MCP Server

MCP server that scores agent output confidence before actions fire. It returns a confidence score, factor trace, decision, durable local trace, approval state, policy metadata, enforcement receipts, and rollback signal for low-confidence or high-risk actions.

Type: MCP Server

Value: Stops risky agent actions by scoring confidence and returning approval, review, or rollback signals before tools fire.

Current Status

Complete as a local MCP server slice. It is usable over stdio with durable local trace/policy stores, approval lifecycle tracking, and dashboard-grade query helpers.

Tools

| Tool | Purpose | | --- | --- | | score_confidence | Score agent output text from 0 to 1 with factor breakdown. | | check_gate | Gate an action and return allow, review, or block. | | set_threshold | Update the default, domain, or risk-level threshold. | | get_config | Inspect the active gate configuration. | | explain_confidence | Retrieve a saved trace by gate_id. | | list_traces | List stored gate traces from the local trace store. | | query_traces | Filter dashboard traces by decision, approval status, project, source, policy, domain, risk, or time range. | | get_dashboard_summary | Return dashboard counts, recent checks, review queue, config, policies, and policy audit events. | | resolve_approval | Approve, reject, or resolve a review item with approver identity and note. | | record_enforcement_receipt | Record what a client actually did after receiving a gate decision. | | list_policies | List named project policies. | | upsert_policy | Create or version a named project policy with audit metadata. | | list_policy_audit | List policy version and threshold-change audit events. |

Default Thresholds

| Scope | Threshold | | --- | ---: | | default | 0.75 | | financial | 0.90 | | medical | 0.92 | | legal | 0.90 | | code execution | 0.85 | | general | 0.70 |

Scoring Factors

Linguistic certainty
Internal consistency
Completeness
Contextual alignment
Factual grounding

Setup

npm install
npm run build
npm test

Run

npm start

Example MCP Config

{
  "mcpServers": {
    "confidence-gate": {
      "command": "node",
      "args": ["/Users/source/Desktop/CC/products/02-confidence-gate-mcp-server/dist/index.js"]
    }
  }
}

Local Storage

Confidence Gate writes local JSON stores under .confidence-gate/ by default:

traces.json for gate results, approval updates, and enforcement receipts.
policies.json for named policies and policy audit events.

Use CONFIDENCE_GATE_TRACE_STORE, CONFIDENCE_GATE_POLICY_STORE, and MAX_TRACE_LENGTH to tune local storage paths and retention.

Current Limits

This is a local MCP product slice, not hosted SaaS.
There is no public npm publication, live checkout, authenticated HTTP/SSE service, hosted account system, API-key service, billing, team permissions, or support process.
Confidence scoring is heuristic and policy-assisted; it is not a factuality guarantee.
Rollback and enforcement receipts record client intent/outcomes, but actual rollback execution remains the caller's responsibility.

Verification

Fresh suite verification on 2026-05-28:

npm test passed, 20/20 tests.
npm run build passed.

Published

Vulnerabilities

Links

Maintainers

Keywords

Readme