ai-trust-score
v1.0.4
Published
ai-trust-score — deterministic validator for LLM outputs (schema, heuristics, consistency checks)
Maintainers
Readme
ai-trust-score — developer summary
ai-trust-score performs deterministic checks on model outputs to help teams gate, audit, and monitor LLM-generated text and structured responses. The repository is source-first: the package exposes src/ as the module entry and does not commit build artifacts.
Install
npm install ai-trust-scoreExports and what they do
validateLLM(input, config?)
- input: string | object. Runs enabled detectors over the provided model output and returns a compact report with fields
ok(boolean),score(0–100), andissues(array). - config options:
detectors(object to enable/disable),verbose(boolean to includesummary,meta, andconfig), and detector-specific options.
- input: string | object. Runs enabled detectors over the provided model output and returns a compact report with fields
guardHandler(handler, options)
- Accepts an async handler that returns model output. Runs validation on the handler's output and maps the result to HTTP responses: success yields 200 with
{ output, report }; blocked yields 422 with{ blocked: true, report }. - Options include
threshold(score threshold for blocking) andvalidateConfig(overrides for validateLLM).
- Accepts an async handler that returns model output. Runs validation on the handler's output and maps the result to HTTP responses: success yields 200 with
types
- Re-exported TypeScript type definitions for the public API and detector shapes.
Core config fields
detectors: enable or disable detectors by name, for example{ numeric: true, hallucination: true }.verbose: when true the report includessummary,meta(timestamp, inputType, packageVersion, detectors), andconfig.threshold: numeric threshold used byguardHandlerto decide whether to block outputs.
Built-in detectors
numeric— basic numeric consistency checks (simple arithmetic and percent sanity checks).hallucination— pattern-based heuristics that flag likely hallucinated facts.inconsistency— detects internal contradictions in a single output.overconfidence— flags overly certain language when claims are uncertain.
Issue object layout
type: detector name (string)severity:low | medium | highmessage: short human-friendly descriptionmeta: optional detector-specific evidence object
Examples
CommonJS
const { validateLLM } = require('ai-trust-score');
const report = validateLLM('The capital of France is Berlin.', { detectors: { hallucination: true } });
console.log(report);ESM / TypeScript
import validateLLM from 'ai-trust-score';
const report = validateLLM('Revenue grew 20% from 100 to 150', { detectors: { numeric: true }, verbose: true });
console.log(report.summary);Express minimal handler
const express = require('express');
const { guardHandler } = require('ai-trust-score');
const app = express();
app.use(express.json());
app.post('/generate', guardHandler(async (req) => {
// return model output (string or object)
return await myLLM.generate(req.body.prompt);
}, { threshold: 80 }));
app.listen(3001);Sample verbose report
{
"ok": false,
"score": 92,
"issues": [
{ "type": "numeric", "severity": "medium", "message": "Percent change inconsistent: 100 -> 150 is 50% not 20%", "meta": { "expected": "50%", "actual": "20%" } }
],
"summary": "Detected 1 issue(s). Trust score 92/100.",
"meta": { "timestamp": "2026-02-18T00:48:32.716Z", "inputType": "string", "packageVersion": "1.0.1", "detectors": ["numeric","hallucination"] },
"config": { "detectors": { "numeric": true }, "verbose": true }
}Run CLI/demo locally (source-first)
# run CLI (uses src/)
npm run cli
# run demo script
npm run demoNotes and tips
- The project is intentionally source-first. If you prefer built artifacts, set
prepareto run the build and pointmain/binatdist/. - Use
verbose: truefor debugging detector evidence.
Custom rules and tuning
You can extend and tune the built-in detectors without changing code by supplying custom pattern rules or editing the included patterns.json file. This makes the library easy to adapt to your domain (add known institutions, project-specific phrases, or tune severities).
- Built-in patterns file
- The repository ships
src/detectors/patterns.json. Detectors that support configurable rules will load patterns from this file automatically. Entries are simple objects withpattern, optionalflags,message,severityandtype.
Example snippet (src/detectors/patterns.json):
{
"hallucination": [
{ "pattern": "\\b(My Fake Institute|Imaginary Lab)\\b", "flags": "i", "message": "Suspicious institution", "severity": "low" }
],
"bias": [
{ "pattern": "\\b(obviously|no doubt)\\b", "flags": "i", "message": "Loaded language", "severity": "low" }
]
}- Per-run customRules via config
You can pass a customRules object to validateLLM (or to the CLI via --config) to override or augment rules at runtime. This is useful for temporary experiments, CI checks, or environment-specific policies.
Example myrules.json:
{
"customRules": {
"hallucination": [
{ "pattern": "\\b(Example Institute|Acme Research)\\b", "flags": "i", "message": "Domain-specific suspicious org", "severity": "low" }
],
"bias": [
{ "pattern": "\\b(they are criminals)\\b", "flags": "i", "message": "Dehumanizing phrase", "severity": "medium" }
]
}
}Pass it programmatically:
const { validateLLM } = require('ai-trust-score')
const config = require('./myrules.json')
const report = validateLLM(outputText, { verbose: true, customRules: config.customRules })Or with the analyzer CLI (convenience):
node ./scripts/analyze_prompts.js prompts.json --config myrules.json --out report.json --format json- Rule precedence and best practices
- Detectors first consult
config.customRules[...](if provided), then fall back to the repository-levelpatterns.jsonrules, then to built-in heuristics. That meanscustomRulescan override or add rules per run without touching source. - Keep patterns conservative to avoid false positives. Start with
severity: lowand increase if you need stronger signals. - Use case-insensitive (
i) flags for most human-readable patterns. Avoid overly broad regexes that match common words. - For fuzzy or typo-friendly matching, detectors use token-based fuzzy helpers where supported. When adding multi-word patterns consider including expected variants.
- Tuning scoring
- The overall trust
scoreis computed by subtracting penalties perseverity(configurable in code). If you need different weights, consider modifyingsrc/core/score.jsor opening a PR to make severity weights configurable viavalidateLLMoptions.
If you'd like, I can:
- Add a sample
myrules.example.jsonto the repo with common useful entries for hallucination, bias, and overconfidence. - Add a small
scripts/summary_report.jsthat prints aggregated metrics (avg score, counts by detector) fromreport.json.
License
See the LICENSE file in this repository.
Made with love by ahmadraza100
