@shakircm/musaw

v1.0.1

Published

18 days ago

MuSAW — Cross-model AI output verifier. Extract claims, detect hallucinations, find real evidence via CrossRef. Works on Claude, GPT, Perplexity, Gemini, any model.

0High
0Medium
0Low

shakircm

ai verification hallucination detection claude gpt perplexity llm trust safety crossref evidence fact-check

🛡️ MuSAW — Cross-Model AI Output Verifier

The independent truth layer for any AI model. Extract claims, detect hallucinations, and find real evidence — works on Claude, GPT, Perplexity, Gemini, Llama, any model.

npx musaw verify output.txt

The Gap

| Company | Can verify itself? | Why not? | |---------|:------------------:|----------| | Anthropic | ❌ | Claude can't audit Claude — same blind spots | | OpenAI | ❌ | GPT shares the same hallucination patterns | | Perplexity | ❌ | Their search summaries are the product itself | | Google (Gemini) | ❌ | Same architecture, same weaknesses | | MuSAW | ✅ | Model-agnostic, no shared DNA |

Every major AI lab is racing to build self-verification. None can succeed. True verification requires independence — a layer that doesn't share the model's training data, architecture, or biases.

Install

# Install globally
npm install -g musaw

# Or run directly
npx musaw

# Or use as a library
npm install musaw

Quick Start

# Run the demo
npx musaw demo

# Verify a file
npx musaw check output.txt

# Pipe from another command
cat output.txt | npx musaw check -

# Quick confidence check
npx musaw quick output.txt

# Compare two model outputs
npx musaw compare output-a.txt output-b.txt

# Run tests
npx musaw test

Use as a Library

import { MuSAW } from 'musaw';

const musaw = new MuSAW({ strictness: 'strict' });

// Verify any AI output
const result = await musaw.analyze(`
  According to a recent study published in Nature, 
  Google's Sycamore processor achieved quantum supremacy in 2019.
`);

console.log(result.report);
// ✅ Found: "Sodium Regulation in the Blood..." (Nature, 1959)
//    DOI: 10.1038/184283a0
//    Confidence: 60% — source exists but claim not verified

// Import individual modules
import { ClaimExtractor } from 'musaw/core/claim-extractor';
import { Verifier } from 'musaw/core/verifier';
import { EvidenceRetriever } from 'musaw/core/evidence-retriever';
import { ReportGenerator } from 'musaw/core/report-generator';

Architecture

AI Output ──→ MuSAW ──→ Per-Claim Confidence Score + Evidence
(any model)

Layer 1: CLAIM EXTRACTION
  Parses AI output into atomic verifiable statements.
  Identifies: entities, numbers, dates, sources, forgeability patterns
  Flags: "studies show", "experts say", over-precise percentages

Layer 2: VERIFICATION ENGINE (5 dimensions)
  • Internal consistency — does it contradict itself?
  • Numeric sanity — do the numbers make sense?
  • Attribution quality — is the source real and specific?
  • Temporal consistency — are timeframes realistic?
  • Hallucination patterns — known fabrication signatures?

Layer 3: EVIDENCE RETRIEVAL
  For claims referencing studies/journals:
  • Extracts DOIs, journal names, paper titles
  • Queries CrossRef API (free, no key needed)
  • Returns: real paper title, DOI, journal, author, direct URL
  • Boosts confidence when real source confirms the claim

What MuSAW Detects

MuSAW automatically flags:

"Studies show" / "Research indicates" — unattributed research claims
"Experts say" / "Some believe" — vague expert consensus
Over-precise statistics — "exactly 83.7%" is often fabricated
Fake paper IDs — "paper-2024-xyz" pattern
Self-contradictions — saying "all" and "some" in the same breath
Temporal vagueness — "recently", "soon", "historically"
Suspiciously round numbers — 42, 100, 1000 popping up

Verified Models

MuSAW works on any text output from any model, with zero changes:

| Model | Company | Status | |-------|---------|:------:| | Claude | Anthropic | ✅ | | GPT-4 / GPT-4o | OpenAI | ✅ | | Perplexity | Perplexity AI | ✅ | | Gemini | Google | ✅ | | Llama | Meta | ✅ | | Mistral | Mistral AI | ✅ | | Grok | xAI | ✅ | | DeepSeek | DeepSeek | ✅ | | Copilot | Microsoft | ✅ | | Any future model | Any company | ✅ |

Project Structure

musaw/
├── src/
│   ├── cli.js                    # CLI entry with demo/test/compare
│   ├── pitch.js                  # $100M pitch deck
│   ├── targets.js                # 14 acquisition targets
│   ├── verify-all-models.js      # Cross-model demo
│   └── core/
│       ├── musaw.js              # Main orchestrator
│       ├── claim-extractor.js    # Text → atomic claims
│       ├── verifier.js           # 5-dimension verification
│       ├── evidence-retriever.js # CrossRef real paper lookup
│       └── report-generator.js   # Beautiful reports
├── workspace/                    # Analysis outputs
├── vault/                        # Trusted fact storage (future)
├── package.json
├── README.md
└── LICENSE

Why This Matters

The industry hallucination rate is 10-40% across all major models. That's $100B+ in liability. The EU AI Act makes verification mandatory by 2026. Every enterprise deploying AI needs an independent verifier.

MuSAW is the only one.

License

MIT — MuSAW Labs

Pitch

npx musaw pitch   # View $100M pitch deck
npx musaw targets  # View acquisition targets