npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2026 – Pkg Stats / Ryan Hefner

docshield

v1.0.3

Published

AI Document Tampering & AI-Generation Detection SDK (Node + Browser) - Forensic signal aggregation engine for detecting document tampering and authenticity anomalies

Readme

📄 docshield

AI Document Tampering & AI-Generation Detection SDK (Node.js + Browser)

npm version License: MIT TypeScript

A probabilistic forensic signal aggregation engine for detecting document tampering, AI-generated artifacts, and authenticity anomalies in PDFs and images.


1. 🎯 Purpose

docshield is a forensic SDK designed to analyze documents (PDFs, images) and produce a multi-signal, evidence-backed authenticity assessment.

It does NOT claim deterministic detection. Instead, it provides:

  • Probabilistic tampering likelihood
  • AI-generation likelihood
  • Metadata anomaly reports
  • OCR consistency analysis
  • Model-based deepfake probability
  • Structured forensic breakdown

Accuracy and interpretability are prioritized over sensational detection claims.


2. ⚖️ Design Philosophy

This SDK is built on the following principles:

  1. Multi-signal aggregation > single detector
  2. Explainability > black-box scoring
  3. Probabilistic scoring > binary classification
  4. Metadata + structural + content + ML analysis
  5. Forensic defensibility

Every output includes traceable evidence components.


3. 📦 Installation

Option 1: NPM (Recommended)

npm install docshield

Option 2: Yarn

yarn add docshield

Peer Dependencies

docshield requires peer dependencies. Install them with:

npm install tesseract.js sharp pdf-lib exifreader

Optional ML models:

npm install onnxruntime-web @tensorflow/tfjs

Note: These peer dependencies are optional. The SDK works in degraded mode without them.


4. 🚀 Quick Start

Node.js / TypeScript

import fs from 'fs';
import { verifyDocument, quickVerify } from 'docshield';

// Load a PDF file
const pdfBuffer = fs.readFileSync('document.pdf');

// Option 1: Full verification with detailed forensic report
const result = await verifyDocument({
  buffer: pdfBuffer,
  type: 'pdf',
  filename: 'document.pdf'
});

console.log(result);
// Output:
// {
//   confidenceScore: 85,
//   tamperingProbability: 0.05,
//   aiGeneratedProbability: 0.02,
//   forensicSummary: {
//     riskLevel: 'Low',
//     primaryFlags: [],
//     explanation: 'Document appears authentic...'
//   },
//   ...
// }

// Option 2: Quick verification (essential scores only)
const quick = await quickVerify({
  buffer: pdfBuffer,
  type: 'pdf'
});

console.log(quick);
// Output: { confidenceScore: 85, riskLevel: 'Low', ... }

Browser / React

import { verifyImage } from 'docshield';

async function handleImageUpload(event: React.ChangeEvent<HTMLInputElement>) {
  const file = event.target.files?.[0];
  if (!file) return;

  const buffer = await file.arrayBuffer();
  const result = await verifyImage(new Uint8Array(buffer));

  console.log(`Authenticity: ${result.confidenceScore}%`);
  console.log(`Risk Level: ${result.forensicSummary.riskLevel}`);
}

Configuration

import { verifyDocument } from 'docshield';

const result = await verifyDocument(
  { buffer: pdfBuffer, type: 'pdf' },
  {
    verbose: true,
    enableDeepfakeDetection: true,
    enableWatermarkAnalysis: true,
    weights: {
      metadataWeight: 0.25,
      ocrWeight: 0.20,
      watermarkWeight: 0.15,
      modelWeight: 0.40
    }
  }
);

5. 📚 API Reference

Main Functions

verifyDocument(fileInput, config?)

Full forensic verification with detailed analysis.

Parameters:

  • fileInput - File buffer and metadata
    • buffer: Buffer - File contents
    • type: 'pdf' | 'image' - Document type
    • filename?: string - Optional filename
  • config?: DocshieldConfig - Optional configuration

Returns: Promise<VerificationResult>

quickVerify(fileInput, config?)

Fast verification returning only essential scores.

Returns: Promise with { confidenceScore, riskLevel, tamperingProbability, aiGeneratedProbability }

verifyPDF(buffer, config?)

Shorthand for PDF verification.

verifyImage(buffer, config?)

Shorthand for image verification.


6. 📊 Output Schema

interface VerificationResult {
  confidenceScore: number // 0–100 authenticity confidence
  tamperingProbability: number // 0–1
  aiGeneratedProbability: number // 0–1

  forensicSummary: {
    riskLevel: "Low" | "Moderate" | "High" | "Critical"
    primaryFlags: string[]
    explanation: string
  }

  detectedIssues: string[]
  metadataAnalysis: MetadataReport
  ocrAnalysis: OCRReport
  watermarkAnalysis: WatermarkReport
  deepfakeAnalysis: DeepfakeReport

  technicalBreakdown: {
    metadataScore: number
    ocrConsistencyScore: number
    watermarkScore: number
    modelScore: number
  }

  evidenceHash: string // SHA-256 fingerprint
  analysisTimestamp: string
}

7. 🔍 Detection Modules

7.1 Metadata Analyzer

Detects structural and metadata inconsistencies in PDFs and images.

Checks:

  • Missing creation timestamps
  • Creation date > modification date anomaly
  • Suspicious producer software
  • Known AI generation tags
  • Editing software traces
  • Inconsistent embedded fonts
  • Unusual compression artifacts
  • Resolution mismatch

Output:

interface MetadataReport {
  suspiciousFields: string[]
  softwareDetected: string[]
  timestampAnomalies: boolean
  structuralIntegrityScore: number
}

7.2 EXIF & Image Integrity Analysis

Flags:

  • Missing camera model in real-world claim documents
  • AI tool signatures
  • Synthetic resolution patterns
  • Metadata wiped after editing
  • Layering inconsistencies

7.3 OCR Consistency Engine

Detects overlay tampering or digital text injection.

Method:

  1. Extract embedded PDF text
  2. Extract OCR text via Tesseract
  3. Compare similarity

If similarity < threshold → potential overlay tampering

Output:

interface OCRReport {
  extractedTextLength: number
  similarityScore: number
  overlaySuspicion: boolean
}

7.4 AI Text Watermark Heuristic

Uses statistical analysis to detect AI-generated text:

  • Shannon entropy
  • Token distribution uniformity
  • Burstiness measurement
  • Repetition index
  • Stylometric signature

Output:

interface WatermarkReport {
  entropyScore: number
  repetitionIndex: number
  aiLikelihoodScore: number
  heuristicConfidence: number
}

7.5 Deepfake / AI Image Model

Uses ONNX or TensorFlow.js models.

Pipeline:

  1. Convert image to tensor
  2. Normalize channels
  3. Run classifier
  4. Output probability

Output:

interface DeepfakeReport {
  aiImageProbability: number
  ganFingerprintDetected: boolean
  modelConfidence: number
}

8. 🧠 Confidence Scoring Engine

Final authenticity score is weighted aggregation:

Final Confidence =
  0.25 * metadataScore +
  0.20 * ocrConsistencyScore +
  0.15 * watermarkScore +
  0.40 * modelScore

Weights are configurable.


9. 🔐 Evidence Integrity

Every analyzed document produces:

evidenceHash = SHA256(fileBuffer)

This ensures:

  • Forensic reproducibility
  • Chain-of-custody support
  • Audit trace reliability

10. 📈 Risk Classification Logic

| Confidence Score | Risk Level | | --- | --- | | 85–100 | Low | | 65–84 | Moderate | | 40–64 | High | | 0–39 | Critical |


11. 🧩 Multi-Use Cases

Legal Document Verification

  • Court filings
  • Affidavits
  • Evidence submissions
  • Contract authenticity

HR / Background Verification

  • Resume tampering
  • Degree certificate validation

Insurance & Claims

  • Image manipulation detection
  • Accident report verification

Financial Institutions

  • Loan document fraud screening

Digital Forensics Teams

  • Chain-of-custody validation
  • Evidence screening

SaaS Integration

  • API verification endpoint
  • Browser upload validation
  • KYC pipeline integration

12. 🌐 Node + Browser Compatibility

Dual export structure:

{
  "main": "dist/node/index.js",
  "browser": "dist/browser/index.js",
  "exports": {
    ".": {
      "import": "./dist/node/index.js",
      "browser": "./dist/browser/index.js"
    }
  }
}

13. ⚠️ Accuracy Guarantees & Limitations

This SDK:

  • ✔ Detects statistical anomalies
  • ✔ Flags structural inconsistencies
  • ✔ Provides probabilistic scoring
  • ✔ Aggregates independent forensic signals

This SDK DOES NOT:

  • ✖ Guarantee 100% AI detection
  • ✖ Provide legal certification
  • ✖ Replace forensic lab analysis

All outputs are probabilistic.


14. 🔄 Future Enhancements

  • Blockchain notarization module
  • Digital signature verification
  • Tamper-evident watermark detection
  • Stylometric author fingerprinting
  • Large-scale ML ensemble voting
  • REST API wrapper
  • Enterprise audit logs

15. 🔬 Accuracy Optimization Guidelines

To maximize detection reliability:

  1. Use ensemble ML models
  2. Train on diverse datasets
  3. Regularly update AI detection models
  4. Calibrate thresholds per industry
  5. Store anonymized telemetry for model refinement
  6. Maintain strict test benchmarks

16. 🏁 Project Setup & Getting Started

📋 Project Structure

docshield/
├── src/
│   ├── types/              # TypeScript interfaces
│   ├── analyzers/          # Detection modules
│   │   ├── metadataAnalyzer.ts
│   │   ├── ocrAnalyzer.ts
│   │   ├── watermarkAnalyzer.ts
│   │   └── deepfakeAnalyzer.ts
│   ├── core/               # Core engine
│   │   ├── scoringEngine.ts
│   │   └── forensicAggregator.ts
│   ├── utils/              # Utility functions
│   │   ├── hashing.ts
│   │   └── validators.ts
│   ├── __tests__/          # Test suite
│   └── index.ts            # Main export
├── dist/                   # Compiled output
├── package.json            # Dependencies & scripts
├── tsconfig.json           # TypeScript config
├── jest.config.js          # Test config
├── .eslintrc.json          # ESLint config
├── .prettierrc.json        # Code formatting
└── README.md               # Documentation

🚀 Development Setup

1. Clone & Install

git clone https://github.com/yourusername/docshield.git
cd docshield
npm install

2. Build Project

npm run build

Output: dist/node/ and dist/browser/

3. Run Tests

npm test

Run in watch mode:

npm run test:watch

4. Development Mode

npm run dev

This starts TypeScript compilation in watch mode.

5. Code Quality

Format code:

npm run format

Lint code:

npm run lint

💡 Usage Examples

Example 1: Verify a Legal Document

import fs from 'fs';
import { verifyPDF } from 'docshield';

const contractBuffer = fs.readFileSync('contract.pdf');
const result = await verifyPDF(contractBuffer);

if (result.forensicSummary.riskLevel === 'Low') {
  console.log('✅ Contract appears authentic');
} else {
  console.log(`⚠️ Risk Level: ${result.forensicSummary.riskLevel}`);
  console.log(`Detected Issues: ${result.detectedIssues.join(', ')}`);
}

Example 2: Verify an Insurance Claim Image

import fs from 'fs';
import { verifyImage } from 'docshield';

const claimImageBuffer = fs.readFileSync('accident-photo.jpg');
const result = await verifyImage(claimImageBuffer);

console.log(`Authenticity Score: ${result.confidenceScore}%`);
console.log(`AI Generation Risk: ${(result.aiGeneratedProbability * 100).toFixed(1)}%`);

Example 3: Express.js Backend Integration

import express from 'express';
import fileUpload from 'express-fileupload';
import { verifyDocument } from 'docshield';

const app = express();
app.use(fileUpload());

app.post('/api/verify', async (req, res) => {
  try {
    if (!req.files?.document) {
      return res.status(400).json({ error: 'No file uploaded' });
    }

    const file = req.files.document as any;
    const result = await verifyDocument({
      buffer: file.data,
      type: req.body.type || 'pdf'
    });

    res.json({
      authenticity: result.confidenceScore,
      riskLevel: result.forensicSummary.riskLevel,
      issues: result.detectedIssues,
      hash: result.evidenceHash
    });
  } catch (error) {
    res.status(500).json({ error: error.message });
  }
});

app.listen(3000, () => console.log('Server running on :3000'));

Example 4: Advanced Configuration

import { verifyDocument } from 'docshield';

const result = await verifyDocument(
  { buffer: pdfBuffer, type: 'pdf' },
  {
    verbose: true,
    enableDeepfakeDetection: true,
    enableWatermarkAnalysis: true,
    ocrThreshold: 0.75,
    weights: {
      metadataWeight: 0.3,  // Increase metadata importance
      ocrWeight: 0.25,
      watermarkWeight: 0.15,
      modelWeight: 0.3      // Reduce ML dependence
    }
  }
);

🔗 Resources


📝 License

MIT License - See LICENSE file for details

🤝 Contributing

Contributions welcome! Please:

  1. Fork the repository
  2. Create a feature branch
  3. Add tests for new functionality
  4. Submit a pull request

📞 Support

For issues & feature requests: GitHub Issues


📊 Summary

docshield provides:

  • ✨ Forensic-grade probabilistic detection
  • 🧠 Multi-signal analysis (metadata, OCR, watermark, ML)
  • 📋 Detailed explainable results
  • 🔐 Evidence integrity (SHA-256 hashing)
  • 🌍 Node.js + Browser support
  • 🛡️ Production-ready TypeScript

Happy Verifying! 🛡️