npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2026 – Pkg Stats / Ryan Hefner

@vezlo/ai-validator

v1.2.0

Published

AI Response Validator - Automated accuracy checking, hallucination prevention, and confidence scoring for AI responses

Readme

AI Validator

npm version License: AGPL-3.0

AI Response Validator - Automated accuracy checking, hallucination prevention, and confidence scoring for AI responses.

🎯 Purpose

AI Validator helps you ensure the quality and reliability of AI-generated responses by:

  • LLM-as-Judge Context Validation - Semantic accuracy checking using OpenAI/Claude
  • Developer Mode - Strict code grounding validation for technical queries
  • Automated Accuracy Checking - Verify AI responses against source documents
  • Hallucination Prevention - Detect when AI invents information not in sources
  • Confidence Scoring - Get reliability scores for every response
  • Query Classification - Skip validation for greetings, typos, and small talk
  • Multi-LLM Support - Works with OpenAI and Claude

Perfect for RAG systems, knowledge bases, codebase Q&A, and any application where AI response quality matters.

🚀 Quick Start

Installation

npm install @vezlo/ai-validator

Or install globally for CLI access:

npm install -g @vezlo/ai-validator

For Local Development/Testing

# Clone the repository
git clone https://github.com/vezlo/ai-validator.git
cd ai-validator

# Install dependencies
npm install

# Build the project
npm run build

# Run the test CLI
npm test

💻 Usage

1. CLI Testing (Interactive)

Test the validator interactively without writing code:

# Using npx (no installation required)
npx vezlo-validator-test

# Or if installed globally
vezlo-validator-test

The CLI will guide you through:

  • Selecting LLM provider (OpenAI or Claude)
  • Entering API keys
  • Choosing models (any OpenAI or Claude model)
  • Configuring validation settings
  • Testing with your own queries and responses
  • Easy text input for sources (no JSON required)

2. Code Usage (Programmatic)

Basic Example

import { AIValidator } from '@vezlo/ai-validator';

// Initialize with your API key and provider
const validator = new AIValidator({
  openaiApiKey: 'sk-your-openai-key',  // Your OpenAI API key
  llmProvider: 'openai'                 // 'openai' or 'claude'
});

// Validate a response
const validation = await validator.validate({
  query: "What is machine learning?",
  response: "Machine learning is a subset of AI that focuses on algorithms.",
  sources: [
    {
      content: "Machine learning is a subset of artificial intelligence that focuses on algorithms and statistical models.",
      title: "ML Guide",
      url: "https://example.com/ml-guide"
    }
  ]
});

// Check results
console.log(`Confidence: ${(validation.confidence * 100).toFixed(1)}%`);
console.log(`Valid: ${validation.valid}`);
console.log(`Accuracy: ${validation.accuracy.verified ? 'Verified' : 'Not verified'}`);
console.log(`Hallucination Risk: ${(validation.hallucination.risk * 100).toFixed(1)}%`);
console.log(`Warnings: ${validation.warnings.join(', ')}`);

Advanced Configuration

import { AIValidator } from '@vezlo/ai-validator';

const validator = new AIValidator({
  // API Keys (at least one required)
  openaiApiKey: 'sk-your-openai-key',
  claudeApiKey: 'sk-ant-your-claude-key',
  
  // LLM Provider (required)
  llmProvider: 'openai', // 'openai' or 'claude'
  
  // Model Selection (optional)
  openaiModel: 'gpt-4o-mini',  // Default for LLM Judge
  claudeModel: 'claude-3-haiku-20240307',  // Default for LLM Judge
  
  // Validation Settings (optional)
  confidenceThreshold: 0.7,              // 0.0 - 1.0 (default: 0.7)
  enableQueryClassification: true,       // Skip validation for greetings/typos
  enableContextValidation: true,         // Context relevance validation (default: true)
  useLLMJudge: true,                    // Use LLM-as-Judge for context (default: false)
  developerMode: false,                  // Strict code grounding mode (default: false)
  enableAccuracyCheck: false,           // LLM-based accuracy checking (default: false)
  enableHallucinationDetection: false   // LLM-based hallucination detection (default: false)
});

Integration with RAG Systems

// Example with a RAG system
const ragResponse = await yourRAGSystem.query(userQuestion);
const sources = await yourRAGSystem.getSources(userQuestion);

const validation = await validator.validate({
  query: userQuestion,
  response: ragResponse.content,
  sources: sources.map(s => ({
    content: s.text,
    title: s.title,
    url: s.url
  }))
});

if (validation.valid) {
  // Show response to user
  return ragResponse.content;
} else {
  // Handle low confidence response
  console.warn('Low confidence response:', validation.warnings);
  return "I'm not confident about this answer. Please consult additional sources.";
}

📊 Validation Results

interface ValidationResult {
  confidence: number;        // 0.0 - 1.0
  valid: boolean;            // true if confidence >= threshold
  accuracy: {
    verified: boolean;
    verification_rate: number;
    reason?: string;
  };
  context: {
    source_relevance: number;
    source_usage_rate: number;
    valid: boolean;
  };
  hallucination: {
    detected: boolean;
    risk: number;
    hallucinated_parts?: string[];
  };
  warnings: string[];
  query_type?: string;       // 'greeting', 'question', etc.
  skip_validation?: boolean; // true for greetings/typos
}

🔧 Configuration

Configuration Options

All configuration is done in code when initializing the validator:

interface AIValidatorConfig {
  // API Keys (at least one required)
  openaiApiKey?: string;      // Your OpenAI API key
  claudeApiKey?: string;       // Your Claude API key
  
  // Provider (required)
  llmProvider: 'openai' | 'claude';
  
  // Models (optional - specify any valid model from the chosen provider)
  openaiModel?: string;        // Default: 'gpt-4o'
  claudeModel?: string;        // Default: 'claude-sonnet-4-5-20250929'
  
  // Validation Settings (optional)
  confidenceThreshold?: number;           // Default: 0.7
  enableQueryClassification?: boolean;    // Default: true
  enableAccuracyCheck?: boolean;         // Default: true
  enableHallucinationDetection?: boolean; // Default: true
}

Model Support

OpenAI Models: You can use any OpenAI chat model by specifying it in openaiModel. Common choices include:

  • gpt-4o (default, recommended)
  • gpt-4o-mini (faster, cheaper)
  • gpt-4 (previous flagship)
  • gpt-4-turbo
  • Or any other OpenAI chat completion model

Claude Models: You can use any Claude model by specifying it in claudeModel. Common choices include:

  • claude-sonnet-4-5-20250929 (default, Claude 4.5 Sonnet)
  • claude-opus-4-1-20250805 (Claude 4.1 Opus)
  • claude-3-7-sonnet-20250219 (Claude 3.7 Sonnet)
  • Or any other Claude model identifier

The validator will work with any model supported by the respective provider's API.

CLI Commands

# Interactive testing CLI
npx vezlo-validator-test

# Development commands
npm run build   # Build the project
npm run clean   # Clean build files
npm test        # Run the test CLI

🎯 Use Cases

1. RAG Systems

Validate responses against retrieved documents to ensure accuracy.

2. Customer Support Bots

Prevent incorrect information from reaching customers.

3. Knowledge Base Applications

Ensure AI answers are grounded in your documentation.

4. Content Generation

Validate AI-generated content against source materials.

5. Educational Applications

Ensure AI tutoring responses are accurate and helpful.

⚡ Performance

  • Validation Time: 2-5 seconds per response (depending on LLM provider)
  • Cost: Additional LLM API calls for validation
  • Accuracy: High accuracy for responses with good sources
  • Reliability: Graceful handling of edge cases

🔍 How It Works

  1. Query Classification - Identifies greetings, typos, and small talk (skips validation)
  2. Accuracy Checking - Uses LLM to verify facts against source documents
  3. Hallucination Detection - Identifies information not present in sources
  4. Context Validation - Ensures response relevance to the query
  5. Confidence Scoring - Combines all metrics into a single score

📝 Examples

High Confidence Response

{
  confidence: 0.92,
  valid: true,
  accuracy: { verified: true, verification_rate: 0.95 },
  hallucination: { detected: false, risk: 0.05 },
  warnings: []
}

Low Confidence Response

{
  confidence: 0.35,
  valid: false,
  accuracy: { verified: false, verification_rate: 0.2 },
  hallucination: { detected: true, risk: 0.8 },
  warnings: ["No sources provided - high hallucination risk"]
}

Skipped Validation (Greeting)

{
  confidence: 1.0,
  valid: true,
  query_type: "greeting",
  skip_validation: true,
  warnings: []
}

🤝 Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

📄 License

This project is dual-licensed:

  • Non-Commercial Use: Free under AGPL-3.0 license
  • Commercial Use: Requires a commercial license - contact us for details

See the LICENSE file for complete AGPL-3.0 license terms.

🆘 Support

🔗 Related Projects


Status: ✅ Production Ready | Version: 1.2.0 | License: AGPL-3.0 | Node.js: 20+

Made with ❤️ by Vezlo