ai-captcha

v0.1.0

Published

3 months ago

Proof of Intelligence — verify intelligence behind API calls. Filter scripts, detect AI, profile callers.

0High
0Medium
0Low

captcha ai llm proof-of-intelligence bot-detection agent-verification anti-script behavioral-analysis constraint-verification

AI CAPTCHA

Proof of Intelligence — verify intelligence behind API calls.

Traditional CAPTCHA: "Prove you're human." AI CAPTCHA: "Prove there's intelligence behind the call — and tell me what kind."

A zero-dependency TypeScript library that generates challenges, verifies answers, and builds behavioral profiles to distinguish scripts, humans, and AI agents.

┌────────────────────────────────────────────────────────────┐
│                    Who's calling your API?                  │
│                                                            │
│   Script          Human             AI Agent               │
│   ──────          ─────             ────────               │
│   curl loop       slow, imperfect   fast, near-perfect     │
│   fixed output    variable timing   consistent timing      │
│   no constraints  misses details    nails every constraint │
│                                                            │
│   ✗ BLOCKED       ✓ DETECTED        ✓ DETECTED             │
└────────────────────────────────────────────────────────────┘

What It Does

Blocks scripts — Challenges require language comprehension; curl loops can't answer
Scores AI likelihood — Behavioral analysis over multiple challenges profiles the caller
Zero dependencies — Pure TypeScript, runs anywhere (Node, Deno, Bun, Cloudflare Workers)

Quick Start

npm install ai-captcha

import { generateChallenge, ruleVerify, extractSignal, analyzeProfile } from "ai-captcha";

// 1. Generate a challenge
const challenge = generateChallenge();
// → { type: "constraint", prompt: "Write exactly 8 words about the ocean.", verify_data: {...} }

// 2. Send `prompt` to caller (keep `verify_data` server-side)

// 3. Verify + extract signal
const result = ruleVerify(challenge.type, answer, challenge.verify_data);
const signal = extractSignal(challenge.type, answer, challenge.verify_data, responseTimeMs);

// 4. Accumulate signals, then analyze
const profile = analyzeProfile(signals);
// → { intelligence: 0.95, ai_likelihood: 0.82, confidence: 0.7, signals: {...} }

Four Challenge Types

Topic

"Write one sentence about the ocean."

Random topic from 200+ words. Tests basic language production.

Paraphrase

"Say this in different words: 'The weather is beautiful today'"

Must rephrase without copying (>80% similarity = fail). Tests comprehension.

Keyword

"Write a sentence that includes both 'moon' and 'river'."

Must include both words coherently. Random from 100+ pairs.

Constraint (the AI differentiator)

"Write exactly 8 words about the ocean." "Write 2 sentences about technology. Start the first with 'In' and the second with 'Yet'." "Write one sentence about the moon that ends with a question mark."

Format constraints that LLMs comply with >95% of the time, but humans frequently miss. This is the key signal for distinguishing AI from human callers.

Architecture

                  ┌──────────────┐
  API Call ──────►│  Your Server │
                  └──────┬───────┘
                         │
              ┌──────────▼──────────┐
              │  Layer 1: Rules     │  Zero-cost, in-process
              │  Garbage filter     │  Blocks scripts, gibberish
              └──────────┬──────────┘
                         │
              ┌──────────▼──────────┐
              │  Layer 2: LLM       │  Optional Cloudflare Worker
              │  Semantic check     │  Judges language understanding
              └──────────┬──────────┘
                         │
              ┌──────────▼──────────┐
              │  Layer 3: Constraint│  In-process, zero-cost
              │  Format compliance  │  Measures constraint adherence
              └──────────┬──────────┘
                         │
              ┌──────────▼──────────┐
              │  Layer 4: Behavioral│  In-process, stateless
              │  Profile analysis   │  Timing + consistency → AI score
              └─────────────────────┘

Layer 1 (Rule Verification) — Built-in garbage filter. Catches empty answers, gibberish, direct copies. Intentionally lenient.

Layer 2 (LLM Verification) — Optional Cloudflare Worker that sends answers to an LLM judge. Adds repeat detection and suspicious tracking.

Layer 3 (Constraint Verification) — Built-in. Checks exact word counts, sentence starters, ending punctuation, keyword presence. Returns a compliance score (0-1).

Layer 4 (Behavioral Analysis) — Built-in, stateless. Analyzes accumulated signals to build a caller profile. You store the signals; the library does the math.

Behavioral Analysis

After 5+ challenges, analyzeProfile() returns a caller profile:

{
  intelligence: 0.95,      // P(has language ability) — filters scripts
  ai_likelihood: 0.82,     // P(is AI vs human) — behavioral signals
  confidence: 0.73,        // reliability (reaches 1.0 at 15+ samples)
  signals: {
    timing_regularity: 0.90,        // AI: consistent timing
    timing_word_correlation: 0.85,  // AI: time ∝ output length
    constraint_compliance: 0.95,    // AI: near-perfect format adherence
    quality_consistency: 0.92       // AI: stable performance
  }
}

How Each Signal Works

| Signal | What it measures | AI | Human | Script | |--------|-----------------|-----|-------|--------| | timing_regularity | Response time variance | Low CV (0.1-0.3) | High CV (0.5-1.5) | Near-zero CV | | timing_word_correlation | Time vs word count correlation | Strong positive | Weak/none | None | | constraint_compliance | Format constraint adherence | >95% | 50-80% | 0% | | quality_consistency | Performance stability | Very stable | Variable | N/A |

API Reference

`generateChallenge(options?): Challenge`

Generate a random challenge. Optionally restrict types:

generateChallenge()                                    // any type
generateChallenge({ types: ["constraint"] })           // constraint only
generateChallenge({ types: ["topic", "constraint"] })  // mix

`ruleVerify(type, answer, verifyData): VerifyResult`

Verify an answer. Returns { passed, reason?, constraint_score? }.

`extractSignal(type, answer, verifyData, responseTimeMs): ChallengeSignal`

Extract a behavioral signal from a challenge-response pair. Store these for analysis.

`analyzeProfile(signals): CallerProfile`

Analyze accumulated signals into a behavioral profile. Stateless — you manage storage.

`stringSimilarity(a, b): number`

Dice coefficient on character bigrams. Returns 0-1.

Pools

import { TOPICS, SENTENCES, KEYWORD_PAIRS, SENTENCE_STARTERS, EXACT_WORD_COUNTS } from "ai-captcha";

Extend or replace the built-in pools.

Integration Pattern

Call 1: POST /api/work  { caller_id: "agent-1" }
  ← 403 { error: "CHALLENGE_REQUIRED", challenge: { id, prompt, expires_in } }

Call 2: POST /api/work  { caller_id: "agent-1", challenge_id, challenge_answer }
  ← 200 { success: true, profile: { intelligence: 0.9, ai_likelihood: 0.5, confidence: 0.07 }, next_challenge: {...} }

Call 10+: (profile becomes reliable)
  ← 200 { profile: { intelligence: 0.95, ai_likelihood: 0.82, confidence: 0.67 } }

See examples/express.ts for a complete working server.

Why Not Just Rate Limiting?

Rate limiting answers "how often?" — AI CAPTCHA answers "who's calling?"

| Defense | Blocks scripts | Detects AI | Detects humans | |---------|:---:|:---:|:---:| | Rate limiting | Partial | No | No | | Traditional CAPTCHA | Yes | No | No | | AI CAPTCHA | Yes | Yes | Yes |

Error Codes

| Code | Meaning | |------|---------| | CHALLENGE_REQUIRED | No challenge provided — here's one to answer | | CHALLENGE_INVALID | Challenge ID not found or belongs to another caller | | CHALLENGE_EXPIRED | Challenge TTL exceeded (default: 5 min) | | CHALLENGE_FAILED | Verification failed — here's a new challenge |

License

MIT — ClawPlaza

Published

Vulnerabilities

Links

Maintainers

Keywords

Readme