@codeslick/ai-detection

v1.1.0

Published

9 days ago

AI-generated code detection library with 164 signals (119 hallucinations + 13 heuristics + 32 LLM fingerprints)

0High
0Medium
0Low

vitorlourenco

ai-detection code-analysis security static-analysis hallucination-detection llm-fingerprinting

@codeslick/ai-detection

AI-generated code detection library with comprehensive signals for identifying AI-generated code patterns.

Version 1.1.0 - Complete Detection System

This release provides a complete AI code detection system with 45 detectors + scoring logic.

See IMPLEMENTATION_STATUS.md for implementation details.

Features

✅ Available Now (v1.1.0)

13 Code Smell Heuristics: Structural patterns that indicate AI-generated code
- 8 original heuristics (over-engineering, wrappers, comments, etc.)
- 5 "perfect code" patterns (AI code is "too perfect")
32 LLM Fingerprints: Behavioral patterns specific to GPT-4, GitHub Copilot, Claude Code, and Cursor
- GPT-4: 8 patterns (verbose docstrings, defensive null checks, etc.)
- GitHub Copilot: 7 patterns (boilerplate comments, generic names, etc.)
- Claude Code: 5 patterns (explanatory comments, custom errors, etc.)
- Cursor: 8 patterns (AI markers, diff comments, artifacts, etc.)
Scoring System: Confidence levels (HIGH/MEDIUM/LOW) and severity mapping
- Combines hallucinations (60%), heuristics (25%), LLM fingerprints (15%)

📋 Note on Hallucination Patterns

119 Hallucination Patterns: Language-specific API misuse patterns are maintained in CodeSlick analyzers (not this package)
This package provides reusable detection primitives for both CodeSlick and Endure

Installation

npm install @codeslick/ai-detection

Usage

Basic Heuristic Detection

import {
  calculateHeuristicScores,
  detectOverEngineeredErrorHandling,
  detectZeroEdgeCases,
  isTestFile,
  type HeuristicScores
} from '@codeslick/ai-detection';

// Read your code file
const code = `
function processData(data) {
  result = data.append(value);  // Python-style method (hallucination)
  return result;
}
`;

const lines = code.split('\n');

// Calculate heuristic scores (0.0-1.0)
const scores: HeuristicScores = calculateHeuristicScores(lines);

console.log('AI Detection Scores:', scores);
// Output: { overEngineeredErrors: 0.2, unnecessaryWrappers: 0.0, ... }

LLM Fingerprint Detection

import {
  detectVerboseDocstrings,
  detectBoilerplateComments,
  detectDetailedExplanatoryComments,
  detectAICommandMarkers,
  type LLMFingerprintScores
} from '@codeslick/ai-detection';

const code = `
/**
 * This function processes the data by iterating through each element
 * and applying the transformation. It returns the result.
 * @param data - The input data
 * @returns The processed result
 */
function processData(data) {
  return data;  // One-line function with 5+ line docstring = GPT-4 fingerprint
}

// TODO: implement error handling
// your code here
`;

const lines = code.split('\n');

// Detect LLM fingerprints
const gpt4Verbose = detectVerboseDocstrings(lines);
const copilotBoilerplate = detectBoilerplateComments(lines);

console.log('GPT-4 Verbose Docstrings:', gpt4Verbose);  // 1.0 (detected)
console.log('Copilot Boilerplate:', copilotBoilerplate);  // 0.66 (2/3 threshold)

Complete AI Code Confidence Scoring

import {
  calculateHeuristicScores,
  calculateAICodeConfidence,
  detectVerboseDocstrings,
  detectBoilerplateComments,
  // ... import other LLM fingerprints as needed
  type DetectionResult,
  type HeuristicScores,
  type LLMFingerprintScores
} from '@codeslick/ai-detection';

const code = `/* your AI-generated code here */`;
const lines = code.split('\n');

// 1. Calculate heuristic scores
const heuristicScores: HeuristicScores = calculateHeuristicScores(lines);

// 2. Calculate LLM fingerprint scores (optional)
const llmScores: Partial<LLMFingerprintScores> = {
  // GPT-4 fingerprints
  verboseDocstrings: detectVerboseDocstrings(lines),
  // GitHub Copilot fingerprints
  boilerplateComments: detectBoilerplateComments(lines),
  // Claude Code fingerprints
  detailedExplanatoryComments: detectDetailedExplanatoryComments(lines),
  // Cursor fingerprints
  aiCommandMarkers: detectAICommandMarkers(lines),
  // ... add more as needed
};

// 3. Calculate overall confidence
// hallucinationCount would come from language-specific analyzers (not this package)
const hallucinationCount = 0; // Example: detected in your analyzer

const result: DetectionResult | null = calculateAICodeConfidence(
  hallucinationCount,
  heuristicScores,
  llmScores
);

if (result) {
  console.log('AI Code Detected!');
  console.log('Confidence:', result.confidence);  // HIGH, MEDIUM, or LOW
  console.log('Severity:', result.severity);      // CRITICAL, HIGH, or MEDIUM
  console.log('Hallucinations:', result.hallucinationPatterns);
  console.log('Combined Score:', result.heuristicScore);
}

Available Detectors

Heuristic Detectors (8 original patterns)

detectOverEngineeredErrorHandling(lines) - 4+ nested if/else in catch blocks
detectUnnecessaryWrappers(lines) - Single-line function wrappers
detectVerboseComments(lines) - Comments that duplicate code (>70% overlap)
detectMixedNamingConventions(lines) - camelCase + snake_case in same window
detectRedundantNullChecks(lines) - Multiple null checks on same variable
detectUnnecessaryAsync(lines) - async functions without await
detectGenericVariableOveruse(lines) - Generic names (data, result, temp) overused
detectInconsistentStringConcatenation(lines) - Mixed concatenation methods

Perfect Code Heuristics (5 patterns)

detectZeroEdgeCases(lines) - Functions with no error handling (AI assumes happy path)
detectUniformIndentation(lines) - Perfectly aligned code (humans are messier)
detectTextbookVariableNames(lines) - Generic tutorial-style names
detectNoCommentsWithPerfectStructure(lines) - Clean code with zero comments
detectExcessiveParameterValidation(lines) - Validation in all functions (even private)

Utilities

isTestFile(filename?) - Checks if file is a test file
removeCommentsAndStrings(line, language) - Preprocesses code to avoid false positives

API Reference

`calculateHeuristicScores(lines: string[]): HeuristicScores`

Runs all 13 heuristic detectors and returns normalized scores (0.0-1.0).

Parameters:

lines: Array of code lines

Returns:

HeuristicScores object with all 13 heuristic scores

`isTestFile(filename?: string): boolean`

Checks if a filename indicates a test file.

Supported patterns:

.test., .spec., __tests__/
Test.java, _test.py, _test.go
test_*.py

Weights and Scoring

Heuristics are weighted to sum to 1.0:

const HEURISTIC_WEIGHTS = {
  overEngineeredErrors: 0.10,
  unnecessaryWrappers: 0.08,
  verboseComments: 0.07,
  mixedNaming: 0.09,
  redundantNullChecks: 0.10,
  unnecessaryAsync: 0.08,
  genericVariables: 0.07,
  inconsistentStrings: 0.09,
  // Perfect code heuristics
  zeroEdgeCases: 0.08,
  uniformIndentation: 0.07,
  textbookVariableNames: 0.07,
  noCommentsWithPerfectStructure: 0.05,
  excessiveParameterValidation: 0.05,
};

TypeScript Support

Full TypeScript support with exported types:

import type { DetectionResult, HeuristicScores, LLMFingerprintScores } from '@codeslick/ai-detection';

License

Related Projects

CodeSlick: Security-first code analysis platform (uses this library)
Endure: Architecture archaeology tool (will use this library)

Development

# Install dependencies
npm install

# Build
npm run build

# Run tests
npm test

# Type check
npm run type-check

Published

Vulnerabilities

Links

Maintainers

Keywords

Readme

@codeslick/ai-detection

Version 1.1.0 - Complete Detection System

Features

✅ Available Now (v1.1.0)

📋 Note on Hallucination Patterns

Installation

Usage

Basic Heuristic Detection

LLM Fingerprint Detection

Complete AI Code Confidence Scoring

Available Detectors

Heuristic Detectors (8 original patterns)

Perfect Code Heuristics (5 patterns)

Utilities

API Reference

calculateHeuristicScores(lines: string[]): HeuristicScores

isTestFile(filename?: string): boolean

Weights and Scoring

TypeScript Support

License

Links

Related Projects

Development

`calculateHeuristicScores(lines: string[]): HeuristicScores`

`isTestFile(filename?: string): boolean`