@aloth/cred1
v2026.6.16
Published
CRED-1 domain credibility dataset — JavaScript library and CLI for checking news source reliability
Maintainers
Readme
CRED-1: Open Domain Credibility Dataset
CRED-1 is an open, reproducible domain-level credibility dataset combining multiple openly-licensed source lists with computed enrichment signals. It provides credibility scores for 2,672 domains known to publish mis/disinformation, conspiracy theories, or other unreliable content.
🎓 Presented at ACM WebSci 2026 (Braunschweig). Landing page: aloth.github.io/agentic-ai-information-integrity/cred-1. First production integration: Trackless Links for iOS and macOS, with free codes for readers and attendees: gutscheinhub.de/ratgeber/trackless-links-cred-1-acm-websci-2026.
Paper: A. Loth, M. Kappes, and M.-O. Pahl, "CRED-1: An Open Multi-Signal Domain Credibility Dataset for Automated Pre-Bunking of Online Misinformation," Preprint, 2026. doi:10.2139/ssrn.6448466
Install
# CLI (global)
npm install -g @aloth/cred1
# Library (project dependency)
npm install @aloth/cred1
# Or try without installing
npx @aloth/cred1 check infowars.comCLI Usage
# Single domain lookup
cred1 check infowars.com
# 🔴 infowars.com
# Score: 0.073 / 1.000
# Category: conspiracy
# Level: low
# Sources: 2
# Domain not in dataset
cred1 check nytimes.com
# ⚪ nytimes.com
# Not found in CRED-1 dataset — treat as unknown/neutral
# Batch processing (stdin)
echo -e "rt.com\ninfowars.com\nnytimes.com" | cred1 batch
# JSON output
cred1 check breitbart.com --json
# Search
cred1 search "news"
cred1 search "\.ru$"
# Statistics
cred1 stats
cred1 categoriesDomain normalization is automatic — https://www.infowars.com/politics/ resolves to infowars.com.
MCP Server (Claude Desktop / Cursor / Windsurf)
CRED-1 ships an MCP server so AI assistants can check domain credibility directly.
Tools exposed
| Tool | Description |
|------|-------------|
| check_domain | Check a single domain (score, category, level, metadata) |
| batch_check | Check up to 100 domains at once |
| search_domains | Search domains by substring or regex pattern |
| get_stats | Dataset statistics (total, per-category counts, version) |
| get_categories | Category taxonomy with descriptions and score ranges |
Claude Desktop
Add to ~/Library/Application Support/Claude/claude_desktop_config.json:
{
"mcpServers": {
"cred1": {
"command": "npx",
"args": ["-y", "@aloth/cred1", "--mcp"]
}
}
}Note:
--mcpis handled by the CLI wrapper — or run the dedicated binary directly.
Alternatively, if the package is installed globally:
{
"mcpServers": {
"cred1": {
"command": "cred1-mcp"
}
}
}Cursor
Add to .cursor/mcp.json (project) or ~/.cursor/mcp.json (global):
{
"mcpServers": {
"cred1": {
"command": "npx",
"args": ["-y", "@aloth/cred1", "--mcp"]
}
}
}Windsurf
Add to ~/.codeium/windsurf/mcp_config.json:
{
"mcpServers": {
"cred1": {
"command": "npx",
"args": ["-y", "@aloth/cred1", "--mcp"]
}
}
}OpenClaw / any MCP-compatible host
{
"command": "cred1-mcp",
"transport": "stdio"
}Library Usage
import { checkDomain, searchDomains, getStats } from '@aloth/cred1';
// Single lookup
const result = checkDomain('infowars.com');
// { domain: 'infowars.com', score: 0.073, category: 'conspiracy', level: 'low', sources: 2, domainAge: 27.3, trancoRank: 15889 }
// Not found → null
const unknown = checkDomain('nytimes.com'); // null
// Search by pattern (substring or regex)
const russian = searchDomains('\\.ru$');
// Dataset statistics
const stats = getStats();
// { totalDomains: 2673, categories: { unreliable: 2001, fake: 233, ... }, version: '1.0.0' }Traffic-Light Scoring
| Level | Score | Emoji | Meaning | |---------|-----------|-------|----------------------------------| | low | ≤ 0.20 | 🔴 | High credibility risk | | mixed | 0.21–0.50 | 🟡 | Unreliable or mixed signals | | ok | > 0.50 | 🟢 | Generally considered reliable | | neutral | not found | ⚪ | Unknown — absence ≠ trustworthy |
Key Features
- 2,672 domains with credibility scores (0.0–1.0)
- Dual-mode — works as CLI tool and JavaScript library
- Fully reproducible — Python pipeline rebuilds the dataset from scratch
- Multi-signal scoring combining source labels, domain age, web popularity, fact-check frequency, and threat intelligence
- Privacy-preserving — designed for on-device client-side deployment (no server calls needed)
- Two openly-licensed sources — no proprietary data dependencies
- Domain normalization — handles www., protocols, paths automatically
Dataset Schema
Compact Format (cred1_compact.json)
{
"infowars.com": { "c": "c", "s": 0.073, "n": 2, "d": "1999-10-04", "r": 15889 }
}| Field | Description |
|-------|-------------|
| c | Category code: f=fake, u=unreliable, m=mixed, c=conspiracy, s=satire, r=reliable |
| s | Credibility score (0.0–1.0, lower = less credible) |
| n | Number of independent source lists flagging this domain |
| d | Domain registration date (optional) |
| r | Tranco Top-1M rank (optional — lower rank = more popular) |
Full Format (cred1_current.json)
{
"infowars.com": {
"category": "fake",
"credibility_score": 0.14,
"domain_age_years": 26.4,
"domain_registered": "1999-10-04T04:00:00Z",
"iffy_factual": "VL",
"iffy_bias": "FN",
"iffy_score": 0.1,
"factcheck_claims": 52,
"safe_browsing_flagged": false,
"score_age": 0.2,
"score_cat": 0.05,
"score_factcheck": 0.0,
"score_iffy": 0.1,
"score_safebrowsing": 0.05,
"score_tranco": 0.1,
"sources": 2,
"tranco_rank": 4382
}
}See CODEBOOK.md for full field documentation.
Rebuilding the Dataset
cd pipeline/
python3 build_dataset.py # Full pipeline
python3 build_dataset.py --step fetch # Download raw data only
python3 build_dataset.py --step merge # Parse + merge (requires prior fetch)
python3 enrich_dataset.py # Add enrichment signals (API keys required)Versioning
CRED-1 uses calendar versioning (CalVer) across all distribution channels:
| Channel | Format | Example |
|---------|--------|---------|
| GitHub Release | v2026-06-13 | Tag + Zenodo archive |
| npm package | 2026.6.13 | Same date, dot-separated (valid semver) |
A new version is released weekly with rescored domains. The npm package updates automatically with each GitHub release — no separate version scheme needed.
To pin a specific dataset version:
npm install @aloth/[email protected]Production Integrations
- Trackless Links — Safari extension for iOS and macOS with real-time CRED-1 credibility warnings
- HuggingFace — Dataset mirror for ML pipelines
Citation
@misc{loth2026cred1,
author = {Loth, Alexander and Kappes, Martin and Pahl, Marc-Oliver},
title = {{CRED-1}: An Open Multi-Signal Domain Credibility Dataset for Automated Pre-Bunking of Online Misinformation},
year = 2026,
doi = {10.2139/ssrn.6448466},
url = {https://github.com/aloth/cred-1}
}License
Author
Alexander Loth — alexloth.com · @xlth · ORCID
