backlink-detector
v1.0.0
Published
Detect unauthorized external links (backlinks) in HTML content. Find links pointing to domains outside your whitelist.
Maintainers
Readme
backlink-detector
Detect unauthorized external links (backlinks) in HTML content. Find links pointing to domains outside your whitelist.
Use Case
When managing website content (especially user-generated or third-party content), you may want to:
- Detect unauthorized backlinks to external sites
- Find links pointing to competitor domains
- Audit content for SEO compliance
- Remove unwanted external links while preserving content
Features
- Domain whitelist - Specify allowed domains, detect everything else
- Subdomain support - Automatically includes subdomains (www.example.com matches example.com)
- Anchor text extraction - See what text is being used for links
- HTML cleaning - Remove backlinks while preserving content
- Zero dependencies - Uses native
fetchAPI (Node.js 18+) - TypeScript - Full type definitions included
- CLI & API - Use from command line or as a library
Installation
npm install backlink-detectorOr use directly with npx:
npx backlink-detector -f page.html -d example.comCLI Usage
Detect backlinks in an HTML file
# Check HTML file - only allow links to example.com
backlink-detector -f page.html -d example.com
# Multiple allowed domains
backlink-detector -f page.html -d "example.com,mysite.org,blog.example.com"Fetch and analyze a live page
backlink-detector -u https://example.com/page -d example.comGet cleaned HTML (backlinks removed)
backlink-detector -f page.html -d example.com -c > cleaned.htmlOutput formats
# Detailed report
backlink-detector -f page.html -d example.com -r
# JSON output
backlink-detector -f page.html -d example.com --json
# List all links (no filtering)
backlink-detector -f page.html -lOptions
-f, --file <path> HTML file to analyze
-u, --url <url> URL to fetch and analyze
-d, --domains <list> Comma-separated list of allowed domains (required)
-o, --output <path> Output results to file
-c, --clean Output cleaned HTML (backlinks removed)
-r, --report Output detailed report
-l, --list List all links without filtering
--json Output as JSON
-q, --quiet Minimal output
-h, --help Show help
-v, --version Show versionAPI Usage
Detect Backlinks
import { detectBacklinks } from 'backlink-detector';
const html = `
<html>
<body>
<a href="https://mysite.com/page">My Site</a>
<a href="https://competitor.com/link">Competitor</a>
<a href="https://spam-site.net">Spam</a>
</body>
</html>
`;
const result = detectBacklinks(html, {
allowedDomains: ['mysite.com'],
});
console.log(result.stats);
// {
// totalLinks: 3,
// allowedLinks: 1,
// externalLinks: 2,
// uniqueExternalDomains: 2
// }
console.log(result.backlinks);
// [
// { url: 'https://competitor.com/link', domain: 'competitor.com', anchorText: 'Competitor', ... },
// { url: 'https://spam-site.net', domain: 'spam-site.net', anchorText: 'Spam', ... }
// ]
console.log(result.externalDomains);
// ['competitor.com', 'spam-site.net']Remove Backlinks
import { removeBacklinks } from 'backlink-detector';
const html = '<p>Check out <a href="https://external.com">this link</a> for more.</p>';
const { html: cleanedHtml, removedCount } = removeBacklinks(html, {
allowedDomains: ['mysite.com'],
});
console.log(cleanedHtml);
// '<p>Check out this link for more.</p>'
console.log(removedCount); // 1Quick Checks
import { hasBacklinks, findExternalDomains } from 'backlink-detector';
// Check if any backlinks exist
if (hasBacklinks(html, ['mysite.com'])) {
console.log('Backlinks found!');
}
// Get list of external domains
const domains = findExternalDomains(html, ['mysite.com']);
console.log(domains); // ['competitor.com', 'spam.net']Extract All Links
import { extractLinks } from 'backlink-detector';
const { links, count } = extractLinks(html);
console.log(count); // 5
console.log(links);
// [
// { url: 'https://...', anchorText: 'Click here', fullMatch: '<a href="...">Click here</a>' },
// ...
// ]Generate Report
import { getBacklinkReport } from 'backlink-detector';
const report = getBacklinkReport(html, {
allowedDomains: ['mysite.com'],
});
console.log(report);
// ==================================================
// BACKLINK DETECTION REPORT
// ==================================================
//
// Total Links: 10
// Allowed Links: 3
// External Links (Backlinks): 7
// Unique External Domains: 4
// ...Options
DetectOptions
interface DetectOptions {
// List of allowed domains (required)
allowedDomains: string[];
// Include subdomains of allowed domains (default: true)
// When true: "www.example.com" matches "example.com"
includeSubdomains?: boolean;
// Case insensitive domain matching (default: true)
caseInsensitive?: boolean;
}How Domain Matching Works
With includeSubdomains: true (default):
| Allowed Domain | URL Domain | Match? | |---------------|------------|--------| | example.com | example.com | ✅ | | example.com | www.example.com | ✅ | | example.com | blog.example.com | ✅ | | example.com | sub.blog.example.com | ✅ | | example.com | notexample.com | ❌ | | example.com | example.com.evil.com | ❌ |
Common Use Cases
SEO Audit
import { detectBacklinks, COMMON_SAFE_DOMAINS } from 'backlink-detector';
// Allow your domains + common safe domains
const allowedDomains = [
'mysite.com',
'myblog.com',
...COMMON_SAFE_DOMAINS, // Google, Facebook, Wikipedia, etc.
];
const result = detectBacklinks(html, { allowedDomains });Content Moderation
import { removeBacklinks } from 'backlink-detector';
// Clean user-generated content
const userContent = await getUserContent();
const { html: cleanedContent } = removeBacklinks(userContent, {
allowedDomains: ['mysite.com'],
});
// Save cleaned content
await saveContent(cleanedContent);CI/CD Check
# Exit with error code 1 if backlinks found
backlink-detector -f dist/index.html -d mysite.com
# Use in CI pipeline
if backlink-detector -f build/index.html -d mysite.com -q; then
echo "No backlinks found"
else
echo "Unauthorized backlinks detected!"
exit 1
fiRequirements
- Node.js 18+ (uses native
fetch)
License
MIT License - see LICENSE file.
Credits
Built with ❤️ by Hayati Ali Keles
