bear-tracker
v1.4.4
Published
Lightweight bot detection middleware for tracking AI crawler visits (OpenAI, ChatGPT, etc.) with API support and analytics
Maintainers
Readme
Bear Tracker 🐻
A lightweight, zero-dependency npm package for tracking AI/LLM bots (OpenAI, Google, etc.) in web applications. Perfect for Vercel, Express.js, Next.js, and other Node.js frameworks.
Features
- 🤖 AI Bot Detection: Identifies OpenAI bots (GPTBot, ChatGPT-User, OAI-SearchBot), Googlebot, and other AI/LLM bots
- 🚀 Minimal Integration: Less than 5 lines of code to get started
- 📊 Structured Logging: Vercel-friendly JSON logs for easy parsing and analysis
- 🎯 AI-Focused: Specialized for tracking AI training, search, and user interaction bots
- 🔧 Framework Agnostic: Works with Express, Next.js, Fastify, and any Node.js middleware system
- 📦 Zero Dependencies: Lightweight with no external dependencies
Installation
npm install bear-trackerQuick Start (< 5 lines)
Express.js
const express = require('express');
const { createBotTracker } = require('bear-tracker');
const app = express();
app.use(createBotTracker('info')); // Only this line needed!
// Your existing routes...Next.js API Routes
// middleware.js
import { createBotTracker } from 'bear-tracker';
export const middleware = createBotTracker('warn'); // Only this line needed!
export const config = {
matcher: '/api/:path*'
};Express with Custom Logging
const { createCustomBotTracker } = require('bear-tracker');
app.use(createCustomBotTracker((botInfo) => {
if (botInfo.isBot) console.log(`AI Bot detected: ${botInfo.name} - ${botInfo.description}`);
}));Detected AI/LLM Bots
The package specializes in detecting these AI and search bots:
OpenAI Bots
- OAI-SearchBot: OpenAI SearchBot for linking and surfacing websites in ChatGPT search results
- ChatGPT-User: ChatGPT user actions and Custom GPTs web interactions
- GPTBot: OpenAI GPTBot for training generative AI foundation models
Search Engines
- Googlebot: Google web crawler for search indexing
Additional AI Bots
- Claude-Web: Anthropic Claude web interactions
- Bard: Google Bard AI interactions
- AI Bot: Generic AI or bot-like user agents
Advanced Usage
Full Configuration
const { BotTracker } = require('bear-tracker');
const tracker = new BotTracker({
enableLogging: true,
trackOnlyBots: true, // Only log when AI bots are detected
includeIp: true, // Include IP addresses in logs
logLevel: 'warn', // 'info', 'warn', or 'error'
customLogger: (botInfo) => {
// Your custom logging logic
console.log(`${botInfo.name}: ${botInfo.description} from ${botInfo.ip}`);
}
});
app.use(tracker.middleware());Accessing Bot Info in Routes
app.get('/api/data', (req, res) => {
const botInfo = res.locals.botInfo;
if (botInfo.isBot) {
console.log(`API accessed by ${botInfo.name}: ${botInfo.description}`);
// Handle different AI bot types
if (botInfo.type === 'ai_training') {
// GPTBot - you might want to limit what content is accessible
res.json({ message: 'Limited data for training bots' });
} else if (botInfo.type === 'ai_search') {
// OAI-SearchBot - optimize for search indexing
res.json({ data: 'SEO-optimized content for search' });
}
}
res.json({ data: 'your data' });
});Manual Bot Detection
const { detectBotFromUserAgent } = require('bear-tracker');
const userAgent = 'Mozilla/5.0 AppleWebKit/537.36 (KHTML, like Gecko); compatible; GPTBot/1.1; +https://openai.com/gptbot';
const result = detectBotFromUserAgent(userAgent);
console.log(result);
// {
// name: 'GPTBot',
// type: 'ai_training',
// isBot: true,
// userAgent: '...',
// timestamp: 2024-01-01T12:00:00.000Z,
// description: 'OpenAI GPTBot for training generative AI foundation models'
// }Log Output Format
The structured logs are perfect for Vercel and other serverless platforms:
{
"timestamp": "2024-01-01T12:00:00.000Z",
"bot_detected": true,
"bot_name": "GPTBot",
"bot_type": "ai_training",
"bot_description": "OpenAI GPTBot for training generative AI foundation models",
"user_agent": "Mozilla/5.0 AppleWebKit/537.36 (KHTML, like Gecko); compatible; GPTBot/1.1...",
"ip_address": "40.84.180.224"
}Framework-Specific Examples
Vercel/Next.js
// middleware.js
import { createBotTracker } from 'bear-tracker';
export const middleware = createBotTracker('info');
export const config = {
matcher: [
'/api/:path*',
'/((?!_next/static|favicon.ico).*)'
]
};Express.js with AI Bot Analytics
const express = require('express');
const { BotTracker } = require('bear-tracker');
const app = express();
const aiTracker = new BotTracker({
logLevel: 'warn',
trackOnlyBots: true,
customLogger: (botInfo) => {
// Send AI bot data to your analytics service
analytics.track('ai_bot_visit', {
bot_name: botInfo.name,
bot_type: botInfo.type,
bot_description: botInfo.description,
timestamp: botInfo.timestamp
});
}
});
app.use(aiTracker.middleware());Use Cases for AI Bot Tracking
- AI Training Control: Detect GPTBot and control what content is used for AI training
- Search Optimization: Optimize content delivery for OAI-SearchBot and Googlebot
- Rate Limiting: Apply different limits for AI bots vs human users
- Content Strategy: Track which AI services are accessing your content
- Compliance: Monitor and log AI bot access for regulatory requirements
- Performance: Serve optimized responses to different types of AI bots
API Reference
createBotTracker(logLevel?)
Quick setup function that tracks only AI bots.
logLevel:'info' | 'warn' | 'error'(default:'info')
createCustomBotTracker(customLogger)
Setup with custom logging function.
customLogger:(botInfo: BotInfo) => void
BotTracker(options?)
Full-featured class with configuration options.
detectBotFromUserAgent(userAgent)
Manually detect if a user-agent string belongs to an AI bot.
TypeScript Support
Full TypeScript support with exported types:
import { BotInfo, BotTrackerOptions, BotTracker } from 'bear-tracker';
const options: BotTrackerOptions = {
enableLogging: true,
trackOnlyBots: true
};
const tracker = new BotTracker(options);License
MIT
Contributing
Contributions welcome! Please feel free to submit issues and pull requests.
Perfect for Vercel deployments - The structured JSON logs integrate seamlessly with Vercel's logging and analytics systems. Track OpenAI bots, Google crawlers, and other AI services with minimal code.
