npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, πŸ‘‹, I’m Ryan HefnerΒ  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you πŸ™

Β© 2025 – Pkg Stats / Ryan Hefner

@podx/scraper

v2.0.2

Published

πŸ” Advanced Twitter/X scraping, bot detection, and crypto analysis toolkit for PODx

Readme

@podx/scraper

Version License TypeScript Bun

The scraper package provides comprehensive Twitter/X data collection, analysis, and signal generation capabilities for the PODx ecosystem. It includes advanced scraping algorithms, bot detection, sentiment analysis, cryptocurrency token extraction, and trading signal generation.

πŸ“¦ Installation

# Install from workspace
bun add @podx/scraper@workspace:*

# Or install from npm (when published)
bun add @podx/scraper

πŸ—οΈ Architecture

The scraper package is organized into several specialized modules:

packages/scraper/src/
β”œβ”€β”€ scrapers/         # Core scraping functionality
β”‚   β”œβ”€β”€ baseScraper.ts    # Base scraper with authentication
β”‚   β”œβ”€β”€ searchScraper.ts  # Search-based tweet scraping
β”‚   β”œβ”€β”€ commentScraper.ts # Comment/reply scraping
β”‚   └── index.ts          # Scraper exports
β”œβ”€β”€ services/         # Service layer
β”‚   β”œβ”€β”€ index.ts          # Main ScraperService
β”œβ”€β”€ auth/             # Authentication handling
β”œβ”€β”€ analyzers/        # Data analysis modules
β”‚   β”œβ”€β”€ BotDetector.ts    # Bot detection algorithms
β”‚   β”œβ”€β”€ SentimentAnalyzer.ts # Sentiment analysis
β”‚   β”œβ”€β”€ SignalGenerator.ts # Trading signal generation
β”‚   β”œβ”€β”€ TokenExtractor.ts # Cryptocurrency token extraction
β”‚   └── index.ts          # Analyzer exports
β”œβ”€β”€ crypto/           # Cryptocurrency analysis
β”œβ”€β”€ signals/          # Signal processing and generation
β”œβ”€β”€ types/            # TypeScript type definitions
└── index.ts          # Main exports

πŸš€ Quick Start

import { ScraperService, SentimentAnalyzer, TokenExtractor } from '@podx/scraper';

// Initialize scraper service
const scraper = new ScraperService();

// Scrape tweets from a user
const tweets = await scraper.scrapeAccount({
  targetUsername: 'cryptowhale',
  maxTweets: 100,
  progressCallback: (progress) => {
    console.log(`Scraped ${progress.count}/${progress.max} tweets`);
  }
});

// Analyze sentiment
const analyzer = new SentimentAnalyzer();
const sentiment = await analyzer.analyze(tweets);

// Extract cryptocurrency tokens
const extractor = new TokenExtractor();
const tokens = await extractor.extract(tweets);

// Save results
const result = await scraper.saveTweetsToFile(tweets, 'cryptowhale');
console.log(`Saved ${tweets.length} tweets to ${result.filename}`);

πŸ” Authentication

The scraper supports multiple authentication methods for Twitter/X API access:

Environment Variables Setup

# Required credentials
export XSERVE_USERNAME="your_twitter_username"
export XSERVE_PASSWORD="your_twitter_password"
export XSERVE_EMAIL="[email protected]"  # Optional, for account recovery

Authentication Flow

import { ScraperService } from '@podx/scraper';

const scraper = new ScraperService();

// Authentication happens automatically on first API call
try {
  const tweets = await scraper.scrapeAccount({
    targetUsername: 'example',
    maxTweets: 10
  });

  console.log('Authentication successful!');
} catch (error) {
  if (error.code === 'AUTHENTICATION_FAILED') {
    console.error('Please check your Twitter credentials');
  }
}

πŸ“Š Core Scraping Features

Account Scraping

import { ScraperService } from '@podx/scraper';

const scraper = new ScraperService();

// Scrape tweets from a specific account
const tweets = await scraper.scrapeAccount({
  targetUsername: 'VitalikButerin',
  maxTweets: 500,
  progressCallback: (progress) => {
    const percent = Math.round((progress.count / progress.max) * 100);
    console.log(`Progress: ${percent}% (${progress.count}/${progress.max})`);
  }
});

// Process scraped tweets
tweets.forEach(tweet => {
  console.log(`@${tweet.username}: ${tweet.text}`);
  console.log(`Likes: ${tweet.likes}, Retweets: ${tweet.retweets}`);
});

Search-Based Scraping

import { SearchScraper } from '@podx/scraper/scrapers';

// Search for tweets with specific criteria
const searchScraper = new SearchScraper();

const tweets = await searchScraper.search({
  query: 'bitcoin OR ethereum',
  maxTweets: 1000,
  filters: {
    language: 'en',
    dateRange: {
      from: new Date('2024-01-01'),
      to: new Date('2024-01-31')
    },
    minLikes: 10,
    minRetweets: 5
  }
});

Comment/Reply Scraping

import { CommentScraper } from '@podx/scraper/scrapers';

const commentScraper = new CommentScraper();

// Scrape replies to a specific tweet
const replies = await commentScraper.scrapeComments({
  tweetId: '1234567890123456789',
  maxReplies: 200,
  includeNested: true  // Include replies to replies
});

// Analyze conversation threads
const threads = commentScraper.buildConversationThreads(replies);

🧠 Advanced Analysis

Bot Detection

import { BotDetector } from '@podx/scraper/analyzers';

const detector = new BotDetector();

// Analyze tweets for bot-like behavior
const analysis = await detector.analyze(tweets);

analysis.results.forEach(result => {
  console.log(`@${result.username}: ${result.botProbability}% bot probability`);
  console.log(`Reasons: ${result.reasons.join(', ')}`);
});

// Filter out likely bots
const humanTweets = analysis.results
  .filter(result => result.botProbability < 30)
  .map(result => result.tweet);

Sentiment Analysis

import { SentimentAnalyzer } from '@podx/scraper/analyzers';

const sentimentAnalyzer = new SentimentAnalyzer();

// Analyze sentiment of tweets
const sentimentResults = await sentimentAnalyzer.analyze(tweets);

sentimentResults.forEach(result => {
  console.log(`Tweet: ${result.text}`);
  console.log(`Sentiment: ${result.sentiment} (${result.confidence}%)`);
  console.log(`Emotions: ${result.emotions.join(', ')}`);
});

// Aggregate sentiment over time
const timeSeries = sentimentAnalyzer.createSentimentTimeSeries(sentimentResults);

Cryptocurrency Token Extraction

import { TokenExtractor } from '@podx/scraper/analyzers';

const tokenExtractor = new TokenExtractor();

// Extract cryptocurrency mentions and addresses
const tokenResults = await tokenExtractor.extract(tweets);

tokenResults.forEach(result => {
  console.log(`Found ${result.tokens.length} tokens in tweet`);
  result.tokens.forEach(token => {
    console.log(`- ${token.symbol}: ${token.address} (${token.blockchain})`);
    console.log(`  Context: ${token.context}`);
    console.log(`  Confidence: ${token.confidence}%`);
  });
});

// Get trending tokens
const trending = tokenExtractor.getTrendingTokens(tokenResults, {
  timeframe: '24h',
  minMentions: 5
});

πŸ“ˆ Signal Generation

Trading Signals

import { SignalGenerator } from '@podx/scraper/analyzers';

const signalGenerator = new SignalGenerator();

// Generate trading signals from tweet analysis
const signals = await signalGenerator.generateSignals({
  tweets,
  sentimentResults,
  tokenResults,
  marketData: {
    btcPrice: 45000,
    ethPrice: 2500
  }
});

signals.forEach(signal => {
  console.log(`Signal: ${signal.type} for ${signal.token}`);
  console.log(`Strength: ${signal.strength}/10`);
  console.log(`Reason: ${signal.reason}`);
  console.log(`Confidence: ${signal.confidence}%`);
  console.log(`Timeframe: ${signal.timeframe}`);
});

// Filter high-confidence signals
const strongSignals = signals.filter(s => s.confidence > 80 && s.strength >= 7);

Market Sentiment Signals

// Generate market sentiment signals
const marketSignals = await signalGenerator.generateMarketSignals({
  tweets,
  sentimentData: sentimentResults,
  tokenData: tokenResults,
  marketContext: {
    overallSentiment: 'bullish',
    fearGreedIndex: 75,
    volume24h: 1250000000
  }
});

marketSignals.forEach(signal => {
  console.log(`Market Signal: ${signal.type}`);
  console.log(`Direction: ${signal.direction}`);
  console.log(`Strength: ${signal.strength}`);
  console.log(`Timeframe: ${signal.timeframe}`);
  console.log(`Rationale: ${signal.rationale}`);
});

πŸ”§ Advanced Configuration

Custom Scraping Options

import { ScraperService } from '@podx/scraper';

// Advanced scraping with custom options
const scraper = new ScraperService();

const tweets = await scraper.scrapeAccount({
  targetUsername: 'crypto_influencer',
  maxTweets: 1000,
  filters: {
    minLikes: 10,
    minRetweets: 5,
    dateRange: {
      from: new Date('2024-01-01'),
      to: new Date('2024-01-31')
    },
    language: 'en',
    excludeReplies: false,
    excludeRetweets: true
  },
  rateLimit: {
    requestsPerMinute: 30,
    delayBetweenRequests: 2000
  },
  retryPolicy: {
    maxRetries: 3,
    backoffMultiplier: 2,
    initialDelay: 1000
  }
});

Custom Analysis Pipeline

import { 
  SentimentAnalyzer, 
  TokenExtractor, 
  BotDetector,
  SignalGenerator 
} from '@podx/scraper/analyzers';

// Create custom analysis pipeline
class CryptoAnalysisPipeline {
  constructor(
    private sentimentAnalyzer = new SentimentAnalyzer(),
    private tokenExtractor = new TokenExtractor(),
    private botDetector = new BotDetector(),
    private signalGenerator = new SignalGenerator()
  ) {}

  async analyze(tweets: Tweet[]): Promise<AnalysisResult> {
    // Step 1: Filter out bots
    const botAnalysis = await this.botDetector.analyze(tweets);
    const humanTweets = botAnalysis.results
      .filter(r => r.botProbability < 50)
      .map(r => r.tweet);

    // Step 2: Analyze sentiment
    const sentiment = await this.sentimentAnalyzer.analyze(humanTweets);

    // Step 3: Extract tokens
    const tokens = await this.tokenExtractor.extract(humanTweets);

    // Step 4: Generate signals
    const signals = await this.signalGenerator.generateSignals({
      tweets: humanTweets,
      sentimentResults: sentiment,
      tokenResults: tokens
    });

    return {
      originalTweetCount: tweets.length,
      humanTweets: humanTweets.length,
      sentiment,
      tokens,
      signals,
      analysisTimestamp: new Date()
    };
  }
}

// Use the pipeline
const pipeline = new CryptoAnalysisPipeline();
const result = await pipeline.analyze(tweets);

πŸ’Ύ Data Storage and Export

File Storage

import { ScraperService } from '@podx/scraper';

const scraper = new ScraperService();

// Scrape and save to file automatically
const result = await scraper.scrapeAndSave({
  targetUsername: 'cryptopunk',
  maxTweets: 200
});

console.log(`Saved ${result.tweets.length} tweets to ${result.filename}`);

// Custom file naming and organization
const customResult = await scraper.saveTweetsToFile(
  tweets, 
  'custom_username',
  {
    format: 'json',
    compress: true,
    includeMetadata: true,
    splitByDate: true
  }
);

Database Integration

import { DatabaseService } from '@podx/core';
import { ScraperService } from '@podx/scraper';

const db = new DatabaseService(config.database);
const scraper = new ScraperService();

// Scrape and store in database
const tweets = await scraper.scrapeAccount({
  targetUsername: 'defi_pulse',
  maxTweets: 100
});

// Store with analysis
for (const tweet of tweets) {
  const analysis = await analyzer.analyze([tweet]);
  const tokens = await tokenExtractor.extract([tweet]);

  await db.save('analyzed_tweets', {
    ...tweet,
    sentiment: analysis[0]?.sentiment,
    tokens: tokens[0]?.tokens || [],
    analyzedAt: new Date()
  });
}

πŸ“Š Analytics and Reporting

Generate Reports

import { AnalyticsEngine } from '@podx/scraper/analytics';

const analytics = new AnalyticsEngine();

// Generate comprehensive report
const report = await analytics.generateReport({
  tweets,
  sentimentResults,
  tokenResults,
  signals,
  timeframe: {
    from: new Date('2024-01-01'),
    to: new Date('2024-01-31')
  }
});

// Export report
await analytics.exportReport(report, {
  format: 'pdf',
  includeCharts: true,
  includeRawData: false
});

// Get insights
const insights = analytics.extractInsights(report);
console.log('Key Insights:');
insights.forEach(insight => {
  console.log(`- ${insight.category}: ${insight.description}`);
});

Real-time Monitoring

import { RealtimeMonitor } from '@podx/scraper/monitoring';

const monitor = new RealtimeMonitor({
  targetUsernames: ['cryptowhale', 'defi_pulse'],
  keywords: ['bitcoin', 'ethereum', 'defi'],
  updateInterval: 30000  // 30 seconds
});

// Monitor with callbacks
monitor.onTweet((tweet) => {
  console.log(`New tweet from @${tweet.username}: ${tweet.text}`);
});

monitor.onSignal((signal) => {
  console.log(`New signal: ${signal.type} for ${signal.token}`);
  // Send notification, update dashboard, etc.
});

// Start monitoring
await monitor.start();

πŸ”§ API Reference

ScraperService

scrapeAccount(options: ScrapingOptions): Promise<Tweet[]>

Scrapes tweets from a specific Twitter account.

Parameters:

  • targetUsername: string - Twitter username to scrape
  • maxTweets: number - Maximum number of tweets to scrape
  • progressCallback?: (progress: ScrapingProgress) => void - Progress callback

scrapeAndSave(options: ScrapingOptions): Promise<{ tweets: Tweet[]; filename: string }>

Scrapes tweets and saves them to file.

saveTweetsToFile(tweets: Tweet[], username: string): Promise<string>

Saves tweets to a JSON file.

Analyzers

SentimentAnalyzer.analyze(tweets: Tweet[]): Promise<SentimentResult[]>

Analyzes sentiment of tweets.

TokenExtractor.extract(tweets: Tweet[]): Promise<TokenResult[]>

Extracts cryptocurrency tokens from tweets.

BotDetector.analyze(tweets: Tweet[]): Promise<BotAnalysis>

Detects bot-like behavior in tweets.

SignalGenerator.generateSignals(params: SignalParams): Promise<Signal[]>

Generates trading signals from tweet analysis.

πŸ“‹ Data Types

Tweet

interface Tweet {
  id: string;
  username: string;
  text: string;
  createdAt: Date;
  likes: number;
  retweets: number;
  replies: number;
  isReply: boolean;
  isRetweet: boolean;
  media?: MediaData[];
  hashtags: string[];
  mentions: string[];
  urls: string[];
}

SentimentResult

interface SentimentResult {
  tweet: Tweet;
  sentiment: 'positive' | 'negative' | 'neutral';
  confidence: number;
  emotions: string[];
  score: number;
}

TokenResult

interface TokenResult {
  tweet: Tweet;
  tokens: TokenMention[];
}

interface TokenMention {
  symbol: string;
  address?: string;
  blockchain: string;
  context: string;
  confidence: number;
}

Signal

interface Signal {
  id: string;
  type: 'buy' | 'sell' | 'hold' | 'alert';
  token: string;
  strength: number;  // 1-10
  confidence: number; // 0-100
  reason: string;
  timeframe: string;
  timestamp: Date;
  supportingTweets: Tweet[];
}

πŸ§ͺ Testing

import { describe, test, expect, mock } from 'bun:test';
import { ScraperService } from '@podx/scraper';

describe('ScraperService', () => {
  test('should scrape tweets from account', async () => {
    const scraper = new ScraperService();

    // Mock the scraper
    mock.module('agent-twitter-client', () => ({
      Scraper: class {
        async login() {}
        async getTweets() {
          return [
            {
              id: '1',
              username: 'testuser',
              text: 'Hello world!',
              createdAt: new Date(),
              likes: 10,
              retweets: 5,
              replies: 2
            }
          ];
        }
      }
    }));

    const tweets = await scraper.scrapeAccount({
      targetUsername: 'testuser',
      maxTweets: 1
    });

    expect(tweets).toHaveLength(1);
    expect(tweets[0].username).toBe('testuser');
  });

  test('should handle authentication errors', async () => {
    const scraper = new ScraperService();

    // Mock authentication failure
    mock.module('agent-twitter-client', () => ({
      Scraper: class {
        async login() {
          throw new Error('Invalid credentials');
        }
      }
    }));

    expect(async () => {
      await scraper.scrapeAccount({
        targetUsername: 'testuser',
        maxTweets: 1
      });
    }).toThrow('Invalid credentials');
  });
});

⚑ Performance Optimization

Rate Limiting

// Configure rate limiting to avoid Twitter API limits
const scraper = new ScraperService({
  rateLimit: {
    requestsPerMinute: 30,
    delayBetweenRequests: 2000
  }
});

Caching

// Cache analysis results to improve performance
const cache = new AnalysisCache();

const analyzer = new SentimentAnalyzer({
  cache: cache
});

// Results are cached automatically
const result1 = await analyzer.analyze(tweets);
const result2 = await analyzer.analyze(tweets); // Uses cache

Parallel Processing

// Process multiple accounts in parallel
const usernames = ['user1', 'user2', 'user3'];
const results = await Promise.allSettled(
  usernames.map(username =>
    scraper.scrapeAccount({ targetUsername: username, maxTweets: 100 })
  )
);

πŸ”’ Security Considerations

  • Credential Protection: Never store credentials in code
  • Rate Limiting: Respect Twitter's API limits
  • Data Privacy: Handle user data responsibly
  • Error Handling: Don't expose sensitive information in errors
  • Logging: Be careful with sensitive data in logs

🀝 Contributing

  1. Fork the repository
  2. Create a feature branch
  3. Add tests for new functionality
  4. Ensure all tests pass
  5. Submit a pull request

πŸ“ License

This package is licensed under the ISC License. See the LICENSE file for details.

πŸ”— Related Packages

πŸ“ž Support

For support and questions: