@pioneer-platform/pioneer-discovery-service

v0.2.90

Published

2 months ago

AI-powered discovery service for dApps, networks, and assets

Downloads

146

0High
0Medium
0Low

bithighlander

blockchain dapp discovery agent ai

Pioneer Discovery Service

AI-powered persistent agent for discovering and analyzing dApps, networks, and assets in the Pioneer ecosystem.

Overview

The Discovery Service is a Node.js worker that runs periodically (hourly by default) to:

Analyze Existing Data: Reviews CAIPs, networkIds, and dapps that Pioneer users are using
Price Discovery: Tests free price APIs and monitors primary asset prices
DApp Investigation: Deep analysis of dApps including contracts, social, metrics, and security
Scam Detection: Identifies potential scams and malicious services
Web Crawling: Investigates the open internet to find new dapps and services (TODO)
Database Population: Adds verified dapps to the MongoDB Atlas vector database
Report Generation: Creates detailed reports on each discovery run
Discord Alerts: Sends notifications for empty prices and rate limits

Architecture

MongoDB Collections

The service uses MongoDB Atlas with vector search capabilities:

discovery_dapps - Dapp records with analysis and vector embeddings
discovery_networks - Network/blockchain records
discovery_assets - Asset/token records with CAIP identifiers
discovery_reports - Generated analysis reports
discovery_state - Crawler state and scheduling info

Components

Agent (src/agent/) - Main discovery orchestrator
Database (src/db/) - MongoDB wrapper with vector collections
Fetchers (src/fetchers/) - Data retrieval from existing Pioneer databases
Analyzer (src/analyzer/) - Dapp, network, and asset analysis
Reporter (src/reporter/) - Report generation and statistics
Workers (src/workers/) - Background workers for specialized tasks
- Price Discovery - Tests free price APIs and monitors primary assets
- DApp Investigator - Deep investigation of dApps (contracts, social, security)
Types (src/types/) - TypeScript interfaces for all entities

Installation

# Install dependencies
bun install

# Build
bun run build

# Development mode (auto-reload)
bun run dev

# Production
bun run start

Configuration

Environment Variables

# MongoDB connection (required)
MONGO_CONNECTION=mongodb://username:password@host:port/database
# Or for Atlas:
# MONGO_CONNECTION=mongodb+srv://username:[email protected]/database

# Optional: Cron schedule (default: "0 * * * *" = hourly)
DISCOVERY_CRON_SCHEDULE="0 * * * *"

Cron Schedule

Default: 0 * * * * (every hour at minute 0)

For testing: */5 * * * * (every 5 minutes)

Edit in src/index.ts to change schedule.

Usage

Running the Service

# Development with auto-reload
bun run dev

# Production
bun run build && bun run start

The service will:

Initialize MongoDB connection and indexes
Run discovery agent immediately on startup
Schedule hourly cron runs
Continue running until stopped (Ctrl+C)

Manual Trigger

To manually trigger a discovery run, you can import and call the agent:

import { discoveryAgent } from './src/agent';

await discoveryAgent.run();

Features

Current Features (v0.1.0)

✅ MongoDB Atlas vector collection setup
✅ Data fetching from existing Pioneer databases
✅ Dapp analysis with basic scam detection
✅ Network and asset synchronization
✅ Report generation with statistics
✅ Hourly cron scheduling
✅ Graceful shutdown handling

Planned Features

🔲 Web crawler for discovering new dapps
🔲 Advanced scam detection with ML models
🔲 Vector embeddings for semantic search
🔲 Social media presence verification
🔲 Contract audit integration
🔲 User reputation system
🔲 Automated whitelist management
🔲 REST API for querying discovery data

MongoDB Atlas Vector Search

The collections are designed to work with MongoDB Atlas Vector Search. To enable vector search:

Create a Search Index in Atlas UI
Select "JSON Editor" mode
Use this index definition:

{
  "fields": [
    {
      "type": "vector",
      "path": "embedding",
      "numDimensions": 1536,
      "similarity": "cosine"
    },
    {
      "type": "filter",
      "path": "whitelist"
    },
    {
      "type": "filter",
      "path": "scamScore"
    }
  ]
}

Apply to discovery_dapps, discovery_networks, and discovery_assets collections

API Integration

The service is designed to be used by the Pioneer Server (pioneer-server). The existing dapps controller can query the discovery database for enhanced dapp information.

Example integration in dapps.controller.ts:

import { discoveryDB } from '@pioneer-platform/pioneer-discovery';

// Get analyzed dapp data
const dapp = await discoveryDB.getDapp(dappId);
console.log(dapp.analysis); // Full analysis
console.log(dapp.scamScore); // Scam risk score

Reports

Each discovery run generates a detailed report saved to discovery_reports collection:

Statistics: Dapps analyzed, discovered, whitelisted, flagged
Findings: New dapps, scams detected, verified dapps
Recommendations: Action items for manual review
Logs: Complete execution log

View latest report:

import { discoveryDB } from './src/db';

const report = await discoveryDB.getLatestReport();
console.log(report);

Development

Project Structure

pioneer-discovery/
├── src/
│   ├── agent/          # Main discovery orchestrator
│   ├── analyzer/       # Analysis logic
│   ├── db/            # Database layer
│   ├── fetchers/      # Data fetchers
│   ├── reporter/      # Report generation
│   ├── types/         # TypeScript types
│   └── index.ts       # Entry point with cron
├── package.json
├── tsconfig.json
└── README.md

Adding New Analyzers

Extend DiscoveryAnalyzer in src/analyzer/index.ts
Add new analysis fields to types in src/types/index.ts
Update report generation in src/reporter/index.ts

Adding Web Crawlers

TODO: Implement in Phase 3 of agent workflow

Planned approach:

Use cheerio for HTML parsing
Puppeteer for JavaScript-heavy sites
Rate limiting and respectful crawling
Domain whitelist/blacklist

Contributing

This service is part of the Pioneer Platform monorepo. Follow the monorepo development guidelines.

License

ISC