@pioneer-platform/pioneer-discovery-service
v0.2.16
Published
AI-powered discovery service for dApps, networks, and assets
Maintainers
Readme
Pioneer Discovery Service
AI-powered persistent agent for discovering and analyzing dApps, networks, and assets in the Pioneer ecosystem.
Overview
The Discovery Service is a Node.js worker that runs periodically (hourly by default) to:
- Analyze Existing Data: Reviews CAIPs, networkIds, and dapps that Pioneer users are using
- Price Discovery: Tests free price APIs and monitors primary asset prices
- DApp Investigation: Deep analysis of dApps including contracts, social, metrics, and security
- Scam Detection: Identifies potential scams and malicious services
- Web Crawling: Investigates the open internet to find new dapps and services (TODO)
- Database Population: Adds verified dapps to the MongoDB Atlas vector database
- Report Generation: Creates detailed reports on each discovery run
- Discord Alerts: Sends notifications for empty prices and rate limits
Architecture
MongoDB Collections
The service uses MongoDB Atlas with vector search capabilities:
discovery_dapps- Dapp records with analysis and vector embeddingsdiscovery_networks- Network/blockchain recordsdiscovery_assets- Asset/token records with CAIP identifiersdiscovery_reports- Generated analysis reportsdiscovery_state- Crawler state and scheduling info
Components
- Agent (
src/agent/) - Main discovery orchestrator - Database (
src/db/) - MongoDB wrapper with vector collections - Fetchers (
src/fetchers/) - Data retrieval from existing Pioneer databases - Analyzer (
src/analyzer/) - Dapp, network, and asset analysis - Reporter (
src/reporter/) - Report generation and statistics - Workers (
src/workers/) - Background workers for specialized tasks- Price Discovery - Tests free price APIs and monitors primary assets
- DApp Investigator - Deep investigation of dApps (contracts, social, security)
- Types (
src/types/) - TypeScript interfaces for all entities
Installation
# Install dependencies
bun install
# Build
bun run build
# Development mode (auto-reload)
bun run dev
# Production
bun run startConfiguration
Environment Variables
# MongoDB connection (required)
MONGO_CONNECTION=mongodb://username:password@host:port/database
# Or for Atlas:
# MONGO_CONNECTION=mongodb+srv://username:[email protected]/database
# Optional: Cron schedule (default: "0 * * * *" = hourly)
DISCOVERY_CRON_SCHEDULE="0 * * * *"Cron Schedule
Default: 0 * * * * (every hour at minute 0)
For testing: */5 * * * * (every 5 minutes)
Edit in src/index.ts to change schedule.
Usage
Running the Service
# Development with auto-reload
bun run dev
# Production
bun run build && bun run startThe service will:
- Initialize MongoDB connection and indexes
- Run discovery agent immediately on startup
- Schedule hourly cron runs
- Continue running until stopped (Ctrl+C)
Manual Trigger
To manually trigger a discovery run, you can import and call the agent:
import { discoveryAgent } from './src/agent';
await discoveryAgent.run();Features
Current Features (v0.1.0)
- ✅ MongoDB Atlas vector collection setup
- ✅ Data fetching from existing Pioneer databases
- ✅ Dapp analysis with basic scam detection
- ✅ Network and asset synchronization
- ✅ Report generation with statistics
- ✅ Hourly cron scheduling
- ✅ Graceful shutdown handling
Planned Features
- 🔲 Web crawler for discovering new dapps
- 🔲 Advanced scam detection with ML models
- 🔲 Vector embeddings for semantic search
- 🔲 Social media presence verification
- 🔲 Contract audit integration
- 🔲 User reputation system
- 🔲 Automated whitelist management
- 🔲 REST API for querying discovery data
MongoDB Atlas Vector Search
The collections are designed to work with MongoDB Atlas Vector Search. To enable vector search:
- Create a Search Index in Atlas UI
- Select "JSON Editor" mode
- Use this index definition:
{
"fields": [
{
"type": "vector",
"path": "embedding",
"numDimensions": 1536,
"similarity": "cosine"
},
{
"type": "filter",
"path": "whitelist"
},
{
"type": "filter",
"path": "scamScore"
}
]
}- Apply to
discovery_dapps,discovery_networks, anddiscovery_assetscollections
API Integration
The service is designed to be used by the Pioneer Server (pioneer-server). The existing dapps controller can query the discovery database for enhanced dapp information.
Example integration in dapps.controller.ts:
import { discoveryDB } from '@pioneer-platform/pioneer-discovery';
// Get analyzed dapp data
const dapp = await discoveryDB.getDapp(dappId);
console.log(dapp.analysis); // Full analysis
console.log(dapp.scamScore); // Scam risk scoreReports
Each discovery run generates a detailed report saved to discovery_reports collection:
- Statistics: Dapps analyzed, discovered, whitelisted, flagged
- Findings: New dapps, scams detected, verified dapps
- Recommendations: Action items for manual review
- Logs: Complete execution log
View latest report:
import { discoveryDB } from './src/db';
const report = await discoveryDB.getLatestReport();
console.log(report);Development
Project Structure
pioneer-discovery/
├── src/
│ ├── agent/ # Main discovery orchestrator
│ ├── analyzer/ # Analysis logic
│ ├── db/ # Database layer
│ ├── fetchers/ # Data fetchers
│ ├── reporter/ # Report generation
│ ├── types/ # TypeScript types
│ └── index.ts # Entry point with cron
├── package.json
├── tsconfig.json
└── README.mdAdding New Analyzers
- Extend
DiscoveryAnalyzerinsrc/analyzer/index.ts - Add new analysis fields to types in
src/types/index.ts - Update report generation in
src/reporter/index.ts
Adding Web Crawlers
TODO: Implement in Phase 3 of agent workflow
Planned approach:
- Use cheerio for HTML parsing
- Puppeteer for JavaScript-heavy sites
- Rate limiting and respectful crawling
- Domain whitelist/blacklist
Contributing
This service is part of the Pioneer Platform monorepo. Follow the monorepo development guidelines.
License
ISC
