@thehotelsnetwork/ripley
v1.1.0
Published
AI-powered parser generator for hotel booking engines
Readme
MCP Parser Generator
Automated system for generating hotel data extractors using artificial intelligence. Creates specialized functions to extract information from hotel booking and room rates pages.
🎯 New Feature: Visible Browser Mode
You can now watch the AI perform its automation tasks in real-time! Set MCP_BROWSER_HEADLESS=false to see the browser window during parser generation. See docs/visible-browser-mode.md for details.
🚀 Main Commands
# === Main commands ===
npm run booking-draft # Generate extractor for User Register Page pages
npm run price-parser # Generate Rooms and Rates extractor
npm run deep-link # Generate deep link builder functions
npm run validate-deep-link # Validate deep-link generator functions
# === Development tools ===
npm run mcp:server # Start MCP server
MCP_BROWSER_HEADLESS=false npm run mcp:server:dev # Start MCP server on visible mode
npm run clear-parsers # Clean empty parser sessions
# === Utilities ===
npm run list-tools # List available MCP tools📁 Parser Output Directories
Generated extractors are automatically saved in:
/parsers/
/booking-draft/ # Checkout/registration extractors
/session-2025-10-15T09-08-56-652Z/
/domain-name-1/
- parser.js # Main function
- debug-response.md # Debug information
- session.log # Processing log
/price/ # Room rates extractors
/session-2025-10-14T11-28-13-405Z/
/domain-name-1/
- parser.js # Main function
- debug-response.md # Debug information
- session.log # Processing log
/deep-link/ # Deep link URL builders
/session-2025-10-28T14-30-22-123Z/
/domain-name-1/
- parser.js # Main function
- debug-response.md # Debug information
- session.log # Processing log
/deep-link-validation/ # Deep-link generator validator sessions
/session-2025-11-05T14-30-22-123Z/
- attempts/ # Per-attempt artifacts (validation-results.json)
- final-validation.json # Aggregated statuses & final score
- final-summary.md # Human-friendly summary
- run-summary.json # Summary stats
- session.log # Execution logEach parser includes:
- ✅ Complete specialized function (
bookingDraftParseData(),priceParseData(), orgenerateDeepLink()) - ✅ Metadata (engine, URL, date, model used)
- ✅ Documentation of extracted fields or parameters
🛠 Detailed Commands
User Register Page extractor (booking-draft)
Generates bookingDraftParseData() functions to extract registration form data:
npm run booking-draftWhat it does:
- Navigates to hotel User Register Page pages
- Analyze the data for the selected room and the dates of the reservation.
- Generates function to automate form filling
Room and Rates extractor (price-parser)
Generates priceParseData() functions to extract pricing information:
npm run price-parserWhat it does:
- Navigates to room search pages
- Analyzes displayed availability and prices
- Generates function to obtain rate information programmatically
Deep Link Generator (deep-link)
Generates generateDeepLink() functions to build booking URLs with parameters:
npm run deep-linkWhat it does:
- Analyzes booking flow and URL structure
- Identifies all possible booking parameters
- Generates function to build deep links programmatically
Usage example:
const url = generateDeepLink({
dateBegin: '2025/03/01',
dateEnd: '2025/03/05',
language: 'en',
currency: 'EUR',
roomsDispo: [
{ adults: 2, children: 1, childrenAges: [8] },
{ adults: 2, children: 0 }
]
});See docs/deep-link-command.md for detailed documentation.
Deep Link Validator
Validate a JavaScript deep‑link generator function end‑to‑end.
What it does:
It always invokes generateDeepLink(dataset), navigates to the resulting URL in a clean browser context, and verifies Rooms & Rates and parameter application (dates are mandatory; other params are informative and scored).
Usage example:
# From file
npm run validate-deep-link -- --fn-file /abs/path/generator.js [--max-attempts 10] [--dataset '{"language":"en"}']Artifacts:
parsers/deep-link-validation/session-<ISO>/attempts/*/validation-results.jsonfinal-validation.json,final-summary.md,run-summary.json,session.log
Prompts overview:
src/core/prompts/deep-link/generator-validation.md:- Used by the generator-based validator. Tells the model to execute
generateDeepLink(dataset)in an isolated context, navigate, and output a single VALIDATION RESULTS JSON (classification, urlParamCoverage, guestsCoverage, constraintsDetected, nextSuggestedDataset, etc.). It enforces dates-as-gate, focused rechecks for unknown/not_supported, and a strict JSON contract.
- Used by the generator-based validator. Tells the model to execute
src/core/prompts/deep-link/url-validation.md:- Used to validate an existing deep-link URL (no generator). The model infers expected parameters from the URL itself, navigates, extracts observed values from UI, compares expected vs observed, and returns a concise JSON with checks and reasons.
See docs/validate-deeplink-command.md for detailed documentation.
Quick Start
- Copy the .env.example file and set the following AI provider credentials
# Option A: OpenAI (preferred provider)
export AI_PROVIDER="openai"
export OPENAI_API_KEY="your-openai-key"
export OPENAI_MODEL="gpt-5" # or the model you plan to use
# Option B: Anthropic
ANTHROPIC_API_KEY=your-anthropic-key # Anthropic API key (optional if using OpenAI)
CLAUDE_MODEL=claude-opus-4-1 # Claude model to use (default: claude-opus-4-1)
# AI Provider Selection (mandatory)
AI_PROVIDER=auto # Which AI provider to use: 'claude', 'openai', or 'auto' (auto tries OpenAI first, falls back to Claude)
MCP_SERVER_URL=http://localhost:8931 # Local MCP server URL (if using local server)- Start the MCP server (new terminal)
npm run mcp:server- Run a quick single-URL test
# Checkout / User Register Page flow
npm run booking-draft -- --url "https://example-hotel.com"
# Rooms and Rates flow
npm run price-parser -- --url "https://example-hotel.com"- Batch mode (optional)
# Create urls.csv with header and at least one row
cat > urls.csv << 'EOF'
booking_system,site_id,page_name,url
unknown,hotel-1,home,https://example-hotel.com
EOFRun in batch
npm run booking-draft
# or
npm run price-parser- Where to find results
# Generated files are saved under:
parsers/booking-draft/session-*/<domain-n>/
parsers/price/session-*/<domain-n>/
parsers/deep-link/session-*/<domain-n>/
parsers/deep-link-validation/session-*/
# Common files per URL folder:
# - parser.js (generated parser)
# - debug-response.md (raw AI response for debugging)
# - session.log (log of the run)
# For validator sessions, see detailed artifact structure in docs/validate-deep-link-command.md❓ Troubleshooting
No file generated in parsers/
- Ensure your AI credentials and MCP server are correctly configured
- Try single-URL mode first to validate connectivity
- Check
debug-response.mdandsession.logunder the URL folder
Page doesn't load correctly
- Some sites may have anti-bot protections
- Try the URL manually first
- Consider using more specific internal page URLs
📚 Documentation
- Architecture Overview - Detailed system architecture and design patterns
- Environment Setup - Configuration and setup guide
- Best Practices - Development guidelines and tips
- Local MCP Server Guide - Running local MCP servers
- Deep‑link Validator Command - Detailed documentation for the deep‑link validation CLI
- Validation Explained - Beginner‑friendly overview of how validation and scoring work
