universal-read-api
v1.0.0
Published
A serverless infrastructure tool that turns any website URL into structured JSON data for AI agents
Maintainers
Readme
Universal Read API
Build AI agents that can read any website.
A serverless API that turns any URL into structured JSON data using Cloudflare Workers and Google Gemini 2.5 Flash Lite.
🚀 Features
- Free to Host: Runs entirely on the Cloudflare Workers FREE plan.
- Fast Extraction: Uses lightweight HTTP fetch + intelligent Regex cleaning (not heavy Puppeteer).
- Smart Schema: Define exactly what JSON structure you want back.
- Universal: Works on ~80% of websites (blogs, news, documentation, etc.).
- Auto-Summarization: If no schema is provided, it intelligently summarizes the page.
🛠️ Tech Stack
- Runtime: Cloudflare Workers (Hono framework)
- AI Model: Gemini 2.5 Flash Lite
- Parsing: Zero-dependency Regex HTML-to-Markdown converter
- Language: TypeScript
⚡ Quick Start
1. Clone & Install
git clone https://github.com/RajeshKalidandi/universal-read-api.git
cd universal-read-api
npm install2. Setup Gemini API Key
Get a free API key from Google AI Studio.
For Local Development:
Create a .dev.vars file:
GEMINI_API_KEY=your_actual_api_key_hereFor Production:
npx wrangler secret put GEMINI_API_KEY3. Run Locally
npm run dev4. Deploy
npm run deploy🔌 API Usage
Endpoint: POST /extract
Request:
curl -X POST https://universal-read-api.rajeshdev.workers.dev/extract \
-H "Content-Type: application/json" \
-d '{
"url": "https://example.com",
"schema": {
"title": "string",
"summary": "string",
"dates": ["string"]
}
}'Response:
{
"success": true,
"data": {
"title": "Example Domain",
"summary": "This domain is for use in documentation examples...",
"dates": []
},
"metadata": {
"url": "https://example.com",
"model": "gemini-2.5-flash-lite",
"tokensUsed": 341,
"processingTimeMs": 1205
}
}🤝 Contributing
We welcome contributions! Whether it's fixing bugs, improving documentation, or adding new features.
How to Contribute
- Fork the repository
- Clone your fork:
git clone https://github.com/YOUR_USERNAME/universal-read-api.git - Create a branch:
git checkout -b feature/amazing-feature - Make your changes
- Commit:
git commit -m 'Add some amazing feature' - Push:
git push origin feature/amazing-feature - Open a Pull Request
Ideas for Contributions
- [ ] Add support for Puppeteer (Browser Rendering) as an optional mode
- [ ] Add rate limiting using Cloudflare KV/Durable Objects
- [ ] Add scraping fallback for different site structures
- [ ] Improve prompt engineering for specific extraction types
⚠️ Limitations
This version uses standard HTTP requests (fetch), not a full browser.
- Works great for: Static sites, blogs, news, wiki, docs.
- Does not work for: Heavy client-side rendered apps (some React/SPA sites) that require JavaScript to show any content.
📄 License
MIT © Rajesh Kalidandi
