browsefn
v0.0.1
Published
Self-hosted web browsing and data extraction platform with web scraping, image search, and geolocation services
Maintainers
Readme
BrowseFn
Self-hosted web browsing and data extraction platform for developers
BrowseFn is a comprehensive developer-first browsing solution that combines web scraping, batch metadata extraction, image browsing/downloading, and geolocation services into a single self-hosted platform.
Features
- Web Scraping & Crawling: Multiple provider support (Firecrawl, Puppeteer, Playwright, Cheerio, Jina AI)
- Batch Metadata Extraction: Extract metadata from multiple URLs efficiently
- Image Search & Download: Browse and download images from Unsplash, Pixabay, Pexels
- Geolocation Services: Geocoding, reverse geocoding, and place search
- Provider-Agnostic: Unified interface with automatic fallback
- Performance: Built-in caching, rate limiting, and retry logic
- Type-Safe: Full TypeScript support with type inference
Installation
npm install browsefnOptional Dependencies
# For Puppeteer support
npm install puppeteer
# For Playwright support
npm install playwrightQuick Start
import { browseFn } from 'browsefn';
const browse = browseFn({
web: {
defaultProvider: 'cheerio',
cache: { enabled: true }
}
});
// Scrape a webpage
const page = await browse.web.getPage('https://example.com', {
format: 'markdown'
});
console.log(page.content);Documentation
See SPEC.md for full API documentation.
License
Apache-2.0
