postal-code-scraper
v1.0.4
Published
A tool for scraping country data, including regions and their postal codes
Downloads
131
Maintainers
Readme
Postal Code Scraper
📌 Overview
Postal Code Scraper is an automated web scraper designed to extract postal code data from countries worldwide. It efficiently fetches postal codes and organizes them into structured JSON files for easy use in applications.
This library uses Puppeteer for web scraping, Cheerio for HTML parsing, p-limit for controlling concurrency, ensuring accurate and efficient data extraction.
🚀 Features
- Scrape postal codes for one country or all countries
- Save results as JSON files for easy integration
- Region-structured output (country → region1 → region2 → region3 → ... → postal codes)
- Postal-code lookup output (postal code → region path)
- Configurable options (concurrency, retries, headless mode, output directory, logging, etc.)
- Fully asynchronous for optimized performance
📦 Installation
Install via npm:
npm install postal-code-scraperOr with Yarn:
yarn add postal-code-scraper📖 Usage Guide
1️⃣ Import the Library
ES Module (Recommended):
import { PostalCodeScraper } from "postal-code-scraper";CommonJS:
const { PostalCodeScraper } = require("postal-code-scraper");2️⃣ Instantiate Scraper
const scraper = new PostalCodeScraper();3️⃣ Scrape a Single Country
import { PostalCodeScraper } from "postal-code-scraper";
const scraper = new PostalCodeScraper();
await scraper.scrapeCountry("Romania");📌 Output Files (saved in ``):
romania-postal-codes.jsonromania-lookup.json
4️⃣ Scrape All Countries
import { PostalCodeScraper } from "postal-code-scraper";
const scraper = new PostalCodeScraper();
await scraper.scrapeCountries();📌 This will fetch postal codes for every available country.
5️⃣ Customize Scraper Configuration
🛠 Configuration Options
| Option | Type | Default | Description |
| --------------- | --------------- | -------------------------------- | ------------------------------------------------------------ |
| directory | string | src/data | The directory to save data |
| concurrency | number | 15 | Maximum concurrent requests to process |
| maxRetries | number | 5 | Number of retries for failed requests |
| headless | boolean | true | Run Puppeteer in headless mode |
| usePrettyName | boolean | false | Use country pretty names instead of default names |
| logger | object null | Logger (custom implementation) | Handles event logging, can be set to null to disable logging |
import { PostalCodeScraper } from "postal-code-scraper";
const customScraper = new PostalCodeScraper({
concurrency: 10, // Limit concurrent requests
maxRetries: 3, // Max retries per request
headless: false, // Run Puppeteer in visible mode
usePrettyName: true, // Store data using country pretty names
logger: console, // Enable console logging (set to null to disable)
directory: "src/data", // Output directory
});
await customScraper.scrapeCountry("Germany");📁 Output Data Format
🔹 romania-postal-codes.json
{
"cluj": {
"agarbiciu": [
"407146"
],
"aghiresu": [
"407005"
],
"cluj-napoca": [
"400001",
"400002",
"400003",
"...",
],
}🔹 romania-lookup.json
{
"postalCodeMap": {
"337563": "tamasesti_2",
"337564": "valea_4",
"400001": "cluj-napoca_1",
"400002": "cluj-napoca_1",
"400003": "cluj-napoca_1"
},
"regions": {
"cluj-napoca_1": ["cluj", "cluj-napoca"],
"tamasesti_2": ["hunedoara", "tamasesti"],
"valea_4": ["hunedoara", "valea"]
}
}❓ FAQs
1. Where are the postal code files stored?
By default, they are saved in:
src/data/Each country has two JSON files: one with raw postal codes and another with a structured lookup.
2. Can I scrape multiple countries at once?
Yes, using scrapeCountries(), which scrapes all countries automatically.
3. Can I change the output directory?
Yes, by changing the directory attribute in configuration.
4. Does this package work with TypeScript?
Yes! The package includes TypeScript types for better development experience.
5. How can I turn off logging?
You, by setting the logger attribute in configuration to null.
🏗 Future Enhancements
- ✅ Support for exporting data as CSV
🤝 Contributing
Contributions are welcome! Feel free to submit a pull request or open an issue.
📜 License
MIT License © 2024
