npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2024 – Pkg Stats / Ryan Hefner

geo_radius_scraper

v1.0.0

Published

Helps when scraping geodata from an API that returns points near a given location.

Downloads

5

Readme

Geo Radius Scraper

Helps when scraping geodata from an API that returns points near a given location.

How does it work

We want to querying an API to retrieve all points inside a specific target area.

  1. Start by determining your target area, which can be defined as either a bounding box or a polygon. This is the area you're interested in getting data for.

  2. Select any point at the edge of this target area. You will use this point to make your first request to the API.

  3. The API responses with a set of data entries. Add these entries to your collection of results. This could be done using a data structure like a map.

  4. Now, imagine drawing a circle around the request point you chose. This circle should be the smallest possible that can cover all the data entries you received. So the radius of this circle is the distance from the request point to the furthest data entry.

  5. We can reasonably assume that there are no more data entries within this circle because the API has already provided all the data for this specific area. So, we can "subtract" this circle from our target area, essentially excluding this circle from the areas we are interested in.

  6. We then repeat the process: select a new point in the remaining target area, make an API request, gather the data, draw a new circle, and subtract that from the target area.

  7. Continue this process until you have covered the entire target area, retrieving all the available data from the API for your region of interest.

How to use this package?

Install via npm

npm i geo_radius_scraper
// import this package
import { runScraper } from 'geo_radius_scraper';

// use a Map to collect all results
let results = new Map();

// run the scraper, using a bbox and an async callback
await runScraper([5.9, 47.3, 15.1, 55.0], async (point, progress) => {
	// log progress
	process.stderr.write('\r' + (100 * progress).toFixed(2) + '% - ' + results.size);

	// make a request
	let responses = await mayRequest(point);

	// add the entries to your results
	responses.forEach(r => results.set(r.id, r));

	// return the maximum distance. If the response is empty return 50km
	return Math.max(...responses.map(r => r.distance)) || 50;
});

References

runScraper

async function runScraper(feature, callback)
  • argument feature can be a GeoJSON feature or a bbox [lonMin, latMin, lonMax, latMax]
  • argument callback is a async callback of the form (point, progress) => distance
    • argument point: as an 2 number array: [lon, lat]
    • argument progress: progress as number between 0 and 1
    • returns distance: the maximum distance of the API responses in kilometers. This is the radius of the circle around point, that we have covered with this request.

calcDistance

function calcDistance(point1, point2);

calculates the distance between point1 and point2 in kilometers. This function is a re-export from turf.distance.

createCachedClient

function createCachedClient(path)

returns request functions that cache all request as local files. This helps during development so you don't have to make identical requests twice.

  • argument path points to directory where all requests/responses will be cached as txt files. returns an object with request functions. Currently only get:
const { get } = createCachedClient(path)

Use get simply:

let text = await get(url);