mav-stations
v0.7.0
Published
A list of MAV stations.
Maintainers
Readme
mav-stations
A collection of all stations of Magyar Államvasutak (MÁV, Hungarian State Railways) requested from an endpoint used by their website plus thousands more discovered by crawling the MÁV timetable API.
Coverage
36,400+ rail stations across 27 countries. The interactive station map shows all geocoded stations, color-coded by source (official MAV list vs. discovered via timetable crawl or brute-force code scanning).
Note: Station discovery is an ongoing process. The dataset grows with each crawl and may not yet cover all stations reachable through the MÁV timetable, particularly in countries far from Hungary.
Installing
npm install mav-stationsNote: This Git repo does not contain the actual station data from the MÁV endpoint, but the npm package does. To retrieve station data and merge with discovered stations, run:
npm run buildUsage
readStations() returns an async iterable of Friendly Public Transport Format station objects.
import { readStations } from 'mav-stations';
for await (const station of readStations()) {
console.log(station);
}{
type: 'station',
id: '005510009', // station ID, used throughout booking system
name: 'BUDAPEST*',
aliasNames: ['Bp (BUDAPEST*)'], // if several names for the same station exist, otherwise empty list
baseCode: '3638', // internal MÁV identifier, only set on Hungarian stations
isInternational: false, // true if international trains available (?)
canUseForOfferRequest: true,
canUseForPassengerInformation: false,
country: 'Hungary',
countryIso: 'HU',
isIn108_1: true, // only true for select Hungarian stations; likely refers to UIC leaflet 108.1 (international tariff stations)
transportMode: {"code": 100,"name": "Rail", "description": "Rail. Used for intercity or long-distance travel."},
location: {type: 'location', latitude: 47.499444, longitude: 19.025} // or null if not geocoded
// ⚠️ location is unofficial — sourced from Wikidata/OSM, not from MÁV. Not guaranteed to be accurate.
}
// and a lot more…Tools
Station Discovery (crawler/)
BFS (breadth-first search) crawler that follows trains from seed stations through the MÁV timetable API, discovering stations not already in the dataset. Compares against data.ndjson (which includes both official and previously discovered stations), so each crawl only finds genuinely new stations. Run npm run build before crawling to ensure the comparison base is up to date.
npm run discover -- [options]| Option | Default | Description |
| ---------------------- | ---------------------------------- | --------------------------------------------------------------------------- |
| --seed <codes> | 008016321,008069685,… | Comma-separated station codes to start from |
| --max-depth <n> | 3 | BFS depth limit (use infinite to keep going while new stations are found) |
| --max-trains <n> | 5 | Max trains to follow per station |
| --delay <ms> | 250 | Delay between API calls in ms |
| --date <iso> | current timestamp | Travel date for timetable queries |
| --output <path> | crawler/discovered-stations.json | Output file path |
| --seen-trains <path> | crawler/seen-trains.json | Seen trains file (for incremental crawling) |
| --help | | Show help message |
# Discover stations reachable from German hubs, depth 4, 15 trains per station
npm run discover -- --seed 008013240,008014350 --max-depth 4 --max-trains 15 --delay 500
# Keep crawling until no new stations are found
npm run discover -- --seed 008503000,005454300,008101003 --max-depth infinite --max-trains 10
Output: crawler/discovered-stations.json — merged into the main dataset during npm run build.
Brute-Force Code Scanner (crawler/)
Tests UIC station codes against the MAV StationInfo endpoint to find stations not discoverable via BFS crawling (e.g. stations with seasonal or no current service). Uses the Trainline stations dataset as a primary candidate source for efficiency.
# Test Trainline candidates for a country (UIC 80 = Germany)
npm run brute-force -- --uic 80 --trainline --start 1 --end 0
# Trainline candidates + range scan (NL has a narrow range)
npm run brute-force -- --uic 84 --trainline --start 1 --end 800
# Merge hits into discovered-stations.json
node crawler/merge-brute-force.jsDiscovered stations are tagged with source: "brute-force" or source: "crawled" to distinguish confirmed active stations from those that merely exist in the system.
Geocoding (geocode/)
Looks up geographic coordinates (latitude/longitude) for stations via Wikidata SPARQL queries — first by UIC station code (P722), then via the Trainline stations dataset (UIC match), then by station name + country as fallback, with multi-language label matching. Stations not found in Wikidata are resolved via the Overpass API (OpenStreetMap), querying all railway nodes per country and matching by name. Generates an interactive map using Leaflet.
# Full geocode (only queries stations not already in cache)
npm run geocode
# Regenerate map from cache (no API calls)
npm run geocode:mapManual overrides for rejected matches go in geocode/geocode-overrides.json (format: {"stationCode": {"lat": ..., "lon": ...}}).
Notes on Data Quality
canUseForOfferRequestis unreliable for discovered stations. The flag is set by the timetable system, not the pricing engine. Entire countries (e.g. France, Moldova) appear in the timetable via cross-border services but are outside MAV's booking scope — no tickets can be purchased to or from those stations at the time of writing.- Meta-stations (names ending in
*, e.g.BUDAPEST*,WIEN*) are virtual groupings of nearby stations, not physical locations. They are excluded from geocoding. - Ghost stations — 88 stations (mostly small Serbian halts) are returned by the timetable API with
id: 0and no metadata. Their names and country codes have been resolved via the StationInfo endpoint and UIC prefix. They are real operational stops where trains call, but MAV doesn't have them in its official station list.
Related
mav-prices– Query MÁV connection prices.db-stations– A list of DB stations (data from DB station API).db-stations-autocomplete– Search for stations of DB (data from DB station API).db-hafas-stations– A list of DB stations, taken from HAFAS.db-hafas-stations-autocomplete– Search for stations of DB (data from HAFAS).
Contributing
If you have a question, found a bug or want to propose a feature, have a look at the issues page.
