Parsing flights information

Apparently, there’s no open API for getting information about flights.
But as Google Flights can show a calendar with
prices for dates for two months I decided to use it:

So I’ve generated every possible combination of interesting destinations in South America and
flights to and from Amsterdam. Simulated user interaction with changing destination inputs and
opening/closing calendar. By the end, I wrote results as JSON in a new tab. The whole code
isn’t that interesting and available in the gist. From the high level it looks like:

constgetFlightsData=async([from,to])=>{awaitsetDestination(FROM,from);awaitsetDestination(TO,to);constprices=awaitgetPrices();returnprices.map(([date,price])=>({date,price,from,to,}));};constcollectData=async()=>{letresult=[];for(letflightofgetAllPossibleFlights()){constflightsData=awaitgetFlightsData(flight);result=result.concat(flightsData);}returnresult;};constwin=window.open('');collectData().then((data)=>win.document.write(JSON.stringify(data)),(error)=>console.error("Can't get flights",error),);

In action:

I’ve run it twice to have separate data for flights with and without stops, and just saved
the result to JSON files with content like:

[{"date":"2018-07-05","price":476,"from":"Rio de Janeiro","to":"Montevideo"},{"date":"2018-07-06","price":470,"from":"Rio de Janeiro","to":"Montevideo"},{"date":"2018-07-07","price":476,"from":"Rio de Janeiro","to":"Montevideo"},...]

Although, it mostly works, in some rare cases it looks like Google Flights has some sort of
anti-parser and show “random” prices.

def_generate_trips(can_visit,can_travel,can_spent,current_id,current_day,trip_flights):# The last flight is to home city, the end of the tripiftrip_flights[-1].to_id==home_city_id:yieldTrip(price=sum(flight.priceforflightintrip_flights),flights=trip_flights)return# Everything visited or no vacation days left or no money leftifnotcan_visitorcan_travel<MIN_STAYorcan_spent==0:return# The minimal amount of cities visited, can start "thinking" about going homeiflen(trip_flights)>=MIN_VISITEDandhome_city_idnotincan_visit:can_visit.add(home_city_id)forto_idincan_visit:can_visit_next=can_visit.difference({to_id})forstayinrange(MIN_STAY,min(MAX_STAY,can_travel)+1):current_day_next=current_day+stayflight_next=from_id2day_number2to_id2flight \
.get(current_id,{}).get(current_day_next,{}).get(to_id)ifnotflight_next:continuecan_spent_next=can_spent-flight_next.priceifcan_spent_next<0:continueyieldfrom_generate_trips(can_visit_next,can_travel-stay,can_spent_next,to_id,current_day+stay,trip_flights+[flight_next])

As the algorithm is easy to parallel, I’ve made it possible to run with Pool.pool.imap_unordered,
and pre-sort for future sorting with merge sort:

Without optimizations, it was taking more than an hour and consumed almost whole RAM
(apparently typing.NamedTuple isn’t memory efficient with multiprocessing at all),
but current implementation takes 1 minute 22 seconds on my laptop.

As the last step I’ve saved results in csv (the code isn’t interesting and available in the gist), like: