Category: bigdata

Unlike the beautifully concise and familiar 1 liner of code to dump a document to a text file:

File.open(local_filename, 'w') {|f| f.write(doc) }

The ruby csv library requires quite a bit more typing, and the documentation for it is easy to misunderstand.

One of my primary needs, is often data wrangling. Changing the contents of a csv file for use in another framework, whether it’s reverse coordinates, stripping unwanted columns, or adding needed columns to the data, and I always trip up on how to dump the changed CSV after manipulating it.

As a reminder to myself, and maybe a hint to others, I’ve include the proper way to dump out your arr_of_arrs, once you’ve manipulated it as you will,

Caveats Travel Data As Co-Founder and former CTO/Designer of Everbread and it’s Haystack Flight Shopping Engine, I’m midly qualified to speak and pontificate about Air Travel Technology. There is a lot more that I don’t than I do, but I guess I kn…

Caveats

Travel Data

As Co-Founder and former CTO/Designer of Everbread and it’s Haystack Flight Shopping Engine, I’m midly qualified to speak and pontificate about Air Travel Technology. There is a lot more that I don’t than I do, but I guess I know enough to understand some things.

Usability

On Usability, UI, UX and other forms of the Field’s name, I’m a novice. I know what I like, and I can observer what tends to work well and intuitively, but I’m no Jason Putorti

Travel Business

Anything I have to say on this topic is extremely likely to be wrong, incorrect, fallacious, idiotic, and misguided. But I won’t let that stop me.

The first ingredient in producing scheduled flights with a ticketable fare is pure computational magic. I’m not giving away any secrets in this section. Most of what I can say is well known at least “inside” the Travel Tech sector. ITA’s primary Data Service product QPX is built in LISP and is pretty fast at computing the set of possible fares that can apply to the possible flight itineraries, and then validating the complex dynamic rule set needed to allow a fare to be shopped. (No point in showing a fare that can’t be ticketed now is there??)

For those of you who aren’t familiar with just how difficult this is to do, correctly, and completely, here’s the simple version , there’s a slideshow produced by Carl de Marcken from ITA Software that is often referred to. I’ve added a second source in case it ever gets pulled from the ITA site.

Anyway, the 2nd Ingredient needed to Display Ticketable Flights with Fares is Seat Availability Data. QPX is completely depenednt upon ITA’s DACS system which requires Airlines to participate in a sensitive data sharing relationship with ITA. Only a few of the world’s airlines do and most of these are US Domestic and are sharing data only for US Domestic Routes (Continental is one of ITA’s Chief partners in this area).

As a result, currently QPX is reputed to work more perfectly with US Domestic Itineraries, and certainly given ITA’s customer Base, it will have an up-to-date fresh Cache which is US Domestic Flight Centric.

The 3rd Ingredient in this Data Shuffle is The Results Cache. It’s the industry’s dirtiest little secret. All of the GDS’s use a results cache to manage load. Why shouldn’t they? If You just asked for Fares from LAX-BER 10 seconds ago, the likelihood that the answer has changed in those 10 seconds, is very very low. Unfortunately most of the Fare Shopping systems have much higher loads than they are designed for, and also most queries produce little actionable revenue (A lot more people Shop for Fares than actually book them) so there is a lot of weird black magic in deciding whether to compute an answer or serve and old answer. As for speed, this is where cache’s excel better than pure computation and this is what you’re seeing on the Google Flight Search system. It may be a Cache this is being constantly refreshed (if someone is seeing data change on the results page after the Query has been completed, please point it out, that would indicate that the page is getting relatively fresh updates from the Cache).

Baking a Cake There are a lot more factors that I’m not prepared to babble on about, but private fares, and point-of-sale issues will also have an impact on the quality of the fares being shown, most of which is driven by the seat availability cache. There are many ways to bake a cake and different spices you can add to make it tastier. However, most people are happy with good cake, it doesn’t have to be heavenly to eat it. After all, it’s still cake! That’s my way of saying that it’s certainly not obvious that Google’s Flight Search product will produce cheaper or more convenient fares/flights than Travelocity, Orbitz, Expedia, or Kayak for that matter who are using a mix of Data providers, not just ITA. Furthermore, the system is not going to perform at it’s best for international and non-US Domestic flights until ITA addresses that in it’s core product offering (QPX).

Still, for a first effort, it’s an amazing solution (it really is FAST) that produces a wide range of results and will likely satisfy the airfare shopping needs of a majority of customers who may not going to shop beyond the search and click. (More on this later)