The linear model is definitely not a great fit here, as the relationships aren’t particularly linear. Age is kind of linear above the age of 22, as older riders get consistently slower, but below 22 we see younger riders are slower

Distance traveled really doesn’t follow a linear relationship, especially for longer distances over ~3 miles. It would be very bad and inaccurate to extrapolate this model to long trips!

The daily/weekly customers are much slower, I think probably because more of them are tourists who are pedaling around leisurely, not really trying to get from point A to point B quickly. Anyway, subscribers account for ~97% of this dataset anyway, so it doesn’t change much to restrict to subscribers

This is a recurring assumption throughout the post, so we might as well get it out of the way that, yeah, it’s definitely not a true assumption! But it’s convenient, and in most cases it’s probably not too far off

Awesome work! Out of curiosity, did you run all your Google Maps cycling directions queries (for all station-to-station combos) at once and, if so, do you have any idea of the cycling directions vary by traffic, time of day, etc. like driving directions do?

Why September 16, 2015? It’s the 3rd biggest day in program history. Sep 24/25 2015 were slightly bigger, but those were also the days when Pope Francis was in town, so I figured they might be less indicative of a typical day if there were all sorts of road closures and traffic changes