Understanding NYC Citi Bike Riders with SAP Analytics Cloud

If you’ve ever been to New York, you’ve surely noticed blue branded Citi Bikes around town. These bikes allow riders to unlock bikes from hundreds of stations across New York and return them to other stations for a fee.

When I’m in New York, I tend to see most of them around midtown with a young man fighting through rush hour traffic. But as an analytics geek, I needed data to prove this impression or dispel it as myth.

Did You Know that These Bikes Are Data-Collecting IoT Machines?

While it shouldn’t be a surprise, the bike stations are Internet-enabled data collection devices. They capture and collect all sorts of data around the trip times, start and stop date/time, start and stop station, bike ID, and some descriptive information about the user. Furthermore, these bikes have been collecting data since mid-2013. But like most data collection machines, the data has its fair share of challenges.

The first challenge is that each month is a new data set. This means that you need to append each month on top of each other to form a history. The second set of challenges is that periodically, the data set will have different column names, which requires it to be remapped and relabeled. A third challenge is that there are numerous codes with no descriptive information, so many of these codes need to be mapped to regular locations to make sense of it.

And the final challenge was around data quality. Subscribers can put in “fake” information about themselves, such as their gender and age. Anyways, it took a few hours, but I was finally able to get a usable data set to do my analytics.

Let’s Take a Look at the Data

The data set goes from January 2014 to March 2018. Overall, we can see that New Yorkers have become quite fond of these bikes and their popularity has grown each year.

Do People Ride in the Winter?

New Yorkers are quite tough and battle through the elements on their bikes. While less people bike in the winter, many brave folks soldier through.

What Stations Get Used the Most?

If we look at the map, we can see that the most popular pick-up and drop-off locations are in mid-town and lower Manhattan. And if we zoom into the map further, we can see that the largest bubbles are around Grand Central Station.

Where Are They Going from These Stations?

If we filter on the top station (Pershing Square), we can see that they’re typically not biking very far (10-15 blocks away). This makes sense since the average duration is around 15 minutes.

When Do People Ride?

Wednesday is the most common day to ride, but people ride longer on the weekends.

What Types of People Ride?

While there are very limited demographic data on the riders, the data captures age and gender. The majority of the riders are millennial-generation men.

When Do These Different Generations Ride and Why?

Here’s where things get a bit more interesting. In the visual on the left, we can see that millennials use these bikes the most, but primarily during the week. On the visual on the right, we can see that these millennials are riding during afternoon and evening hours (likely on the way back home from work). However, if we pivot this and look at the average ride time (instead of the number of rides), we can see a different trend. In the visual on the left, we can see that while traditionalists (seniors) and baby boomers ride much less, they tend to take much longer rides—and these rides are primarily in the afternoon and at night.

What’s the Profile of These Bike Riders?

Based on the answers above, it’s very easy to paint a picture of who these riders are. The typical Citi Bike riders are:

Young men who live and work around the city

Short haul riders who travel to/from the train station to work

Bicyclist who prefer to ride home—when they have more time, maybe more energy, or want to burn off a few calories

What Does This All Mean?

The power of analytics is that it gives us the whole story behind the data and it can help us to validate our thought process or to gain new insights into our business. We can see in the data that:

How people use your product can tell you more than who uses it. Citi Bike riders provide very little information about themselves, but their usage tells you a lot. The fact that the number of riders has grown dramatically and that people using these bikes to ride to/from work tells us that they value the cost and convenience of these bikes.

Citi Bike is targeting the right customers. Their tag line “Faster than walking, cheaper than a taxi, and more fun than the subway” with pictures of young adults targets the exact demographic that’s using their bikes. The only exception is that it seems to be mostly young men who are riding.

People need analytics. Not data. While New York Citi Bike does an excellent job providing timely and updated “raw data,“ they provide very little analytics and insight into this data, like we’re seeing above. This type of analytics lets us see the whole story in the data.

What Do You Think?

Have a question, want a demo, or just want to know more? Get in touch or connect with me on LinkedIn and I’d be happy to share more details.

Post navigation

Hello!

My name is Igor. I'm 40 years old and live in Moscow/Russia. I work in IT sphere 22 years with 1C, SQL, SAP, Oracle as developer and consultant. Also I have some projects where I work as Apple Developer and Java Developer. I'm married and I have a daughter 16 years old. I have next hobbies: photography, auto-travel, books. I'm glad to see you and I hope this site will be useful for you. ;-) If you have any questions you can write me letter on I.Khoroshilov@gmail.com.