Aleem's Blog

Tuesday, April 5, 2011

In this post I'll be analyzing Facebook Places and Foursquare data to better understand their relative competitive positions by looking at the number of venues and checkins they have. While it may seem obvious that Foursquare is clearly winning in this space, I wanted to quantify the extent to which this is true and also see if there are any situations (either by geography or by category of business) where Facebook may have a chance.

So, where's the data from?

In order to collect the data for this analysis, I utilized the Facebook Graph API and the Foursquare API. The first step was to collect a list of venues in Facebook's database and another list of venues in Foursquare's database for a particular geographic region. Since there is no easy way to query the API's for a list of all the places/venues and their checkin counts, I had to split up the geographic region into a grid and get the venues for these smaller regions. I then combined and normalized the data to get a list of venues for each service.

Next, to compare Facebook with Foursquare data I had to match places in the Facebook data set with venues in the Foursquare one. Unfortunately, the data sets aren't exactly clean and there can be differences between how a place is listed in Facebook and Foursquare. In order to get around this limitation, I wrote an algorithm that fuzzy matches places in both data sets by using a combination of name, address and geo location. The algorithm can match places even if the places aren't listed in the exact same way. By design, it is overly conservative in its matching - so that only the same places are listed in the "matched" data set. Once the matching was complete, the API was used to get the checkin counts for each matched venue. The aggregate the statistics are presented below.

The Results

Unfortunately trying to do the analysis for an entire country would have taken far too long computationally and would have certainly had API limit issues. Instead, below are the raw results for 6 different North American cities (San Francisco, Cambridge, Manhattan, Toronto, Orlando and Cleveland).

The first obvious conclusion here is that Foursquare has more venues and more checkins than Facebook Places. However this does vary by geography and venue category. For example, we can compare results from Orlando where Facebook is actually very close to Foursquare in terms of venue count, but in San Francisco, Foursquare almost has double the venues. (see chat 1 and chart 2 below).

The graphs also show how many of the venues in both data sets have been "matched" (we think they are the same venue in both data sets). The amount of overlap in the venues are likely to be higher than what is shown in the data because our algorithm tries to be overly conservative and minimize false positives to get accurate comparison data.

From these matched venues we can compare checkin counts from Facebook and Foursquare. The following graphs show the percentage of these matched venues for which either Foursquare or Facebook has more checkins than the other. For example, in Manhattan, only 8% of matched venues had more checkins on Facebook than on Foursquare:

Where as in Cleveland, Facebook almost twice as much at 14.2% of venues:

It seems that outside of major tech hubs (NY, SF and Toronto), Facebook Places does much better (still nowhere near Foursquare). In Orlando, Facebook Places wins a whopping 23.9%. This seems reasonable because of Facebook's existing penetration throughout the US, they have more usage when the population isn't as drawn to new tech services like Foursquare.

Finally, we can look at how each service fairs when broken down by what type of venue category we look at. For example, in Cambridge, we can see that Facebook Places "wins" (has more checkins than Foursquare does at the same place) when looking at travel spots. This seems consistent with the view that more people from out of Cambridge use Facebook as Foursquare is very popular in Cambridge.

Looking at other cities - one interesting trend is that for the most part, Facebook Places does best at College & University locations. Any guesses why (please comment)?

The above are only a few snapshots of all the data collected, you can find the full analysis + data sets available here:

Additionally, and more for fun, I created a few visualizations of the data. Below is Facebook and Foursquare checkin data plotted on Google earth. Pink lines represent Foursquare, Green lines represent Facebook. The height of the lines represent the number of checkins. This data only shows checkin data for venues where I could find a match on Facebook and Foursquare.

What doesn't this show?

This data is merely a snapshot (as of April 1, 2011), it doesn't show how this is changing over time. Is Facebook catching up in venues/checkins or is Foursquare growing faster? This analysis would have to be done multiple times over a long enough time period to make any meaningful conclusions (no historic data is available from the API's).

Assumptions

One critical assumption in the analysis done is that the places and venues that were matched are a representative sample of all the places and venues that are in common between Facebook and Foursquare. However, I can't think of a reason why it wouldn't be representative (i.e. why would spelling errors of a large enough magnitude that my algorithm discarded the possible match, be limited to one specific segment of venues/places?).

Further Work

Improve matching algorithm - current matching very few of overall venues. Is it the case that this algorithm is too conservative or is the overlap of places not actually very high. If the places aren't very high - what could this mean?

Any gotchas?

While the Facebook API has some fairly generous API limits, Foursquare on the other hand hobbles you with a 5,000 request limit/hour. When their results only return 50 items at a time, this can be limiting. One way to get around the limit is to register multiple API clients and cycle through them when making requests so you don't hit the API limit on any single client.

Social Strategy Implications

Implications for Local Businesses: As a local business implementing a social strategy, you must be engaging with your customers on both platforms. While Foursquare has a lot of the buzz now (rightfully so), Facebook Places usage is not insignificant especially outside of the SF and NYC. Local businesses just starting to experiment with offering deals in location based services clearly need to do the basics: 1) make sure their venue/place is on both services with up to date information and 2) check stats to see whether Facebook or Foursquare has more usage for their particular venue. It is likely worth offering the same deals on both services as the multihoming cost isn't very high (just check activity on two websites) and you are likely to attract different sets of customers on each service (the high variation in checkins per category seems confirms this).

The implications for Facebook and Foursquare are less clear. Only having a snapshot of this data at one particular point in time just tells us that Facebook is currently smaller (almost 10x smaller on average) than Foursquare. This seems obvious as Facebook Places launched well after Foursquare. To get a real sense of who is growing faster between the two, you would need to look at aggregate user data (i.e. checkins per active user and how its changed over time).

Assuming, Facebook is currently losing this battle (i.e. not growing faster than Foursquare), there are several ways they could compete. First, they should leverage their success in photos + mobile. Currently in Facebook, you can tag other people in photos - why not be able to tag the place they are at too? I could imagine a pretty slick interface too - since mobile photos contain GPS coordinates in the EXIF data, Facebook could automatically suggest nearby locations when tagging a photo with a place. It could also automatically checkin all the users tagged in the photo to the place when the photo is uploaded.

Second, Facebook has the benefit of its users using the system for free form status updates. If Facebook could extract or match places mentioned in a status update with the actual Facebook place, it could offer to automatically checkin that user. More interesting though, would be if Facebook would alert you if any of your friends have been to the same location so you can ask them questions. For example, if I posted an update saying "Going to Cafe of India, food always smells great when I walk by....", Facebook should be able to use my current location + my update text to check me into Cafe of India in Cambridge. It should also alert me of all my friends of who have been there so I can ask them what to order.

Friday, February 25, 2011

I love ZipCar. I save a ton of money not having to buy a car for my occasional driving use. There is no question that ZipCar is great service - well thought out, great user experience, attention to details, all the requisites for a great company.

As I was browsing around their site the other day, their environmental impact page claiming that ZipCar has green benefits caught my eye. Here's what they are claiming:

Screenshot of ZipCar's Environmental Impact Page

This got me thinking, is ZipCar claiming that using their service is better for their environment per given mile you drive? Or are they trying to say that members will end up driving less and that's where the benefits come in? If its the latter, then ZipCar doesn't really have a green impact - it is just providing a less useful mode of transportation. It would be like saying if we traded in all our cars for horses and buggies we would be more green. Sure there would be less CO2 emissions, but at the cost of not having a workable mobility solution for society. Let's dig in a bit and see which of the two it is.

The first claim is that each ZipCar takes 15-20 personally owned vehicles off the road. Privately owned vehicles in the US tend to only have about a 5-10% utilization rate (and probably lower for ZipCar's demographic), so clearly sharing a single car among members could reduce the number of cars. However, this doesn't mean that there is less "driving" going on. Members are likely to drive the same amount as before, but in a shared vehicle. So in terms of the amount of emissions released, this seems to be unchanged by ZipCar. Its either x pounds of CO2 released by 15 cars or 15x pounds of CO2 released by one car. There are benefits of course to sharing a vehicle - less parking is needed in cities, making finding a parking spot easier for other cars (ZipCars have their own dedicated spot). Less hunting for parking reduces emissions (by more than you think - some estimates peg 20% of the driving time is spent looking for parking). However, unless ZipCar has a massive footprint, the savings here seem negligible.

Next, ZipCar claims that people who use their service funnel their car savings into buying local and sustainable products. Is this real? Sound like a guess to me given there are no citations whatsoever. I highly suspect that they saw there was some correlation between ZipCar users and environmentally responsible citizens, but instead claimed that this was a causality and not just a correlation. There is no reason to believe that becoming a ZipCar user will suddenly cause you to start consuming in an environmentally responsible way.

ZipCar's final two arguments revolve around the assertion that ZipCar drivers drive less miles than private car owners. Well, how could this be? My view is that a customer has a certain mobility/transportation need. The customer then chooses between the various transportation options that both satifies their mobility need and has the lowest perceived cost. A user that switches from driving their own private car to driving a ZipCar still has the same mobility need and thus likely to drive the same set of miles. In another case, people switching from public transit to ZipCar can still satisfy their mobility need, but are using a mode of transportation that is less environmentally friendly. This is not the fault of the consumer, ZipCar is a convenient and economic solution - but in this case not necessarily environmentally friendly. In the final case, you have users who did not have a previous mobility need but because of ZipCar's convenience and economics actually increase their potential mobility - increasing the number of miles driven. I have yet to see a case where ZipCar's economics and convenience had led people to drive LESS - comment if you have some examples.

Now of course I could be wrong. Maybe ZipCar is environmentally friendly. The most compelling argument is that the existence of ZipCar will shift users from using private ownership vehicles to public transit because they know they can use ZipCar as a backup. That is, some people may have turned away from public transit before because they know they need vehicle usage in some cases. ZipCar allows users to shift to taking majority public transit trips and ZipCar for those rare cases where they need a car - this overall causes less miles to be driven. This begs the question of how many people actually shift from private car ownership to mass public transit, a small number I suspect.

Another possibility is that the fleet of ZipCars, on average, is more eco-friendly than the average American car. That is, if ZipCar uses a disproportionate amount of hybrids and fuel efficient vehicles then less CO2 is being released for the same number of miles being driven. I will have to check the stats on ZipCars fleet to confirm this.

Finally, ZipCar has a partnership with ZimRide which encourages the use of car sharing. In this case, ZipCar is advertising another service that actually is environmentally friendly so there is some benefit there. The question again arises of what percentage of trips are incrementally shared because of the ZimRide partnership. This is something GobiCab is trying to address but in a more direct way - cab sharing at airports where single passenger rides are high, and the opportunity to share a ride because of similar destinations is also high.

I've painted a pretty dark picture of ZipCar but this is not my intention. I think it is a great service, less wasteful, more economic and a general benefit to society - however, I believe some of the claims on ZipCars site with respect to their environmental impact are a bit of a stretch. What do you think?