Inspiration

Ever wonder where is the best place to set up your brand new shop? By mashing up Yelp data and open data sets we can help you figure it out!

What it does

SweetSpot models a combination of Yelp data for existing businesses and open data to predict how your new shop's walk-in traffic can be maximized.

How we built it

We looked at Yelp data for cafes, restaurants, and other businesses. We used k-means clustering to compute geographic clusters of restaurants, cafes, and businesses, and for each cluster we computed statistics including business density (number of businesses/cluster area) and average star rating. For each business in the data, we also computed the distance to the nearest public transit station (bus or metro). We then built a regression model using the most significant features in the data to predict success for a new business. Finally, we plotted all real estate listings in the neighbourhood, overlaid with our computed suggestions for good locations to start your new business.

Challenges we ran into

Cleaning the data, selecting features to use in our model, and visualizing the results in a useful way were all very important steps but challenging to get right.

Accomplishments that we're excited about

We're excited about our project and think that it has a lot of potential to be a useful tool for prospective entrepreneurs. It's difficult to decide where to place a new business, so this tool can leverage data to compare possible locations.

We were happy to find a huge amount of available open data sets that we were able to use for enriching the original Yelp data and improving our model. The final model that we came up with makes intuitive sense in that it places high significance on features such as proximity to public transit, neighbourhood, other successful businesses nearby, and so on.

Built With

Try it out

Submitted to

Created by

Mined data in shapefiles from Montreal open data portal, worked on mashing the Yelp datasets with Montreal open data, did data cleaning and transformation, performed feature selection to identify important variables, fitted a linear regression model and finally, predicted the success metric for the commercial property listings on Remax.