Looking for an open data store

Processing data is cool. Analyzing data is cool. Streaming data and perform real time processing is over cool 🙂 Yet, to make it happen, you will need data…

You might get data from your business, but can you use it if you want to try a new technology for yourself ? So, like many others (I guess), I tried to get data from opendata platforms. I could easily find some data, so I got to download a dataset and try myself for a data analysis. I used a dataset called « Museums ». Not knowing exactly what information I could extract from the dataset, I first opened the file to see what data was inside. And this is with great surprise that I saw that all the information the file was giving was the name of museum and its address…

At this moment, I as just wondering: Wait what? What am I supposed to do with this?

I can tell you this is a real problem, because what this means to me is: Use this data. You will do nothing with it, but in the end, who cares…

So what is the point in doing this ? Giving access to data that brings nothing is just useless. You will try nothing with it, you will learn nothing from it. After spending some time crawling on Internet, I reached data.world, a social network for data.

After subscribing, you had access to tons of datasets, about so many topics that you should find whatever you would like to work on. On this site, I could easily find datasets about pollution, politics, baseball…

If you are interested in investigating on pollution, you can for instance download a 350Mo dataset about US pollution records from 2000 to 2016 (https://data.world/data-society/us-air-pollution-data). You will find thousands of datasets like this one. Indeed, this is nothing compared to streamed data you will fetch from work, but I believe this is a good alternative to try yourself on a personal project, on a new tool.