DESIGN THE WORLD

DESIGN DEV-OPS SERVICE

VE’AHAVTA PROJECT

VE’AHAVTA PROJECT

OUR FB GROUP

‏WHOLE WORLD OF DESIGN‏

‎‎Public‎ group‎

JOIN TO OUR GROUP

The whole world of design is a roof group that concentrates all the information on graphic design, industrial design, visual communication, interior design, fashion design, architecture, environmental design and more.

OUR FB GROUP

‏STARTUPS IN ISRAEL‏

‎‎Public‎ group‎

JOIN TO OUR GROUP

The group's goal is to disseminate and share vital science in all aspects of their technological ecosystem in Israel, and to help create connections between entrepreneurs, private investors and venture capital funds.

For any questions, cooperation and assistance for members of the technological community, please contact us with comments or details. Good luck everyone!

EcoMotion Week 2020

ISOCI 2020

DESIGN 2020 Conference

Design Thinking 2020

InnoVEX 2020

Service Design Hong Kong 2020

The 50 Best Free Datasets for Machine Learning

The quality of your data is often an issue as to why your algorithm gives poor predictions. Having a clean large consistent data set is quite difficult to obtain. I would like to recommend a cheat sheet with the sourcesof best free data sets that you might be interested in to use while doing computer vision, sentiment analysis, NLP, financial predictions, etc.

What are some open datasets for machine learning? We at Gengo decided to create the ultimate cheat sheet for high quality datasets. These range from the vast (looking at you, Kaggle) or the highly specific (data for self-driving cars).

First, a couple of pointers to keep in mind when searching for datasets. According to Dataquest:

A dataset shouldn’t be messy, because you don’t want to spend a lot of time cleaning data.

A dataset shouldn’t have too many rows or columns, so it’s easy to work with.

The cleaner the data, the better — cleaning a large data set can be very time consuming.

There should be an interesting question that can be answered with the data.

UCI Machine Learning Repository: One of the oldest sources of datasets on the web, and a great first stop when looking for interesting datasets. Although the data sets are user-contributed, and thus have varying levels of cleanliness, the vast majority are clean. You can download data directly from the UCI Machine Learning repository, without registration.

General Datasets

Public Government datasets

Data.gov: This site makes it possible to download data from multiple US government agencies. Data can range from government budgets to school performance scores. Be warned though: much of the data requires additional research.

Berkeley DeepDrive BDD100k: Currently the largest dataset for self-driving AI. Contains over 100,000 videos of over 1,100-hour driving experiences across different times of the day and weather conditions. The annotated images come from New York and San Francisco areas.

Oxford’s Robotic Car: Over 100 repetitions of the same route through Oxford, UK, captured over a period of a year. The dataset captures different combinations of weather, traffic and pedestrians, along with long-term changes such as construction and roadworks.

Cityscape Dataset: A large dataset that records urban street scenes in 50 different cities.

CSSAD Dataset: This dataset is useful for perception and navigation of autonomous vehicles. The dataset skews heavily on roads found in the developed world.

If you think we’ve missed a dataset or two, let us know! And check out our more detailed list on datasets for natural language processing. Still can’t find what you need? Reach out to Gengo — we provide custom machine learning datasets. We manage the entire process, from designing a custom workflow to sourcing qualified workers for your specific project. Plus, our team includes over 21,000+ qualified native speakers in English as well as 36 other languages.