FAQs

How many datasets do I need?

One dataset per chapter is typical. Some courses use a single dataset throughout. In this case it has to be a fairly rich and interesting dataset in order to keep student's attentions over the whole course. A few courses use multiple datasets in each chapter, but you have to be careful in this case that you don't spend all the time introducing datasets instead of teaching other concepts.

Where can I find datasets?

Here are a few data sources to get you started.

DataCamp can provide access to the datasets curated by ThinkNum. Ask your Curriculum Lead for details.

Can I create my own synthetic datasets?

Yes, but only if you can't find a real or semi-synthetic dataset. Students really enjoy using "real" datasets, so existing publicly available datasets are preferred over those generated from code.

What's a semi-synthetic dataset?

Semi-synthetic datasets are the data equivalent of a "based on a true story" movie. If you have a dataset that is not suitable for use on DataCamp due to licensing issues, (for example, if it is commercially sensitive) then it is sometimes possible to anonymize it and change enough numbers that it retains the spirit of the original data, but does not reveal anything commercially sensitive.

Good ideas

Use the simplest dataset that will get your point across

Bigger datasets are not always better. In many cases, having a dataset that is small enough that the students can easily understand it in it's entirety is beneficial. Be ruthless about shrinking your dataset when you prepare it.

Common problems and their solutions

Using the iris dataset

This dataset is very overused. Please choose a different one.

The dataset doesn't give the answer I wanted

Sometimes datasets give unexpected answers, and this can make the narrative in the exercise tricky or confusing. It is best to develop a reasonable familiarity with your datasets during course speccing phase, when it is easiest to swap the dataset for a different one.

How will this be reviewed?

Your Curriculum Lead will discuss your responses to the brainstorming questions. They will not be formally reviewed (though they provide important context for reviewers).