Validating

- [Voiceover] A topic that can be easily overlooked…in data science analyses, unfortunately,…is that of validating models.…The question here is, Are you on target with your analysis?…The problem is that many machine learning algorithms…fail on implementation, many scientific studies…cannot be replicated.…And so there's the question of whether your analysis…is actually giving you good insights…into the general nature of the problem…and not just the specific data at hand.…To put it another way, your model…fits the sample data beautifully.…

It's tailored, it's great, but will it fit other data?…This is considered the issue of generalizability…or scalability.…Now, one way of looking at this…is with posterior probabilities, where you take information…about your present data and you combine it…with information about the past to get…some sort of impression about the future.…Most analyses give you the probability…of the data given the hypothesis.…Fine, that's the basis of standard testing,…but there's more interest and more utility…

Resume Transcript Auto-Scroll

Author

Released

7/5/2016

Introduction to Data Science provides a comprehensive overview of modern data science: the practice of obtaining, exploring, modeling, and interpreting data. While most only think of the "big subject," big data, there are many more fields and concepts to explore. Here Barton Poulson explores disciplines such as programming, statistics, mathematics, machine learning, data analysis, visualization, and (yes) big data. He explains why data scientists are now in such demand, and the skills required to succeed in different jobs. He shows how to obtain data from legitimate open-source repositories via web APIs and page scraping, and introduces specific technologies (R, Python, and SQL) and techniques (support vector machines and random forests) for analysis. By the end of the course, you should better understand data science's role in making meaningful insights from the complex and large sets of data all around us.