Data mining and analytics

Keyboard Shortcuts

Learn how data mining and analytics work by analyzing different aspects of these two fascinating disciplines of data science. Explore core areas of data mining and analytics, such as text retrieval, classification, prediction, and clustering.

- [Instructor] Data mining and analytics involve…a myriad of data manipulation techniques.…Text retrieval is one of the most well-known…data mining techniques.…It builds on many foundational concepts and methods…developed by Natural Language Processing, or NLP.…Classification constructs a model…that labels a group of data objects…into a specific category.…In the classification model,…the classes with their own labels are discrete in nature.…

For instance, the same classification model can categorize…people into groups of trustworthy and untrustworthy users…of an online banking system.…Prediction builds a model that produces continuous…or ordered values that form a trend.…For instance, a prediction model can provide…estimated mean time to failure or MTTF values…for a computer.…Clustering is a process of grouping similar data objects…into a class.…

Clustering helps reveal features that distinguish…one class of data objects from the other,…leading to new discoveries on a dataset.…Uses of clustering analysis range from pattern recognition…

Resume Transcript Auto-Scroll

Author

Released

1/26/2018

The career opportunities in data science, big data, and data analytics are growing dramatically. If you're interested in changing career paths, determining the right course of study, or deciding if certification is worth your time, this course is for you.

Jungwoo Ryoo is a professor of information science and technology at Penn State. Here he reviews the history of data science and its subfields, explores the marketplaces for these fields, and reveals the five main skills areas: data mining, machine learning, natural language processing (NLP), statistics, and visualization. This leads to a discussion of the five biggest career opportunities, the six leading industry-recognized certifications available, and the most exciting emerging technologies. Along the way, Jungwoo discusses the importance of ethics and professional development, and provides pointers to online resources for learning more.

Topics include:

A history of data science

Why data analytics is important

How data science is used in fraud detection, disease control, network security, and other fields