Do you have data and wonder what it can tell you? Do you need a deeper understanding of the core ways in which machine learning can improve your business? Do you want to be able to converse with specialists about anything from regression and classification to deep learning and recommender systems?
In this course, you will get hands-on experience with machine learning from a series of practical case-studies. At the end of the first course you will have studied how to predict house prices based on house-level features, analyze sentiment from user reviews, retrieve documents of interest, recommend products, and search for images. Through hands-on practice with these use cases, you will be able to apply machine learning methods in a wide range of domains.
This first course treats the machine learning method as a black box. Using this abstraction, you will focus on understanding tasks of interest, matching these tasks to machine learning tools, and assessing the quality of the output. In subsequent courses, you will delve into the components of this black box by examining models and algorithms. Together, these pieces form the machine learning pipeline, which you will use in developing intelligent applications.
Learning Outcomes: By the end of this course, you will be able to:
-Identify potential applications of machine learning in practice.
-Describe the core differences in analyses enabled by regression, classification, and clustering.
-Select the appropriate machine learning task for a potential application.
-Apply regression, classification, clustering, retrieval, recommender systems, and deep learning.
-Represent your data as features to serve as input to machine learning models.
-Assess the model quality in terms of relevant error metrics for each task.
-Utilize a dataset to fit a model to analyze new data.
-Build an end-to-end application that uses machine learning at its core.
-Implement these techniques in Python.

MK

A great course, really designed to understand the underlying core concepts of machine learning using real-life examples which takes you through all that with little to no programming skills required!

DS

Sep 28, 2015

Filled StarFilled StarFilled StarFilled StarFilled Star

Excellent course, with really good lectures, material and assignment. Plus the professors are really amazing and their enthusiasm is really refreshing and makes the class more interesting. Loved it!

From the lesson

Clustering and Similarity: Retrieving Documents

A reader is interested in a specific news article and you want to find a similar articles to recommend. What is the right notion of similarity? How do I automatically search over documents to find the one that is most similar? How do I quantitatively represent the documents in the first place?<p>In this third case study, retrieving documents, you will examine various document representations and an algorithm to retrieve the most similar subset. You will also consider structured representations of the documents that automatically group articles by similarity (e.g., document topic).</p>You will actually build an intelligent document retrieval system for Wikipedia entries in an iPython notebook.

Taught By

Carlos Guestrin

Amazon Professor of Machine Learning

Emily Fox

Amazon Professor of Machine Learning

Transcript

[MUSIC] So the first thing we need to describe is how are we gonna represent the documents that we're looking at. Okay, so perhaps the most popular model to represent a document is something called the bag of words model, where we simply ignore the order of words that are present in the document. And the reason it's called a bag of words model is we think of taking a bag, throwing all the words from that document into the bag, shaking it up, and the new document we've created with the words all jumbled up has exactly the same representation as I'll describe as the original document where the words were ordered. And what we're gonna do, instead of considering the structure, the order of the words, is we're simply gonna count the number of instances of every word in the document. So let's look at a specific example of this. So in this document we're gonna imagine that there's just one sentence, and that sentence says that Carlos calls the sport futbol, Emily calls the sport soccer. Actually I guess that's really two sentences, but that's the entire document. And what we're gonna do to count the number of instances of words in this very short document is we're just gonna look at a vector. And this vector is defined over the vocabulary in whatever language we're looking at. So maybe one word in our vocabulary is the name, Carlos. Another place in this vector is the index for the word sport. And then, somewhere else we have the word futbol luckily in our English vocabulary that I'm writing here and then let's say, Emily is this last entry. What words are we missing? We're missing the word calls, and of course the word the. Okay, so how many instances of Carlos? Well, there's only one. How many instances of the? We have two of the. Two of calls, two of sports, one of futbol and I forgot the word soccer. One word of Emily and let just throw soccer in here, imagining this was the index and this would be our word count factor with. For this document, every other entry would be zero. And all these other entries represent all the other words that are out there in the vocabulary, like the word cat, and dog, and tree, and every other word you can think of. So it's a very, very long and sparse vector that counts the number of words that we see in this document. Okay, so we talked about this representation of our documents in terms of just these raw word counts. This bag of words model. And now we want to talk about how we're gonna measure the similarity between different documents because we're gonna use that in order to find documents that are related to one another and so on, like we talked about before. Carlos is reading an article, so what’s another article he might be interested in? Okay, so imagine that this is the count factor that we have for this article on soccer, with this famous Argentinian player, Messi. And then there's another article here that I'm showing in blue and the associated word counts. And this article is about another famous soccer player, Pele. Is that right? >> Pele. >> Pele. [LAUGH] So when we think about measuring similarity, what we can do is simply look at an element-wise product over this vector. So for every element in the vector, we're gonna multiply the two elements appearing in these two different count vectors. And add up over all the different elements in this vector. So here I've done this math where we have 1 times 3, all the other elements multiplied to 0, except at some point that fifth entry in the vector we have 5 times 2. And if we do this multiplication over the whole vector, the sum of these terms is 13. So that measures the similarity between these two articles on soccer. But now let's compare to another article, which happens to be something about a conflict in Africa. And so I'm providing the examples of word counts that appear in this article. And what we see, is when we go to measure the similarities between these articles, using the method that I described of element-wise product, and then adding, that the similarity here in this case is 0. Okay, so let's talk about an issue that arises when we use these raw word counts to measure the similarity between documents. So to do this, let's look at these green and blue articles that we had before. And so I'm repeating the word count vectors that we had, and what we calculated before was the fact that the similarity between these two articles that are both about soccer is 13. Okay, but now let's look at what happens if we simply double the length of the documents. So now every word that appeared in that original document appears twice in this twice as long document. So, the word count vector is simply two times the word count vector we had before. So, when we go to calculate the similarity here, what we see is now the similarity is calculated to be 52. So, let's think about this. What we're saying is that two documents that are related to each other in the same way as before. They're both talking about the same two sports, but one just is replicated twice is a lot more similar. We would say, yes, Carlos is a lot more interested in this longer document. Then what happened when Carlos was reading the shorter documents. So this doesn't make a lot of sense when we're trying to do document retrieval. And it biases very strongly towards long documents. So let's think about how we can cope with this. So one solution is very straight forward where we're simply gonna normalize this vector. So we take this word count vector and we're gonna compute the norm of the vector. And so if you guys remember computing the norm of a vector, we simply add the square of every entry in the vector, and then take the square root. So, in this case, we have the square root of 1 squared plus 5 squared plus 3 squared plus 1 squared. And that happens to be the number and so, the resulting normalize word count vector is shown on the bottom of the slide here. And what this does is, it allows us to place all of our articles that we're considering, regardless of their length, on equal footing. And then use this normalized vector when we go to do retrieval. [MUSIC]

Explore our Catalog

Join for free and get personalized recommendations, updates and offers.