>>> By enrolling in this course you agree to the End User License Agreement as set out in the FAQ. Once enrolled you can access the license in the Resources area <<<
This course, Advanced Machine Learning and Signal Processing, is part of the IBM Advanced Data Science Specialization which IBM is currently creating and gives you easy access to the invaluable insights into Supervised and Unsupervised Machine Learning Models used by experts in many field relevant disciplines. We’ll learn about the fundamentals of Linear Algebra to understand how machine learning modes work. Then we introduce the most popular Machine Learning Frameworks for python Scikit-Learn and SparkML. SparkML is making up the greatest portion of this course since scalability is key to address performance bottlenecks. We learn how to tune the models in parallel by evaluating hundreds of different parameter-combinations in parallel. We’ll continuously use a real-life example from IoT (Internet of Things), for exemplifying the different algorithms. For passing the course you are even required to create your own vibration sensor data using the accelerometer sensors in your smartphone. So you are actually working on a self-created, real dataset throughout the course.
If you choose to take this course and earn the Coursera course certificate, you will also earn an IBM digital badge. To find out more about IBM digital badges follow the link ibm.biz/badging.

教學方

Romeo Kienzler

Nikolay Manchev

腳本

So, welcome to week three; Unsupervised Learning. So, Unsupervised Learning is mostly all about distances between point clouds. Therefore, let's have a look at distances in hyper-dimensional vector spaces. Consider this three points in this 3D space. So, how can we actually measure distance between those points. So, in one dimension it's pretty straightforward. So, imagine we have a point x1 at the position two and the point x2 at position 10, then we can basically just calculate the distance by subtracting x1 from x2. If you want to subtract x2 from x1, we have to take the absolute value. In order to make it more simple you just take any point subtracted from it other, square it, and take the square root. This will also come in handy later. So, now, what happens in 2D space. In 2D space it's a bit more complicated. It's actually the distance is the square root of x1 minus x2 to the power of two plus y1 minus y2 to the power of two. So, this might look familiar because it's basically the Pythagorean theorem. So, if you consider c squared and you take the square root on the left and on the right-hand side, you will get c equals square root of a to the power of two plus b to the power of two, and we can actually compute distances between points in vector spaces of any dimension. So, this is the basic formula and you can also write it as a sum. So, this is called the Euclidean distance, that's the one we are mostly using. I will just give you an example of another distance measure not because you need it but because it gives you intuition that there exist some other distance measures which might be useful in certain use cases. So, in Manhattan distance, we consider distance between two points reachable via a taxi in New York City downtown, and that means we cannot take the direct connection, we have to follow a square, and it turns out it doesn't matter which path you take as long as you are taking the shortest path. So, there exist multiple shortest path between two points in Manhattan distance. So, Manhattan distance, for example, is a bit more outlier resistant than Euclidean distance and it's better suited for high dimensional data but all this is beyond the scope of this course. So, actually, unsupervised machine learning is mostly all about computing distances between points in hyper-dimensional vector space. This actually also holds for supervised machine learning but the fact that you have a target or labor makes it more easy to ignore this fact.