Deep Learning and the Artificial Intelligence Revolution: Part 2

Sign up for our monthly newsletter

Thanks for subscribing to the newsletter. This is your channel for getting the latest notifications of server and driver releases, details on local MongoDB events, updates on education programs, and the skinny on all things MongoDB.

The MongoDB Engineering Journal

In part 1 we looked at the history of AI, and why it is taking off now

In today’s part 2, we will discuss the differences between AI, Machine Learning, and Deep Learning

In part 3, we’ll dive deeper into deep learning and evaluate key considerations when selecting a database for new projects
We’ll wrap up in part 4 with a discussion on why MongoDB is being used for deep learning, and provide examples of where it is being used

In many contexts, artificial intelligence, machine learning and deep learning are used interchangeably, but in reality, machine and deep learning is a subset of AI. We can think of AI as the branch of computer science focused on building machines capable of intelligent behaviour, while machine and deep learning is the practice of using algorithms to sift through data, learn from the data, and make predictions or take autonomous actions. Therefore, instead of programming specific constraints for an algorithm to follow, the algorithm is trained using large amounts of data to give it the ability to independently learn, reason, and perform a specific task.

So what’s the difference between machine learning and deep learning? Before defining deep learning – which we’ll do in part 3, let’s dig deeper into machine learning.

Machine Learning: Supervised vs. Unsupervised

There are two main classes of machine learning approaches: supervised learning and unsupervised learning.

Supervised Learning. Currently, supervised learning is the most common type of machine learning algorithm. With supervised learning, the algorithm takes input data manually labeled by developers and analysts, using it to train the model and generate predictions. Supervised learning can be delineated into two groups: regression and classification problems.

Figure 2 demonstrates a simple regression problem. Here, there are two inputs, or features (square feet and price), that are used to generate a curve fitting line and make subsequent predictions of property price.

Figure 3 is an example of a supervised classification example. The dataset is labeled with benign and malignant tumors for breast cancer patients. The supervised classification algorithm will attempt to segment tumors into two different classifications by fitting a straight line through the data. Future data can then be classified as benign or malignant based on the straight-line classification. Classification problems result in discrete outputs, though that does not necessarily constrain the number of outputs to a fixed set. Figure 3 has only two discrete outputs, but there could be many more classifications (benign, Type 1 malignant, Type 2 malignant, etc.)

Unsupervised Learning. In our supervised learning example, labeled datasets (benign or malignant classifications) help the algorithm determine what the correct answer is. With unsupervised learning, we give the algorithm an unlabeled dataset and depend on the algorithm to uncover structures and patterns in the data.

In Figure 4, there is no information about what each data point represents, and so the algorithm is asked to find structure in the data independently of any supervision. Here, the unsupervised learning algorithm might determine there are two distinct clusters and make a straight-line classification between the clusters. Unsupervised learning is broadly applied in many use cases such as Google News, social network analysis, market segmentation, and astronomical analysis around galaxy formations.

Wrapping Up Part 2

That wraps up the second part of our 4-part blog series. In Part 3, we’ll dive deeper into deep learning and evaluate key considerations when selecting a database for new projects