Data Analytics, Artificial Intelligence, FinTech and BlockChain

Machine Learning – Introduction to Its Algorithms – MLAlgos

MLAlgos (Machine Learning Algorithms) – Describing and picturising machine learning algorithms is the main idea of this post. We will attempt to answer a few basic questions as well. Though these questions have been answered many a time in the past and are widely available on an open internet. Answering them again here from my very own experience on the ground may make the difference compared to simply answering them from PhD scholar books material perspective.

Points Covered in this Post:

This post is limited to below index items. The coverage of the points is from a layman perspective at a high level in simple English. Anyone looking for detailed explanations or codes should get in touch with me. This post is divided into 3 parts as below:

Machine Learning at a Glance

Types of Machine Learning

Algorithms in Machine Learning – MLAlgos

From the policy standpoint, we never share any code unless it’s an open source code or link to the published work. Some of you might wonder why we need to know all of them or what exactly is the meaning of each term. We have gathered all the information at one place i.e. in this blog post.

Machine Learning at a Glance

Machine learning is subset to Artificial Intelligence which borrows principles from computer science. It is not an AI though; It is a focal point where business, data and experience meet emerging technology and decides to work together. Machine learning is a way to achieve AI. ML help us to get to speed up and understand how Artificial Intelligence will impact global business.The role of machine learning and deep learning in healthcare, transportation, customer service, manufacturing and financial services.

The landscape of Artificial Intelligence as it stands today, gravitating more towards machine learning and deep learning. Machine learning of today is helping the organisation to devise a strategy to move the forward with a focus on the company’s most pressing points i.e. on its business & revenue growth. As on date supervised learning is the king or have kind of monopoly in machine learning domain but with advancements of machine learning towards deep learning, days are not far when unsupervised learning will become far more important or only success factor in the future.

Machine Learning entirely depend upon algorithms of two kinds i.e. learning style and symmetry & similarity. Machine learning now becoming the hottest subject, DataScientist is now the sexiest job of today but the implementation in real life business is not to the level of its hype. The real need for today’s time is to demonstrate, clarify and extract real values and benefit to the business that everyone can enjoy. Why ML is so good today? there are a couple of reasons for same some of them are listed below:

The explosion of big data

Hunger for new business and revenue streams in this business shrinking times

Advancements in machine learning algorithms

Development of extremely powerful machine with high capacity & faster computing ability

Storage capacity

There are success stories where organisations have made remarkable progress and value-adds for business with each type of learning. Making the right choice on which ml techniques to be used for a particular business problem requires experience and a thorough understanding.

Each type of machine learning provides a strategic and competitive advantage but the availability of quality data basis which technique is chosen is far more important. Types of machine learning algorithms and which one to be used when is extremely important to know. The goal of any machine learning task and all the things that are being done in the field that puts you in a better position is to break down a real problem in design form for machine learning systems.

Until recent time ML remained largely confined to academia even the foundation was laid down in 1950. Only recently it got spotlight and attention from the industry. Machine learning use cases like face recognition, image captioning, voice & text processing and self-driving cars now everyone talks about.

Types of Machine Learning

Before we get into MLAlgos lets understand some basics here. The approach of developing ML includes learning from data inputs based on “What has happened”. Evaluating and optimising different model results remains focus here. As on date Machine Learning is widely used in data analytics as a method to develop algorithms for making predictions on data. It is related to probability, statistics, and linear algebra.

Machine Learning is classified into four categories at a high level depending on the nature of the learning and learning system. Semi-supervised learning is actually very interesting of them all:

Supervised learning:Supervised learning gets labelled inputs and their desired outputs. The goal is to learn a general rule to map inputs to the output.

Reinforcement learning: In this algorithm interacts with a dynamic environment, and it must perform a certain goal without a guide or teacher.

Semi-supervised Learning: This type of ml i.e. semi-supervised algorithms are the best candidates for the model building in the absence of labels for some data. So if data is a mix of label and un-label then this can be the answer. Typically a small amount of labelled data with a large amount of unlabeled data is used here.

ML also has a very close relationship to statistics; which can be called as a graphical branch (From data representation point of view) of mathematics. It instructs an algorithm to learn for itself by analyzing data. The more data it processes, the smarter the algorithm gets.

Some of the popular Machine Learning Algorithms (MLAlgos)

Linear Regression – Simple Linear Regression- there is only an independent variable. Multiple Linear Regression- refers to defining a relationship between independent and dependent variables

Logistic Regression – A super simple form of regression analysis in which the outcome variable is binary or dichotomous. Helps to estimate adjusted prevalence rates, adjusted for potential confounders (sociodemographic or clinical characteristics)

Linear Discriminant Analysis –A generalization of Fisher’s linear discriminant, a method used in statistics, pattern recognition and machine learning to find a linear combination of features that characterizes or separates two or more classes of objects or events.

Classification and Regression Trees- Decision trees are are an important type of algorithm for predictive modelling machine learning. A greedy algorithm based on divide and conquer rule. Split the records based on an attribute test that optimizes certain criterion. The real value is in determining how to split the records.

Naive Bayes – Naive Bayes classifiers are a family of simple probabilistic classifiers based on applying Bayes’ theorem with strong (naive) independence assumptions between the features.

K-Nearest Neighbors – The laziest algorithm which is also a very simple algorithm that stores all available cases and predict the numerical target based on a similarity measure. In the beginning of 1970s as a non-parametric technique, KNN has been used in statistical estimation and pattern recognition already.

Learning Vector Quantization- It has aim i.e. representation of large amounts of data by (few) prototype vectors by identification and grouping in clusters of similar data.

Support Vector Machines-A Support Vector Machine (SVM) is a supervised machine learning algorithm. This can be employed for both classification and regression purposes. SVMs are more commonly used in classification problems and as such.

Bagging and Random Forest- Bagging, in general, is an acronym like work that is a portmanteau of Bootstrap and aggregation. In general by taking a bunch of bootstrapped samples of from original dataset; fit models will be mainly be all bb model predictions this is bootstrap aggregation i.e. Bagging.

The fundamental difference between bagging and the random forest is that in Random forests, only a subset of features are selected at random out of the total and the best split feature from the subset is used to split each node in a tree, unlike in bagging where all features are considered for splitting a node.” Does that mean that bagging is the same as random forest if only one explanatory variable (predictor) is used as input?

Boosting and AdaBoost- Boosting is a machine learning ensemble meta-algorithm for primarily reducing bias, and also variance in supervised learning, and a family of machine learning algorithms which convert weak learners to strong ones. Algorithms that achieve hypothesis boosting quickly became simply known as “boosting”.

Q–learning – It’s a model-free reinforcement learning technique. It is able to compare the expected utility of the available actions (for a given state) without requiring a model of the environment.

Learning Process Machine & Human

The learning process for a human child or new machine algorithm is the same regardless of the fact that something is made up of bones and flash or wires and metal. The basic learning process is similar. It can be divided into four interrelated components:

Data storage

Information abstraction from stored data

Generalization of information abstracted from the data.

Evaluation of each piece of information and making use of the same.

I will talk about each type of machine learning and respective algorithms in a detailed manner in my upcoming “Machine Learning Series” post from next week.

AA

Points to Note:

All credits if any remains on the original contributor only. We have covered a few machine learning algorithms at a high level in this post. Our last posts on Supervised Machine Learning and Unsupervised Machine Learning got some decent feedback. Our next post will talk about Reinforcement Learning — Markov Decision Processes

Books & Other Material Referred

Feedback & Further Question

Do you have any questions about Deep Learning or Machine Learning? Leave a comment or ask your question via email. Will try my best to answer it.

AA

Conclusion – Point to note here is that AI is much more than ML. I particularly think that getting to know the types of MLAlgos actually helps to see a somewhat clear picture of AI. The answer to the question “What machine learning algorithm should I use?” is always “It depends.” It depends on the size, quality, and nature of the data. It depends on what you want to do with the answer.

It depends on how the math of the algorithm was translated into instructions for the computer you are using. And it depends on how much time you have. Only recently machine learning got spotlight and attention from the industry. Machine learning use cases like face recognition, image captioning, voice & text processing and self-driving cars now everyone talks about.

That’s a quite comprehensive post on Machine Learning and AI along with supporting infographics. Good and informative for the ones wishing to add on their knowledge on ML Algos and AI.
What machine learning algorithm should we use? This depends on the size, quality and nature of the data. So, true!