Slideshare uses cookies to improve functionality and performance, and to provide you with relevant advertising. If you continue browsing the site, you agree to the use of cookies on this website. See our User Agreement and Privacy Policy.

Slideshare uses cookies to improve functionality and performance, and to provide you with relevant advertising. If you continue browsing the site, you agree to the use of cookies on this website. See our Privacy Policy and User Agreement for details.

5.
Supervised vs. Unsupervised Learning 5
Unsupervised learning (clustering)
The class labels of training data is unknown
Given a set of measurements, observations, etc. with the
aim of establishing the existence of classes or clusters in
the data
Supervised learning (classification)
Supervision: The training data (observations,
measurements, etc.) are accompanied by labels indicating
the class of the observations
New data is classified based on the training set
Prof. Pier Luca Lanzi

7.
A model extracted from 7
the contact lenses data
If tear production rate = reduced then recommendation = none
If age = young and astigmatic = no
and tear production rate = normal then recommendation = soft
If age = pre-presbyopic and astigmatic = no
and tear production rate = normal then recommendation = soft
If age = presbyopic and spectacle prescription = myope
and astigmatic = no then recommendation = none
If spectacle prescription = hypermetrope and astigmatic = no
and tear production rate = normal then recommendation = soft
If spectacle prescription = myope and astigmatic = yes
and tear production rate = normal then recommendation = hard
If age young and astigmatic = yes
and tear production rate = normal then recommendation = hard
If age = pre-presbyopic
and spectacle prescription = hypermetrope
and astigmatic = yes then recommendation = none
If age = presbyopic and spectacle prescription = hypermetrope
and astigmatic = yes then recommendation = none
Prof. Pier Luca Lanzi

10.
What is classification? 10
It is a two-step Process
Model construction
Given a set of data representing examples of
a target concept, build a model to “explain” the concept
Model usage
The classification model is used for classifying
future or unknown cases
Estimate accuracy of the model
Prof. Pier Luca Lanzi

15.
Machine Learning perspective 15
on classification
Classification algorithms are methods of supervised Learning
In Supervised Learning
The experience E consists of a set of examples of a target
concept that have been prepared by a supervisor
The task T consists of finding an hypothesis that
accurately explains the target concept
The performance P depends on how accurately the
hypothesis h explains the examples in E
Prof. Pier Luca Lanzi

16.
Machine Learning perspective 16
on classification
Let us define the problem domain as the set of instance X
For instance, X contains different fruits
We define a concept over X as a function c which maps
elements of X into a range D
c:X→ D
The range D represents the type of concept that is going to
be analyzed
For instance, c: X → {apple, not_an_apple}
Prof. Pier Luca Lanzi

17.
Machine Learning perspective 17
on classification
Experience E is a set of <x,d> pairs, with x∈X and d∈D.
The task T consists of finding an hypothesis h to explain E:
∀x∈X h(x)=c(x)
The set H of all the possible hypotheses h that can be used to
explain c it is called the hypothesis space
The goodness of an hypothesis h can be evaluated as the
percentage of examples that are correctly explained by h
P(h) = | {x| x∈X e h(x)=c(x)}| / |X|
Prof. Pier Luca Lanzi

18.
Examples 18
Concept Learning
when D={0,1}
Supervised classification
when D consists of a finite number of labels
Prediction
when D is a subset of Rn
Prof. Pier Luca Lanzi

19.
Machine Learning perspective 19
on classification
Supervised learning algorithms, given the examples in E,
search the hypotheses space H for the hypothesis h that best
explains the examples in E
Learning is viewed as a search in the hypotheses space
Prof. Pier Luca Lanzi

20.
Searching for the hypothesis 20
The type of hypothesis required influences the search
algorithm
The more complex the representation
the more complex the search algorithm
Many algorithms assume that it is possible to define a partial
ordering over the hypothesis space
The hypothesis space can be searched using either a general
to specific or a specific-to-general strategy
Prof. Pier Luca Lanzi

21.
Exploring the Hypothesis Space 21
General to Specific
Start with the most general hypothesis and then go on
through specialization steps
Specific to General
Start with the set of the most specific hypothesis and then
go on through generalization steps
Prof. Pier Luca Lanzi

22.
Inductive Bias 22
Set of assumptions that together with the training data
deductively justify the classification assigned by the learner
to future instances
Prof. Pier Luca Lanzi

23.
Inductive Bias 23
Set of assumptions that together with the training data
deductively justify the classification assigned by the learner
to future instances
There can be a number of hypotheses consistent with
training data
Each learning algorithm has an inductive bias that imposes a
preference on the space of all possible hypotheses
Prof. Pier Luca Lanzi

24.
Types of Inductive Bias 24
Syntactic Bias, depends on the language used to represent
hypotheses
Semantic Bias, depends on the heuristics used to filter
hypotheses
Preference Bias, depends on the ability to rank and compare
hypotheses
Restriction Bias, depends on the ability to restrict the search
space
Prof. Pier Luca Lanzi

25.
Why looking for h? 25
Inductive Learning Hypothesis: any hypothesis (h) found to
approximate the target function (c) over a sufficiently large
set of training examples will also approximate the target
function (c) well over other unobserved examples.
Prof. Pier Luca Lanzi

26.
Trainining and testing 26
Training: the hypothesis h is developed to explain the
examples in Etrain
Testing: the hypothesis h is evaluated (verified) with respect
to the previously unseen examples in Etest
Prof. Pier Luca Lanzi

27.
Generalization and Overfitting 27
The hypothesis h is developed based on a set of training
examples Etrain
The underlying hypothesis is that if h explains Etrain then it
can also be used to explain other examples in Etest not
previously used to develop h
Prof. Pier Luca Lanzi

28.
Generalization and Overfitting 28
When h explains “well” both Etrain and Etest we say that h is general
and that the method used to develop h has adequately generalized
When h explains Etrain but not Etest we say that the method used to
develop h has overfitted
We have overfitting when the hypothesis h explains Etrain too
accurately so that h is not general enough to be applied outside
Etrain
Prof. Pier Luca Lanzi

29.
What are the general issues 29
for classification in Machine Learning?
Type of training experience
Direct or indirect?
Supervised or not?
Type of target function and performance
Type of search algorithm
Type of representation of the solution
Type of Inductive bias
Prof. Pier Luca Lanzi

30.
Summary 30
Classification is a two-step process involving the building,
the testing, and the usage of the classification model
Major issues for Data Mining include:
The type of input data
The representation used for the model
The generalization performance on unseen data
In Machine Learning, classification is viewed as
an instance of supervised learning
The focus is on the search process aimed at finding the
classifier (the hypothesis) that best explains the data
Major issues for Machine Learning include:
The type of input experience
The search algorithm
The inductive biases
Prof. Pier Luca Lanzi