Problem Set 5

Assigned: April 1
Due: April 15.

Problem 1

Machine learning algorithms can be very sensitive to how values of
attributes are grouped together. Naive Bayes is one of the most sensitive.

Amy and Barbara are separately researching the question of what very
rich people like to drink. They are both working from the same data set.
An instance in this data set is a person. There are three attributes:

1. The make of the most expensive car the person owns.

2. Whether or not the person owns a yacht.

3. The person's favorite beverage.

Amy divides the beverages into two categories: alcoholic and non-alcoholic.
She then applies the Naive Bayes algorithm to predict the category of
drink for an individual with the predicitive values "Owns a BMW" and
"Owns a yacht." The result of her calculation is that, given that a person
owns a BMW and a yacht, the probability is more than 90 per cent that
his/her favorite drink is non-alcoholic.

Barbara divides the beverages into two categories: champagne and everything
else. She applied Naive Bayes in the same way. The results of her
calculation is that, given that a person
owns a BMW and a yacht, the probability is more than 90 per cent that
his/her favorite drink is champagne.

Describe a data set that supports both of these conclusions.
(The kind of "description" you're looking for would be something like,
"There are 100 instances of people who own a BMW, own a yacht and drink
champagne; 3 instances of people who don't own a BMW, own a yacht, and
drink beer;" etc.)

Problem 2

A. Let D be a data set with three predictive attributes: P, Q, and R and one
classification attributes C. Attributes P, Q, and C are Boolean.
Attribute R has three values:
1, 2, and 3. The data is as follows

P

Q

R

C

Number of instances.

Y

Y

1

N

20

Y

Y

2

Y

1

Y

Y

3

Y

2

Y

N

1

Y

8

Y

N

2

Y

2

Y

N

3

Y

0

N

Y

1

N

12

N

Y

2

Y

2

N

Y

3

Y

1

N

N

1

Y

25

N

N

2

Y

1

N

N

3

Y

4

Trace the execution of the ID3 algorithm, and show the decision tree
that it outputs.
At each stage, you should
compute the average entropy AVG\_ENTROPY(A,C,T) for each attribute A.
(The book calls this "Remainder(A)" (p. 660).)