CS188 Artificial Intelligence_ Fall 2010 Written 3 Bayes Nets

CS188: Artiﬁcial Intelligence, Fall 2010
Written 3: Bayes Nets, VPI, and HMMs
Due: Tuesday 11/23 in 283 Soda Drop Box by 11:59pm (no slip days)
Policy: Can be solved in groups (acknowledge collaborators) but must be written up individually.
1 Scouting Information (13 points)
In StarCraft, you choose what types of units to build, and so does your
opponent. (Knowledge of the actual game of StarCraft will not make this W C U (W, C)
problem easier or harder.) As in rock-paper-scissors, some units are strong S P(S) +w 4000
against others, so if you scout your opponent, you can build the unit type t 0.8 +w b 5000
which counters their choice. Consider a simple version of the game where m 0.2 ¬w 0
you (the Zerg Overmind) are playing against a Terran. Your opponent’s ¬w b 500
strategy (S) is either to build tanks (S = t) or of Marines (S = m).
You must create an army to counter (C) their strategy. Mutalisks
(C = ) are strong against Tanks, while Banelings (C = b) are strong W S C P (W |S, C)
against Marines. If you construct the right counter, you will probably +w t 1
win (W = +w), but if you don’t, you’ll likely lose (W = ¬w). Even ¬w t 0
before you scout, you know that your opponent is somewhat fond of Tanks. +w t b 0.5
¬w t b 0.5
To make things more interesting, you are actually a professional Starcraft +w m 0.25
player, and so you aren’t just trying to win: you’re trying to make money! ¬w m 0.75
Therefore, you want to maximize your utility U , which we’ll measure in +w m b 1
dollars. Winning always carries a prize, but if you win by doing the strategy ¬w m b 0
the crowd wants (Banelings!) then you’ll receive an even larger reward from
endorsements. Using all this information, you’ve created the tables to the
right.
(a) (2 pts) Draw the decision network over the variables U , C, W , and S.
(b) (2 pts) Given the tables above, what are the expected utilities for your possible actions (C = or C = b)?
What is the MEU? Make sure all three numbers are prominently visible in your answer.
1
Now you must decide whether or not to scout your opponent’s base. Scouting will F S P (F |S)
reveal what is at the front (F ) of their base. If your opponent is going for Tanks r t 0.5
(S = t), you might see a few Marines (F = r), or you might see nothing at all φ t 0.5
(F = φ). However, if their strategy is to build Marines (S = m), then you’ll r m 1.0
deﬁnitely see Marines (F = r). You create the following probability distributions: φ m 0.0
(c) (1 pt) Augment your existing drawing in part (a) above with the variable F . Be sure to include any arc(s).
Doing some calculations, you compute the expected utilities E[U (W, C = c)|F = f ] for each of the possible
values of scouting information F and counter-strategies C. You also compute marginal probabilities P (F = f ):
F C E[U (W, C)|F = f ]
F P(F) r 3000
r 0.6 r b 3500
φ 0.4 φ 4000
φ b 2750
(d) (2 pts) Compute the VPI of observing F , given no other observations.
(e) (1 pt) Now suppose that you have already observed that your opponent is using Tanks. What is the value
of perfect information of observing F given S = t: VPI(F |S = t).
(f ) (1 pt) What property of Bayesian networks explains the answer to (e)?
2
Finally, there’s even more information to consider! If your opponent is going for tanks (S = t), you will likely
see then harvesting gas early (G = +g), while for marines (S = m) you likely won’t (G = −g). Therefore, you
extend your model with a probability distribution P (G|S) and assume that G is conditionally independent of
all other variables given S.
(g) (1 pt) You would like to do inference in this full decision network using variable elimination. However,
the U node denotes a function and not a random variable. To formally convert your decision network into a
Bayes net, state the conditional probability distribution P (U |C, W ), over values of U , that encodes the same
information as the current utility function U .
You wish to compute P (U |G = +g, F = r), the distribution over utility values given that you have observed a
Marine in the Front of the Base and gas being harvested early.
(h) (2 pts) You start by eliminating S. What factors will you need to join and what factor will be created after
performing the join, but before summing out S?
Factors to join:
Resulting factor:
(i) (1 pt) For any variable, X, let |X| denote the size of X’s domain (the number of values it can take). How
large is the table for the result factor from part (h), again before summing out S ?
3
2 HMM: Search and Rescue (12 points)
You are an interplanetary search and rescue expert who has just received an urgent message: a rover on Mercury
has fallen and become trapped in Death Ravine, a deep, narrow gorge on the borders of enemy territory. You
zoom over to Mercury to investigate the situation.
Death Ravine is a narrow gorge 6 miles long, as shown below. There are volcanic vents at locations A and D,
indicated by the triangular symbols at those locations.
A B C D E F
! !
The rover was heavily damaged in the fall, and as a result, most of its sensors are broken. The only ones still
functioning are its thermometers, which register only two levels: hot and cold. The rover sends back evidence
E = hot when it is at a volcanic vent (A and D), and E = cold otherwise. There is no chance of a mistaken
reading.
The rover fell into the gorge at position A on day 1, so X1 = A. Let the rover’s position on day t be
Xt ∈ {A, B, C, D, E, F }. The rover is still executing its original programming, trying to move 1 mile east (i.e.
right, towards F) every day. However, because of the damage, it only moves east with probability 0.5, and it
stays in place with probability 0.5. Your job is to ﬁgure out where the rover is, so that you can dispatch your
rescue-bot.
(a) (2 pt) Three days have passed since the rover fell into the ravine. The observations were (E1 = hot, E2 =
cold, E3 = cold ). What is P (X3 |hot1 , cold2 , cold3 ), the probability distribution over the rover’s position on day
3, given the observations?
You decide to attempt to rescue the rover on day 4. However, the transmission of E4 seems to have been
corrupted, and so it is not observed.
(b) (2 pt) What is the rover’s position distribution for day 4 given the same evidence, P (X4 |hot1 , cold2 , cold3 )?
4
All this computation is taxing your computers, so the next time this happens you decide to try approximate
inference using particle ﬁltering to track the rover.
(c) (2 pt) If your particles are initially in the top conﬁguration shown below, what is the probability that
they will be in the bottom conﬁguration shown below after one day (after time elapses, but before evidence is
observed)?
A B C D E F
00
11 000
111 00
11
00
11
00
000
111 00
11
11
00
000
111 00
11
11 000
111 00
11
00
11 000
111 00
11
! !
A B C D E F
000
111 00
11 00
11
000
111 00
11 00
11
000
111
000
00
11 00
11
111 00
11 00
11
! !
A B C D E F
00
11 00
11 00
11
000
111 00
11 00
11 00 00
11 11
00 00
000
111 00
11 00
11 11 11
(d) (2 pt) If your particles are initially in the conﬁguration: 000
111
000
111
00
11
00
11
00
11 00
11
00
11
00
11
00
11
00
11
000
111
000
111
00
11 000
111 00
11 00
11 00 000
11 111
00
11 00
11 000
111
000
111
00
11
00
11
00
11
00
11
00
11 000
111 00
11 00
11
00
11
! !
and the next observation is E = hot, what is the probability that ALL will be at location D after this observation
has been taken into account? Assume that the number of particles is ﬁxed at all times.
(e) (2 pt) Your co-pilot thinks you should model P (E|X) diﬀerently. Even though the sensors are not noisy,
she thinks you should use P (hot| no volcanic vent) = , and P (cold | volcanic vent) = , meaning that a hot
reading at a non-volcanic location has a small probability, and a cold reading at a volcanic location has a small
probability. She performs some simulations with particle ﬁltering, and her version does seem to produce more
accurate results, despite the false assumption of noise. Explain brieﬂy why this could be.
(f ) (2 pt) The transition model (east: 0.5 and stay: 0.5) turns out to be an oversimpliﬁcation. The rover’s
position Xt is actually determined by both its previous position Xt−1 and also its current velocity, Vt , which
randomly drifts up and down from the previous day’s velocity. Draw the dynamic Bayes net that represents
this reﬁned model. You do not have to specify the domains of V, X, or E, just the structure of the net.
5
3 Credit Card Fraud Detection (practice, not graded)
You are building a fraud detection system for a credit card company. They have the following records of
purchases for which the fraud status is known; here A is a coarse amount of a purchase, B is the kind of
business purchased from, C is the country (domestic or foreign), and F is whether the transaction is fraudulent.
A B C F
cheap candy domestic legit
cheap jewelry foreign fraud
medium bike domestic legit
expensive jewelry domestic fraud
medium bike domestic legit
cheap game domestic legit
expensive computer foreign fraud
You decide to build a Naive Bayes classiﬁer using these samples.
(a) Using the unsmoothed relative frequency estimates, what is the classiﬁer’s posterior distribution for the
example (cheap, computer, foreign)?
(b) Using relative frequency estimates smoothed with add-one Laplace smoothing, what is the classiﬁer’s pos-
terior distribution for the same example? Assume that there are no values for any of the random variables that
are not present somewhere above.
(c) For add-k Laplace smoothing, what will the classiﬁer’s posterior distribution approach as k → ∞? (Assume
the prior distribution over classes is also smoothed.)
You train your classiﬁer on a larger set of examples and deploy it for the company. You notice that the posteriors
are overly peaked; that is, the classiﬁer tends to overestimate the probability of fraud. You decide the source of
this overconﬁdence is that the independence assumptions are too strong.
(d) Extrapolating from this sample, what single arc would be a best choice to add to the Naive Bayes network
to reduce this overconﬁdence? Justify your answer very brieﬂy.
6
You decide to try out a perceptron classiﬁer on your original training set. You use an indicator feature for each
outcome of each variable (as we did in lecture), and you choose to use the multiclass perceptron (even though
there are only two classes here). You break ties by predicting fraud.
(e) What will the weights be for the classiﬁer after one pass through the data?
Class cheap medium expensive candy jewelry bike game computer domestic foreign
fraud
legit
(f ) Will the perceptron ever converge on this training data? Brieﬂy justify your answer.
7