learning - exercises

1. Our prior beliefs about coins

a)

Recall our final coin weight model, “fair-vs-uniform”, in which the coin weight was either 0.5 with high probability or drawn from a uniform distribution otherwise. This implies that a two-faced coin (always heads) is equally likely as a 70% heads coin. Intuitively you might be inclined to think that a two-faced coin is easier to make, and thus more likely. Adjust the model to express this prior.

b)

How does your solution behave differently than the fair-vs-uniform model from the chapter? Find a data set such that the learning curves are qualitatively different.

2. The strength of beliefs

a)

In the chapter, we graphed learning trajectories for a number of models. Below is one of these models (the one with the Beta(10,10) prior). In the chapter, we observed how the model’s best guess about the weight of the coin changed across a sequence of sucessive heads. See what happens if instead we see heads and tails in alternation:

It looks like we haven’t learned anything! Indeed, since our best estimate for the coin’s weight was 0.5 prior to observing anything, our best estimate is hardly going to change when we get data consistent with that prior.

However, we’ve been looking at the average (or expected) estimate. Edit the code below to see whether our posterior distribution is at all changed by observing this data set. (You only need to compare the prior and the posterior after all 100 observations):

Based on your results, is anything learned? (Note that the bounds on the x-axis are likely to be different in the two graphs, which could obscure this. The viz package doesn’t easily to allow you to adjust the bounds on the axes.)

b)

Ideally, we’d like to see how our belief distribution shifts as more data comes in. A particularly good measure would be entropy. Unfortunately, calculating entropy for a Beta distribution is somewhat involved.

An alternative we can use is variance: the expected squared difference between a sample from the distribution and the distribution mean. This doesn’t take into account the shape of the distribution, and so won’t give us quite what we want if the distribution is non-symmetric; but it is a reasonable first try.

Edit the code below to see how variance changes as more data is observed.