Metropolis-Hastings Algorithm from Scratch

Metropolis Algorithm

The Metropolis algorithm is a common acceptance/rejection algorithm for sampling from target distributions and a key tool for Bayesian inference. The algorithm takes draws from a probability distribution creating a sequence where over time the draws approximate the target distribution. At each time step a draw is taken from a proposal distribution. The new draw is either accepted or rejected based on the ratio of the evaluated density at the proposal and the previous time point. The metropolis algorithm requires the proposal distribution to be symmetric, meaning it satisfies the condition . The algorithm is as follows:

Draw a starting point . This can be a user specified value or an approximation based on the data.

For
a) Sample from the proposal distribution. It is common to choose the proposal distribution to be of the same form as the target distribution but usually scaled 20-30%.
b) Calculate the ratio of the target densities evaluated at and ,

Set,

This will be demonstrated on a linear regression example using the metropolis algorithm to estimate the model parameters , and . First we’ll simulate the data to me estimated.

Set up a function to evaluate the likelihood of the model given the data. We will use non-informative prior distributions for the parameters. We now calculate the posterior as the sum of the likelihood and the prior distributions given we have used the log scale.

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

22

23

24

25

26

posterior<-function(par){

# take the input parameters of the intercept, slope and std dev

b0<-par[1]

b1<-par[2]

sd<-par[3]

# compute the expected value given the input parameters

y_hat<-b0+b1*x

# compute the log likelihoods

loglikelihoods<-dnorm(y,mean=y_hat,sd=sd,log=T)

# sum the log likelihoods and return

sumll<-sum(loglikelihoods)

# priors - non-formative

b0_prior<-dnorm(b0,sd=5,log=TRUE)

b1_prior<-dnorm(b1,sd=5,log=TRUE)

sd_prior<-dnorm(sd,sd=5,log=TRUE)

# now return the sum of all components

return(sum(sumll,b0_prior,b1_prior,sd_prior))

}

A key part of the Metropolis algorithm is the proposal distribution. From this we draw the proposal value, calculate the ratio $r$ and randomly accept or reject the proposal. Here we set the proposal distribution to be normally distributed with standard deviations for the parameters (0.5, 0.1, 0.25). These are reasonable values chosen from the linear model above.

This is a good demonstration how the algorithm works however, it could use some fine tuning e.g. chosing better proposal distributions. The trace for the intercept and slope exhibit some autocorrelation which could be improved by thinning the sequence. The best thing about R is that there are great packages to help us.

MHadaptive and MCMCpack

The metropolis algorithm can be implemented much more simply using the fantastics packages MHadaptive and MCMCpack. All that is needed is to specify the likelihood function of the model. The likelihood function from above can be passed directly to these functions.

As you can see the acceptance rate is 0.45 for both methods were as the the algorithm we implemented was approximately 0.25. This is likely due to the chosen proposal distribution. These acceptance rates are pretty good. Essentially we don’t want an acceptance rate too high since this will mean our jumping distribution is too narrow and the new proposal is more likely to be accepted which can be sub-optimal simulation of the posterior. Likewise, if it’s too low, our jumping distribution is too wide and the algorithm will essentially get stuck in one place for a long time and not explore the parameter space adequately. Fortunately MHadpative and MCMCpack functions take care of that for you and you can see the difference in this output.

I encourage you to play around with different parameter sets and proposal distributions to see the differences in the posterior sequences.