Assume that we have known data values for both $x_{2i}$ and $x_{3i}$ for all $i = 1, 2, \cdots, n$. Defining $\boldsymbol{\beta} = (\beta_1, \beta_2, \beta_3)'$ and assuming a non-informative prior of the form $p(\boldsymbol{\beta}, \sigma) \propto \frac{1}{\sigma}$, then we can show that the conditional posterior pdf for $\boldsymbol{\beta}$ and $\sigma$, that is, $p(\boldsymbol{\beta}|\sigma, \mathbf{y})$ and $p(\sigma|\boldsymbol{\beta},\mathbf{y})$ are normal and inverted gamma, respectively.

The question is: Use a Gibbs Sampler and estimate the posterior pdf of the parameter function: $\displaystyle \psi = \frac{\beta_2 + \beta_3}{\sigma^2}$.

Now I have run a Gibbs sampler (in R) and after a burn in period of 100 draws, I have obtained 1000 draws of $(\boldsymbol{\beta}^{(i)}, \sigma^{(i)})$, that is, I have a sample of $(\beta_1^{(i)}, \beta_2^{(i)}, \beta_3^{(i)}, \sigma^{(i)})$ for $i = 1, 2, \cdots, 1000$, how can I use these draws to produce an estimate of the posterior pdf of $\psi$? In other words, how can I estimate $p(\psi|\mathbf{y})$?

EDIT: I'm new to MCMC logarithms. I do understand what you mean but I am still not too sure how to use it in the context of this question. From what I've learnt so far, say we have $\boldsymbol{\theta} = (\theta_1, \theta_2)'$ and $p(\boldsymbol{\theta}|\mathbf{y}) = p(\theta_1, \theta_2|\boldsymbol{y})$ is the joint posterior, then the marginal posterior of $\theta_1$ is given by $p(\theta_1|\mathbf{y}) = \int_{\theta_1} p(\theta_1|\theta_2,\mathbf{y})p(\theta_2|\mathbf{y})d\theta_2$, now say I have a sample of $M$ draws of $(\theta_1^{(i)}, \theta_2^{(i)})$ from $p(\theta_1, \theta_2|\boldsymbol{y})$, then to estimate $p(\theta_1|\mathbf{y})$, we use the following sample mean: $\widehat{p(\theta_1|\mathbf{y})} = \frac{1}{M} \sum_{i=1}^M p(\theta_1|\theta_2^{(i)}, \mathbf{y}) $, that is, we estimate the marginal density by averaging over the conditional densities. How can I implement that here?

EDIT2: After the help of @Tomas and @BabakP, I have tried to code the problem into R myself (I'm quite new to R, only learnt the language in the last couple of weeks). The following is my code: (Note: just for practice, I purposely went the other route of drawing from each conditional rather than @BabakP's more efficient decomposition)

$\begingroup$You can obtain posterior for any smooth function of parameters just by direct plug-in of parameter values obtained by Gibbs or any other MCMC algorithm. For example if you samples are indexed as $\alpha_{i}$ and $\beta_{i}$ then posterior sample for $\theta=\alpha^{2}+\beta^{2}$ is $\theta_{i}=\alpha_{i}^{2}+\beta_{i}^{2}$.$\endgroup$
– TomasAug 27 '13 at 12:24

$\begingroup$@Tomas Thanks Tomas, I edited my original post as a reply to your comment.$\endgroup$
– TeTsAug 27 '13 at 12:59

$\begingroup$The nicety of Gibbs and MCMC in general is that after the sample is obtained each chain (i.e. sample for separate parameters) is a marginal distribution. For example you have two parameters $\theta_{1}$ and $\theta_{2}$. You run your algorithm and obtain a chain/sample of values $(\theta_{1,i},\theta_{2,i})$, which represents the sample from your joint posterior distribution, say $p(\theta_{1},\theta_{2}|y)$. If you take separately each chain/sample, say $\theta_{1,i}$, then you imediately have a sample from marginal posterior distribution $p(\theta_{1}|y)$.$\endgroup$
– TomasAug 27 '13 at 14:59

$\begingroup$Hence, you don't need to somehow separately perform marginalization, which in fact you do incorrectly.$\endgroup$
– TomasAug 27 '13 at 15:00

1

$\begingroup$First, obtain a sample of $\psi$ the way I explained earlyer, then if you work in some statistical software, like R, then you could use predifined function like hist(psi). Otherwise just divide the interval [min(psi),max(psi)] into some number of subintervals and count how many values from the sample of $\psi$ falls into each interval and then devide each number by the size of your sample $\psi$. And you'll get your approximation of posterior density of $\psi$$\endgroup$
– TomasAug 27 '13 at 15:21

1 Answer
1

I won't belabor the valid points made by @Tomas above so I will just answer with the following: The likelihood for a normal linear model is given by

$$f(y|\beta,\sigma,X)\propto\left(\frac{1}{\sigma^2}\right)^{n/2}\exp\left\{-\frac{1}{2\sigma^2}(y-X\beta)'(y-X\beta)\right\}$$ so now using the standard diffuse prior $$p(\beta,\sigma^2)\propto\frac{1}{\sigma^2}$$
we can obtain draws from the posterior distribution $p(\beta,\sigma^2|y,X)$ in a very simple Gibbs sampler. Note, although your steps above for the Gibbs sampler are not wrong, I would argue that a more efficient decomposition is the following:
$$p(\beta,\sigma^2|y,X)=p(\beta|\sigma^2,y,X)p(\sigma^2|y,X)$$
and so now we want to design a Gibbs sampler to sample from $p(\beta|\sigma^2,y,X)$ and $p(\sigma^2|y,X)$.

After some tedious algebra, we obtain the following:

$$\beta|\sigma^2,y,X\sim N(\hat\beta,\sigma^2(X'X)^{-1})$$
and
$$\sigma^2|y,X\sim\text{Inverse-Gamma}\left(\frac{n-k}{2},\frac{(n-k)\hat\sigma^2}{2}\right)$$
where
$$\hat\beta=(X'X)^{-1}X'y$$
and
$$\hat\sigma^2=\frac{(y-X\hat\beta)'(y-X\hat\beta)}{n-k}$$

Now that we have all that derived, we can obtain samples of $\beta$ and $\sigma^2$ from our Gibbs sampler and at each iteration of the Gibbs sampler we can obtain estimates of $\psi$ by plugging in our estimates of $\beta$ and $\sigma$.

$\begingroup$I edited my post above (check EDIT2) and posted my code of using draws from each conditional rather than your more efficient decomposition as I wanted some extra practice with the coding. I'm now wondering how I can estimate the 5% quantile of $p(\psi|\mathbf{y})$ using these draws psi?$\endgroup$
– TeTsAug 28 '13 at 11:17