## Saturday, November 24, 2007

### Gibbs Sampling for Hierarchical Models

Gibbs sampling is an attractive "automatic" method of setting up a MCMC algorithm for many classes of models. Here we illustrate using R to write a Gibbs sampling algorithm for the normal/normal exchangeable model.

We write the model in three stages as follows.

1. The observations y1, ..., yk are independent where yj is N(thetaj, sigmaj^2), where we write N(mu, sigma2) to denote the normal density with mean mu and variance sigma2. We assume the sampling variances sigma1^2, ..., sigmak^2 are known.

2. The means theta1,..., thetak are a random sample from a N(mu, tau2) population. (tau2 is the variance of the population).

3. The hyperparameters (mu, tau) are assumed to have a uniform distribution. This implies that the parameters (mu, tau2) have a prior proportional to (tau2)^(-1/2).

To write a Gibbs sampling algorithm, we write down the joint posterior of all parameters (theta1, ..., thetak, mu, tau2):

$
\frac{1}{\tau} \prod_{j=1}^k \frac{1}{\sqrt{\sigma_j^2}} \exp\{-(y_j-\theta_j)^2/(2\sigma_j^2)\}
\times \prod_{j=1}^k \frac{1}{\sqrt{\tau^2}} \exp\{-(\theta_j - \mu)^2/(2 \tau^2)\}
$

From this expression, one can see

1. The posterior of thetaj conditional on all remaining parameters is normal, where the mean and variance are given by the usual normal density/normal prior updating formula.

2. The hyperparameter mu has a normal posterior with mean theta_bar (the sample mean of the thetaj) and variance tau2/k.

3. The hyperparameter tau2 has an inverse gamma posterior with shape (k-1)/2 and rate 1/2 sum(thetaj - mu)^2.

Given that all the conditional posteriors have convenient functional forms, we write a R function to implement the Gibbs sampling. The only inputs are the data matrix (columns containing yj and sigmaj^2) and the number of iterations m.

I'll display the function normnormexch.gibbs.R with notes.

normnormexch.gibbs=function(data,m)
{
y=data[,1]; k=length(y); sigma2=data[,2] # HERE I READ IN THE DATA

THETA=array(0,c(m,k)); MU=rep(0,m); TAU2=rep(0,m) # SET UP STORAGE

mu=mean(y); tau2=median(sigma2) # INITIAL ESTIMATES OF MU AND TAU2

for (j in 1:m) # HERE'S THE GIBBS SAMPLING
{
p.means=(y/sigma2+mu/tau2)/(1/sigma2+1/tau2) # CONDITIONAL POSTERIORS
p.sds=sqrt(1/(1/sigma2+1/tau2)) # OF THETA1,...THETAK
theta=rnorm(k,mean=p.means,sd=p.sds)

mu=rnorm(1,mean=mean(theta),sd=sqrt(tau2/k)) # CONDITIONAL POSTERIOR OF MU

tau2=rigamma(1,(k-1)/2,sum((theta-mu)^2)/2) # CONDITIONAL POSTERIOR OF TAU2

THETA[j,]=theta; MU[j]=mu; TAU2[j]=tau2 # STORE SIMULATED DRAWS
}
return(list(theta=THETA,mu=MU,tau2=TAU2)) # RETURN A LIST WITH SAMPLES
}

Here is an illustration of this algorithm for the SAT example from Gelman et al.

y=c(28,8,-3,7,-1,1,18,12)
sigma=c(15,10,16,11,9,11,10,18)
data=cbind(y,sigma^2)
fit=normnormexch.gibbs(data,1000)

In the Chapter 7 exercise that used this example, a different sampling algorithm was used to simulate from the joint posterior of (theta, mu, tau2) -- it was a direct sampling algorithm based on the decomposition

[theta, mu, tau2] = [mu, tau2] [theta | mu, tau2]

In a later posting, I'll compare the two sampling algorithms.