In the last posting, I illustrated fitting a t(4) sampling model to some baseball data where I suspected there was an outlier. To compare this model to other sampling models, we can use Bayes factors.
Here is a t sampling model with a convenient choice of prior.
1.is a random sample from
2. and are independent, with distributed and distributed
I define a R function tsampling.R that computes the logarithm of the posterior. Note that I am careful to include all of the normalizing constants, since I am primarily interested in computing the marginal density of .
tsampling=function(theta,datapar)
{
mu=theta[1]; logsigma=theta[2]; sigma=exp(logsigma)
y=datapar$y; df=datapar$df
mu.mean=datapar$mu.mean; mu.sd=datapar$mu.sd
lsigma.mean=datapar$lsigma.mean; lsigma.sd=datapar$lsigma.sd
loglike=sum(dt((y-mu)/sigma,df,log=TRUE)-log(sigma))
logprior=dnorm(mu,mu.mean,mu.sd,log=TRUE)+
dnorm(logsigma,lsigma.mean,lsigma.sd,log=TRUE)
return(loglike+logprior)
}
We use this function together with the function laplace to compute the log marginal density for two models.
Model 1 -- t(4) sampling, is normal(90, 20), is normal(1, 1).
Model 2 -- t(30) sampling, is normal(90, 20), is normal(1, 1).
Note that I'm using the same prior for both models -- the only difference is the choice of degrees of freedom in the sampling density.
dataprior1=list(y=y, df=4, mu.mean=90, mu.sd=20,
lsigma.mean=1, lsigma.sd=1)
log.m1=laplace(tsampling,c(80, 3), dataprior1)$int
dataprior2=list(y=y, df=30, mu.mean=90, mu.sd=20,
lsigma.mean=1, lsigma.sd=1)
log.m2=laplace(tsampling,c(80, 3), dataprior2)$int
BF.12=exp(log.m1-log.m2)
BF.12
[1] 14.8463
We see that there is substantial support for the t(4) model over the "close to normal" t(30) model.
Thursday, November 29, 2007
Subscribe to:
Post Comments (Atom)
No comments:
Post a Comment