Monday, December 31, 2007

New version of LearnBayes

Based on my experience teaching Bayes this fall, I've added some new functions to LearnBayes. The new version, LearnBayes 1.20, is available from the book website http://bayes.bgsu.edu/bcwr

One simple way of extending conjugate priors is by the use of discrete mixtures. There are three new functions, binomial.beta.mix, poisson.gamma.mix, and normal.normal.mix, that do the posterior computations for binomial, Poisson, and normal problems using mixtures of conjugate priors.

Here I'll illustrate one of the functions for Poisson sampling. Suppose we are interested in learning about the home run rate for Derek Jeter before the start of the 2004 season. (A home run rate is the proportion of official at-bats that are home runs.) Suppose our prior beliefs are that the median of is equal to 0.05 and the 90th percentile is equal to 0.081.
Here are two priors that match this information. The first is a conjugate Gamma prior and the second is a mixture of two conjugate Gamma priors.

Prior 1: Gamma(shape = 6, rate = 113.5)

Prior 2: 0.88 x Gamma(shape = 10, rate = 193.5) + 0.12 x Gamma(shape = 0.2, rate = 0.415)

I check below that these two priors match the prior information.

pgamma(c(.05,.081),shape=6,rate=113.5)
[1] 0.5008145 0.8955648
probs[1]*pgamma(c(.05,.081),shape=gammapar[1,1],rate=gammapar[1,2])+
probs[2]*pgamma(c(.05,.081),shape=gammapar[2,1],rate=gammapar[2,2])
[1] 0.5007136 0.9012596

Graphs of the two priors are shown below. They look similar, but the mixture prior has flatter tails.
Now we observe some data -- the number of at-bats and home runs hit by Jeter in the 2004 season. This data is contained in the dataset jeter2004 contained in the LearnBayes pacakge.

library(LearnBayes)
data(jeter2004)

The function poisson.gamma.mix will compute the posterior for the mixture prior. The inputs are prior, the vector of prior probabilities of the components, gammapar, a matrix of the gamma parameters for the components, and data, a list with components y (the observed counts) and t (the corresponding time intervals).

probs=c(.88,.12)
gammapar=rbind(c(10,193.5),c(.2,.415)
data=list(y=jeter2004$HR,t=jeter2004$AB)

Now we can run the function.

poisson.gamma.mix(probs,gammapar,data)
$probs
[1] 0.98150453 0.01849547

$gammapar
[,1] [,2]
[1,] 33.0 836.500
[2,] 23.2 643.415

We see from the output that the posterior for is distributed

0.98 x Gamma(33.0, 836.5) + 0.02 x Gamma(23.3, 643.4)

Does the choice of prior make a difference here? The following figure displays the posteriors for the two priors. They look similar, indicating that inference about is robust to the choice of prior.


But this robustness will depend on the observed data. Suppose that Jeter is a "steriod slugger" during the 2004 season and hits 70 home runs in 500 at-bats.

We run the function poisson.gamma.mix for these data.

poisson.gamma.mix(probs,gammapar,list(y=70,t=500))
$probs
[1] 0.227754 0.772246

$gammapar
[,1] [,2]
[1,] 80.0 693.500
[2,] 70.2 500.415

Here the posterior is

0.23 x gamma(80, 693.5) + 0.77 x gamma(70.2, 500.4)

I've graphed the two posteriors from the two priors. Here we see that the two posteriors are significantly different, indicating that the inference depends on the choice of prior.

No comments: