Wednesday, September 26, 2007

Inferences for Gamma Sampling Problem

In the previous post, I considered the problem of modeling lengths of cell phone calls. Here we focus on several types of inferences and predictions that might be of interest.

Following the general computing strategy described in Chapter 5 of BCWR, I first transform the gamma parameters (alpha, beta) to (theta1 = log alpha, theta2 = log mu= log (alpha beta)). The function gamma.sampling.post computes the posterior of (theta1, theta2). The function mycontour draws a contour plot and the function simcontour simulates from this grid. The figure shows the contour plot with the simulated draws placed on top.

> y=c(12.2,.9,.8,5.3,2,1.2,1.2,1,.3,1.8,3.1,2.8)
> library(LearnBayes)


> gamma.sampling.post=function(theta,data)
+ {

+ a=exp(theta[,1])

+ mu=exp(theta[,2])

+ n=length(data)

+ val=0*a

+ for (i in 1:n) val=val+dgamma(data[i],shape=a,scale=mu/a,log=TRUE)

+ return(val-log(a)+log(a)+log(mu))

+ }


> mycontour(gamma.sampling.post,c(-1.5,1.5,0,3),y)

> title(main="POSTERIOR OF (LOG ALPHA, LOG MU)",xlab="log alpha",

+ ylab="log mu")

> s=simcontour(gamma.sampling.post,c(-1.5,1.5,0,3),y,1000)

> points(s$x,s$y)




Suppose we are interested in the mean length of cell phone calls mu. In particular, what is the probability that the mean length exceeds 4 minutes? The figure displays a density estimate of the simulated draws of mu, and I have labeled the desired probability.

> mu=exp(s$y)
> alpha=exp(s$x)

> beta=mu/alpha
> plot(density(mu),main="POSTERIOR OF MEAN LENGTH",xlab="mu",lwd=3)
> lines(c(4,4),c(0,ss$y[135]),lwd=3)
> text(8,.15,"P(MU > 4) = 0.178")

> arrows(7,.1,4.5,.05,lwd=2)



Next, suppose we are interested in the predictive distribution of the length of a single cell phone call. Since we have already collected simulated draws from the posterior of (alpha, beta), it just takes one additional command to simulate the predictive distribution of y* (using the function rgamma). I have displayed a density estimate of the predictive density.

Note that the probability the mean call length exceeds 4 minutes is 0.178; the probability a future call exceeds 4 minutes is 0.263

> ys=rgamma(1000,shape=alpha,scale=beta)
> plot(density(ys),xlab="CALL LENGTH", lwd=3, main="POSTERIOR PREDICTIVE DENSITY")

> mean(ys>4)

[1] 0.263



Last, suppose you plan on making 20 calls next month and you're interested in the total amount of time used. By use of a loop, we simulate 20 draws from the predictive distribution -- the variable ysum contains 1000 realizations of the total.

> ysum=rep(0,1000)
> for (j in 1:20) ysum=ysum+rgamma(1000,shape=alpha,scale=beta)

> hist(ysum, main="PREDICTIVE DISTRIBUTION OF LENGTH OF 20 CALLS")

No comments: