Archive for December, 2009

h1

PDF annotation in Linux and update about Bayesian guide

December 11, 2009

I have been searching for a program that can annotate PDFs natively in Linux. I stumbled upon this PDF guide. However, I didn’t want to use Wine if I didn’t have to. Fortunately I came across Jarnal. While Jarnal isn’t as fully featured as using a PDF viewer under Wine, it does work. Also there’s a Debian package. So if you’re looking for an alternative to running Foxit or PDF-XChange Viewer via Wine, Jarnal is a viable option.

To use Jarnal, I do the following: File -> Open Background (and find the PDF I’m interested in) then I set Format -> Paper and Background “Lined” to “Plain”. I am not sure how to set this up as a default. But it works well and at least I can annotate.

Also, I plan to update the Bayesian guide during my break between semesters. Mostly, I plan to extend on the information about priors and maybe write up some simple code comparing GLM to MCMC results. The code will most likely be written in R but it might include some BUGS code. Ideally, it would be code in JAGS but I don’t know JAGS yet. We’ll see if I get time to learn it over break. Probably won’t.

h1

Sane priors in a ZIP multilevel model

December 8, 2009

My recent tardiness from this blog shouldn’t be an indication of it’s abandonment. Instead it’s an indication of a hectic semester with a trying analysis.

I have been hard at work with a large data set with a dependent variable that has a Poisson distribution. Unfortunately, lots of subjects have zero bon this measure and have forced me into the arena of zero-inflated Poisson models because of poor fit of the Poisson models. The ZIP model in this case is essential a Poisson model + Binomial model for the zero-inflation component. Also due to the complexity of my analysis and the multiple levels involved, I can’t estimate this model with maximum likelihood but instead am relying on Markov Chain Monte Carlo and specifically the MCMCglmm package in R. The author of the package, Jarrod Hadfield, has been incredibly helpful and extremely patient with my questions. Unfortunately, my problem has even stumped him a little bit (the overall extremely poor fit of the predicted values from the Poisson model). Right now I am trying to figure out the most sane priors to expedite mixing and converging. At the moment MCMCglmm v. 1.13 only allows Inverse Wishart priors for the variance components and Normal priors for the means (which I’ve not specified but instead am using the default). However, Jarrod has informed that there will be a new version with a massive overhaul with more options for different priors. I’ve decided to use weak but informative Inverse Wishart priors while fixing the residual variance of the zero-inflation component because it can’t be estimated.

prior=list(R=list(V=diag(c(1,1e-10)), nu=0.001, fix=2),
G=list(G1=list(V=diag(2)*0.021e-10, nu=0.001)))

To run the ZIP model I have sent MCMCglmm the following code:
system.time(q1.zip <- MCMCglmm(y ~ trait + trait:time-1 + trait:I(time)^2,saveX=TRUE,saveZ=TRUE,random=~us(trait):id.f, nitt=200,burnin=50,rcov=~idh(trait):units,
data=dat1,verbose=TRUE,singular.ok=TRUE,pr=TRUE,
pl=TRUE,prior=prior,DIC=TRUE,family="zipoisson"))

However, MCMCglmm doesn’t like these prior as they are too weak:
Error in MCMCglmm(y ~ trait + trait:time - 1 + trait:I(time)^2, saveX = TRUE, : Mixed model equations singular: use a (stronger) prior Timing stopped at: 1485.42 44.79 1551.59

Back to the drawing board … I will update this when/if I can get this model to converge!

PS – This blog has subscribers! Thanks and I’ll try to update more often to make subscriptions actual worth it.

PSS – I will start working on a MS in Statistics in Fall 2010 and hope to finish it in Spring 2012 and then my PhD in Quantitative Methods in Education in Fall 2013. I think my MS will help provide me with a more thorough understanding of the statistical models I run as well as giving me a deeper understanding of Bayesian methods and MCMC. My plans are to continue to use this blog as a way to deliver Bayesian in a non-technical way to the applied researcher in the social sciences and maybe Ecology if someone can send some data!