ZIP model updates and longitudinal talk

January 14, 2010

Since my last post, I’ve finally managed to make my ZIP models (a mixture model of Binomial + Poisson distributions) converge in MCMCglmm. The trick was that the model needed a longer burn-in and a larger number of iterations to converge. After 10,000 burn-ins and 60,000 iterations my model converged. I am in debt to Jarrod Hadfield for his time and patience helping me through this problem.

The reason I initially had to move to a ZIP model was that 86% of the student data in my data set had zero suspensions. So there was a strong need to account for this in a model. The Poisson model could not account for and did a horrible job predicting. However, the ZIP model did a much better job accounting for this. Below is the syntax that I used to solve my problem.

# Set priors for additional terms in G-structure
prior1=list(R=list(V=diag(2),nu=1, fix=2),
G=list(G1=list(V=1, nu=1, alpha.mu=0, alpha.V=25^2),
G2=list(V=1, nu=1, alpha.mu=0, alpha.V=25^2),
G3=list(V=1, nu=1, alpha.mu=0, alpha.V=25^2)))

q2.1000 <- MCMCglmm(sus~trait-1 + at.level(trait,1):grade + at.level(trait,1):I(grade^2) + at.level(trait,1):male.f + at.level(trait,1):ethnic.f + at.level(trait,1):sped.f +at.level(trait,1):risk.f, random=~us(at.level(trait,1)):id.f + us(at.level(trait,1)):schn.f + (at.level(trait,1)):schn.f:id.f, data=ransamp1000, rcov=~idh(trait):units, family="zipoisson", prior=prior1,nitt=60000, thin=50, burnin=10000)

The R structure in the prior corresponds to the random error variance-covariance matrix which is not estimated in a Binomial model so it is fixed. The G structure corresponds to the random effects variance-covariance matrix, in my case I have three elements and need three priors. I need a prior to random intercepts for the students (~us(at.level(trait,1)):id.f), schools (~us(at.level(trait,1)):schn.f), and students nested within schools (~us(at.level(trait,1)):schn.f:id.f). The priors on the G-structure are weak but informative Cauchy priors and I encourage you to read Jarrod’s extensive CourseNotes included with his package and see Gelman 2006 “Prior distributions for variance parameters in hierarchical models” for more info. about the priors.

My estimates, both ‘fixed’ and ‘random’, appear to be robust to the priors and using the Cauchy ones with parameter expansion just speeds up convergence. (Takes about 26.5 hours with 175,975 data points).

Finally, I am posting a recent talk I gave on longitudinal models here and the TeX file here . It was created in Beamer and should compile except for the references to the graphs that were not included. So comment these out.


One comment

  1. Does your implicit nesting structure in the random statement mean that the package doesn’t recognize nesting from the data structure? (supposedly lme4 does) I’m currently hunting around for a solution to a crossed – nesting combination situation.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s

%d bloggers like this: