Since my last post, I’ve finally managed to make my ZIP models (a mixture model of Binomial + Poisson distributions) converge in MCMCglmm. The trick was that the model needed a longer burn-in and a larger number of iterations to converge. After 10,000 burn-ins and 60,000 iterations my model converged. I am in debt to Jarrod Hadfield for his time and patience helping me through this problem.
The reason I initially had to move to a ZIP model was that 86% of the student data in my data set had zero suspensions. So there was a strong need to account for this in a model. The Poisson model could not account for and did a horrible job predicting. However, the ZIP model did a much better job accounting for this. Below is the syntax that I used to solve my problem.
# Set priors for additional terms in G-structure
prior1=list(R=list(V=diag(2),nu=1, fix=2),
G=list(G1=list(V=1, nu=1, alpha.mu=0, alpha.V=25^2),
G2=list(V=1, nu=1, alpha.mu=0, alpha.V=25^2),
G3=list(V=1, nu=1, alpha.mu=0, alpha.V=25^2)))
q2.1000 <- MCMCglmm(sus~trait-1 + at.level(trait,1):grade + at.level(trait,1):I(grade^2) + at.level(trait,1):male.f + at.level(trait,1):ethnic.f + at.level(trait,1):sped.f +at.level(trait,1):risk.f, random=~us(at.level(trait,1)):id.f + us(at.level(trait,1)):schn.f + (at.level(trait,1)):schn.f:id.f, data=ransamp1000, rcov=~idh(trait):units, family="zipoisson", prior=prior1,nitt=60000, thin=50, burnin=10000)
The R structure in the prior corresponds to the random error variance-covariance matrix which is not estimated in a Binomial model so it is fixed. The G structure corresponds to the random effects variance-covariance matrix, in my case I have three elements and need three priors. I need a prior to random intercepts for the students (~us(at.level(trait,1)):id.f), schools (~us(at.level(trait,1)):schn.f), and students nested within schools (~us(at.level(trait,1)):schn.f:id.f). The priors on the G-structure are weak but informative Cauchy priors and I encourage you to read Jarrod’s extensive CourseNotes included with his package and see Gelman 2006 “Prior distributions for variance parameters in hierarchical models” for more info. about the priors.
My estimates, both ‘fixed’ and ‘random’, appear to be robust to the priors and using the Cauchy ones with parameter expansion just speeds up convergence. (Takes about 26.5 hours with 175,975 data points).
Finally, I am posting a recent talk I gave on longitudinal models here and the TeX file here . It was created in Beamer and should compile except for the references to the graphs that were not included. So comment these out.