h1

Dropbox package for Debian Testing (Squeeze) for amd64

February 4, 2010

I am sharing a Dropbox package that I built from a Ubuntu source (from the Dropbox website) on a Debian testing machine. So it is binary compatible with Debian testing. Click here to download it. If there is a problem with it please let me know.

NOTE: This is not the same as installing the Ubuntu binary of nautilus-dropbox. My package is a complete rebuild and therefore is 100% compatible with Debian as it was built on Debian using Debian build dependencies.

h1

Icedove 3 amd64 packages for Debian Unstable (and Testing)

January 30, 2010

I grew impatient waiting for Icedove 3 (aka Thunderbird 3) to hit Debian Unstable, so I rolled my own packages for amd64 (only architectures available in Experimental were for i386 and armel). I’m uploading them in case they are of interest to anyone. They were built using build-deps from sid. I haven’t done any real testing with them but they *seem* to work fine here. Click here for the amd64 packages.

Note: They likely work on Testing but I haven’t tried and can’t. Feel free to leave a comment if they do so that others know.

EDIT (Feb 3, 2010): This package works fine on Debian Testing. So feel free to install it on Testing all the dependencies are in place. Also note this is just a rebuild of the source from Experimental so I haven’t done anything fancy other than just


apt-get source icedove
sudo apt-get build-dep icedove
dch -l local # To add the 'local' part to the package name #
debuild -us -uc
sudo dpkg -i ../icedove_3.0.1-1local1_amd64.deb

i.e. I did a complete rebuild not just a checkinstall.

h1

ZIP model updates and longitudinal talk

January 14, 2010

Since my last post, I’ve finally managed to make my ZIP models (a mixture model of Binomial + Poisson distributions) converge in MCMCglmm. The trick was that the model needed a longer burn-in and a larger number of iterations to converge. After 10,000 burn-ins and 60,000 iterations my model converged. I am in debt to Jarrod Hadfield for his time and patience helping me through this problem.

The reason I initially had to move to a ZIP model was that 86% of the student data in my data set had zero suspensions. So there was a strong need to account for this in a model. The Poisson model could not account for and did a horrible job predicting. However, the ZIP model did a much better job accounting for this. Below is the syntax that I used to solve my problem.


# Set priors for additional terms in G-structure
prior1=list(R=list(V=diag(2),nu=1, fix=2),
G=list(G1=list(V=1, nu=1, alpha.mu=0, alpha.V=25^2),
G2=list(V=1, nu=1, alpha.mu=0, alpha.V=25^2),
G3=list(V=1, nu=1, alpha.mu=0, alpha.V=25^2)))


q2.1000 <- MCMCglmm(sus~trait-1 + at.level(trait,1):grade + at.level(trait,1):I(grade^2) + at.level(trait,1):male.f + at.level(trait,1):ethnic.f + at.level(trait,1):sped.f +at.level(trait,1):risk.f, random=~us(at.level(trait,1)):id.f + us(at.level(trait,1)):schn.f + (at.level(trait,1)):schn.f:id.f, data=ransamp1000, rcov=~idh(trait):units, family="zipoisson", prior=prior1,nitt=60000, thin=50, burnin=10000)

The R structure in the prior corresponds to the random error variance-covariance matrix which is not estimated in a Binomial model so it is fixed. The G structure corresponds to the random effects variance-covariance matrix, in my case I have three elements and need three priors. I need a prior to random intercepts for the students (~us(at.level(trait,1)):id.f), schools (~us(at.level(trait,1)):schn.f), and students nested within schools (~us(at.level(trait,1)):schn.f:id.f). The priors on the G-structure are weak but informative Cauchy priors and I encourage you to read Jarrod’s extensive CourseNotes included with his package and see Gelman 2006 “Prior distributions for variance parameters in hierarchical models” for more info. about the priors.

My estimates, both ‘fixed’ and ‘random’, appear to be robust to the priors and using the Cauchy ones with parameter expansion just speeds up convergence. (Takes about 26.5 hours with 175,975 data points).

Finally, I am posting a recent talk I gave on longitudinal models here and the TeX file here . It was created in Beamer and should compile except for the references to the graphs that were not included. So comment these out.

h1

PDF annotation in Linux and update about Bayesian guide

December 11, 2009

I have been searching for a program that can annotate PDFs natively in Linux. I stumbled upon this PDF guide. However, I didn’t want to use Wine if I didn’t have to. Fortunately I came across Jarnal. While Jarnal isn’t as fully featured as using a PDF viewer under Wine, it does work. Also there’s a Debian package. So if you’re looking for an alternative to running Foxit or PDF-XChange Viewer via Wine, Jarnal is a viable option.

To use Jarnal, I do the following: File -> Open Background (and find the PDF I’m interested in) then I set Format -> Paper and Background “Lined” to “Plain”. I am not sure how to set this up as a default. But it works well and at least I can annotate.

Also, I plan to update the Bayesian guide during my break between semesters. Mostly, I plan to extend on the information about priors and maybe write up some simple code comparing GLM to MCMC results. The code will most likely be written in R but it might include some BUGS code. Ideally, it would be code in JAGS but I don’t know JAGS yet. We’ll see if I get time to learn it over break. Probably won’t.

h1

Sane priors in a ZIP multilevel model

December 8, 2009

My recent tardiness from this blog shouldn’t be an indication of it’s abandonment. Instead it’s an indication of a hectic semester with a trying analysis.

I have been hard at work with a large data set with a dependent variable that has a Poisson distribution. Unfortunately, lots of subjects have zero bon this measure and have forced me into the arena of zero-inflated Poisson models because of poor fit of the Poisson models. The ZIP model in this case is essential a Poisson model + Binomial model for the zero-inflation component. Also due to the complexity of my analysis and the multiple levels involved, I can’t estimate this model with maximum likelihood but instead am relying on Markov Chain Monte Carlo and specifically the MCMCglmm package in R. The author of the package, Jarrod Hadfield, has been incredibly helpful and extremely patient with my questions. Unfortunately, my problem has even stumped him a little bit (the overall extremely poor fit of the predicted values from the Poisson model). Right now I am trying to figure out the most sane priors to expedite mixing and converging. At the moment MCMCglmm v. 1.13 only allows Inverse Wishart priors for the variance components and Normal priors for the means (which I’ve not specified but instead am using the default). However, Jarrod has informed that there will be a new version with a massive overhaul with more options for different priors. I’ve decided to use weak but informative Inverse Wishart priors while fixing the residual variance of the zero-inflation component because it can’t be estimated.

prior=list(R=list(V=diag(c(1,1e-10)), nu=0.001, fix=2),
G=list(G1=list(V=diag(2)*0.021e-10, nu=0.001)))

To run the ZIP model I have sent MCMCglmm the following code:
system.time(q1.zip <- MCMCglmm(y ~ trait + trait:time-1 + trait:I(time)^2,saveX=TRUE,saveZ=TRUE,random=~us(trait):id.f, nitt=200,burnin=50,rcov=~idh(trait):units,
data=dat1,verbose=TRUE,singular.ok=TRUE,pr=TRUE,
pl=TRUE,prior=prior,DIC=TRUE,family="zipoisson"))

However, MCMCglmm doesn’t like these prior as they are too weak:
Error in MCMCglmm(y ~ trait + trait:time - 1 + trait:I(time)^2, saveX = TRUE, : Mixed model equations singular: use a (stronger) prior Timing stopped at: 1485.42 44.79 1551.59

Back to the drawing board … I will update this when/if I can get this model to converge!

PS – This blog has subscribers! Thanks and I’ll try to update more often to make subscriptions actual worth it.

PSS – I will start working on a MS in Statistics in Fall 2010 and hope to finish it in Spring 2012 and then my PhD in Quantitative Methods in Education in Fall 2013. I think my MS will help provide me with a more thorough understanding of the statistical models I run as well as giving me a deeper understanding of Bayesian methods and MCMC. My plans are to continue to use this blog as a way to deliver Bayesian in a non-technical way to the applied researcher in the social sciences and maybe Ecology if someone can send some data!

h1

How-to: VIM, GNU R, and LaTeX

October 24, 2009

Recently I switched from Emacs to VIM. Mostly because I’ve felt that the default keybindings are more sane. The first thing I needed to do was to find a plugin for VIM that worked for R. Well here is that plugin. The follow the directions there for installation. (If you need help leave me a comment). VIM will automatically detect that it is in R mode when opening a *.R file.

Two quick tips to get you started:

# To open R
\rf
# To send code to R
\d

Also, if you need a LaTeX plugin for VIM here it is. It also has very sane keybindings. I recommend not using the Debian package as it didn’t work for me. To get LaTeX to compile in PDF mode by default I added the following to ~/.vimrc:


" Set pdf as default target
let g:Tex_DefaultTargetFormat='pdf'

That’s it. And in complete disclosure, I am really using the GTK-front end of VIM known as GVIM.

h1

SEM in WinBUGS

October 24, 2009

For my MITER research, I have been working with a faculty member in the University of Minnesota’s Institute of Child Development. I have been involved on a couple of projects with this group. One project involved longitudinal modeling of work competency (from childhood to adulthood) using structural equation modeling (SEM). This project examined how childhood competence in externalizing behaviors/conduct, academics, and socializing behaviors predict work as an adult. If you’re interested in this work, it has just been accepted and should be in press soon. (Feel free to email me for a copy of this manuscript and I’ll see if I can get a copy).

To do the SEM, we initially tried the sem package in R. This proved to be to basic for us as it was unable to do a multi-group analysis. So we settled on MPlus. Even in MPlus, we were unable to run the models we really wanted to because the number of integration points were too many and MPlus would just error out. We dealt with it the best we could and I think our manuscript is strong and transparent about our computational problems.

After this paper was submitted, I become more interested in examining what sort of alternatives exist. Is there a Bayesian book on SEM? Well apparently there is. In the Bayesian framework, latent constructs are just dealt with as missing data problems, making them more computationally solvable. Additionally, using Bayesian you work directly from your observed data and not from a covariance matrix. Because of this you can create more highly complex, developmentally appropriate models that are infeasible in traditional software. Even Bollen is writing about this.

To help you get started. I am going to paste the code that I have used to replicate our models in WinBUGS. Unfortunately I can not share the data. But if you have specific data you can share then I’d be happy to modify the code as appropriate.


model {
for(i in 1:N){
##########################
# measurement equation model
##########################
for(j in 1:P){
y[i,j]~dnorm(mu[i,j],psi[j])
}
##########################
# alp[i] corresponds to the intercepts
##########################
mu[i,1]<-xi[i,1]+alp[1] ## Ext1
mu[i,2]<-lam[1]*xi[i,1]+alp[2]
mu[i,3]<-lam[2]*xi[i,1]+alp[3]
mu[i,4]<-xi[i,2]+alp[4] ## Soc1
mu[i,5]<-lam[3]*xi[i,2]+alp[5]
mu[i,6]<-lam[4]*xi[i,2]+alp[6]
mu[i,7]<-xi[i,3]+alp[7] ## Aca1
mu[i,8]<-lam[5]*xi[i,3]+alp[8]
mu[i,9]<-lam[6]*xi[i,3]+alp[9]
mu[i,10]<-xi[i,4]+alp[10] ## Ext2
mu[i,11]<-lam[7]*xi[i,4]+alp[11]
mu[i,12]<-lam[8]*xi[i,4]+alp[12]
mu[i,13]<-xi[i,5]+alp[13] ## Soc2
mu[i,14]<-lam[9]*xi[i,5]+alp[14]
mu[i,15]<-lam[10]*xi[i,5]+alp[15]
mu[i,16]<-xi[i,6]+alp[16] ## Aca2
mu[i,17]<-lam[11]*xi[i,6]+alp[17]
mu[i,18]<-lam[12]*xi[i,6]+alp[18]
mu[i,19]<-xi[i,7]+alp[19] ## Work2
mu[i,20]<-lam[13]*xi[i,7]+alp[20]
mu[i,21]<-lam[14]*xi[i,7]+alp[21]
##########################
# structural equation model
##########################
xi[i,1:3]~dmnorm(u[1:3],phi[1:3,1:3]) ## factor loadings are multivariate normal
xi[i,4]~dnorm(ext2[i],psd)
xi[i,5]~dnorm(soc2[i],psd)
xi[i,6]~dnorm(aca2[i],psd)
xi[i,7]~dnorm(work2[i],psd)
##########################
# continuity Paths
##########################
ext2[i]<-gam[1]*xi[i,1]
soc2[i]<-gam[2]*xi[i,2]
aca2[i]<-gam[3]*xi[i,3]
work2[i]<-gam[4]*xi[i,1] + gam[5]*xi[i,1] + gam[6]*xi[i,1]
} #end of i
##########################
# priors on intercepts
##########################
for(j in 1:21){alp[j]~dnorm(0.0, 1.0)}
##########################
# priors on loadings and coefficients
##########################
lam[1]~dnorm(0.8,psi[1])
lam[2]~dnorm(0.8,psi[2])
lam[3]~dnorm(0.8,psi[3])
lam[4]~dnorm(0.8,psi[4])
lam[5]~dnorm(0.8,psi[5])
lam[6]~dnorm(0.8,psi[6])
lam[7]~dnorm(0.8,psi[7])
lam[8]~dnorm(0.8,psi[8])
lam[9]~dnorm(0.8,psi[9])
lam[10]~dnorm(0.8,psi[10])
lam[11]~dnorm(0.8,psi[11])
lam[12]~dnorm(0.8,psi[12])
lam[13]~dnorm(0.8,psi[13])
lam[14]~dnorm(0.8,psi[14])
for(j in 1:6){gam[j]~dnorm(0.5, psd)}
##########################
#priors on precisions
##########################
for(j in 1:P){
psi[j]~dgamma(9.0, 4.0)
sgm[j]<-1/psi[j]
}
psd~dgamma(9.0, 4.0)
sgd<-1/psd
phi[1:3,1:3]~dwish(R[1:3,1:3], 5)
phx[1:3,1:3]<-inverse(phi[1:3,1:3])
} #end of model

I’d be happy to answer any questions about the code. Note this is just the model, not the starting values. I usually let WinBUGS generate them for me. Bad practice I know, but as long as my chains mix I am not worried.

h1

WinBUGS and OpenBUGS on Debian GNU/Linux Lenny

October 24, 2009

A quick how-to for installing WinBUGS and OpenBUGS on Debian GNU/Linux Lenny. I am sure this will work on other distributions. The only issue might be that certain distributions ship different versions of wine . I am use wine 1.0.1, with other versions of wine YMMV.

First, install wine

su -
aptitude install wine

Second, go here and download the WinBUGS and OpenBUGS binaries. It is important to note that OpenBUGS is the development version of WinBUGS and it’s open-source.

I also encourage you to check out JAGS. I have created a 64-bit package of JAGS that is available here for Debian. If you’re a R user and do a lot of multilevel modeling, I recommend MCMCglmm .

Third, assuming you’ve downloaded these files to ~/Downloads, for WinBUGS do the following:

unzip ~/Downloads/winbugs14.zip

And for OpenBUGS do the following:


mkdir -p ~/Downloads/openbugs
mv ~/Downloads/OpenBUGS.zip ~/Downloads/openbugs
unzip ~/Downloads/openbugs/OpenBUGS.zip

Now everything is reading to go. To launch WinBUGS:

wine ~/Downloads/WinBUGS14/WinBUGS14.exe

To launch OpenBUGS:

wine ~/Downloads/openbugs/winbugs.exe

Note the first line of code is for WinBUGS and the second line is for OpenBUGS.

Don’t forget to add the immortality key and the patch for WinBUGS.

h1

Bayesian Guide for Education

October 24, 2009

I have been working on a gentle, very applied introduction to Bayesian aimed at educational researchers. I’ve uploaded the guide. Feel free to download it and if you do please provide feedback.

Click here for the Bayesian guide

h1

R just got a little more love from Debian

July 30, 2009
Announcing cran2deb: 1700+ Debian packages from almost all of CRAN
------------------------------------------------------------------

Last Friday's presentation at UseR! 2009 was the first really public mention
of 'cran2deb'. It provides Debian packages of all of CRAN. It started as
Charles' project from last year's Google Summer of Code, was further extended
by us over the last year.

We currently build for the testing distribution and the i386 and amd64
architectures. This is now publically useable and we welcome wider testing
by Debian users. We hope to extend this to Ubuntu during the summer or fall.

I include what I blogged earlier (which you may have seen via Planet Debian
and/or Planet R).  This also contains the /etc/apt/sources.list entries:

   cran2deb: Would you like 1700+ new Debian / R packages ?

   As I mentioned in my [8]quick write-up of UseR 2009, one of my talks
   was about cran2deb: a system to turn (essentially) all [1]CRAN packages
   into directly apt-get-able binary packages.

   This is essentially a '2.0' version of earlier work with Steffen
   Moeller and David Vernazobres which we had presented in 2007. Then, the
   approach was top-down and monolithic which started to show its limits.
   This time, the idea was to borrow the successful bottom-up approach of
   my [2]CRANberries feed.

   The bulk of the work was done by Charles Blundell as part of his Google
   Summer of Code 2008 project which I had suggested and mentored. After
   that project had concluded, we both felt we should continue with it and
   bring it to 'production'. The CRAN hosts provided us with a (virtual
   Xen) machine to build on, and we are now ready to more publically
   announce the availability of the repositories for i386 and amd64:

      deb http://debian.cran.r-project.org/cran2deb/debian-i386 testing/

   and

      deb http://debian.cran.r-project.org/cran2deb/debian-amd64 testing/

   A few more details are provided in [3]our presentation slides. We look
   forward to hearing from folks using; the r-sig-debian list may be a
   good venue for this.

   References
   1. http://cran.r-project.org/
   2. http://dirk.eddelbuettel.com/cranberries
   3. http://dirk.eddelbuettel.com/papers/useR2009cran2deb.pdf

Please direct questions to r-sig-debian (which has subscriber-only posts) or
debian-science. 

Regards,  Dirk and Charles

--
Three out of two people have difficulties with fractions.

I would like to thank Dirk and Charles for their hard work!

Original message posted by Dirk:

https://stat.ethz.ch/pipermail/r-sig-debian/2009-July/000805.html