Archive for April, 2009

h1

JAGS package for Debian Lenny AMD 64

April 28, 2009

I’ve packaged JAGS (Just Another Gibbs Sampler) for Debian Lenny for the AMD 64 architecture. This may or may not work on Ubuntu. I am working on getting this into Debian but until then you can at least get the 64 bit version here. Please let me know if there are any problems with the package. If anyone is interested in creating a *.i386.deb package (etc.) please let me know and I’ll upload the appropriate files. For JAGS 1.0.3 Debian Package Click Here.

h1

Bayesian talk and DIC

April 28, 2009

I gave an introduction to Bayesian talk for one of my classes. I’ve attached a PDF of the talk (click Thomas Bayes below to download it). Please feel free to leave comments and let me know what you think and if it’s helpful. It’s suppose to give a conceptual overview. It’s not meant to be definitive. Finally, I have included some humor in the presentation (e.g. Frequentist and Bayesian images). They are meant to be taken as a joke and do not at all imply that these individuals are representative of the different statistical lineages. There is one easter egg in the presentation. Click the word “Bayesian” on the Harry Potter slide. Enjoy!

thomas_bayes1

I recently started working with the MCMCglmm package in R and came across an interesting information criterion called the DIC (Deviance Information Criterion) that is often used for model selection when running MCMC. I look forward to seeing how the BIC and DIC differ. Additionally as I’m moving more in the direction of employing Bayesian statistics in my research, I’ll probably start to favor the DIC over the BIC as the former does not require maximum likelihood estimates. But we’ll see.

h1

AIC vs. BIC revisited and other updates

April 22, 2009

The winner: BIC. See Raftery (1995;1996) and Kadane and Lazar (2004). There are two reasons that have convinced me that the BIC is the better choice. First, As N increases to infinity the probability of choosing the true model is 1 with the BIC. Second, BIC has a Bayesian justification whereas the AIC does not. Finally, BIC favors parsimony. Clearly I am moving away from my ecological roots of AIC and the influence of Burnham and Anderson.

This summer I intend to update this blog more frequently. These updates will include Bayesian analyses for typical analyses performed in the social sciences (I plan to spend a great deal of time on Bayesian Hierarchical Modeling); Win/OpenBUGS code (i.e. MCMC simulations); and R code. My intention is to post weekly with an example of a Bayesian analysis of a typical problem, compare it with a frequentist analysis, and provide the code. I will try to keep it objective and allow the procedures/results to speak for themselves. Apparently there are at least a few folks that read this, so if you’re interested in seeing a Bayesian approach to a problem common in the social sciences please comment to let me know and I will try my best to address it.

h1

Emacs: The all-in-one R, Sweave, LaTeX, and BibTeX editor

April 6, 2009

Emacs is a cross-platform, highly extensible, open source text editor developed by this guy. It is the way to go if you use R and LaTeX. If you want a vanilla version of the package go to the above link, but if you want Emacs already preloaded with a bunch of goodies try the following for Mac OS X or this one. For Windows, this one has a bunch of nice features that enables it to better integrate with Windows but it doesn’t come with ESS so you might prefer this one which comes with ESS. For Linux, it’s usually a simple apt-get install emacs-gtk, uprmi emacs, zypper in emacs, or yum install emacs (you get the picture) with Linux you’ll also need to install ESS. Also, to run LaTeX make sure you have a LaTeX bundle installed.

Here are some useful Emacs tips for using R (you could also use this cheat sheet:

First thing you should do is create a file in Emacs and save it with the extension .Rnw (if you want to do Sweave) or .R (if you just want it to be a R script). For example, test.Rnw (though you could save it as test.R).
Then you can add R code. To run the code you can either invoke R in Emacs as described below, click the run code icon, or just have Emacs invoke R when you run your first line of code by typing C-c C-n. That’s it.

M-x R
(Starts R in Emacs)

M-n s

(Runs Sweave, but Emacs needs to know where Sweave is … so if you’re running Linux everything (I should be careful here, I guess I mean Debian or Ubuntu) is set up fine but in Windows or Mac OS X you have to tell Emacs where Sweave.sty is located. This is included with R. Locate the file and create a softlink in Mac OS X to the directory where your other *.sty are or you can copy the Sweave.sty file to a directory on your computer such as “/Users/me/Sweave/Sweave.sty” and specify the full path to the Sweave file in your preamble.

M-p
(Go to previous line)

If you use scholar.google.com, BibTeX is the way to go.  Set up the scholar preferences to “show links to import citations into BibTeX”. Then copy and paste the reference into a reference file with the file extension *.bib. Then all you need to do is call the reference from Emacs when you cite it in LaTeX.

For Sweave put your R code in the following

<<>>=
x <- c(1,2,3,4,5)
@

If you want to include comments


<<keep.source=T>>=
# Assign 1 through 5 to X
x <- c(1,2,3,4,5)
@

For figures use,

\begin{center}
<<fig=TRUE,echo=FALSE>>=
INSERT R CODE
@
\end{center}

Also you must first “Sweave” a document, then run “Latex”, then run “Bibtex”, then run “Latex” again. Small price to pay for a well integrated way of creating documents.

h1

BIC or AIC?

April 5, 2009

model      BIC           deltaBIC            AIC          deltaAIC
1       141978.9     112.05478     141789.5     154.048607
2       141924.9     57.98382       141648.5     13.137065
3       142006.4     139.49803     141643.2     7.810695
4       141866.9     0.00000         141677.4     41.993823
5       141911.7     44.84676       141635.4     0.000000

n ~ 39,000

The AIC seems to always select model 5 when examining all the data or based on random splits (for model cross-validation). However, the BIC will select model 4 (which is more parsimonious than model 5) on random splits of the data but will select model 5 on all the data. Which model selection criterion to go with?

h1

Migration from UMN blog and Free/Reduced Price Lunch in Minnesota

April 4, 2009

This is my first post on WordPress after migrating from the UMN blog roll. WordPress appears to be more feature rich and customizable. This will now be my sole blogging source. I intend to include R and WinBUGS code here as well as other information relevant to my research.

As part of my research with the Institute of Child Development at the University of Minnesota I have been involved with modeling math and reading achievement in homeless/highly mobile (HHM) students in the Minneapolis public school district. This work has fostered a deeper interest in child development in general and HHM students in particular. I decided it might be interest to look at the proportion of students on free/reduced price lunch by county in Minnesota. The map was quickly created below by using the ggplot2 package in R. While I am not sure if I feel that county is the best scale to examine free/reduced price lunch at it does however allow a cursory spatial examination of free lunch in Minnesota.

Free/Reduced Price Lunch