R: Analysis methods

Some examples of resampling statistics in R:

Permutation tests randomize the data with respect to treatment and re-analyze each new data permutation. If the treatment has a significant (yes, I said significant. so sue me!) effect, then we would expect random permutations to exhibit less extreme among-treatment differences than the actual data. Permutation tests assume that the observations are exchangeable under the null hypothesis. Thus, tests of difference in location require equal underlying variance among treatments. Here is a simple example of a permutation test in R.

#simple permutation test:

#        1) generate fake data for contrasting quantitative data across two treatments
#            treatment A is a normal random variable (10,5) and treatment B is a
#            normal random variable (15,5).

x <- c(rnorm(50,10,5),rnorm(50,15,5))
treat <- rep(c("A","B"),each=50)

#        2) calculate absolute difference in treatment means

realdif <- abs(mean(x[which(treat=="A")]) - mean(x[which(treat=="B")]))

#        3) resample data, without replacement, randomizing treatment for each trial

z <- vector(mode="integer",length=500)    #storage vector for resampled means
for (i in 1:500)  {
rstreat <- sample(treat,100,replace=F)

#        4) calculate the dif. in treatment means for resampled data

z[i] <- abs(mean(x[which(rstreat=="A")]) - mean(x[which(rstreat=="B")]))
}

#    5) calculate the 95th percentile of the difference in means for the permuted data
#                 and compare with "real" data

quantile95 <- quantile(z,0.95,names=F)
print(c("REALDIF = ",realdif,"95percentile dif = ", quantile95),quote=F)

In this case, the permutation test shows us that our treatment effect was significant:

 REALDIF = 5.81602929424672
95percentile dif =  2.31977945130971

Note: You can find a lot of pre-built resampling tools in the boot and coin packages. It would be worth reading the (f-ing) manual, though.

Basic Statistics

This is a link to a helpful pdf from CRAN R, simpleR—Using R for Introductory Statistics by John Verzani. All of your basic frequentists stats are covered, so if you need an easy reference for constructing confidence intervals or want to run a chi square this is a good place to look.

page revision: 5, last edited: 30 Nov 2009 04:32