(R) code for working with data

table() is useful for counting the number of observations in different groups or levels

> color <- c("red", "red", "red", "blond", "blond", "blond", "blond", "brunette", "brunette", "brunette", "brunette")
> style <- c("perm", "mullet", "perm", "mohawk", "mullet", "mullet", "braids", "mullet", "mullet", "perm", "braids")
> table(color, style)

         style
color      braids mohawk mullet perm
  blond         1      1      2    0
  brunette      1      0      2    1
  red           0      0      1    2

tapply() is very useful for applying a function (e.g., mean, max, length, or one of your own) to the observations in different groups or levels

> length.in.back <- c(2, 5, 1.5, 8, 3, 4.5, 13, 9, 2, 3.5, 3)
> tapply(length.in.back, INDEX = style, FUN = mean)

  braids   mohawk   mullet     perm 
8.000000 8.000000 4.700000 2.333333 

> tapply(length.in.back, INDEX = style, FUN = function(x) { sd(x)/mean(x) }) #calculating the C.V.
   braids    mohawk    mullet      perm 
0.8838835        NA 0.5709110 0.4460713

subset() can be useful for subsetting your data according to a condition or conditions

Unless otherwise stated, the content of this page is licensed under Creative Commons Attribution-ShareAlike 3.0 License