## By Kevin Keenan

The `apply`

family of functions in `R`

are extremely useful. I've been using them for quite a while now, generally in place of `for`

loops. However, they are not particulary intuative for `R`

beginners, in the same way that loops can be.

One `apply`

function that I have never paid much attention to in the past is `mapply`

. I've attempted to use it a few times but could never make sense of the help file and just resorted to loops instead. This morning, however, I was trying to calculate some statistics from the independent element of two `lists`

that I had generated, and was determined to avoid using a `for`

loop (my default position when writing `R`

code).

A quick google search suggested that mapply was the way to go. After some fumbling around and lots of trial and error, the scales dropped from my eyes as I held 'CTRL+ENTER' (in RStudio of course) and the stop icon dissappeared as if it was never there. Previously, when running similar calculations using `for`

loops, the stop icon might have remained tauntingly for up to a minute, maybe more.

It appears that `mapply`

is not only easier to use than I previously thought, but also lightening fast. Let the code below be a testament to its power:

### Example code

For the purpose of illustration, imagine we have two lists of length 100,000, each element being a matrix of 100 random variables with 10 columns and 10 rows.

Imagine that we are interested in calculating the product of each matrix (i.e. \( mat1 \times mat2 \)). Let's have a look at the speed difference between using a for loop and `mapply`

.

` ``# generate the data`

list1 <- list()

list2 <- list()

for (i in 1:1e+05) {

list1[[i]] <- matrix(rnorm(100), ncol = 10)

list2[[i]] <- matrix(rnorm(100), ncol = 10)

}

# Calculate the matrix products using a 'for' loop

system.time({

listProd1 <- list()

for (i in 1:1e+05) {

listProd1[[i]] <- list1[[i]] * list2[[i]]

}

})

` ``## user system elapsed `

## 32.33 1.34 34.31

` `` # Calculate the matrix products using 'mapply'`

system.time({

listProd2 <- mapply(FUN = `*`, list1, list2, SIMPLIFY = FALSE)

})

` ``## user system elapsed `

## 0.34 0.03 0.38

` `` # Test to make sure both methods do the same thing`

listProd1[[1]] == listProd2[[1]]

` ``## [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]`

## [1,] TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE

## [2,] TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE

## [3,] TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE

## [4,] TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE

## [5,] TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE

## [6,] TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE

## [7,] TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE

## [8,] TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE

## [9,] TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE

## [10,] TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE

` ``listProd1[[1000]] == listProd2[[1000]]`

` ``## [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]`

## [1,] TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE

## [2,] TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE

## [3,] TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE

## [4,] TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE

## [5,] TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE

## [6,] TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE

## [7,] TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE

## [8,] TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE

## [9,] TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE

## [10,] TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE

We can see that there is a massive (in computation terms) difference in the performance of these two methods. Although I don't know for sure, I suspect the time penalties in the `for`

loop are due to growing the list from scratch which takes time, and is not the best way to do things.

### Reproducibility

` ``## R Under development (unstable) (2013-09-29 r64014)`

## Platform: x86_64-w64-mingw32/x64 (64-bit)

##

## attached base packages:

## [1] stats graphics grDevices utils datasets methods base

##

## other attached packages:

## [1] knitr_1.5.1

##

## loaded via a namespace (and not attached):

## [1] digest_0.6.3 evaluate_0.4.7 formatR_0.9 stringr_0.6.2

## [5] tools_3.1.0