R programming tips

This page will be an ongoing collection of tips and suggestions I find useful (or found out through much trial and effort) when using R. As a living document, it will start as a haphazard collection, but should it grow, I may re-order it.

Use a consistent coding style

I have mainly been following Hadley Wickham’s style guide, although I have not settled on a consistent variable and function naming schema as of yet. Another good resource is Paul Johnson’s brief exposition (PDF).

Benchmark your code

There are multiple ways to time code. Personally, I use the microbenchmark package. There is also the rbenchmark package, and the tried-and-true workhorse System.time(foo). Regardless of which you use, it can be illuminating to compare slightly different implementations. Which brings us to the next suggestion…

Profile slow code

Use R’s code profiling mechanisms, specifically Rprof, when dealing with slow code. Identifying the bottleneck and recoding it, or moving it into C++, can provide speed gains measured not in multiples but orders of magnitude!

Use R’s built-in optimized code as much as possible

This was not immediately obvious to me, but it makes sense. As an example, compare 1 – pnorm(4) with pnorm(4, lower.tail=FALSE). There is a small, but measurable speed increase seen in the latter, probably because the subtraction is happening in the C (or is it FORTRAN) routine and not at the R level. If you need to do several billion calls, this savings can become meaningful. I’ve tested it with many of the basic distributions (normal, lognormal, gamma, etc.) and, as a rule, it seems to hold. I’ll be keeping an eye out for this one in the future.

Compare methods when inverting matrices

In much of my code, I have to invert a matrix (Hessian of negative log-likelihood function at point of convergence to find fitted parameter variance-covariance matrix, if you really want to know). For various reasons, I used to use the QR factorization. Using a cholesky decomposition and then chol2inv was markedly faster in my cases.

Use a fast BLAS if possible

See this post and this update for more detail.
Test files used in 3.1.0 speed tests:

4 Responses

  1. jot_en
    jot_en January 16, 2014 at 7:51 AM |

    thanks for tips.
    P.s. Link for Paul Johnson’s brief exposition is broken.

  2. Michael
    Michael January 16, 2014 at 5:54 PM |

    Most of your tips focus on code optimisation so my tip would be:
    Focus on writing clear, readable code. Don’t worry about optimtisation unless you have a performance problem.

    1. MagicSpik3
      MagicSpik3 August 10, 2015 at 9:13 AM |

      I ditto that. But I’m concerned with research more than commercial applications.


Leave a Reply