Coding Style Guide

Introduction

This page will serve as my ever-evolving coding style guide. It may also serve as as source style guide for projects in which I am lead maintainer in lieu of anything else. Before you complain, yes, I know that I have packages on CRAN which do not conform to the guide as it stands now. That is mainly due to the guide’s evolution over time. Once an API is released, there needs to be an earth-shattering reason to break backwards compatibility. Sometimes, a function of mine is patterned after one in base or recommended R, such as adm in the revss package being based on mad in the stats package. Then conformity with base R trumps conformity with my personal style. Lastly, these are all recommendations, or “shoulds” and not “shalls”. Given a good reason, style guides may be broken. If you are using this guide to contribute to one of my projects and you wish to break the style guide, defend yourself well enough and the pull request may be accepted!

Note: This style guide defaults to the use of R. For example, see the section on the assignment operator. Languages with specific style requirements of their own may be discussed later.

General/R Style Guide

Variable Names

  • Variable names should be descriptive yet concise.
  • Acronyms should not be used unless they are self-explanatory from context.
  • Variables names should begin with a lowercase letter, may contain letters and numbers, and should use either underscores or lowerCamelCase to concatenate multiple words.
  • Variables names should not contain periods as that normally implies a method or class hierarchy.
  • Hungarian notation should not be used.

Examples

Good Bad
x (when representing ‘x’ in math) x (when representing ‘weight of emu after diet’)
lowerBound lb (could it be pounds?)
subPrem (subject Premium) sp (???)

Function Names

  • Function names should be descriptive yet concise.
  • Acronyms should not be used unless they are self-explanatory from context.
  • Function names should begin with a lowercase letter, may contain letters and numbers, and should use either underscores or lowerCamelCase to concatenate multiple words.
  • Function names should not contain periods as that normally implies a method or class hierarchy.

Examples

Good Bad
fitDist(x) fD(x)
smoothResults(x, y) sR(x, y)

Line Length

  • Line length should be capped at 80 characters.
    • Even though most people have wide-screen monitors, it is still prudent to cap at 80 characters where possible. This allows for both neater printing and for easier diff comparisons on a wide-screen device.
    • Going over by 1 or 2 characters is not a big deal, but going over by three characters or more implies a hard return may be necessary.

Indentation

  • Each level of indentation should be two spaces—not tabs.
  • If a wrapped line is inside parenthesis, start the next line at the character position after the parentheses.

Examples

Spacing

  • All binary operators should have spaces before and after them, including the caret used for exponentiation and the assignment operator.
  • Never place a space before a comma; always place a space after a comma.
  • Always place a space before a left parenthesis except in a function call.
  • Always place a space after a right parenthesis.
  • Never place a space after a left parenthesis and before a right parenthesis.
  • Never place spaces between brackets and their contents.

Examples

Dates

  • When writing code, dates should be written and stored in ISO 8601 format: “YYYY-MM-DD”.

Assignment Operator

  • The R assignment operator is “<-”; never “=”. The equals sign is used to pass parameters in a function call.

Function Definitions and Calls

Arguments

  • When writing or using functions, it is preferable to first list arguments without default values and then the arguments with default values.

Line Breaks

  • When writing or using functions, the call or definition may span multiple lines, but should be broken between assignments—after a comma.

Comments

  • Always comment code!
  • Use comments liberally and appropriately.
  • In general, comments should be on their own lines, starting with a space, followed by a “#”, followed by a space, and be complete sentences.
  • One may use a row of “-” or “=”” to indicate new sections as well.
  • Very short comments which are not full sentences may be placed on the same row as code, preceded by two spaces.
    • If the comment is of even moderate length, it is preferable to write it as a full sentence.
  • Comments may be placed above the specific line of code and indented to the same level, if it is believed that will help readability.
    • See the curly bracket section for an example of a very short comment.

Examples

Quotations

  • When delimiting strings, use double-quotations (“) and not single-quatations (‘).

Semicolons

  • Do not use semicolons to have multiple calls on the same line.
    • Use separate lines for separate calls.

Curly Braces and “Else”

  • Use Egyptian brackets.
  • The opening brace should always be on the same line as the call—never on its own line.
  • The closing brace should be on its own line, at the same level of indentation as the call it is closing.
    • Single-line clauses may have the closing brace on the same line.
  • There should be nothing else on the line of the closing bracket except an “else” or “else if”.
  • The else should always be on the line closing the previous clause and opening the next clause.

Examples

Tabular Data Framework

  • Use of base R for simple tabular data is strongly preferred
  • Use of data.table for more complex tabular data is strongly preferred
  • Use of dplyr is strongly discouraged.
    • Combining different tabular data constructs gets confusing.
    • data.table syntax is more elegant in the author’s eyes

Functional Composition

  • Functional composition is strongly preferred
    • This is how mathematicians and statisticians write
  • Piping is strongly discouraged
    • Both native R pipes and magrittr style pips will almost never be acceptable

Dependencies

  • Dependencies should be minimized as per the tinyverse mentality.
  • Use of tidyverse member packages is discouraged if base R analogues exist
  • Use of library(tidyverse) is extremely strongly discouraged
    • I cannot think of a reason this would be allowed, but never say never.

Creating R Packages

Documentation

  • Use hand-written .Rd files.
  • Do not use Roxygen-style comments

File Names

  • File names should begin with a capital letter, may contain letters and numbers, and should use either underscores or UpperCamelCase to concatenate multiple words.
  • Script file names should end in .R.
  • R markdown file names should end in .Rmd.
  • Sweave file names should end in .Rnw.