November 12, 2017

Trick or Tips?

Ever tumbled on a code chunk that made you say:
"I should have known this ¶ø?!@~&* piece of code long ago!?"
Chances are you have, frustratingly, just like we have, and on multiple occasions too.

In comes Trick or Tips!

Trick or Tips is a series of blog posts that each
present 5 -- hopefully helpful -- coding tips for a specific programming
language. Posts should be short (i.e. no more than 5 lines of code,
max 80 characters per line, except when appropriate) and provide tips of
many kind: a function, a way of combining of functions, a single argument,
a note about the philosophy of the language and practical consequences,
tricks to improve the way you code, good practices, etc.

Note that while some tips might be obvious for careful documentation readers
(God bless them for their wisdom), we do our best to present what we find very
useful and underestimated. By the way, there are undoubtedly similar initiatives on the web (e.g."One R Tip a Day" Twitter account). Also, feel
free to comment below tip ideas or a post of code tips of your own which we will be
happy to incorporate to Trick or Tips.
Enjoy and get ready to frustratingly appreciate our tips!

The drop argument of the [] operator

This is something not obvious and poorly known but there is a logical argumentdrop that can be passed to the [] operator and I’ll try to explain why it could be useful! Let’s first create a dataframe with ten rows and three columns:

This behavior is actually very useful in many cases as we often are happy to deal with a vector when we extract only one column. However this might become an issue when we do extractions without knowing the number of columns to be extracted beforehand (typically when extracting according to a request that can give any number of columns). In such case if the number is one then we end up with a vector instead of a data.frame. The argument drop provides a work around. By default it is set to TRUE and a 1-column dataframe becomes a vector, but using drop=FALSE prevents this from happening. Let’s try this:

Get the citation of a package

Many researchers (it is especially TRUE in ecology) uses R and write paper and carry out analyses using R for their research. One cones the time of citing the package I guess they wonder how to cite the package. However authors of package actually provides this information in their package! Let’s have a look of the reference for the package knitr as of version 1.17 using function citation

Even if you are no a Latex user, this could be very helpful as this file can be read by references manager softwares such as Zotero. So now let’s say I use the following command line:

cat(toBibtex(citation("knitr")), file='biblio.bib', sep='\n')

Then the biblio.bib file just created can be imported in you favorite references manager softwares.

Using namespace

In R, functions are stored in packages and adding a package is like adding a collection of functions. As you get more experienced with R you likely know and use more and more packages. You might even come to the point where you have functions that have the same name but originate from different package. If not, let me show you something:

Here I use the function extract() from the magrittr package that act as [] and I extract the column var1 from df. This function is actually designed to be use with pipes (if this sounds weird, have a look at the magrittr package), for instance when piping you can write df %$% extract(var1) or even df %>% '['('var1') and this will do the same. So far, so good. Now I load the raster package:

It does not work…Why?? Briefly, extract() from raster is now called (this was the warning message on load said) and it does not get well with data.frame (this is the meaning of the error message). To overcome this you can use a explicit namespace. To do so you put the names of the package followed by ::, this is basically the unique identifier of the function. Indeed, within a specific package, functions have different names and on CRAN packages must have different names, so the combination of the two is unique (this holds true if you only package from the CRAN). Let’s use it:

Using this is also very helpful when you develop a package and functions from different packages. Even if you script and use a large number of function from various packages, it could be better to remember from which package functions come from. Finally, note that this is not R specific at all, actually this something very common in programming languages.

How to use non-exported functions?

Packages often contain functions that are not exported. There are often functions called by the functions exported thats helps structuring the code of the package. However, it happens that when you try to understand how a package work you may want to spend some time understanding how they do work (especially given that they are nit documented). There is actually a way to call them! Instead of using tow colons (:), use three! Let’s have a look to the code of one of this function from the knitr package (again version 1.17):

knitr:::.color.block

Interesting, isn’t it! To give you an idea about how frequent this can be, in this packages there are 103 exported functions and 425 not-exported. Below are presented few examples of exported functions followed by not-exported ones.