esoteric R | Introducing Closures

Jeffrey A. RyanJanuary 1, 2011

The R language provides object-oriented programming
through two primary systems, known as S3 and S4.
S3 implements
a class-based dispatch mechanism, while S4 offers a more traditional
object-oriented scheme. Both implementations utilize list-style
constructs for objects and separate data from methods. A third
mechanism, closures, offers the programmer the option of integrating
methods within objects. This can be used as a lightweight object
design with benefits that neither S3 nor S4 offer.

The Basics of a Closure

A closure in R
is an object that contains functions
bound to the environment the closure was created in.
These functions maintain access to the scope in which they
were defined, allowing for powerful design patterns that
are difficult with the standard S3/S4
approach to objects in R.

To create closures, we use the environment
object in R. This allows for data and methods to
reside within the object instances,
making self-aware behavior and selective
inheritence easy. It's even possible to
mix this with traditional R by assigning a class to the environment.

We'll start the exploration with an example of functionality found in
other interpretted langauges — the stack1.

Example: A Stack in R

A stack implementation consists of three main components:

a container variable --- a.k.a. the stack

a push method to add elements

a pop method to remove elements

The general idea is to be
able to add elements to a container, and modify
the container in-place. In R this is possible using some assignment tricks into
the .GlobalEnv, but it can be frought
with unintended consequences. Closures offer us a perfect alternative to
keep surprises to a minimum.

First, we'll create single environment that will act as the container
and then add into that environment a stack vector and the two
methods, push and pop.

s

We are
using the double arrow <<- assignment operator
in the push function to let assignment
proceed up the internal stack frame until a variable is found
to bind to. This allows for non-local modifications
to our .Data variable.
The push method appends new data to the stack
and pop removes the last element
of the stack and returns it to the caller. We can use the
$ operator to access the internal methods of
our environment.

s$push(1)
Error in s$push(1) : object '.Data' not found

Oops, something is wrong. It turns out that <<– can't find the
.Data object
stored in the s object. We haven't matched the environment
of the function to the object's environment.
R isn't starting its search for .Data in the correct
location; it needs more information.
The functions environment and as.environment work well here.

environment(s$push)

We can use S3
classes to create push and pop methods
to make the calls look more like normal R

push

That completes our stack
object. Unfortunately, we currently need to recreate most of the above code
for each new "stack" object we'd like to create. A much better approach
would be to functionalize this.

new_stack

Not only can we now create stacks easily, we can also use this
to extend the class with new functionality via inheritance.

Example: Making a Better Stack

An interesting extension to our example comes
from extending our stack object with
additional "shift" and "unshift" methods.
Using the new_stack constructor, we
can extend the "stack" object
to a new class called "betterstack".

new_betterstack

To make the experience more R like, we again add S3
methods for shift and unshift like we did for push
and pop. Putting it all together gets us a nice stack-like
object for R.

nb

Conclusion

In this first installment on closures in R we covered a
few of the basics. Creating objects using environment
objects, adding methods that act on private data, and even
incorporating this into the traditional S3 landscape.
Some simple usage patterns one may encounter would include
keeping track of a 'static' data without relying on global
variables (hint: create incr and decr
methods for the .Data) or
allowing for method overrides by instance.

In future articles we'll examine some of the more nuanced behavior of
closures in general, as well explore how R's implementation is different
from implementations in other well know programming langauges.

esotericRTM is edited and published
by lemnica.
It covers common parts of R in-depth, and examines the lesser known
aspects of programming with R – from beginner to advanced.
Submissions from authors, developers, and users are encouraged.

1
A stack is a common data structure used in programming. It is based
on the idea of last in, first out (LIFO).
Typically stacks have methods that
allow data to be pushed onto, and popped off of, the stack. A good visual
analogy is that of a stack of dishes in a cafeteria line.

In functional languages like R, side-effects
such as the in-place modification in a stack are
discouraged — part of the notion
of least surprise. Sometimes the
reality of functionality must triumph over philosophy though.

Other articles
explain S3 classes
in detail, but for our needs it is sufficient to understand it as a lightweight
mechanism used in R to provide function dispatch depending on the
'class' of an object.

Note that we needn't reimplement .Data,
pop, or push. These are
inherited from the original with new_stack().
This is a major benefit when dealing with complex
structures that have many variables or methods. Stubs
can be defined and methods can be overwritten by
the child objects with ease.

Examples of both implementations can be found in
the IBrokers package that
interfaces the Interactive Brokers trading platform.
See the twsConnect and eWrapper objects in the
package on CRAN.

About the author

Jeffrey Ryan is the founder of lemnica corp.,
a Chicago firm specializing in statistical software,
training, and on-demand support.
He helps organize the R/Finance conference series
[www.RinFinance.com],
and is a frequent speaker on software related topics. He
is the author or co-author of a variety of R packages
involving finance, large data, and visualizations including
quantmod, xts, Defaults, IBrokers, RBerkeley, mmap, and indexing.
He currently lives in Chicago, Illinois with his wife and three children.