CRAN Task View: Reproducible Research

The goal of reproducible research is to tie specific instructions to data analysis and experimental data so that scholarship can be recreated, better understood and verified.

R largely facilitates reproducible research using literate programming; a document that is a combination of content and data analysis code. The
Sweave
function (in the base R utils package) and the
knitr
package can be used to blend the subject matter and R code so that a single document defines the content and the algorithms.

Basic packages can be structured into the following groups:

LaTeX Markup
:
The
Hmisc,
xtable
and
tables
packages contain functions to write R objects into LaTeX representations.
Hmisc
also includes methods for translating strings to proper LaTeX markup (e.g., ">=" to "$\geq$"). Animations can be inserted into LaTeX documents being converted to PDF via the
animation
package. The
pictex
function in the base grDevices package is a PicTeX graphics driver and the
tikzDevice
can convert R graphics to
TikZ
markup. The
tth
package can convert TeX to HTML.

HTML Markup
:
The
R2HTML
package has drivers that allow
Sweave
to process HTML documents via
Sweave. Packages
R2HTML,
hwriter
and
ReporteRs
can be used to build HTML pages sequentially.
R2HTML,
xtable
and
hwriter
can also convert some R objects into HTML representations.
knitr
also has facilities to weave R code with HTML as well as convert markdown to HTML.

ODF Markup
:
The
odfWeave
package extends
Sweave
to the
Open Document Format
. Word processing tools, such as OpenOffice.org, can then be used to blend content and programs. Many word processors can be used to translate the ODF document to other formats (e.g., Word, PDF, HTML, etc.)

Microsoft Formats
:
The
R2wd
and
R2PPT
packages for Windows can be used to communicate between R and Word or PowerPoint via the COM interface. Document elements (e.g. sections, text, images, etc) that are created in R can be inserted into the document from R. The
rtf
can also be used to create RTF format documents directly from R. Commercial R products hat work with RTF and/or Word are
RTFGen
,
Inference for R
and
SWord
.The output from other packages (odfWeave
and
R2HTML) can also be opened by Word.
ReporteRs
can be used to create Word and PowerPoint documents.
RExcel
can integrate code with Microsoft Excel. Additionally, the
table1xls
can convert summary tables to
Excel files.

Plain Text Formats
:
R code and output in
Sweave
files can be converted into
AsciiDoc
and other structured text formats using the
ascii
package. The
markdown
and
knitr
packages have tools for
markdown
format.

Syntax Highlighting
:
The
SweaveListingUtils
package can also provide enhanced control over how R code chunks and their output are rendered in LaTeX.

Caching of R Objects
:
The
weaver
package allows caching of specific code chunks. The
R.cache
package can also be used but is not integrated with
Sweave.
knitr
also has the ability to cache the results of code chunks.

Others
:
The
brew
and
R.rsp
packages contain alternative approaches to embedding R code into various markups.
knitr
is a comprehensive package derived from
Sweave
that includes code formatting, highlighting, caching, fine control of graphics, conditional evaluation, multiple markup formats and other features. The
pander
package can write R objects into
Pandoc's markdown
and also to convert those or complex reports to PDF/HTML/docx/ODT. The
rapport
package builds on
pander
and provides a way to create reproducible statistical report templates with graphs, tables and annotations to be applied to any R data frame and export the results in different formats. The
installr
package for Windows can download and install MikTeX, pandoc (and other software), as well as quickly update R itself.

An incomplete list of packages which facilitate literate programming for specific types of analysis or objects:

The base R utils package has generic functions to convert objects to LaTeX (via
toLatex) and BibTeX (via
toBibtex). The
bibtex
can also be used to parse BibTeX files.

Functions for creating LaTeX representations of summary statistics and visualizations can be found in the
Hmisc,
reporttools, and
r2lh
packages.
Hmisc
also has functions for marking up data frames and the
quantreg
and
memisc
packages can mark up matrices.

Cross-tabulations can be converted to LaTeX code using the
Hmisc
and
memisc
packages.

The
xtable
and
rms
packages provide LaTeX representations of some common models (e.g., Cox proportional hazards model, etc.). For example, processing an
aov
object with the
xtable
function will generate LaTeX markup of the ANOVA table. Similarly, methods exist for
glm,
prcomp,
ts
and other types of objects.

The
texreg
has functions to create nice LaTeX and HTML representations of one or more objects (e.g.
lm,
lme4, etc.). The
stargazer
has similar functionality for showing models and summary tables in LaTeX and ASCII.