With Safari, you learn the way you learn best. Get unlimited access to videos, live online training,
learning paths, books, tutorials, and more.

Chapter 1. Getting Started and Getting Help

Introduction

This chapter sets the groundwork for the other chapters. It explains
how to download, install, and run R.

More importantly, it also explains how to get answers to your
questions. The R community provides a wealth of documentation and
help. You are not alone. Here are some common sources of
help:

Local, installed documentation

When you install R on your computer, a mass of
documentation is also installed. You can browse the local
documentation (Recipe 1.6) and search it (Recipe 1.8). I am amazed how often I search the Web
for an answer only to discover it was already available in the
installed documentation.

A task view describes packages
that are specific to one area of statistical work, such as
econometrics, medical imaging, psychometrics, or spatial statistics.
Each task view is written and maintained by an expert in the field.
There are 28 such task views, so there is likely to be one or more
for your areas of interest. I recommend that every beginner find and
read at least one task view in order to gain a sense of R’s
possibilities (Recipe 1.11).

Package documentation

Most packages include useful documentation. Many also
include overviews and tutorials, called
vignettes in the R community. The
documentation is kept with the packages in package repositories,
such as CRAN, and it
is automatically installed on your machine when you install a
package.

Mailing lists

Volunteers have generously donated many hours of time
to answer beginners’ questions that are posted to the R mailing
lists. The lists are archived, so you can search the archives for
answers to your questions (Recipe 1.12).

Question and answer (Q&A) websites

On a Q&A site, anyone can post a question, and
knowledgeable people can respond. Readers vote on the answers, so
the best answers tend to emerge over time. All this information is
tagged and archived for searching. These sites are a cross between a
mailing list and a social network; the Stack Overflow site is a
good example.

The Web

The Web is loaded with information about R, and there are
R-specific tools for searching it (Recipe 1.10).
The Web is a moving target, so be on the lookout for new, improved
ways to organize and search information regarding R.

Click on “CRAN”. You’ll see a list of mirror sites,
organized by country.

Select a site near you.

Click on “MacOS X”.

Click on the .pkg file for the
latest version of R, under “Files:”, to download it.

When the download completes, double-click on the
.pkg file and answer the usual
questions.

Linux or Unix

The major Linux distributions have packages for
installing R. Here are some examples:

Distribution

Package name

Ubuntu or Debian

r-base

Red Hat or Fedora

R.i386

Suse

R-base

Use the system’s package manager to download and install the
package. Normally, you will need the root password or
sudo privileges; otherwise, ask a system
administrator to perform the installation.

Discussion

Installing R on Windows or OS X is straightforward because there
are prebuilt binaries for those platforms. You need only follow the
preceding instructions. The CRAN Web pages also contain links to
installation-related resources, such as frequently asked questions
(FAQs) and tips for special situations (“How do I install R when using
Windows Vista?”) that you may find useful.

Theoretically, you can install R on Linux or Unix in one of two
ways: by installing a distribution package or by building it from
scratch. In practice, installing a package is the preferred route. The
distribution packages greatly streamline both the initial installation
and subsequent updates.

On Ubuntu or Debian, use apt-get to download and
install R. Run under sudo to have the necessary
privileges:

$ sudo apt-get install r-base

On Red Hat or Fedora, use yum:

$ sudo yum install R.i386

Most platforms also have graphical package managers, which you might find more
convenient.

Beyond the base packages, I recommend installing the documentation packages, too. On my Ubuntu machine, for
example, I installed r-base-html (because I like
browsing the hyperlinked documentation) as well as
r-doc-html, which installs the important R manuals
locally:

$ sudo apt-get install r-base-html r-doc-html

Some Linux repositories also include prebuilt copies of R packages
available on CRAN. I don’t use them because I’d rather get my software
directly from CRAN itself, which usually has the freshest
versions.

In rare cases, you may need to build R from scratch. You might have an obscure,
unsupported version of Unix; or you might have special considerations
regarding performance or configuration. The build procedure on Linux or
Unix is quite standard. Download the tarball from the home page of your
CRAN mirror; it’s called something like
R-2.12.1.tar.gz, except the “2.12.1” will be
replaced by the latest version. Unpack the tarball, look for a file
called INSTALL, and follow the directions.

See Also

R in a
Nutshell (O’Reilly) contains more details of downloading and installing R, including
instructions for building the Windows and
OS X versions. Perhaps the ultimate guide is the one entitled
R
Installation and
Administration, available on CRAN, which describes
building and installing R on a variety of platforms.

This recipe is about installing the base package. See Recipe 3.9 for installing add-on packages from
CRAN.

1.2. Starting R

Problem

You want to run R on your computer.

Solution

Windows

Click on Start → All
Programs → R; or double-click on
the R icon on your desktop (assuming the installer created an icon
for you).

OS X

Either click on the icon in the
Applications directory or put the R icon on
the dock and click on the icon there. Alternatively, you can just
type R on a Unix command line in a
shell.

Linux or Unix

Start the R program from the shell prompt using the
R command (uppercase R).

Discussion

How you start R depends upon your platform.

Starting on Windows

When you start R, it opens a new window. The window
includes a text pane, called the R Console, where you enter R
expressions (see Figure 1-1).

Figure 1-1. R on Windows

There is an odd thing about the Windows Start menu for R. Every
time you upgrade to a new version of R, the Start menu expands to
contain the new version while keeping all the previously installed
versions. So if you’ve upgraded, you may face several choices such as “R
2.8.1”, “R 2.9.1”, “R 2.10.1”, and so forth. Pick the newest one. (You
might also consider uninstalling the older versions to reduce the
clutter.)

Using the Start menu is cumbersome, so I suggest starting R in one
of two other ways: by creating a desktop shortcut or by double-clicking on your .RData file.

The installer may have created a desktop icon. If not, creating a shortcut is easy: follow the Start menu to the
R program, but instead of left-clicking to run R, press and hold your
mouse’s right button on the program name, drag the program name to your
desktop, and release the mouse button. Windows will ask if you want to
Copy Here or Move Here. Select Copy Here, and the shortcut will appear
on your desktop.

Another way to start R is by double-clicking on a
.RData file in your working directory. This is the
file that R creates to save your workspace. The first time you create a
directory, start R and change to that directory. Save your workspace
there, either by exiting or using the save.image function. That will create
the .RData file. Thereafter, you can simply open
the directory in Windows Explorer and then double-click on the
.RData file to start R.

Perhaps the most baffling aspect of starting R on Windows is
embodied in a simple question: When R starts, what is the working directory? The answer, of course, is that “it
depends”:

If you start R from the Start menu, the working directory is
normally either C:\Documents
and Settings\<username>\My Documents
(Windows XP) or
C:\Users\<username>\Documents (Windows
Vista, Windows 7). You can override this default by setting the
R_USER environment variable to an alternative
directory path.

If you start R from a desktop shortcut, you can specify an
alternative startup directory
that becomes the working directory when R is started. To specify the
alternative directory, right-click on the shortcut, select
Properties, enter the directory path in the box labeled “Start in”,
and click OK.

Starting R by double-clicking on your
.RData file is the most straightforward
solution to this little
problem. R will automatically change its working directory to be the
file’s directory, which is usually what you want.

In any event, you can always use the getwd
function to discover your current working directory (Recipe 3.1).

Just for the record, Windows also has a console version of R called Rterm.exe. You’ll find it in the
bin subdirectory of your R installation. It is much
less convenient than the graphic user interface (GUI) version, and I
never use it. I recommend it only for batch (noninteractive) usage such
as running jobs from the Windows scheduler. In this book, I assume you
are running the GUI version of R, not the console version.

Starting on OS X

Run R by clicking the R icon in the
Applications folder. (If you use R frequently, you
can drag it from the folder to the dock.) That will run the GUI version,
which is somewhat more convenient than the console version. The GUI
version displays your working directory, which is initially your home
directory.

OS X also lets you run the console version of R by typing
R at the shell prompt.

Starting on Linux and Unix

Start the console version of R from the Unix shell prompt
simply by typing R, the name of the program. Be
careful to type an uppercase R, not a lowercase
r.

The R program has a bewildering number of command line options.
Use the --help option to see the complete
list.

1.3. Entering Commands

Problem

Solution

Simply enter expressions at the command prompt. R will evaluate
them and print (display) the result. You can use command-line editing to
facilitate typing.

Discussion

R prompts you with “>”. To get
started, just treat R like a big calculator: enter an expression, and R will evaluate the
expression and print the result:

> 1+1
[1] 2

The computer adds one and one, giving two, and displays the
result.

The [1] before the 2
might be confusing. To R, the result is a vector, even though it has
only one element. R labels the value with [1] to
signify that this is the first element of the vector...which is not
surprising, since it’s the only element of the
vector.

R will prompt you for input until you type a complete expression.
The expression max(1,3,5) is a complete expression,
so R stops reading input and evaluates what it’s got:

> max(1,3,5)
[1] 5

In contrast, “max(1,3,” is an incomplete
expression, so R prompts you for more input. The prompt changes from
greater-than (>) to plus (+),
letting you know that R expects more:

> max(1,3,
+ 5)
[1] 5

It’s easy to mistype commands, and retyping them is tedious and
frustrating. So R includes command-line editing to make life easier. It defines
single keystrokes that let you easily recall, correct, and reexecute
your commands. My own typical command-line interaction goes like
this:

I enter an R expression with a typo.

R complains about my mistake.

I press the up-arrow key to recall my mistaken line.

I use the left and right arrow keys to move the cursor back to
the error.

I use the Delete key to delete the offending
characters.

I type the corrected characters, which inserts them into the
command line.

I press Enter to reexecute the corrected command.

That’s just the basics. R supports the usual keystrokes for
recalling and editing command lines, as listed in Table 1-1.

Table 1-1. Keystrokes for command-line editing

Labeled key

Ctrl-key combination

Effect

Up arrow

Ctrl-P

Recall previous command by moving backward through the
history of commands.

Down arrow

Ctrl-N

Move forward through the history of commands.

Backspace

Ctrl-H

Delete the character to the left of
cursor.

Delete (Del)

Ctrl-D

Delete the character to the right of
cursor.

Home

Ctrl-A

Move cursor to the start of the line.

End

Ctrl-E

Move cursor to the end of the line.

Right arrow

Ctrl-F

Move cursor right (forward) one character.

Left arrow

Ctrl-B

Move cursor left (back) one character.

Ctrl-K

Delete everything from the cursor position to the end of
the line.

Ctrl-U

Clear the whole darn line and start over.

Tab

Name completion (on some platforms).

On Windows and OS X, you can also use the mouse to highlight commands and then use the usual copy
and paste commands to paste text into a new command line.

See Also

See Recipe 2.13. From the Windows main menu,
follow Help → Console for a complete
list of keystrokes useful for command-line editing.

1.4. Exiting from R

Problem

You want to exit from R.

Solution

Windows

Select File → Exit
from the main menu; or click on the red X in the upper-right
corner of the window frame.

OS X

Press CMD-q (apple-q); or click on the red X in the
upper-left corner of the window frame.

Linux or Unix

At the command prompt, press Ctrl-D.

On all platforms, you can also use the q function (as in
quit) to terminate the program.

> q()

Note the empty parentheses, which are necessary to call the
function.

Discussion

Whenever you exit, R asks if you want to save your workspace. You have three choices:

Save your workspace and exit.

Don’t save your workspace, but exit anyway.

Cancel, returning to the command prompt rather than
exiting.

If you save your workspace, then R writes it to a file called
.RData in the current working
directory. This will overwrite the previously saved workspace, if any,
so don’t save if you don’t like the changes to your workspace (e.g., if
you have accidentally erased critical data).

See Also

1.6. Viewing the Supplied Documentation

Problem

Solution

Use the help.start function to see the
documentation’s table of contents:

> help.start()

From there, links are available to all the installed
documentation.

Discussion

The base distribution of R includes a wealth of
documentation—literally thousands of pages. When you install additional
packages, those packages contain documentation that is also installed on
your machine.

It is easy to browse this documentation via the
help.start function, which opens a window on the
top-level table of contents; see Figure 1-2.

Figure 1-2. Documentation table of contents

The two links in the Reference section are especially
useful:

Packages

Click here to see a list of all the installed
packages, both in the base packages and the additional, installed
packages. Click on a package name to see a list of its functions
and datasets.

Search Engine & Keywords

Click here to access a simple search engine, which
allows you to search the documentation by keyword or phrase. There
is also a list of common keywords, organized by topic; click one to see
the associated pages.

See Also

The local documentation is copied from the R Project
website, which may have updated documents.

1.7. Getting Help on a Function

Problem

You want to know more about a function that is installed
on your machine.

Solution

Use help to display the documentation for the
function:

> help(functionname)

Use args for a quick reminder of the function
arguments:

> args(functionname)

Use example to see examples of using the
function:

> example(functionname)

Discussion

I present many R functions in this book. Every R function has more
bells and whistles than I can possibly describe. If a function catches
your interest, I strongly suggest reading the help page for that
function. One of its bells or whistles might be very useful to
you.

Suppose you want to know more about the mean
function. Use the help function like this:

> help(mean)

This will either open a window with function documentation or
display the documentation on your console, depending upon your platform.
A shortcut for the help command is to simply type
? followed by the function name:

> ?mean

Sometimes you just want a quick reminder of the arguments to a function: What are they, and in what order
do they occur? Use the args function:

The first line of output from args is a
synopsis of the function call. For mean, the synopsis
shows one argument, x, which is a vector of numbers.
For sd, the synopsis shows the same vector,
x, and an optional argument called
na.rm. (You can ignore the second line of output,
which is often just NULL.)

Most documentation for functions includes examples near
the end. A cool feature of R is that you can request that it execute the
examples, giving you a little demonstration of the function’s
capabilities. The documentation for the mean
function, for instance, contains examples, but you don’t need to type
them yourself. Just use the example function to watch them
run:

See Also

1.8. Searching the Supplied Documentation

Problem

You want to know more about a function that is installed
on your machine, but the help function reports that
it cannot find documentation for any such function.

Alternatively, you want to search the installed documentation for
a keyword.

Solution

Use help.search to search the R
documentation on your computer:

> help.search("pattern")

A typical pattern is a function name or
keyword. Notice that it must be enclosed in quotation marks.

For your convenience, you can also invoke a search by using two question marks (in which case the
quotes are not required):

> ??pattern

Discussion

You may occasionally request help on a function only to be told R
knows nothing about it:

> help(adf.test)
No documentation for 'adf.test' in specified packages and libraries:
you could try 'help.search("adf.test")'

This can be frustrating if you know the
function is installed on your machine. Here the problem is that the
function’s package is not currently loaded, and you don’t know which
package contains the function. It’s a kind of catch-22 (the error
message indicates the package is not currently in your search path, so R
cannot find the help file; see Recipe 3.5 for more
details).

The solution is to search all your installed packages for the function. Just
use the help.search function, as suggested in the
error message:

> help.search("adf.test")

The search will produce a listing of all packages that contain the
function:

The following output, for example, indicates that the
tseries package contains the
adf.test function. You can see its documentation by
explicitly telling help which package contains the
function:

> help(adf.test, package="tseries")

Alternatively, you can insert the tseries
package into your search list and repeat the original help
command, which will then find the function and display the documentation.

You can broaden your search by using keywords. R will then find
any installed documentation that contains the keywords. Suppose you want
to find all functions that mention the Augmented Dickey–Fuller (ADF)
test. You could search on a likely pattern:

> help.search("dickey-fuller")

On my machine, the result looks like this because I’ve installed
two additional packages (fUnitRoots and
urca) that implement the ADF test:

See Also

You can also access the local search engine through the
documentation browser; see Recipe 1.6 for how this
is done. See Recipe 3.5 for more about the search
path and Recipe 4.4 for getting help on
functions.

1.9. Getting Help on a Package

Problem

You want to learn more about a package installed on your
computer.

Solution

Use the help function and specify a package
name (without a function name):

> help(package="packagename")

Discussion

Sometimes you want to know the contents of a package (the
functions and datasets). This is especially true after you download and
install a new package, for example. The help function can provide the
contents plus other information once you specify the package
name.

This call to help will display the information for the
tseries package, a standard package in the base
distribution:

> help(package="tseries")

The information begins with a description and continues with an
index of functions and datasets. On my machine, the first few lines look
like this:

Some packages also include vignettes, which are additional
documents such as introductions, tutorials, or reference cards. They are
installed on your computer as part of the package documentation when you
install the package. The help page for a package includes a list of its
vignettes near the bottom.

You can see a list of all vignettes on your computer by using the
vignette function:

> vignette()

You can see the vignettes for a particular package by including
its name:

> vignette(package="packagename")

Each vignette has a name, which you use to view the
vignette:

> vignette("vignettename")

See Also

See Recipe 1.7 for getting help on a
particular function in a package.

1.10. Searching the Web for Help

Problem

You want to search the Web for information and answers
regarding R.

Solution

Inside R, use the RSiteSearch function to search by
keyword or phrase:

The Statistical Analysis area on Stack Exchange is also a searchable Q&A site,
but it is oriented more toward statistics than programming.

Discussion

The RSiteSearch function will open a browser
window and direct it to the search engine on the R Project website. There you
will see an initial search that you can refine. For example, this call
would start a search for “canonical correlation”:

> RSiteSearch("canonical correlation")

This is quite handy for doing quick web searches without leaving
R. However, the search scope is limited to R documentation and the
mailing-list archives.

The rseek.org site provides a wider search.
Its virtue is that it harnesses the power of the Google search engine
while focusing on sites relevant to R. That eliminates the extraneous
results of a generic Google search. The beauty of
rseek.org is that it organizes the results in a
useful way.

Figure 1-3 shows the results
of visiting rseek.org and searching for “canonical
correlation”. The left side of the page shows general results for search
R sites. The right side is a tabbed display that organizes the search
results into several categories:

Introductions

Task Views

Support Lists

Functions

Books

Blogs

Related Tools

Figure 1-3. Search results from rseek.org

If you click on the Introductions tab, for example, you’ll find
tutorial material. The Task Views tab will show any Task View that mentions your search term.
Likewise, clicking on Functions will show links to relevant R functions.
This is a good way to zero in on search results.

Stack Overflow is a
so-called Q&A site, which means that anyone can submit a question
and experienced users will supply answers—often there are multiple
answers to each question. Readers vote on the answers, so good answers
tend to rise to the top. This creates a rich database of Q&A
dialogs, which you can search. Stack Overflow is strongly problem
oriented, and the topics lean toward the programming side of R.

Stack Overflow hosts questions for many programming languages;
therefore, when entering a term into their search box, prefix it with
“[r]” to focus the search on questions tagged for R. For example,
searching via “[r] standard error” will select only the questions tagged
for R and will avoid the Python and C++ questions.

Stack Exchange (not Overflow) has a Q&A area for
Statistical Analysis. The area is more focused on
statistics than programming, so use this site when seeking answers that
are more concerned with statistics in general and less with R in
particular.

See Also

If your search reveals a useful package, use Recipe 3.9 to install it on your machine.

1.11. Finding Relevant Functions and Packages

Problem

Of the 2,000+ packages for R, you have no idea which ones
would be useful to you.

Solution

Visit the list of task views at http://cran.r-project.org/web/views/. Find and read
the task view for your area, which will give you links to and
descriptions of relevant packages. Or visit http://rseek.org, search by
keyword, click on the Task Views tab, and select an applicable task
view.

To find relevant functions, visit http://rseek.org, search by name or keyword, and
click on the Functions tab.

Discussion

This problem is especially vexing for beginners. You think R can
solve your problems, but you have no idea which packages and functions
would be useful. A common question
on the mailing lists is: “Is there a package to solve problem X?” That
is the silent scream of someone drowning in R.

As of this writing, there are more than 2,000 packages available
for free download from CRAN. Each package has a summary page with a
short description and links to the package documentation. Once you’ve
located a potentially interesting package, you would typically click on
the “Reference manual” link to view the PDF documentation with full
details. (The summary page also contains download links for installing
the package, but you’ll rarely install the package that way; see Recipe 3.9.)

Sometimes you simply have a generic interest—such as Bayesian
analysis, econometrics, optimization, or graphics. CRAN contains a set
of task view pages describing packages that may
be useful. A task view is a great place to start since you get an
overview of what’s available. You can see the list of task view pages at
http://cran.r-project.org/web/views/ or search for
them as described in the Solution.

Suppose you happen to know the name of a useful package—say, by
seeing it mentioned online. A complete, alphabetical list of packages is available at http://cran.r-project.org/web/packages/ with links to the
package summary pages.

1.12. Searching the Mailing Lists

Problem

You have a question, and you want to search the archives
of the mailing lists to see whether your question was answered
previously.

Solution

Open http://rseek.org in your browser.
Search for a keyword or other search term from your question. When
the search results appear, click on the “Support Lists” tab.

You can perform a search within R itself. Use the RSiteSearch function to initiate a
search:

> RSiteSearch("keyphrase")

The initial search results will appear in a browser. Under
“Target”, select the R-help
sources, clear the other sources, and resubmit your query.

Discussion

This recipe is really just an application of Recipe 1.10. But it’s an important application because you
should search the mailing list archives before submitting a new question
to the list. Your question has probably been answered before.

Solution

Read the Posting
Guide for instructions on writing an effective
submission.

Write your question carefully and correctly. If appropriate,
include a minimal self-reproducing example so that others can
reproduce your error or problem.

Mail your question to
r-help@r-project.org.

Discussion

The R mailing list is a powerful resource, but please treat it as
a last resort. Read the help pages, read the documentation, search the
help list archives, and search the Web. It is most likely that your
question has already been answered. Don’t kid yourself: very few
questions are unique.

After writing your question, submitting it is easy. Just mail it
to r-help@r-project.org. You must be a list
subscriber, however; otherwise your email submission may be rejected.

Your question might arise because your R code is causing an error
or giving unexpected results. In that case, a critical element of your
question is the minimal self-contained
example:

Minimal

Construct the smallest snippet of R code that displays your
problem. Remove everything
that is irrelevant.

Self-contained

Include the data necessary to exactly reproduce the error.
If the list readers can’t reproduce it, they can’t diagnose it.
For complicated data structures, use the dump
function to create an ASCII representation of your data and include it in
your message.

Including an example clarifies your question and greatly increases
the probability of getting a useful answer.

There are actually several mailing lists. R-help is the main list
for general questions. There are also many special interest group (SIG) mailing lists dedicated to
particular domains such as genetics, finance, R development, and even R
jobs. You can see the full list at https://stat.ethz.ch/mailman/listinfo. If your question
is specific to one such domain, you’ll get a better answer by selecting
the appropriate list. As with R-help, however, carefully search the SIG
list archives before submitting your question.