Working with NULL, NA, and NaN

Problem

Solution

Sometimes your data will include NULL, NA, or NaN. These work somewhat differently from “normal” values, and may require explicit testing.

Here are some examples of comparisons with these values:

x<-NULLx>5# logical(0)
y<-NAy>5# NA
z<-NaNz>5# NA

Here’s how to test whether a variable has one of these values:

is.null(x)# TRUE
is.na(y)# TRUE
is.nan(z)# TRUE

Note that NULL is different from the other two. NULL means that there is no value, while NA and NaN mean that there is some value, although one that is perhaps not usable. Here’s an illustration of the difference:

In the first case, it checks if y is NULL, and the answer is no. In the second case, it tries to check if x is `NA, but there is no value to be checked.

Ignoring “bad” values in vector summary functions

If you run functions like mean() or sum() on a vector containing NA or NaN, they will return NA and NaN, which is generally unhelpful, though this will alert you to the presence of the bad value. Many of these functions take the flag na.rm, which tells them to ignore these values.