Working with NA Values

There are cases in which you might specify an operation for which no data is available. For example, there might be no appropriate value for a given cell in a variable, for the return value of a function, or for the value of an expression that includes an arithmetic operator. In these cases, an NA (Not Available) value is automatically supplied.

NA is the value of any cell to which a specific data value has not been assigned or for which data cannot be calculated. An NA value has no specific data type.

Certain functions (for example, the aggregation functions) return an NA value when the information that is requested with the function is not available or cannot be calculated. Similarly, an expression whose value cannot be calculated has NA as its value.

To set the value of a variable or relation to NA, you can use the = command, as shown in the following example.

sales = NA

If sales is a dimensioned variable, then the = command loops through all of the values of sales, setting them to NA.

Controlling how NA values are treated

The following options and functions control how NA values are treated in expressions:

Using the PROPERTY command, you can set the value of the NATRIGGER property on a dimensioned variable so that when a cell of the variable that contains an NA value is read, the value of the NATRIGGER expression is substituted for the NA value. You can use this substitution to increase the efficiency of some kinds of calculations and to eliminate the need for some formula objects.

The following options control how NA values are treated in aggregation functions and in arithmetic operations with the addition (+) and subtraction (-) operators.

The NASKIP option controls how NA values are treated in aggregation functions.

The NASKIP2 option controls how NA values are treated in arithmetic operations with the addition (+) and subtraction (-) operators.

The NAFILL function returns the values of the source expression with any NA values appearing as the specified fill expression. You can include this function in an expression to control the format of its value.

Working with the NATRIGGER Property

An NATRIGGER property expression is evaluated before applying the NAFILL function or the NASKIP, NASKIP2, or NASPELL options. If the NATRIGGER expression is NA, then the NAFILL function and the NA options have an effect. Additionally, the NATRIGGER property allows you a good deal of flexibility about handling NA values:

You can make NA triggers recursive or mutually recursive by including triggered objects within the value expression. You must set the RECURSIVE option to yes before a formula, program, or other NATRIGGER expression can invoke a trigger expression again while it is executing. For limiting the number of triggers that can execute simultaneously, see the TRIGGERMAXDEPTH option.

You can replace the NA value in the cells of the variable with the NATRIGGER expression value by setting the TRIGGERSTOREOK option to yes and setting the STORETRIGGERVAL property on the variable to yes.

The ROLLUP and AGGREGATE commands and the AGGREGATE function ignore the NATRIGGER property setting for a variable during a rollup or aggregation operation. Additionally, the NATRIGGER property expression on a variable is not evaluated when the variable is simply exported with an EXPORT TO EIF file command. The NATRIGGER property expression is only evaluated if the variable is part of an expression that is calculated during the export operation.

Using NASKIP

The NASKIP option controls how NA values are treated in aggregation functions.

By default, the NASKIP option is set to YES, and NA values are ignored by aggregation functions. Only expressions with actual values are used in calculations.

If you set the NASKIP option to no, then NA values are considered as input to aggregation functions. If any of the values being considered are NA, then the function returns NA for that value.

Setting NASKIP to no is useful for cases in which having NA values in the data makes the calculation itself invalid. For example, when you use the MOVINGMAX function, you specify a range from which to select the maximum value.

If NASKIP is YES (the default), then MOVINGMAX returns NA only when all the values in the range are NA.

If NASKIP is NO and any value in the range is NA, then MOVINGMAX returns NA.

Using NASKIP2

The NASKIP2 option controls how NA values are treated in arithmetic operations with the addition (+) and subtraction (-) operators.

By default, the value of the NASKIP2 option is NO. NA values are treated as NAs in arithmetic operations using the addition (+) and subtraction (-) operators. If any of the operands being considered is NA, then the arithmetic operation evaluates to NA. For example, by default, 2+NA results in NA.

If you set the value of the NASKIP2 option to yes, then zeroes are substituted for NA values in arithmetic operations using the addition (+) and subtraction (-) operators. The two special cases of NA+ NA and NA-NA both result in NA.

Using NAFILL

NASKIP and NASKIP2 do not change your data. They only affect the results of calculations on your data. If you would prefer a more targeted influence on any kind of expressions, and want the option of making an actual change in your data, then you can use the NAFILL function.

The effect of the NAFILL function is limited to the single expression you specify. It can be any kind of expression, not just a function or an addition (+) or subtraction (-) operation. In addition, you can use NAFILL to substitute anything for the NAs in the expression, not just zeroes. Moreover, using assignment statements, you can use NAFILL to make a permanent substitution for NAs in your data.

NAFILL returns the value of a specified expression unless its value is NA, in which case NAFILL returns the substitute value you specify.

The following command uses NAFILL to replace the NA values in the sales variable with the number 1 and then assign those values to the variable. This makes the substitution permanent in your data.

sales = NAFILL(sales, 1)

The following command illustrates the use of NAFILL for more specialized purposes. By substituting zeros for NA values, NAFILL in this example forces the AVERAGE function to include NA values when it counts the number of values it is averaging. The substitution is temporary, lasting only for the duration of this command.