Impala Mathematical Functions

Mathematical functions, or arithmetic functions, perform numeric calculations that are typically more complex
than basic addition, subtraction, multiplication, and division. For example, these functions include
trigonometric, logarithmic, and base conversion operations.

Note:
In Impala, exponentiation uses the pow() function rather than an exponentiation operator
such as **.

Purpose: Returns a string representation of an integer value in a particular base. The input value
can be a string, for example to convert a hexadecimal number such as fce2 to decimal. To
use the return value as a number (for example, when converting to base 10), use CAST()
to convert to the appropriate type.

Purpose: Computes the factorial of an integer value.
It works with any integer type.

Added in:Impala 2.3.0

Usage notes: You can use either the factorial() function or the ! operator.
The factorial of 0 is 1. Likewise, the factorial() function returns 1 for any negative value.
The maximum positive value for the input argument is 20; a value of 21 or greater overflows the
range for a BIGINT and causes an error.

Purpose: Returns the largest integer that is less than or equal to the argument.

Return type:bigint or decimal(p,s) depending on the type of
the input argument

fmod(double a, double b), fmod(float a, float b)

Purpose: Returns the modulus of a floating-point number. Equivalent to the % arithmetic operator.

Return type:float or double, depending on type of arguments

Added in: Impala 1.1.1

Usage notes:

Because this function operates on DOUBLE or FLOAT
values, it is subject to potential rounding errors for values that cannot be
represented precisely. Prefer to use whole numbers, or values that you know
can be represented precisely by the DOUBLE or FLOAT
types.

Examples:

The following examples show equivalent operations with the fmod()
function and the % arithmetic operator, for values not subject
to any rounding error.

The following examples show operations with the fmod()
function for values that cannot be represented precisely by the
DOUBLE or FLOAT types, and thus are
subject to rounding error. fmod(9.9,3.0) returns a value
slightly different than the expected 0.9 because of rounding.
fmod(9.9,3.3) returns a value quite different from
the expected value of 0 because of rounding error during intermediate
calculations.

Purpose: Returns a consistent 64-bit value derived from the input argument, for convenience of
implementing hashing logic in an application.

Return type:BIGINT

Usage notes:

You might use the return value in an application where you perform load balancing, bucketing, or some
other technique to divide processing or storage.

Because the result can be any 64-bit value, to restrict the value to a particular range, you can use an
expression that includes the ABS() function and the % (modulo)
operator. For example, to produce a hash value in the range 0-9, you could use the expression
ABS(FNV_HASH(x)) % 10.

This function implements the same algorithm that Impala uses internally for hashing, on systems where
the CRC32 instructions are not available.

This function implements the
Fowler–Noll–Vo
hash function, in particular the FNV-1a variation. This is not a perfect hash function: some
combinations of values could produce the same result value. It is not suitable for cryptographic use.

Similar input values of different types could produce different hash values, for example the same
numeric value represented as SMALLINT or BIGINT,
FLOAT or DOUBLE, or DECIMAL(5,2) or
DECIMAL(20,5).

Return type: same as the initial argument value, except that integer values are promoted to
BIGINT and floating-point values are promoted to DOUBLE; use
CAST() when inserting into a smaller numeric column

hex(bigint a), hex(string a)

Purpose: Returns the hexadecimal representation of an integer value, or of the characters in a
string.

Return type:string

is_inf(double a),

Purpose: Tests whether a value is equal to the special value "inf", signifying infinity.

Return type:boolean

Usage notes:

Infinity and NaN can be specified in text data files as inf and nan
respectively, and Impala interprets them as these special values. They can also be produced by certain
arithmetic expressions; for example, 1/0 returns Infinity and
pow(-1, 0.5) returns NaN. Or you can cast the literal values, such as CAST('nan' AS
DOUBLE) or CAST('inf' AS DOUBLE).

is_nan(double a),

Purpose: Tests whether a value is equal to the special value "NaN", signifying "not a
number".

Return type:boolean

Usage notes:

Infinity and NaN can be specified in text data files as inf and nan
respectively, and Impala interprets them as these special values. They can also be produced by certain
arithmetic expressions; for example, 1/0 returns Infinity and
pow(-1, 0.5) returns NaN. Or you can cast the literal values, such as CAST('nan' AS
DOUBLE) or CAST('inf' AS DOUBLE).

Return type: same as the initial argument value, except that integer values are promoted to
BIGINT and floating-point values are promoted to DOUBLE; use
CAST() when inserting into a smaller numeric column

Purpose: Returns the logarithm of the second argument to the specified base.

Return type:double

log10(double a),
dlog10(double a)

Purpose: Returns the logarithm of the argument to the base 10.

Return type:double

log2(double a)

Purpose: Returns the logarithm of the argument to the base 2.

Return type:double

max_int(), max_tinyint(), max_smallint(),
max_bigint()

Purpose: Returns the largest value of the associated integral type.

Return type: The same as the integral type being checked.

Usage notes: Use the corresponding min_ and max_ functions to
check if all values in a column are within the allowed range, before copying data or altering column
definitions. If not, switch to the next higher integral type or to a DECIMAL with
sufficient precision.

Usage notes: Use the corresponding min_ and max_ functions to
check if all values in a column are within the allowed range, before copying data or altering column
definitions. If not, switch to the next higher integral type or to a DECIMAL with
sufficient precision.

mod(numeric_type a, same_type b)

Purpose: Returns the modulus of a number. Equivalent to the % arithmetic operator.
Works with any size integer type, any size floating-point type, and DECIMAL
with any precision and scale.

Return type: Same as the input value

Added in:Impala 2.2.0

Usage notes:

Because this function works with DECIMAL values, prefer it over fmod()
when working with fractional values. It is not subject to the rounding errors that make
fmod() problematic with floating-point numbers.
The % arithmetic operator now uses the mod() function
in cases where its arguments can be interpreted as DECIMAL values,
increasing the accuracy of that operator.

Examples:

The following examples show how the mod() function works for
whole numbers and fractional values, and how the % operator
works the same way. In the case of mod(9.9,3),
the type conversion for the second argument results in the first argument
being interpreted as DOUBLE, so to produce an accurate
DECIMAL result requires casting the second argument
or writing it as a DECIMAL literal, 3.0.

The following examples show how the fmod() function sometimes returns a negative value
depending on the sign of its arguments, and the pmod() function returns the same value
as fmod(), but sometimes with the sign flipped.

Purpose: Returns the first argument raised to the power of the second argument.

Return type:double

precision(numeric_expression)

Purpose: Computes the precision (number of decimal digits) needed to represent the type of the
argument expression as a DECIMAL value.

Usage notes:

Typically used in combination with the scale() function, to determine the appropriate
DECIMAL(precision,scale) type to declare in a
CREATE TABLE statement or CAST() function.

Return type:int

Examples:

The following examples demonstrate how to check the precision and scale of numeric literals or other
numeric expressions. Impala represents numeric literals in the smallest appropriate type. 5 is a
TINYINT value, which ranges from -128 to 127, therefore 3 decimal digits are needed to
represent the entire range, and because it is an integer value there are no fractional digits. 1.333 is
interpreted as a DECIMAL value, with 4 digits total and 3 digits after the decimal point.

Purpose: Returns the first argument divided by the second argument, discarding any fractional
part. Avoids promoting integer arguments to DOUBLE as happens with the / SQL
operator. Also includes an overload that accepts DOUBLE arguments,
discards the fractional part of each argument value before dividing, and again returns BIGINT.
With integer arguments, this function works the same as the DIV operator.

Return type:bigint

radians(double a)

Purpose: Converts argument value from degrees to radians.

Return type:double

rand(), rand(int seed),
random(),
random(int seed)

Purpose: Returns a random value between 0 and 1. After rand() is called with a
seed argument, it produces a consistent random sequence based on the seed value.

Return type:double

Usage notes: Currently, the random sequence is reset after each query, and multiple calls to
rand() within the same query return the same value each time. For different number
sequences that are different for each query, pass a unique seed value to each call to
rand(). For example, select rand(unix_timestamp()) from ...

Examples:

The following examples show how rand() can produce sequences of varying predictability,
so that you can reproduce query results involving random values or generate unique sequences of random
values for each query.
When rand() is called with no argument, it generates the same sequence of values each time,
regardless of the ordering of the result set.
When rand() is called with a constant integer, it generates a different sequence of values,
but still always the same sequence for the same seed value.
If you pass in a seed value that changes, such as the return value of the expression unix_timestamp(now()),
each query will use a different sequence of random values, potentially more useful in probability calculations although
more difficult to reproduce at a later time. Therefore, the final two examples with an unpredictable seed value
also include the seed in the result set, to make it possible to reproduce the same random sequence later.

Purpose: Rounds a floating-point value. By default (with a single argument), rounds to the nearest
integer. Values ending in .5 are rounded up for positive numbers, down for negative numbers (that is,
away from zero). The optional second argument specifies how many digits to leave after the decimal point;
values greater than zero produce a floating-point return value rounded to the requested number of digits
to the right of the decimal point.

Return type:bigint for single double argument.
double for two-argument signature when second argument greater than zero.
For DECIMAL values, the smallest
DECIMAL(p,s) type with appropriate precision and
scale.

scale(numeric_expression)

Purpose: Computes the scale (number of decimal digits to the right of the decimal point) needed to
represent the type of the argument expression as a DECIMAL value.

Usage notes:

Typically used in combination with the precision() function, to determine the
appropriate DECIMAL(precision,scale) type to
declare in a CREATE TABLE statement or CAST() function.

Return type:int

Examples:

The following examples demonstrate how to check the precision and scale of numeric literals or other
numeric expressions. Impala represents numeric literals in the smallest appropriate type. 5 is a
TINYINT value, which ranges from -128 to 127, therefore 3 decimal digits are needed to
represent the entire range, and because it is an integer value there are no fractional digits. 1.333 is
interpreted as a DECIMAL value, with 4 digits total and 3 digits after the decimal point.

Purpose: Removes some or all fractional digits from a numeric value.
With no argument, removes all fractional digits, leaving an integer value.
The optional argument specifies the number of fractional digits to include
in the return value, and only applies with the argument type is DECIMAL.
truncate() and dtrunc() are aliases for the same function.