Literals

Each of the Impala data types has corresponding notation for literal values of that type. You specify literal values in SQL statements, such as in the SELECT list or WHERE clause of a query, or as an argument to a function call. See Data
Types for a complete list of types, ranges, and conversion rules.

Numeric Literals

To write literals for the integer types (TINYINT, SMALLINT, INT, and BIGINT), use a sequence of digits with optional leading zeros.

To write literals for the floating-point types (DECIMAL, FLOAT, and DOUBLE), use a sequence of digits with an optional decimal point (. character). To preserve accuracy during arithmetic expressions, Impala interprets
floating-point literals as the DECIMAL type with the smallest appropriate precision and scale, until required by the context to convert the result to FLOAT or DOUBLE.

Integer values are promoted to floating-point when necessary, based on the context.

You can also use exponential notation by including an e character. For example, 1e6 is 1 times 10 to the power of 6 (1
million). A number in exponential notation is always interpreted as floating-point.

When Impala encounters a numeric literal, it considers the type to be the "smallest" that can accurately represent the value. The type is promoted to
larger or more accurate types if necessary, based on subsequent parts of an expression.

For example, you can see by the types Impala defines for the following table columns how it interprets the corresponding numeric literals:

String Literals

String literals are quoted using either single or double quotation marks. You can use either kind of quotes for string literals, even both kinds for different literals within the same
statement.

Quoted literals are considered to be of type STRING. To use quoted literals in contexts requiring a CHAR or
VARCHAR value, CAST() the literal to a CHAR or VARCHAR of the appropriate
length.

Escaping special characters:

To encode special characters within a string literal, precede them with the backslash (\) escape character:

\t represents a tab.

\n represents a newline or linefeed. This might cause extra line breaks in impala-shell output.

\r represents a carriage return. This might cause unusual formatting (making it appear that some content is overwritten) in impala-shell output.

\b represents a backspace. This might cause unusual formatting (making it appear that some content is overwritten) in impala-shell output.

\0 represents an ASCII nul character (not the same as a SQL NULL). This might not be
visible in impala-shell output.

\Z represents a DOS end-of-file character. This might not be visible in impala-shell output.

\% and \_ can be used to escape wildcard characters within the string passed to the LIKE
operator.

\ followed by 3 octal digits represents the ASCII code of a single character; for example, \101 is ASCII 65, the character
A.

Use two consecutive backslashes (\\) to prevent the backslash from being interpreted as an escape character.

Use the backslash to escape single or double quotation mark characters within a string literal, if the literal is enclosed by the same type of quotation mark.

If the character following the \ does not represent the start of a recognized escape sequence, the character is passed through unchanged.

Quotes within quotes:

To include a single quotation character within a string value, enclose the literal with either single or double quotation marks, and optionally escape the single quote as a \' sequence. Earlier releases required escaping a single quote inside double quotes. Continue using escape sequences in this case if you also need to run your SQL code on older
versions of Impala.

To include a double quotation character within a string value, enclose the literal with single quotation marks, no escaping is necessary in this case. Or, enclose the literal with double
quotation marks and escape the double quote as a \" sequence.

Note: The CREATE TABLE clauses FIELDS TERMINATED BY, ESCAPED
BY, and LINES TERMINATED BY have special rules for the string literal used for their argument, because they all require a single character. You can use a regular
character surrounded by single or double quotation marks, an octal sequence such as '\054' (representing a comma), or an integer in the range '-127'..'128' (with
quotation marks but no backslash), which is interpreted as a single-byte ASCII character. Negative values are subtracted from 256; for example, FIELDS TERMINATED BY
'-2' sets the field delimiter to ASCII code 254, the "Icelandic Thorn" character used as a delimiter by some data formats.

impala-shell considerations:

When dealing with output that includes non-ASCII or non-printable characters such as linefeeds and backspaces, use the impala-shell options to save
to a file, turn off pretty printing, or both rather than relying on how the output appears visually. See impala-shell Configuration
Options for a list of impala-shell options.

Boolean Literals

For BOOLEAN values, the literals are TRUE and FALSE, with no quotation marks and
case-insensitive.

Timestamp Literals

Impala automatically converts STRING literals of the correct format into TIMESTAMP values. Timestamp values are accepted in
the format "yyyy-MM-dd HH:mm:ss.SSSSSS", and can consist of just the date, or just the time, with or without the fractional second portion. For example, you can specify
TIMESTAMP values such as '1966-07-30', '08:30:00', or '1985-09-25
17:45:30.005'. Casting an integer or floating-point value N to TIMESTAMP produces a value that is N seconds past the start of the epoch date (January 1, 1970). By default, the result value represents a date and time in the UTC time zone. If the setting --use_local_tz_for_unix_timestamp_conversions=true is in effect, the resulting TIMESTAMP represents a date and time in the local time
zone.

You can also use INTERVAL expressions to add or subtract from timestamp literal values, such as CAST('1966-07-30' AS TIMESTAMP) +
INTERVAL 5 YEARS + INTERVAL 3 DAYS. See TIMESTAMP Data Type for details.

Depending on your data pipeline, you might receive date and time data as text, in notation that does not exactly match the format for Impala TIMESTAMP
literals. See Impala Date and Time Functions for functions that can convert between a variety of string literals
(including different field order, separators, and timezone notation) and equivalent TIMESTAMP or numeric values.

NULL

The notion of NULL values is familiar from all kinds of database systems, but each SQL dialect can have its own behavior and restrictions on NULL values. For Big Data processing, the precise semantics of NULL values are significant: any misunderstanding could lead to inaccurate results or
misformatted data, that could be time-consuming to correct for large data sets.

NULL is a different value than an empty string. The empty string is represented by a string literal with nothing inside, "" or ''.

In a delimited text file, the NULL value is represented by the special token \N.

When Impala inserts data into a partitioned table, and the value of one of the partitioning columns is NULL or the empty string, the data is placed in a
special partition that holds only these two kinds of values. When these values are returned in a query, the result is NULL whether the value was originally NULL or an empty string. This behavior is compatible with the way Hive treats NULL values in partitioned tables. Hive does not allow empty strings as
partition keys, and it returns a string value such as __HIVE_DEFAULT_PARTITION__ instead of NULL when such values are returned from a
query. For example:

create table t1 (i int) partitioned by (x int, y string);
-- Select an INT column from another table, with all rows going into a special HDFS subdirectory
-- named __HIVE_DEFAULT_PARTITION__. Depending on whether one or both of the partitioning keys
-- are null, this special directory name occurs at different levels of the physical data directory
-- for the table.
insert into t1 partition(x=NULL, y=NULL) select c1 from some_other_table;
insert into t1 partition(x, y=NULL) select c1, c2 from some_other_table;
insert into t1 partition(x=NULL, y) select c1, c3 from some_other_table;

There is no NOT NULL clause when defining a column to prevent NULL values in that column.

There is no DEFAULT clause to specify a non-NULL default value.

If an INSERT operation mentions some columns but not others, the unmentioned columns contain NULL for all inserted
rows.

In Impala 1.2.1 and higher, all NULL values come at the end of the result set for ORDER BY ... ASC queries, and at the
beginning of the result set for ORDER BY ... DESC queries. In effect, NULL is considered greater than all other values for sorting
purposes. The original Impala behavior always put NULL values at the end, even for ORDER BY ... DESC queries. The new behavior in Impala
1.2.1 makes Impala more compatible with other popular database systems. In Impala 1.2.1 and higher, you can override or specify the sorting behavior for NULL by adding
the clause NULLS FIRST or NULLS LAST at the end of the ORDER BY clause.

Note: Because the NULLS FIRST and NULLS LAST keywords are not currently available in
Hive queries, any views you create using those keywords will not be available through Hive.

In all other contexts besides sorting with ORDER BY, comparing a NULL to anything else returns NULL, making the comparison meaningless. For example, 10 > NULL produces NULL, 10 <
NULL also produces NULL, 5 BETWEEN 1 AND NULL produces NULL, and so on.

Several built-in functions serve as shorthand for evaluating expressions and returning NULL, 0, or some other substitution value depending on the
expression result: ifnull(), isnull(), nvl(), nullif(), nullifzero(), and zeroifnull(). See Impala Conditional Functions
for details.

If this documentation includes code, including but not limited to, code examples, Cloudera makes this available to you under the terms of the Apache License, Version 2.0, including any required
notices. A copy of the Apache License Version 2.0 can be found here.