untitled-map

This is the documentation for Cloudera Impala 2.1.x.
Documentation for other versions is available at Cloudera.com.

DDL Statements

DDL refers to "Data Definition Language", a subset of SQL statements that change the
structure of the database schema in some way, typically by creating, deleting, or modifying
schema objects such as databases, tables, and views.
Most Impala DDL statements start with the keywords CREATE, DROP,
or ALTER.

After Impala executes a DDL command, information about available tables, columns, views, partitions, and so on
is automatically synchronized between all the Impala nodes in a cluster.
(Prior to Impala 1.2, you had to issue a REFRESH or INVALIDATE METADATA statement
manually on the other nodes to make them aware of the changes.)

If the timing of metadata updates is significant, for example if you use round-robin scheduling
where each query could be issued through a different Impala node,
you can
enable the
SYNC_DDL query option
to make the DDL statement wait until all nodes have been notified about the metadata changes.

Although the INSERT statement is officially classified as a DML (data manipulation language)
statement, it also involves metadata changes that must be broadcast to all Impala nodes, and so is also
affected by the SYNC_DDL query option.

Because the SYNC_DDL query option makes each DDL operation take longer than normal,
you might only enable it before the last DDL operation in a sequence.
For example, if if you are running a script that issues multiple
of DDL operations to set up an entire new schema, add several new partitions, and so on,
you might minimize the performance overhead by enabling the query option only
before the last CREATE, DROP, ALTER,
or INSERT statement.
The script only finishes when all the relevant metadata changes are recognized by all the Impala
nodes, so you could connect to any node and issue queries through it.

The classification of DDL, DML, and other statements is not necessarily the same
between Impala and Hive.
Impala organizes these statements in a way intended to be familiar to people
familiar with relational databases or data warehouse products.
Statements that modify the metastore database, such as COMPUTE STATS,
are classified as DDL.
Statements that only query the metastore database, such as SHOW
or DESCRIBE, are put into a separate category of utility statements.

Note:
The query types shown in the Impala debug web user interface might not match
exactly the categories listed here. For example, currently the USE
statement is shown as DDL in the debug web UI.
The query types shown in the debug web UI are subject to change,
for improved consistency.