Overview of Impala Databases

In Impala, a database is a logical container for a group of tables. Each database defines a separate
namespace. Within a database, you can refer to the tables inside it using their unqualified names. Different
databases can contain tables with identical names.

Creating a database is a lightweight operation. There are minimal database-specific properties to configure,
only LOCATION and COMMENT. There is no ALTER DATABASE statement.

Typically, you create a separate database for each project or application, to avoid naming conflicts between
tables and to make clear which tables are related to each other. The USE statement lets
you switch between databases. Unqualified references to tables, views, and functions refer to objects
within the current database. You can also refer to objects in other databases by using qualified names
of the form dbname.object_name.

Each database is physically represented by a directory in HDFS. When you do not specify a LOCATION
attribute, the directory is located in the Impala data directory with the associated tables managed by Impala.
When you do specify a LOCATION attribute, any read and write operations for tables in that
database are relative to the specified HDFS directory.

There is a special database, named default, where you begin when you connect to Impala.
Tables created in default are physically located one level higher in HDFS than all the
user-created databases.

Impala includes another predefined database, _impala_builtins, that serves as the location
for the built-in functions. To see the built-in
functions, use a statement like the following:

show functions in _impala_builtins;
show functions in _impala_builtins like '*substring*';