By design, HDFS-encrypted files cannot be moved or loaded from one encryption
zone into another encryption zone, or from an encryption zone into an
unencrypted directory. Encrypted files can only be copied.

Within an encryption zone, files can be copied, moved, loaded, and
renamed.

Recommendations:

When loading unencrypted data into encrypted tables (e.g., LOAD
DATA INPATH), we recommend placing the source data (to be
encrypted) into a landing zone within the destination encryption
zone.

An attempt to load data from one encryption zone into another will
result in a copy operation. Distcp will be used to speed up
the process if the size of the files being copied is higher than the
value specified by the hive.exec.copyfile.maxsize property. The default
limit is 32 MB.

Here are two approaches for loading unencrypted data into an encrypted
table:

To load unencrypted data into an encrypted table, use the
LOAD DATA ... statement.

If the source data does not reside inside the encryption zone, the
LOAD statement will result in a copy. If your data
is already inside HDFS, though, you can use distcp to
speed up the copying process.

If the data is already inside a Hive table, create a new table
with a LOCATION inside an encryption zone, as
follows:

The location specified in the CREATE TABLE
statement must be within an encryption zone. If you create a
table that points LOCATION to an unencrypted
directory, your data will not be encrypted. You must copy your
data to an encryption zone, and then point LOCATION
to that encryption zone.

If your source data is already encrypted, use the CREATE TABLE
statement. Point LOCATION to the encrypted source directory where
your data resides:

LOCALSCRATCHDIR: The MapJoin optimization in Hive writes HDFS tables to a
local directory and then uploads them to distributed cache. To enable
encryption, either disable MapJoin (set
hive.auto.convert.join to false) or
encrypt the local Hive Scratch directory
(hive.exec.local.scratchdir). Performance note: disabling MapJoin will result in
slower join performance.

DOWNLOADED_RESOURCES_DIR: Jars that are
added to a user session and stored in HDFS are downloaded to
hive.downloaded.resources.dir. If you want these Jar
files to be encrypted, configure
hive.downloaded.resources.dir to be part of an
encryption zone. This directory needs to be accessible to the
HiveServer2.

NodeManager Local Directory List:
Hive stores Jars and MapJoin files in the distributed cache, so if you'd
like to use MapJoin or encrypt Jars and other resource files, the YARN
configuration property NodeManager Local Directory List
(yarn.nodemanager.local-dirs) must be configured to a
set of encrypted local directories on all nodes.

Alternatively, to disable MapJoin, set
hive.auto.convert.join to false.