Re: Upgrading CDH to use Hive 1.2.0 or higher

Short answer is NO. You can't just upgrade Hive or any component version independently, the reason being that there are dependencies between each components and they have been designed to work together in the same version of CDH. Upgrading one component will break such dependencies and cause issues in the cluster.

What specific JIRA are you looking for in Hive 1.2? Chances are that it might already included in CDH version of Hive. Please see the response to Daniil below for further details.

Re: Upgrading CDH to use Hive 1.2.0 or higher

Even though that Hive only comes with version 1.1 in latest CDH version. There are lots of upstream JIRAs in higher version of Hive have already been included in CDH version of Hive 1.1. The version number between upstream Hive and CDH Hive is not compatible, meaning they contain different code base. This is the same for all other component, not just Hive.

The version number for Hive hive-1.1.0+cdh5.13.1+1283 means the CDH Hive is based on upstream Hive version 1.1 + another 1283 JIRAs committed to this version from upstream that is not available in upstream version of Hive 1.1.

Re: Upgrading CDH to use Hive 1.2.0 or higher

Hive 1.1 (CDH 5.4+) only offers UNION ALL (bag union), in which duplicate rows are not eliminated. Starting with Hive 1.2, the UNION DISTINCT feature was introduced and if no UNION type was explictly specified, the default UNION operation is DISTINCT. However, with the introduction of this new UNION DISTINCT capability came some other subtle changes to how the UNION ALL feature worked. We are unable to introduce those changes into CDH 5 for risk of affecting existing workloads. It will be available in CDH 6.

In CDH 5, there is only support UNION is UNION ALL. If it fulfils your business requirements, please include the ALL statement.

Re: Upgrading CDH to use Hive 1.2.0 or higher

We do not have a publicly available roadmap for CDH 6 yet. And while nothing is final until it final, I think it's safe to say that we will be upgrading to at least Hive 1.2, which includes this requested feature.

Re: Upgrading CDH to use Hive 1.2.0 or higher

I have a similar problem using my data modeling tool (erwin). CDH 5.15 is unable to process queries generated by erwin which supports Hive 2.1. When I attempt to reverse engineer my MySQL metastore, the following errors are returned:

--This file is published using to trace the exected SQL--ERwin RE SQL Trace for Hive 2.1.x started on 2018-06-21 10:23:42