Authentication and Authorization

SAML authentication fails with "Cloudera Manager Only" setting

With the following combination of Cloudera Manager configuration properties set, authentication to Navigator fails:

Authentication Backend Order: Cloudera Manager Only

External Authentication Type: SAML

Workaround: To configure Navigator for SAML authentication, use an Authentication Backend Order other than "Cloudera
Manager Only".

Affected Versions: Cloudera Navigator 6.0.0 and later

Fixed Versions: N/A

Cloudera Issue: NAV-6211

Errors when using local login returns the browser to SAML login page

With SAML authentication enabled for Navigator, administrators are allowed to use locallogin.html to login with local credentials instead of SAML. However
if the administrator enters a wrong username or password, the page is redirected to login.html?error=true.

When that happens, the login.html URL is no longer a local login and the login.html page address gets redirected to the IDP
address for SAML authentication.

Workaround: After the login failure, the URL changes to something similar to:

https://hostname:7187/login.html?error=true

To return to the local login page, change the browser address to a URL similar to:

https://hostname:7187/locallogin.html

Affected Versions: Cloudera Navigator 6.0.0 and later

Fixed Versions: N/A

Cloudera Issue: NAV-5824

Cloudera Manager Configuration

Adding a blank audit filter removes filter configuration property

In Cloudera Manager, when adding an empty rule to a service's Audit Event Filter and then saving the change, all existing audit event filters are lost. The filter configuration property
is removed from Cloudera Manager's list of configuration properties. Reverting the change in the History and Rollback does not restore the previous filters nor reproduce the filter property.

Affected Versions: Cloudera Navigator 6.0.0 and later

Fixed Versions: N/A

Cloudera Issue: NAV-6096

Overriding safety valve settings disables audit and lineage features

Customers or third party applications such as Unravel may require that hive.exec.post.hooks is configured in a HiveServer2 safety valve. Cloudera Manager
will comment out the hive.exec.post.hooks value that is configured if audit or lineage is enabled for Hive. The safety valve content shows the commented code:

<!--'hive.exec.post.hooks', originally set to
'com.cloudera.navigator.audit.hive.HiveExecHookContext,org.apache.hadoop.hive.ql.hooks.LineageLogger'
(non-final), is overridden below by a safety valve-->

This automated change disables Navigator's auditing and lineage features without notification.

At this time, there is no workaround.

Affected Versions: Cloudera Navigator 6.0.0 and later

Fixed Versions: N/A

Cloudera Issue: NAV-5331

Hive, Hue, Impala

Overriding safety valve settings disables audit and lineage features

Customers or third party applications such as Unravel may require that hive.exec.post.hooks is configured in a HiveServer2 safety valve. Cloudera Manager
will comment out the hive.exec.post.hooks value that is configured if audit or lineage is enabled for Hive. The safety valve content shows the commented code:

<!--'hive.exec.post.hooks', originally set to
'com.cloudera.navigator.audit.hive.HiveExecHookContext,org.apache.hadoop.hive.ql.hooks.LineageLogger'
(non-final), is overridden below by a safety valve-->

This automated change disables Navigator's auditing and lineage features without notification.

Affected Versions: Cloudera Navigator 6.0.0 and later

Workaround: To fix this problem, manually merge the original HiveServer2 safety valve content for hive.exec.post.hooks with
the new value. For example, in the case of Unravel, the new safety valve would look like the following:

Viewing Navigator tags in Hue overloads Metadata Server heap

When viewing Cloudera Navigator tags through Hue, Navigator uses more memory than usual and does not release the memory after logging out of Hue. Eventually, the calls between Hue and
Navigator will occupy the majority of the heap space allocated to Navigator Metadata Server.

Impala lineage delay when running queries from Hue

When using Hue to perform Impala queries, after running the query, the lineage doesn't show up in Navigator until Impala determines that the query is complete. Hue gives users the
opportunity to pull another set of results on the same query, so Impala holds the query open. Lineage metadata is sent after Impala reaches its configured query timeout or an event such as another
query or logging out of Hue occurs.

Workaround: Set low timeouts for queries in Hue or add an Impala query timeout specifically to the Hue safety valve and set the timeout for 3-5 minutes so
that you see the queries show up in Navigator after Hue is idle for some time. Hue will notify users that the query needs to be run again, but it also releases the query resources. Here are the
options:

HiveServer1 and Hive CLI support removed

Cloudera Navigator requires HiveServer2 for complete governance Hive queries. Cloudera Navigator does not capture audit events for queries that are run on HiveServer1/Hive CLI, and
lineage is not captured for certain types of operations that are run on HiveServer1.

If you use Cloudera Navigator to capture auditing, lineage, and metadata for Hive operations, upgrade to HiveServer2 if you have not done so already.

Affected Versions: Cloudera Navigator 6.x

Fixed Versions: N/A

Cloudera Issue: TSB-185

Streaming Audit Events

Error blocks second streaming target

When streaming audit messages to both Flume and Kafka, if the Flume client throws an exception, Navigator Audit Server does not send the same messages to Kafka. To recover from this
problem, the Flume client needs to be working.

Affected Versions: Cloudera Navigator 6.x

Fixed Versions: N/A

Cloudera Issue: NAV-7143

Navigator Audit Server

When running on Oracle Enterprise Linux 7.6 and using Oracle 12 database, Navigator Audit Server times out when connecting to the Oracle database instance. An error message similar to
the following appears in the Navigator Audit Server log:

Workaround: Add the following entry in the Cloudera Management Service configuration option "Java Configuration Options for Navigator Audit Server":

-Djava.security.egd=file:///dev/urandom

Affects Versions: Navigator 6.2.0, 6.3.0

Cloudera issue: NAV-7169

Logging Threshold setting is not honored

The value for Navigator Audit Server Logging Threshold found in Cloudera Manager is not honored. Instead, messages are logged at trace level and displayed at DEBUG syslog level. This
configuration property is set in Cloudera Management Service > Configuration > Navigator Audit Server.

Cloudera Issue: NAV-3737

Navigator Metadata Server

Navigator Embedded Solr can reach its limit on number of documents it can store

Navigator Metadata Server extracts HDFS entities by performing a one-time bulk extraction and then switching to incremental extraction. In Cloudera Manager releases 5.10.0, 5.10.1 and
5.11.0 (Navigator releases 2.9.0, 2.9.1, and 2.10.0), a problem causes HDFS bulk extraction to be run more than one time, resulting in duplicate relations created for HDFS. Over time, embedded Solr
runs out of document IDs that it can assign to new relations and fails with following error:

Log includes the error "EndPoint1 must not be null"

The following error may appear in the Navigator Metadata Server log in systems upgraded from Cloudera Manager version 5.x:

2017-10-17 13:00:23,007 ERROR com.cloudera.nav.hive.extractor.AbstractHiveExtractor [CDHExecutor-0-CDHUrlClassLoader@14784b7b]: Unable to parse hive view query *: EndPoint1 must not be null or empty
java.lang.IllegalStateException: EndPoint1 must not be null or empty

This error occurs because the Hive pull extraction for creating a Hive view produces an incorrect lineage relationship for the Hive view. However, Navigator also receives information for
the view creation through the push extractor, which correctly produces the lineage relation. You can safely ignore this error.

Affected Versions: Cloudera Navigator 6.0.0 and later

Fixed Versions: N/A

Cloudera Issue: NAV-4224

Purge

First purge job may run twice

Navigator purge jobs are scheduled using UTC. However, the first time Navigator runs a purge, the scheduler triggers the job twice, once in UTC timezone and a second time one in local
timezone. After that the schedule is triggered as expected. Other than the first purge running at an unexpected time, there are no side-effects of this issue.

Affected Versions: Cloudera Navigator 6.0.0 and later

Fixed Versions: N/A

Cloudera Issue: NAV-6666

Purge can create data that's too big for Solr to process

Solr's POST request payload is set to 2MB, which can be exceeded when purging a large Navigator metadata storage directory. The purge job fails with an error similar to the
following:

Use the identity of the matching HDFS service for this cluster as the sourceId.

Affected Versions: Cloudera Navigator 6.0.0 and later

Fixed Versions: N/A

Cloudera Issue: NAV-3537

Spark

Spark Lineage Limitations and Requirements

Spark lineage diagrams are supported in the Cloudera Navigator 6.0 release. Spark lineage is supported for Spark 1.6 and Spark 2.3. Lineage is not available for Spark when Cloudera
Manager is running in single user mode. In addition to these requirements, Spark lineage has the following limitations:

Lineage is produced only for data that is read/written and processed using the Dataframe and SparkSQL APIs. Lineage is not available for data that is read/written or processed using
Spark's RDD APIs.

Lineage information is not produced for calls to aggregation functions such as groupBy().

The default lineage directory for Spark on Yarn is /var/log/spark/lineage. No process or user should write files to this directory—doing so can cause
agent failures. In addition, changing the Spark on Yarn lineage directory has no effect: the default remains /var/log/spark/lineage.

Spark extractor enabled using safety valve deprecated

The Spark extractor included prior to CDH 5.11 and enabled by setting the safety valve, nav.spark.extraction.enable=true is being deprecated, and could be
removed completely in a future release. If you are upgrading from CDH 5.10 or earlier and were using the extractor configured with this safety valve, be sure to remove the setting when you
upgrade.

Upgrade Issues and Limitations

Upgrading Cloudera Navigator from Cloudera Manager 5.9 or Earlier Can be Extremely Slow

Upgrading a cluster running Cloudera Navigator to Cloudera Manager 5.10 (or higher) can be extremely slow due to an internal change made to the Solr schema in Cloudera Navigator 2.9. A
Solr instance is embedded in Cloudera Navigator and supports its search capabilities. The Solr schema used by Cloudera Navigator has been modified in the 2.9 release to use datatype long rather than string for an internal id field. This change makes Cloudera Navigator far more robust and scalable
over the long term.

However, the upgrade process itself can take a significantly long time because the existing Solr documents—the indexed and searchable data structures used by Solr that are contained in
the Cloudera Navigator storage directory—are migrated to the new schema. This change to the Solr schema affects only those Cloudera Navigator deployments that use the metadata and lineage features.
Note: Cloudera Navigator deployments that use only the auditing features of the product—not metadata or lineage—are not affected by this
issue.
The upgrade process for Cloudera Navigator starts automatically at the end of the Cloudera Manager upgrade, and the migration to the new schema occurs automatically as part of that upgrade process.
The Navigator Metadata Server and Navigator console are not available during the upgrade. Navigator Audit Server runs normally. The amount of time that administrators should allow for this process
depends on the quantity stored at the Navigator Metadata Server Storage Dir (nav.data.dir, or simply "storage directory") location as listed here:

Metadata and lineage usage

Description

None

Deployments that use Cloudera Navigator audit capability only—without metadata or
lineage—do not have the issue. Backup the Navigator Metadata Server storage directory and then delete it before upgrading.

storage directory < 60 GB

Deployments with relatively small Navigator Metadata Server data directories may take 1 to 2 days for the upgrade
process to complete. See the workaround below for steps to take before upgrading to Cloudera Manager 5.10 to possibly reduce the upgrade time.

storage directory > 60 GB

Deployments with relatively large Navigator Metadata Server data directories may take several days for the upgrade
process to complete. See the workaround below for steps to take before upgrading to Cloudera Manager 5.10 to possibly reduce the upgrade time.

Workaround: To reduce the time required for upgrading the Navigator Metadata Server data directories for deployments currently
running Cloudera Navigator 2.8 that uses its metadata and lineage features, consider removing unneeded entries from the metadata before the upgrade. The Navigator Purge feature allows you to remove
metadata for deleted entities and for entities and operations older than a specified date. For more information on what metadata you can remove with Purge, see Managing Metadata Storage with Purge.

Run purge before starting the Cloudera Manager upgrade (to Cloudera Manager 5.10), following the steps below.
Warning: These steps may mitigate but do not fully resolve the issue. Follow these steps before starting the
Cloudera Manager upgrade for any Cloudera Manager 5.9 or earlier cluster that currently uses the Cloudera Navigator metadata and lineage features.

Check the Navigator Metadata Server storage directory size. The path is /var/lib/cloudera-scm-navigator (default) unless configured otherwise. If you
need to check the setting:

Set options to purge metadata for deleted HDFS entities and any operations.

Check the storage directory size again. If needed, re-run the purge with a shorter time span to retain metadata. If the storage directory consumption cannot be reduced below 60GB, do
not start the Cloudera Manager upgrade. Instead, contact Cloudera support to help you with this upgrade.

If this documentation includes code, including but not limited to, code examples, Cloudera makes this available to you under the terms of the Apache License, Version 2.0, including any required
notices. A copy of the Apache License Version 2.0 can be found here.