I have a cluster installed by Cloudera Manager Path B with Parcel CDH 5.13.1 activated. I have some doubts about how configuration files work in this environment (I was used to edit manually them in the Quickstart virtual machine provided by Cloudera).

For example: I need to add/modify a property configuration in Oozie, thus i searched in the node where Oozie server is installed for the file "oozie-site.xml". I got the following results:

<?xml version="1.0"?>
<!--
Licensed to the Apache Software Foundation (ASF) under one
or more contributor license agreements. See the NOTICE file
distributed with this work for additional information
regarding copyright ownership. The ASF licenses this file
to you under the Apache License, Version 2.0 (the
"License"); you may not use this file except in compliance
with the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
-->
<configuration>
<!--
Refer to the oozie-default.xml file for the complete list of
Oozie configuration properties and their default values.
-->
<!-- Proxyuser Configuration -->
<!--
<property>
<name>oozie.service.ProxyUserService.proxyuser.#USER#.hosts</name>
<value>*</value>
<description>
List of hosts the '#USER#' user is allowed to perform 'doAs'
operations.
The '#USER#' must be replaced with the username o the user who is
allowed to perform 'doAs' operations.
The value can be the '*' wildcard or a list of hostnames.
For multiple users copy this property and replace the user name
in the property name.
</description>
</property>
<property>
<name>oozie.service.ProxyUserService.proxyuser.#USER#.groups</name>
<value>*</value>
<description>
List of groups the '#USER#' user is allowed to impersonate users
from to perform 'doAs' operations.
The '#USER#' must be replaced with the username o the user who is
allowed to perform 'doAs' operations.
The value can be the '*' wildcard or a list of groups.
For multiple users copy this property and replace the user name
in the property name.
</description>
</property>
-->
<!-- Default proxyuser configuration for Hue -->
<property>
<name>oozie.service.ProxyUserService.proxyuser.hue.hosts</name>
<value>*</value>
</property>
<property>
<name>oozie.service.ProxyUserService.proxyuser.hue.groups</name>
<value>*</value>
</property>
</configuration>

Are the files in /run the ones used by the cluster? If I need to make an edit to configuration, can I do it manually (in this case which file in /run should I modify?) or I have to do it through the Cloudera Manager web console?

Re: How configuration files are used in in a Cloudera Manager managed cluster?

If you have cloudera managed cluster then it is recomended to manage your configuration via cloudera manager.

please do not edit the configuration file manually unless you are very familiar as the same copy of configuration file will be maintained in different nodes and in different locations (in the same node) for various reasons

Re: How configuration files are used in in a Cloudera Manager managed cluster?

When using Cloudera Manager to manage your cluster, configuration for all your services is stored centrally in Cloudera Manager. When a service role is started, Cloudera Manager assembles the necessary configuration which the agents will download and distribute to a unique "process" directory. Those are the /run/cloudera-scm-agent directories you found to contain oozie-site.xml.

If you wish to change configuration for a service or role, you do so in Cloudera Manager itself so it can deploy the necessary configuration files and set the environment variables to run the process.

The oozie-site.xml file you found in the /opt/cloudera/parcels... directory is a "stock" file that ships with the parcels. It is not intended for use or editing and should not be modified unless instructed by Cloudera.

In order for you to run client commands on a host, you will need to have Cloudera Manager distribute Client Configuration files.

Thus if I understand correctly I should use Cloudera Manager for everything...

During the installation procedure with CM I've included the Teradata Connector parcel (which I need to conenct to Teradata database through Oozie/Sqoop). In CM web console I can see that the Teradata Connector parcel has been deployed and activated by the CM, but a Sqoop job using Teradata connector will fail (it does not find the connector). According to https://www.cloudera.com/documentation/other/connectors/teradata/1-x/topics/cctd_topic_3.html#concep... the connector should be copied in the Sqoop library inside the parcel, but the file is not present. Also, the documentations says that in order to use a Sqoop action inside Oozie that performs a Teradata import the following should be present in the oozie-site.xml file:

but in none of the oozie-site.xml files inside /run/cloudera-scm-agent/process/* folders this property is set, neither it seems to be present in the Cloudera Manager configuration for Oozie in the web console. In this situation what should I do?

The instructions you reference regarding the XML configuration apply to installation when Cloudera Manager is not managing the cluster.

If the documentation is correct, you should only need to create a sqoop1 gateway on any hosts that will be using the teradata connector and then deploy client config. After that, distribute and activate the connector parcel.

Note the following that may be relevant to the problem you describe:

Important: The Sqoop 1 Client Gateway is required for the Teradata Connector to work correctly. Cloudera recommends installing the Sqoop 1 Client Gateway role on any host used to execute the Sqoop CLI. If you do not already have the Sqoop Client service running on your cluster, see Managing the Sqoop 1 Client for instructions on how to add the service using the Cloudera Manager Admin Console.

I've set the Sqoop 1 Client Gateway and deployed the configuration. The Teradata Connector parcel was already activated during the cluster configuration phase.

What puzzles me is that in the Cloudera documentation, for the manual installation, says that the following property must be added to the sqoop-site.xml file in order to use the Teradata connector with Oozie:

I've followed the teradata connector installation path trough Cloaudera Manager, thus I was expecting that the property had been injected automatically by the CM, but it seems missing in the sqoop-site.xml files: