Introduction

Having access control on data when executing Jobs on an execution server in Talend is one of the critical business requirements that clients look for in a platform. The need for compliance can be due to regulatory reasons or internal business confidentially. So users often create separate Service accounts to manage the access control in the JobServer and execute Job tasks accordingly. But this can be cumbersome, time consuming, and limit audit capabilities if done outside the toolsets of Talend.

When using Talend Administration Console (TAC), one of the improvements can be to have multiple JobServers deployed, with each server mapped to individual service accounts for controlling the access of data. Once you lock the service accounts and JobServer combinations, no user can be configured to use RUN AS capabilities for executing JobServer tasks, because sudo capability (in other words, running programs with the security privileges of another user) can be curtailed when creating the service accounts. Taking this design approach, the following article describes a simple Job design to automate the creation of multiple JobServers, and to configure them in TAC to enable access control when executing the Jobs.

Creating a DI Job to create multiple JobServers

As shown in the diagram above, the process to automate the process of creating multiple JobServers is depicted in multiple steps below.

Validate the Context Values in properties file for ports and other details.

Fetch the JobServer Zip file from the shared folder and unarchive it in the current folder.

Create a copy of the extracted Zip file.

Modify the JobServer Properties files according to the configuration for the host machine.

Rename the JobServer according to the Service Account or other business conventions.

Note: This process is explained in the Talend Data Fabric installation guide provided as part of your product installation, and is available in the Talend Help Center. Navigate to Installing and Configuring Talend server modules, then scroll down to the Installing and configuring your JobServers section.

Details of the configurable default values of the Job context are listed below:

Running the DI Job

From Studio, build the preceding Job as a binary so it can be used as a utility without further need for Studio.

Copy the binary utility to any appropriate place and extract it.

Navigate to the folder \src\main\resources\proj631\jobserver_0_1\contexts.

Edit and modify the Default.properties file to change the values according to the HOST environment where you plan to deploy the JobServer using the available ports. Importantly, ensure there are no firewalls or other software rules restricting access to the configured ports.

If you don’t want to modify the Default.properties directly, then during execution of the Job, parameters can be set on runtime by providing inline parameters such as:

Running a JobServer as service

After the Job finishes successfully, a new folder will be created in the same folder where the Archive file is placed. The folder will be named according to the service account name given.

Navigate to the folder \svc_talend\Talend-JobServer-20170623_1246-V6.4.1\conf\opensuse_service.

Follow the instructions in the README.txt file to enable the JobServer as a Service.

Setting up TAC with JobServer / Service Accounts mapping to restrict access control

In TAC, access to resources is provided through the project(s). All the resources are tied to the project, and then you give users access to the project(s).

Setting up projects

In a Production TAC, you can create projects with the None storage option as shown below. In this case, there is no source repository like SVN or Git behind this project. The project is simply a Label to attach our resources to for deployment of the binaries and security purposes.

Depending on how closely you want to manage access to the JobServers, and thus to the Service Accounts, you may decide to create one or more such projects. You can even create 21 such projects to enable you to manage access to the JobServers on a very granular level.

Setting up project authorization

Once you have the projects created, you can assign Ops Users the Operation Manager role and give them Read access to the project(s). In the diagram below, User1 has access to two projects named TENANT1 and TENANT2.

Setting up JobServer agents

Set up one JobServer for each Service Account by duplicating the JobServer directory so that you have one directory for each JobServer, as shown below:

Edit the start_jobserver.sh and stop_jobserver.sh shell scripts to use the correct directory path for each JobServer.

Edit the TalendJobServer.properties file so that the three ports used for the command, file transfer, and monitoring ports are all different in all the JobServers.

Set up the RUN_AS_WHITELIST parameter for each JobServer, as shown below, to further ensure the fact that no other user, other than the whitelisted one, can execute jobs through this JobServer.

Set up a user in the users.csv file that you will use when setting up the JobServer. This user is a JobServer User, and they can be different for each JobServer. This is to prevent anyone from connecting to the JobServer from outside the TAC, for example from the Studio, and submitting Jobs through that JobServer. Unless they provide the correct authorization, they cannot do so.

The whole JobServer folder for each Service Account is owned by that Service Account. So only a root user, or a user that is allowed to sudo to that Service Account, can see the configuration files.

The JobServer is set up to start under that Service Account.

The Service Account has access to all folders required for Jobs running under that Service Account to work.

Creating a Keytab

You need a Kerberos keytab for each Service Account. The keytab is placed in the home directory of that Service Account, so only processes started under that Service Account will be able to access the keytab file.

As an example, details on how to create a keytab on Windows are available on this ktpass page, but follow the correct process for your operating system.

Setting up the JobServers in TAC

Set up the JobServers in the TAC with the correct ports as shown below.

Setting up JobServer authorization in TAC

Associate the Server in the TAC to the corresponding project(s). You can associate one JobServer to one project, or many JobServers to one project. This depends on the level of granularity you want. In the beginning, it may be easier to assign one JobServer to one project.

Managing access rights

Once you have set up all the projects and servers, limit the Rights of the Operation Manager to only the two items related to Job Conductor, as shown below.

Configuring Job properties

Since the Production TAC only has projects with the No Storage option, you need to provide Jobs as Zip files or from Nexus. When the Job Conductor imports a Zip from disk or from Nexus, it looks at the jobInfo.properties file to know which project to link this Job to. This properties file is provided within the Zip file of the Job. If you need to change the project so that a project label for a different tenant can be attached to the Job, you can modify the project= attribute in this file to match your project in Production. This can be automated through a build process. This is to allow a binary Job built from a project called xyz_dev to be attached to project TENANT1 in Production.

Creating Tasks and Execution

When creating and editing tasks on the Job Conductor, the user will only be allowed to associate a Job to the JobServer that he/she has access to. So a user will never be able to run a Job against a Service Account which he/she does not have access to using the RUN AS feature of TAC.