IGI Installation and Configuration Guide for gLite 3.2 SL5 x86_64

This document is a complete guide for the installation and the configuration of INFNGRID profiles aligned with gLite middleware version 3.2 on SL5 (or RHEL5 clones) for x86_64 architecture (no i386 profiles have been deployed up to now).

Currently only few profiles have been ported to SL5 x86_64 and integrated in INFNGRID release. At the following link you can find the current status of gLite and INFNGRID porting:

Installation

OS installation

You may find information on official repositories at [http://igrelease.forge.cnaf.infn.it/doku.php?id=doc:tips:repos][Repositories for APT and YUM]] If you want to set up a local installation server please refer to Mrepo Quick Guide

Check the FQDN hostname

Ensure that the hostnames of your machines are correctly set. Run the command:

hostname -f

It should print the fully qualified domain name (e.g. prod-ce.mydomain.it). Correct your network configuration if it prints only the hostname without the domain. If you are installing WN on private network the command must return the external FQDN for the CE and the SE (e.g. prod-ce.mydomain.it) and the internal FQDN for the WNs (e.g. node001.myintdomain).

CAa installation:

Metapackage installation

Please consider that x86_64 WN profiles has to be installed using the “groupinstall” yum command as follows:

yum groupinstall <WN_profile>

where <WN_profile> could be one of: ig_WN, ig_WN_noafs, ig_WN_torque, ig_WN_torque_noafs, ig_WN_LSF, ig_WN_LSF_noafs or

yum groupinstall <UI_profile>

where <UI_profile> could be one of: ig_UI, ig_UI_noafs

If you are installing any othere profiles use:

yum install <metapackage>

Where <metapackage> is one of those reported on the table above (Metapackages_ column).

IMPORTANT NOTE:
When you are installing ig_CREAM or ig_CREAM_LSF or ig_CREAM_torque adding an exclude line to the .repo file, e.g. in /etc/yum.repos.d/slc5-updates.repo or /etc/yum.repos.d/sl5-security.repo

exclude=c-ares

Special cases

CE and Batch Server configuration (access to log files)

It doesn't matter what kind of deployment you have, batch-system master on a different machine than the CE (CREAM, TORQUE or LSF) or on the same one, you have to be sure that you provide access to the batch system log files:
You must set up a mechanism to ''transfer'' accounting logs to the CE:

through NFS (don't forget to set $BATCH_LOG_DIR and $DGAS_ACCT_DIR in <your-site-info.def> configuration file)

through a daily cron job to the directory defined in $BATCH_LOG_DIR and $DGAS_ACCT_DIR in <your-site-info.def> configuration file

CREAM_torque (multiple CEs and single Batch Server)

This configuration requires particular attention both during the metapackage installation and during the nodetype configuration.

Install the first CE (that will act also as Batch Server) following the usual procedure:

yum install ig_CREAM_torque

Install the other secondary CEs without batch server software as follows:

Batch system installation (only for WN)

LSF server/client installation must be done *manually*, whereas Torque server/client installation is included in the metapackage.

Configuration

Configuration files

IGI YAIM configuration files

YAIM configuration files should be stored in a directory structure. All the involved files HAVE to be under the same folder <confdir>, in a safe place, which is not world readable. This directory should contain:

List of configuration variables in the format of key-value pairs. It's a mandatory file. It's a parameter passed to the ig_yaim command. IMPORTANT: You should always check if your <your-site-info.def> is up-to-date comparing with the last /opt/glite/yaim/examples/siteinfo/ig-site-info.def template deployed with ig-yaim and get the differences you find. For example you may use vimdiff:

Worker nodes list in the format of hostname.domainname per row. It's a mandatory file. It's defined by WN_LIST variable in <your-site-info.def>.

<your-users.conf>

whole-site

Pool account user mapping. It's a mandatory file. It's defined by USERS_CONF variable in <your-site-info.def>. IMPORTANT: You may create <your-users.conf> starting from the /opt/glite/yaim/examples/ig-users.conf template deployed with ig-yaim, but probably you have to fill it on the base of your site policy on uids/guis. We suggest to proceed as explained here: ”Whole site: How to create local users.conf and configure users”.

<your-groups.conf>

whole-site

VOMS group mapping. It's a mandatory file. It's defined by GROUPS_CONF variable in <your-site-info.def>. IMPORTANT: You may create <your-groups.conf> starting from the /opt/glite/yaim/examples/ig-groups.conf template deployed with ig-yaim.

Additional files

Furthermore the configuration folder can contain:

Directory

Scope

Details

services/

service-specific

It contains a file per nodetype with the name format: ig-node-type. The file contains a list of configuration variables specific to that nodetype. Each yaim module distributes a configuration file in =/opt/glite/yaim/examples/siteinfo/services/[ig or glite]-node-type. It's a mandatory directory if required by the profile and you should copy it under the same directory where <your-site-info.def> is.

nodes/

host-specific

It contains a file per host with the name format: hostname.domainname. The file contains host specific variables that are different from one host to another in a certain site. It's an optional directory.

vo.d/

VO-specific

It contains a file per VO with the name format: vo_name, but most of VO settings are still placed in ig-site-info.def template. For example, for ”lights.infn.it”:

It's an optional directory for “normal” VOs (like atlas, alice, babar), mandatory only for “fqdn-like” VOs. In case you support such VOs you should copy the structure vo.d/<vo.specific.file> under the same directory where <your-site-info.def> is.

group.d/

VO-specific

It contains a file per VO with the name format: groups-<vo_name>.conf. The file contains VO specific groups and it replaces the former <your-groups.conf> file where all the VO groups were specified all together. It's an optional directory.

The optional folders are created to allow system administrators to organise their configurations in a more structured way.”

IMPORTANT NOTE: If your site has the intention to support more VOs than the default ones, you should have a look at Whole site: How to enable a VO, specially for the enmr.eu VO, once the configuration finished you should follow "extra_configuration".

Default files

Variables that contain a meaningful default value are distributed under ''/opt/glite/yaim/defaults/'' directory and that don't need to be changed unless you are an advanced user and you know what you are doing. The files are:

In case you really need to change these variables, you don't need to modify the value in these files if you don't want to edit them. You can just add the same variable in site-info.def since this will overwrite the variables declared in these files. See the configuration flow in YAIM in the next section.

Configuration flow

This is the order in which the different configuration files are sourced (''<confdir>'' refers to the path of the configuration folder which is the path of ''<your-site-info.def>''):

defaults ''.pre'' files in ''/opt/glite/yaim/defaults/'';

''<confdir>/<your-site-info.def>'';

service-specific files in ''<confdir>/services/'';

defaults ''.post'' files in ''/opt/glite/yaim/defaults/'';

node-specific files in ''<confdir>/nodes/'';

VO-specific files in ''<confdir>/vo.d/'';

function files in ''/opt/glite/yaim/node-info.d/'';

VO-specific group settings in ''<confdir>/group.d/*''.

Configuration variabiles

General

In the documentation you could find all the INFNGRID variables and some important gLite variables that can be configured in ''<your-site-info.def>'' are listed in alphabetically sorting (links to gLite variables are also listed):

Batch server

Path for the batch-system log files: "/var/torque" (for torque); "/lsf_install_path/work/cluster_name/logdir" (for LSF). In case of separate batch-master (not on the same machine as the CE) PLEASE make sure that this directories are READABLE from the CEs

BDII Top

By default, a BDII_top is configured to contact the GOC_DB. If you want to modify this behaviour, after the configuration you have to edit the /etc/glite/glite-info-update-endpoints.conf file using the following settings:

By default the cream db is installed on localhost and accessible only from localhost. Setting ACCESS_BY_DOMAIN to true, you allow the cream db to be accessed from all computers in your domain.

>= 4.0.5-4

BATCH_CONF_DIR

C

LSF settings: path where lsf.conf is located

>= 4.0.5-4

BLAH_JOBID_PREFIX

C

This parameter sets the BLAH jobId prefix (it MUST be 6 chars long, begin with cr and terminate by '_'). It is important in case of more than one CE connecting to the same blparser. In this case, it is better that each CREAM_CE has its own prefix. The default value for this variable (as specified in opt/glite/yaim/defaults/glite-creamce.pre) is "cream_"

>= 4.0.5-4

BLPARSER_HOST

C

to refer to the host where the Blah Log Parser is running or where the blparser is running (i.e. the variable xxx_BLPserver of blah.config). In this machine batch system logs must be accessible

>= 4.0.5-4

BLP_PORT

C

to refer to the the port where Blah Log Parser is running (i.e. the variable xxx_BLPport of blah.config, GLITE_CE_BLPARSERxxx_PORT1 of blparser.conf)

>= 4.0.5-4

CREAM_CE_STATE

C

This is the value to be published as GlueCEStateStatus instead of Production. The default value for this variable (as specified in opt/glite/yaim/defaults/glite-creamce.pre) is "Special"

>= 4.0.5-4

CREAM_DB_USER

C

Database user to access cream database (different by root). Yaim will create this user and grant him the access rights

>= 4.0.5-4

CEMON_HOST

O

Cream CE host name. In a more complex layout, the ce-monitor can be installed in a host different from the cream-ce host. In that case you need to put the right ce-monitor hostname in this variable.

>= 4.0.5-4

CREAM_PORT

C

to refer to the parser listening cream port (i.e. the variable GLITE_CE_BLPARSERxxx_CREAMPORT1 of blparser.conf)

>= 4.0.5-4

JOB_MANAGER

C

The name of the job manager used by the gatekeeper. For CREAM please define: pbs, lsf, sge or condor

Bound date on jobs backward processing. The backward processing doesn't consider jobs prior to that date. Use the format as in this example:

DGAS_IGNORE_JOBS_LOGGED_BEFORE="2007-01-01"

Default value: ''2008-01-01''

4.0.2-9

''DGAS_JOBS_TO_PROCESS''

O

Specify the type of job which the CE has to process. ATTENTION: set "all" on "the main CE" of the site (the one with the best hardware), "grid" on the others. Default value: ''all''

4.0.2-9

''DGAS_HLR_RESOURCE''

C

Reference Resource HLR hostname. There is no need to specify the port as in the previous yaim versions (default value "''56568''" will be set by yaim).

4.0.2-9

''DGAS_USE_CE_HOSTNAME''

O

Only for LSF. Main CE of the site. ATTENTION: set this variable only in the case of site with a single LSF Batch Master (no need to set this variable in the Torque case) in which there are more than one CEs or local submission hosts (i.e. host from which you may submit jobs directly to the batch system.). In this case, ''DGAS_USE_CE_HOSTNAME'' parameter must be set to the same value for all hosts sharing the "lrms" and this value can be arbitrary chosen among these submitting hostnames (you may choose the best one). Otherwise leave it commented. For example:

DGAS_USE_CE_HOSTNAME="my-ce.my-domain"

4.0.2-9

''DGAS_MAIN_POLL_INTERVAL''

O

UR Box parse interval, that is if all jobs have been processed: seconds to wait before looking for new jobs in UR Box. default value "5"

4.0.11-4

''DGAS_JOB_PER_TIME_INTERVAL''

O

Number of jobs to process at each processing step (several steps per mainPollInterval, depending on the number of jobs found in chocolateBox). default value "40"

4.0.11-4

''DGAS_TIME_INTERVAL''

O

Time in seconds to sleep after each processing step (if there are still jobs to process, otherwise start a new mainPollInterval). default value "3"

FTS

GLEXEC_wn

The gLExec system, used in combination with the LCAS site-local authorization system and the LCMAPS local credential mapping service, provides an integrated solution for site access control to grid resources. With the introduction of gLExec, the submission model can be extended from the traditional gatekeeper models, where authorization and credential mapping only take place at the site’s ‘edge’. Retaining consistency in access control, gLExec allows a larger variety of job submission and management scenarios that include per-VO schedulers on the site and the late binding of workload to job slots in a scenario where gLExec in invoked by pilot jobs on the worker node. But it is also the mapping ingredient of a new generation of resource access services, like CREAM.

The section "HYDRA_PEERS" contains the information about the servers installed on remote machines, that have to work with the server you're installing. Please note that HYDRA_ID is the remote hydra server "instance id" specified into its site-info configuration file. If you have only one installation for every single machine, most probably the HYDRA_IDs are all "=1". Ask remote servers' site admins.

After the Hydra installation and configuration, you'll be able to get service information by a LDAP query vs. <hydra_hostname>:2170. So you'll need to register it into your site-BDII. All the Hydra services must be found into a top-BDII to work together.

Enable the support to checksum agents. Available values: ''[true false]'' Default value: ''false''

4.0.9-0

''STORM_DEFAULT_ROOT''

C

In ig-site-info.def template. Default directory for Storage Areas.

4.0.2-9

''STORM_DB_HOST''

O

Host for database connection. Default value: ''$STORM_BACKEND_HOST''

4.0.2-9

''STORM_DB_PWD''

C

Password for database connection.

4.0.2-9

''STORM_DB_USER''

O

User for database connection. Default value: ''storm''

4.0.2-9

''STORM_FRONTEND_HOST_LIST''

O

StoRM Frontend service host list: SRM endpoints can be more than one virtual host different from STORM_BACKEND_HOST (i.e. dynamic DNS for multiple StoRM Frontends). Default value: ''$STORM_BACKEND_HOST''

File System Type (default value for all Storage Areas). Note: you may change the settings for each SA acting on ''$STORM__FSTYPE'' variable. Available values: ''[posixfs xfs gpfs]'' Default value: ''posixfs''

4.0.4-0

''STORM_GRIDFTP_POOL_LIST''

O

GRIDFTP servers pool list (default value for all Storage Areas). Note: you may change the settings for each SA acting on ''$STORM__GRIDFTP_POOL_LIST'' variable. ATTENTION: this variable define a lsit of pair values comma-separated: hostname,weight, e.g.:

STORM_GRIDFTP_POOL_LIST="host1,weight1 host2,weight2 host3,weight3"

Weight has 0-100 range; if not specified, weight will be 100. Default value: ''$STORM_BACKEND_HOST''

Rfio server (default value for all Storage Areas). Note: you may change the settings for each SA acting on ''$STORM_<SA>_RFIO_HOST'' variable. Default value: ''$STORM_BACKEND_HOST''

4.0.9-0

''STORM_ROOT_HOST''

O

Root server (default value for all Storage Areas). Note: you may change the settings for each SA acting on ''$STORM_<SA>_ROOT_HOST'' variable. Default value: ''$STORM_BACKEND_HOST''

4.0.9-0

''STORM_SIZE_LIMIT''

O

Limit Maximum available space on the Storage Area (default value for all Storage Areas). Note: you may change the settings for each SA acting on ''$STORM_<SA>_SIZE_LIMIT'' variable. Available values: ''[true false]'' Default value: ''true''

4.0.7-0

''STORM_STORAGEAREA_LIST''

O

List of supported Storage Areas. Usually at least one Storage Area for each VO specified in ''$VOS'' should be created. Default value: ''$VOS''

4.0.2-9

''STORM_STORAGECLASS''

O

Storage Class type (default value for all Storage Areas). Note: you may change the settings for each SA acting on ''$STORM_<SA>_STORAGECLASS'' variable. Available values: ''[T0D1 T1D0T1D1]'' - No default value.

Then, for each ''<SA>'' "Storage Area" listed in ''STORM_STORAGEAREA_LIST'' variable you have to edit the following compulsory variables:

NOTE:

''<SA>'' has to be written in capital letters as in the other ''<site-info.def>'' variables otherwise default values will be used! ATTENTION: for the DNS-like names (using special characters as "." (dot), "-" (minus)) you have to remove the ".". "-": e.g. for ''STORM_STORAGEAREA_LIST="enmr.eu"'' ''<SA>'' should be ''ENMREU'' like: