Local system administration of SCS Dragon hosts

There are well over 1000 Unix/Linux workstations within the CMU School of Computer Science. In order to better manage an environment of this size, SCS Facilities has made several modifications and additions to the "standard" vendor software. This document describes the major changes that we have made to OS-level software, and how to make local modifications to this software on Facilities-supported Unix/Linux workstations. It also describes some Facilities policies for local system administration. People making configuration changes to their hosts, or installing network-aware software should also be aware of SCS network use policies.On this page:

Some of the procedures described in this document involve becoming root on your machine and modifying system configuration files. You can seriously damage your machine's functionality if you make a mistake or are unfamiliar with the procedure.

If you aren't sure of how to perform some administrative operation, you may contact SCS Facilities by sending e-mail to help@cs.cmu.edu or by calling the Help Center (x8-4231; 9-5, M-F) and we will either do it for you or assist you in performing it yourself.

If you need root access to your machine, contact SCS Facilities. The usual way we provide root access is to add an entry of the form

username/root@CS.CMU.EDU

in the file .k5login in root's home directory. This entry will allow you to su using your username/root Kerberos instance password, and will also allow you to login remotely (via ssh) as root if you have username/root credentials on the client system. If you do not have a /root Kerberos instance, you can create one for yourself by using the Instance Manager. If you forget your /root instance password, you will have to contact SCS Facilities to have it changed. Because the above method of root access depends on Kerberos, and thus upon the network working, you may want to have a local root password. You can set a local root password password by becoming root and using the command:

passwd root

Facilities generally does not make use of local root passwords, so feel free to set this password to whatever you wish (though please pick a secure password). Please do not remove Facilities staff members from root's .k5login file, since that contains the necessary entries needed for Facilities staff members to access the machine. Also, please do not change root's home directory, since .k5login needs to be in root's home directory in order to enable remote access.

On systems where using sudo to perform administrator actions is the norm, Facilities will also add you to the appropriate system groups or sudo configuration to enable you to invoke administrator commands by providing your normal SCS Kerberos password to sudo.

If you have a Linux host, you can boot it into single user mode by appending the word:

single

to the line in the GRUB bootloader menu. You may have to type

e

to modify the GRUB commands. If your machine asks for a local root password during boot (for example, because it needs fscking) and you don't know (or haven't set) a local root password, you may be able to boot directly to a shell by appending:

If you are at all uncertain about the procedures involved in creating accounts on your machine, please contact SCS Facilities and we will do it for you. Facilities attempts to keep usernames and user IDs (ie the number that corresponds to a username) synchronized across machines. We do not do any synchronization of group IDs. If you are creating a local account for somebody who already has an SCS identity, you can use the following comand to add their account:

/usr/cs/bin/scsuseradd <username>

This will create a local account for the user, automatically filling in details like Unix UID, full name, default shell, etc, as they appear in LDAP.

By default, the scsuseradd program will create accounts that use the user's AFS home directory. On hosts that are to be used as desktop workstations, it is recommended that accounts be created with a local home directory instead, as GNOME and some desktop software do not deal well with keeping their dotfiles in a shared filesystem. To create an account with a local homedir, invoke scsuseradd as follows:

/usr/cs/bin/scsuseradd -L <username>

The local homedir will also be populated with the standard SCS shell profiles.

The scsuseradd program can take additional flags and arguments. Usage information can be seen via:

/usr/cs/bin/scsuseradd -h

Extended documentation can be viewed by typing:

perldoc /usr/cs/bin/scsuseradd

Note that this command cannot be used to create local accounts for users who do not have SCS identities.

If you are creating a local account for somebody who does not have an official SCS identity (such as a spouse or friend), you can use one of the UID's that we've set aside for this purpose: 1515, 1516, 1518, 1519, 1520. It is suggested that you choose a username for such accounts that is different from any existing SCS username since otherwise there may be login problems caused by, for example, having the same username as somebody whose Kerberos account has been expired. Note that the mail system will deliver mail addressed to username@machine to the username in the global LDAP database, if it exists, not the local user. As a result, if the username you picked gets allocated to a real SCS user, mail sent to that local account will not go there anymore.

If an account you create for a guest or friend (ie somebody who isn't already a SCS user) is abused in any way, you are responsible. SCS Facilities can provide little or no assistance to people who do not have valid SCS Kerberos identities, or in creating accounts for such people.

The recommended procedure for installing specific applications is to use the packages supplied by the vendor and available via the local SCS mirror of the official Canonical repositories. Dragon, (the SCS custom managed-computing environment), is designed to ensure that vendor packages an end user chooses to install will not interfere or collide with that installation. By design, SCS Facilities Dragon environment will NOT alter or modify a vendor installation. The following are some recommendations and suggestions to consider when installing certain packages on a Dragon based machine.

Important note: Backups

By default, the only partitions backed up on a Dragon system which is subscribed to the Facilities backup service are /etc and /usrN, for any integer value of N. Many vendor packages may default to storing data under /var (as in the specific examples of Apache and MySQL below, but this caveat may apply to any vendor package in general), which is not backed up by default, as it also contains temporary data, runtime- specific data, and often-voluminous logfiles. You machine's /var partition may also not be large enough to store large quantities of data alongside everything else that lives there. You may wish to consider moving your app's data store to a /usrN partition.

Important Note: AppArmor

Some Dragon platforms (e.g. Ubuntu) employ AppArmor by default, which is a kernel-based security module that can be used to restrict access by various daemons and programs to a limited list of directory paths in the filesystem. If you do decide to move an application's datastore from /var to a larger or backed-up partition like /usr0, you may need to adjust the application's AppArmor profile under /etc/apparmor.d.

Apache 2 Web Server

If it is expected that this application will involve a significant utilization of disk space, SCS recommends moving the document root directory to another location with more disk space. The vendor installs the document root directory in /var/www. On Dragon machines the /var partition may not be big enough to support a large scale web site. SCS Facilities recommends creating the document root directory under /usr0 (e.g. /usr0/www) and then symlinking /var/www to this newly created directory.

Alternatively, you may change the DocumentRoot directive in /etc/apache2/sites-available/default (or whichever config stub file is being used for a site in question) to point to the alternative location.

MySQL Database

If it is expected that this application will involve a significant utilization of disk space, SCS recommends moving the data directory to another location with more disk space. The vendor installs the database in /var/lib/mysql . On Dragon machines the /var partition may not be big enough to support large databases. As with Apache 2 above, SCS Facilities recommends relocating the MySQL data directory to a larger partition such as /usr0 if it is expected the database will house large amounts of data.

The default data directory location can either be moved and a symlink provided to the new location, or moved and the DATADIR directive updated in /etc/mysql/my.cnf to point to the new location.

Note that MySQL heavily employs AppArmor to restrict access via the mysqld binary. If you relocate MySQL's datastore from /var/lib/mysql to, for example, /usr0/mysql, you will need to add the following to /etc/apparmor.d/local/usr.sbin.mysqld:

/usr0/mysql/ r,
/usr0/mysql/** rwk,

Note that the above lines must end in commas, as this file is included in the middle of a list of allowed paths by /etc/apparmor.d/usr.sbin.mysqld.

You should assume that our network is hostile, and that any traffic on it may be monitored by potential intruders. For this reason, you should use ssh when connecting to machines. You should assume that if somebody can log in to a machine that they can become root on it by exploiting some vulnerability (we install patches for all known remote exploits, but we may not install patches for all local-only exploits, and patch availability may lag behind the most recent known exploits). For this reason, you should be aware that there is a risk in typing passwords at any machine on which other people have accounts. If you are an administrator for a group of machines, it is suggested that you give yourself root access on those machines by adding your root instance to root's .k5login file. By su-ing on your local machine, and typing:

ssh remote-host

you will be able to ssh in as root to machines you administer without typing a password on them.

You can also use the program iptables to set up a firewall on your machine. Note that the default behavior when setting up iptables is to deny access to all machines, so the default configuration must be adjusted to allow at a minimum CMU SCS Facilities machines and users access to your machine. See the iptables(8) man page for details.

Please do not configure iptables, tcpwrappers, or any other security service in such a way that denies access to the machine by Facilities Staff or services. SCS Facilities can not provide assistance to any machine to which access has been blocked.

If you suspect that your machine has been broken into, contact SCS Facilities at once, so that our security staff can handle the situation. If you are looking for signs of a break-in yourself, be aware that it is common for intruders to replace system binaries such as ls, ps, netstat, etc, so you should not trust the output of such programs.

Facilities-written system management software on SCS Dragon machines is typically automatically updated by a program called depot. Global system configuration information is distributed by a system called SUP. These programs are run nightly by /usr/cs/bin/dosupdepot. Occasionally, a machine will fail to depot/SUP because of some problem (such as a full disk, or AFS problems). If you suspect that your machine is not getting software updates, you can look at /usr/fac/log/depot.log to see when it last successfully ran dosupdepot. At any time, you can run dosupdepot by hand to, for example, force an update after you've subscribed a machine to a different software releaselevel (see below). Automatic dosupdepot upgrades can be disabled entirely by creating a file called /etc/disableupgrade. (Note: don't do this  you will not get security fixes and other important software upgrades if you do). You can force a depot run even if /etc/disableupgrade exists by running dosupdepot with the -force option.

Every machine has a particular release level of software that it is subscribed to by default (individual software collection release levels may be overridden by entries in /usr/cs/depot/depot.pref.local). The machine-wide software release level is controlled by the file /etc/releaselevel. By default (if that file does not exist), the release level is omega. You can subscribe a machine to alpha or beta release levels of software by putting a single line reading alpha or beta in /etc/releaselevel.

Depot controls only the contents of /usr/cs/, and does not modify any files outside of that directory.

Important note: We recommend not refusing upgrades unless there is an overwhelming reason to do so. Facilities staff will not debug problems caused by refusing some portion of (or all) Facilities software updates.

All SCS Dragon machines run the OpenAFS. AFS provides a wide-scale, shared file system with reasonable security features.

Occasionally, the AFS cache on a machine may become corrupted. Symptoms of this problem include an inability to access certain files or directories in AFS, or being unable to run a binary located in AFS without an immediate core dump. If these symptoms occur, you should verify that it is a local problem (as opposed to an AFS server problem) by seeing if it occurs on other machines of that type, or comparing checksums for the binaries between the machine having the problem and other machines. You can use the command

fs checkservers

to check on the status of AFS servers that your local machine's cache manager has recently contacted. The command

fs checkvolumes

will check the status of alternate locations of volumes for replicated collections, and may fix some problems. If the problem is local to a particular machine, then it is very possibly caused by AFS cache corruption. There are a few things to try to fix the problem. If it is a single file or volume, you can run

fs flush path-to-file

or

fs flushv path-to-volume

Alternatively, you can try interactively reducing the cache size, and then increasing it back to the default. To do so, run

fs setcachesize 20 (or some other small number)

wait until that command completes, and then run

fs setcachesize -reset

to reset it to its original size. Sometimes, in cases of severe corruption, the above procedures may not the problem. In order to completely clear the cache, remove the CacheItems file in the cache directory and reboot (instead of rebooting, you could try manually stopping and starting AFS using the appropriate rc script and arguments, but that does not always work).

With very few exceptions all SCS hosts should be a member of the system:friendlyhost AFS group. If you have trouble accessing files from your machine that should be accessable by system:friendlyhost, contact Facilities, and we will add that machine to friendlyhosts. One consequence of being a friendlyhost is that, if you are running a webserver on your machine, you should not allow the server to access files in /afs/cs, as that would circumvent the purpose of the system:friendlyhost access controls.

The information above applies to machines running the SCS Dragon computing environment. SCS Facilities can only provide best-effort support for machines that do not run our environment. However, we can provide a set of Service Configuration Add-Ons for certain Unix and Linux platforms, to be used on machines for which the SCS Dragon computing environment is inappropriate (for example, grant-funded projects with access restrictions that would be violated by allowing Facilities staff members access to your systems) or for machines you would prefer to administer yourself.

These configuration add-ons are provided in a vendor-native format and provide service configuration information only, to facilitate interoperation with core SCS services, such as AFS, Kerberos authentication, printing to SCS printers, and system backups.

If you or your project have such a machine, then you or your project is responsible for taking care of it. In particular, you are responsible for providing security patches and upgrades, and ensuring that it does not become a problem (eg runs a password sniffer or is used for denial of service attacks) for the rest of the facility.

This site is maintained by SCS Computing Facilities; send
comments to help@cs.cmu.edu.