Archive for the 'Shell' Category

Few years ago I went on a quest to find a perfect shell prompt. I asked the mighty Internets for ideas, but it seemed futile. I tried many things, simple prompts, complex prompts, but nothing could satisfy my requirements (I don't even remember what were my requirements back then.) So I picked best of both worlds and got this little monster.

:) oscar:~#

Happy face! And in case of an error, it looks sad.

:( 2 oscar:~#

Cute, a? It even prints the exit code. Useful and cute at the same time!
And here is definition of the prompt. As you can see it uses simple function to determine return code of executed command and adjust its feelings accordingly.

Strace is a tool that should be in a toolbox of every system administrator. Not only that it can help in troubleshooting simple problems (ie. missing libraries in newly created chroot, which ldd mysteriously misses to report) but it also helps in debugging very complex system problems and performance issues.

Recently I experienced a very strange problem with one of the RHEL 3 servers we’ve got. Problem manifested in a very strange way, SSH and su logins hanged, other daemons were also hanging during the startup, only way to reboot or shutdown the server was to physically press the restart/power off button, etc. All this could have been caused by problems on both software and hardware level. First suspicious was bad RAID controller, but after tests this proved to be a mislead. After more tests and brainstorms hardware problems were definitely excluded, so problem has to be on the software side. But what could be the problem?

After few more misleading steps I tried to trace system calls created by su command and found very interesting results.

And this is where the strace output ends and su command hangs. Audit device file is opened (file descriptor 3) and as soon as the first request is dispatched to this device (ioctl system call to file descriptor 3) command freezes. According to this I should just disable audit on the server and the problem will be gone.
As a test, audit daemon was temporarily stopped and I tried to switch to another user and the problem was indeed gone.

When the free space in the filesystem holding the audit logs is less than 20%, the above notify command will error out and auditd will enter suspend mode. This causes all system calls to block.

So this behavior is not a bug but actual feature of the software. :o) From security point of view this is expected behaviour – attacker could fill up filesystem where audit logs are stored before the attack and audit will be disabled, meaning no logs of his activity, so better not to allow ANY activity on the system if audit is not able to write to its logs. But still, this kind of behaviour renders the system completely useless to legitimate users.

The topic of this post is not audit, so I will stop here. Important thing is that strace led us directly to the main source of the problem. Resolution of issues like this would be much more complex and time consuming without this great little tool. :)