Friday, November 30, 2007

Just a few random musings. Hopefully these will save you some money and some headache, like they did me.

1. If you ever have to replace the battery on your CPU Board for the Sun Series v480, v490, v880, v890 (Probably for more) servers and you have pay-per-incident support with Sun (or a VAR), take some down time on the server (you may "have" to) that's giving you a hard time. We're assuming that you've either talked with Sun's phone support or used a search engine to find out what the cryptic error message in your log file means, and all signs point to replacing the CPU board's battery.

Once you've gotten your outage period approved, power down the system and pull it out of the rack, or out on its rack slides (however you can get to the side so you can unscrew it on top and open it up on right side). Pull out the CPU board that is giving you the errors and check the battery. You'll notice that it looks a lot like the battery from your grandmother's hearing aid. That's because it is. Jot down the information on the battery (some will even say "Panasonic" on them) and head down to the corner drug store. For about 10 bucks you can get the battery you need to replace, and maybe a few extra. You just saved the company a few hundred dollars (don't forget to put this on your annual review, unless you're doing this without anyone's knowledge because it isn't condoned ;)

2. The A1000 and D1000 disk arrays are pretty much obsolete now. I'm positive they're past their EOL ("End Of Lifecycle") for sales and OS upgrades/patches, but lots of folks still use them because they're big and sturdy and, for the most part, reliable. The batteries on these machines "expire" every two years. I put "expire" in quotes for one simple reason; it's just a term. One that's misused about 98% of the time when it comes to the A1000 and D1000's.

The issue here is that the batteries, actually last up to 3 years before Sun considers them bad. Sun has actually released a patch (I don't have a link to it directly here - but you can ask a Sun, if you have paid support (even if you don't; you might get lucky). If they won't tell you, ask the FE they send out to do a service call on your A1000) that will make it so that your A1000 or D1000 won't complain until it's 3 years old, now. These complaints are also programmed, and can be modified. That is to say, they don't rely on anything other than their own internal state (assuming no "real" errors are being issued by the batteries). The error message you'll generally see is "Battery age is between 720 days and 810 days.," "Battery Age has Exceeded Specified Limit" or something like that. Not a big deal. You still have another year. If you pay for parts, you just saved the company even more money.

3. Sticking with the A1000's and D1000's; the batteries actually last longer than 3 years. In fact, you can say (with a fair amount of confidence) that the batteries are going to be okay until they throw an actual error - not a status report on their "age." As I mentioned above, these "error" messages about the age of the battery are determined by the RaidManager utility's knowledge of the battery's state. You can actually change this from the command line, as long as you're root, and make your batteries brand new again (In theory ;) like so:

xyz.com # raidutil -c c3t0d0s2 -B Battery age is between 720 days and 810 days.raidutil succeeded!

Now you can wait for something to actually "happen" before replacing the battery. Arguments can be made that it's not a good idea to wait for a problem to happen when you can fix it pre-emptively, but when the batteries die on these devices, all you've really lost is your read-ahead cache. If no one has noticed that lack of cache has slowed down performance in the period before you find out your battery's dead, its use as a speed-up for read/write operations was never really all that important anyway. So, for whatever amount of time you can squeeze out of that battery, you've saved your company even more money.

And one last tip, even though some folks frown on this (Sun, the company, in particular. Most Sun FE's are all right with it): It's perfectly okay to replace the battery on an A1000 while it's up and running. It's not, technically, hot-swappable like the disk, but you can remove the old battery, put in the new, and be confident things will work out okay. I've never ever crashed one by replacing the battery in the 4 or 5 years I worked on these devices. All you need to do is this (I promise I'm not lying):

Turn off cacheing, if it's enabled, with: raidutil -c TheDiskName -w off TheLunName Remove the old battery Replace it with the new battery Run: "raidutil -c TheDiskName -R" Turn cacheing back on, if you had to turn it off, with: raidutil -c TheDiskName -w on TheLunName Wait about 15 minutes for the error state to clear.

Thursday, November 29, 2007

Every once in a while, it's good practice to take a look at any system you have setup (whether it be a single system, business process, program or computing environment) and make sure it's doing okay. Some sort of regular auditing also helps to find out if there are any areas in which you can make an improvement or fix errors.

One of the areas that seems to require looking over more consistently, is your DNS setup. New versions are released regularly, bugs are found just as regulary, and acceptable syntax can sometime change between releases and/or RFC's.

Note for today's script - It is intended only for your benefit, and I strongly urge you to use any free DNS reporting service you can find to accomplish what we're accomplishing here. I neither work for, nor do I do any afilliate marketing for, www.dnsstuff.com. We only use it where I work, because it's the standard. Probably, most readers already know about it. The site tools require a free registration, if you're not a paying member, but also require a payment for use after 30 DNS Reports. I have placed a very obvious "COMMENT" in the script above the only line you'd need to change to use another service. If you "do" find something equal, or better, I would recommend that you use it (and, maybe, send me an email to let me know the URL so I can check it out and update my scripts, too :) That being said, I'm not trashing them either. If you have to do this sort of thing a lot(for your employer, lets say), the company can probably shell out a few bucks a month for the service. It's worth the price if you need to use it regularly.

Aside from your own internal auditing, it's good to get a fair and objective third-party assessment of the state of your DNS setup. A great site to get this accomplished on the web is located at http://www.dnsstuff.com. They have a tool called "DNS Report" (which used to be its own domain - www.dnsreport.com) which can be used very effectively, even if you choose not to be a paying member of the site. A long long time ago, it was available to everyone for free, but since it's pay-for now, you really have to be careful not to deluge it with requests to check your DNS zones (all 30 of them, if you're doing this for free), or you'll get dumped on their blacklist and barred from using the service at all.

The little Perl script I've written below will help you to automate the usage of that web service for all of your DNS zones. The reports are nice and easy to understand, even for the highest of higher-ups (green = good, red = bad ;) and the service does a very good job of pointing out weaknesses, or areas that don't conform to current RFC expectations, in your DNS zones. This script does require that you have Perl installed (although it's just a script I whipped up so it's submitted under GPL (Gnu Public License), at best, and you can feel free to rewrite it to suite your own needs). It also makes use of curl. You can substitute lynx or wget or any other program that can download and save web pages to your Unix/Linux hard drive. And here it is, below (Please read the comments regarding use of dnsstuff's dnsreport tool):

## 2007 - Mike Golvach - eggi@comcast.net## Creative Commons Attribution-Noncommercial-Share Alike 3.0 United States License## Registration is now required to use this tool - It used to be free.# When/If you register, be sure to uncheck the two opt-in mail# checkboxes at the bottom unless you want to receive those emails.## Login to dnsstuff.com before running this script or it will just return# a page indicating that you need to log in before submitting a full request# without that request being linked to from one of their pages## Simple Error Checking here, just want to be sure that a file exists at all.# Disregard everything else on the command line. This script will fail if the# file named on the command line doesn't include one URL per line, anyway :)

And that's that! As I mentioned before, this tool will blacklist you if you hit dnsstuff too hard. My script assumes that you can hit the site at the rate I used last time I used it. Since it's a pay-for service now, I would recommend changing the wait times to at least twice as much. To give you an idea, I ran this when the service was free and ended up getting blacklisted in a under a few hours. Granted, there were certain other factors that contributed to my getting the boot; most prominently that I was checking the DNS for about 300 to 400 zones we were hosting. It could have been left at checking one and making sure all the others were the same, but, as noted above, some bosses like to see reports with lots of colors and lots of pages.

Also, just so you don't feel like I'm leaving you in a lurch, if you do happen to get blacklisted, just go to their member forums (you get access to post to these since you had to do the free registration - you can only browse them if you're not registered) at http://member.dnsstuff.com/forums/ and do a search on "banned," "blacklist" or "black list." Most folks just have to start a thread and request their access back in order to get off the blacklist.

Wednesday, November 28, 2007

There are quite a few bugs out there on SunSolve, regarding system crashes due to this or that network device failure (or driver issues, depending on how you look at it). Most of the time, the advice is to disable the network device as non-destructively as possible. That can be difficult, given the right circumstances. Of course, I'm talking about the "non-destructive" part. Disabling an interface is easy. Lots of people who don't know the first thing about Solaris can do it, if they bang enough keys and have the proper system privilege ;)

The first thing to do is incredibly obvious. Just take out any references to the network device on the host. Say, if you are using ce0 and ce1 and no longer want to use ce1, you could just do the following and you'd be good from that point on and for future reboots:

ifconfig ce1 down ifconfig ce1 unplumb rm /etc/hostname.ce1 vi /etc/hosts ce1 host name/IP entryvi /etc/netmasks ce1's netmask setting (unless it's on the same subnet as ce0!)vi OTHER_FILESThe second thing to look for would be OS operations you could perform (perhaps during boot up). As a "For Instance," certain Sun machines, running certain patch levels, have an issue with the hme0 network device (it's technically a device driver) if it's not connected to the network. Even if you aren't using it. This is somewhat annoying because, if you don't set up the configuration to plumb the network device and bring it up, it shouldn't give you any errors. You should only know hme0 exists by looking in a system file like /etc/path_to_inst. But the hme0 device causes the following error to post constantly during boot up and for a while after:

SUNW,hme0:Parallel detection fault

Working around this bug is fairly simple. In this case (and each case will probably be slightly different - troubleshooting can be long and hard some times), you could run the first two lines below, to stop the activity immediately, while logged in. The second line could be added to /etc/system so that the problem wouldn't recur on reboot. It is strongly recommended to backup, or copy off, /etc/system before changing it, so you can use "boot -a" at the PROM level to boot using your old version if the new one causes your future boots to fail!

The third, and most drastic, way you'd go about this is to disable the problematic network device at the Solaris PROM level. Before you bring the machine down, run this (we're still using hme0 as an example and note the output by copying and pasting into notepad, or even writing it down:

Assuming we've already executed something like "init 0" as the root user, we could do the following to disable hme0 from the PROM "ok" prompt. Note that if you run "show-nets" at the PROM level and see truncated information, use it to compare with the device you have listed from before and use the most similar (just slightly clipped) entry in your future arguments.

If none of the patterns match at all, you may have gotten the wrong info from /etc/path_to_inst or your device tree is screwed up beyond what we're specifically dealing with here today. At this point you should be absolutely certain you know which network device to disable at the PROM level. Now, run the following to make the PROM disable, and Solaris forget all about, hme0:

ok nvedit 0: probe-all install-console banner 1: " /pci@1f,4000/network@1" $delete-device drop 2: ctl-c control key and the c key together will break you out of nvedit and put you back at the ok promptok nvstore ok setenv use-nvramrc? true use-nvramrc? = true ok reset-all set auto-boot? false" before you run this command, if you want your system to stay at the PROM after it resets.

Now, the hme0 network device should finally be completely disabled at the OS level. Solaris should not even know it exists! You may want to consider doing a reconfigure boot ("init 0" followed by "boot -r" at the PROM "ok" prompt or "reboot -- -r" from the OS -- there are a few more ways to do it, but I digress).

And, to answer the inevitable, and reasonable, question: How can I re-enable my network device on Solaris' PROM once I've disabled it?, here's how:

ok setenv use-nvramrc? false ok nvedit 0: " /pci@1f,4000/network@1" delete-device ctrl-u control key and u key together will delete the current line from the nvedit buffer. You only need to do this for the device you previously wanted to ignore.ctrl-u nvedit session. In this case, you must delete two lines since the device and delete-device instruction are on separate lines.ctrl-c ok nvstore ok reset-all ok boot -r

Hopefully this has saved you more headaches than it can potentially cause :)

Tuesday, November 27, 2007

A lot of times, when you're asked to find something on a machine, and you only have a moderate idea of what you're specifically looking for, you'll use the obvious command: find. find is a great command to use because you can use wildcards in your expression argument. So, if you know that you're looking for something like "theWordiestScriptEver," and you have no idea where it's located on your box, you could find it by typing just this:

find / -iname "*word*" -print

This will find every file on the system (even on non-local mounts if you have them set up) and only print the results for files with the word "word" in the name. Note that the "-iname" option matches without regards to case, so h and H both match. This option isn't available in all versions of find. If you don't have this option available to you, you'll get an error when you run the above line (just use "-name" instead). The standard Solarisfind does not do "case insensitive" pattern matching, so your best bet is to find the smallest substring that you're sure of the case on, or use another attribute to search for the file (like -user for the userid or -atime for the last access time). Alternatively, you could spend hours stringing together a bunch of "or" conditions for every conceivable combination of upper and lower case letters in your expression.

Now suppose you needed to perform an action on a file you found. You could use find,s built-in exec function, like so:

find / -iname "*word*" -print -exec grep spider {} \;

This will perform the command "grep spider" on all files that match the expression. Which brings us around to the next predicament. What do you do if you have to try and find something simple, have no idea where it is on your box "and" that box hosts file systems that Windows users are allowed to write files to. The above example should work just fine on those. My own advice is, if you can get away with just using find, do so, since it handles all of the rogue characters, tabs and spaces in Windows files on its own.

Now, if you have to do something much more complicated (or convoluted), you'll want to pipe to a program like xargs, which is where all those funny Windows file names and characters (some of which are special to your shell) start to cause issues. Again, this would return ok:

But this will become an issue if you pipe it to xargs, as shown below:

# find . -name "*word*" -print|xargs ls xargs: Missing quote: files

Ouch! xargs doesn't deal with those spaces, tabs and special characters very well. You can fix the space/tab problem very simply by using xarg's "named variable" option. Normally, xargs acts on the input it receives (thus: "xargs ls," above, is processing ls on each file name find sends it), but you can alter how it deals with that data in a simple way (at least as far as the spacing issue is concerned). Example below:

But, in the second invocation above, you see that it still can't handle the "shell special" characters, like "'" or """ \ / : * ? "< > Windows won't allow them as parts of file names. It seems easier just to react on anything that isn't a letter or number and pass it along with enough escapes (back slashes) so that xargs can parse it correctly, and get you back good information. Here's how to do that; using sed (and a little grep, to keep it neat), also:

Monday, November 26, 2007

If you use Solaris and/or most major Linux distros, AIX administration can be a bit unsettling to jump right into. Luckily, they do include a nice VT interface to get almost anything done called "smitty." You can also call it from the command line as "smit."

Since I have to work on AIX, I thought I'd write a quick listing of the more common commands I use in AIX (that differ the most from Solaris) and what they're used for. Since the list is a bit long, it'll be the meat of this post:

Sunday, November 25, 2007

As much as I'd love to, I won't give you my opinion of Tivoli Storage Manager (TSM) in this post. It's not really about what I think, but will hopefully help out a few folks. TSM, in my experience, is not too difficult to install and set up on a host, but troubleshooting it can be another issue. A lot of times, the error messages it gives you lead you away from the problem rather than to it.

Which brings us to today's post topic: TSM's most misleading error message. This error specifically has to do with Oracle backups.

Assuming your installation has gone without incident, you've already logged into the TSM server from your host and verified all's well, like so:

host.xyz.com# dsmc q sess

And, after logging in, if you haven't set that up already, you see some marginal information about the TSM version, when the last backup was run, the schedule of backups, etc.

Now, you're ready to turn the product over to the client. It's been installed, but never run. And, surely enough, the next day (at the latest ;) you receive a call that the Oracle backups didn't work. Some minimal investigation into the log files (since logging into the TSM interface from the host using "dsmc" doesn't seem to show any issues, except that no Oracle backups have run) shows that something's very very wrong! You find yourself faced with this scary error:

So, your first inclination might be to go talk to the folks that manage site backups, because there's obviously a problem with the tape loader or a tape drive. But, this couldn't be farther from the truth.

Here's the kicker. The real issue has nothing at all to do with the error message! The issue is with the "dsmc" process not being able to write to its logs as the oracle user id! Yes, that's true :)

The good news is that, once you've gotten past the mystery, the solution is relatively simple. You can fix this problem very quickly by doing the following:

1. View the dsm.sys file that you've set up (it should be linked to /usr/bin/dsm.sys) and find out where the error logs and activity logs are being written. If you haven't specified this, your logs should be under the default installation directory under logs/tivoli/tsm -- depending on your version, the location may vary)

2. Now simply fix the permissions on the log so that the oracle user id can write to them:

chgrp dba errlog.logchmod g+w errlog.log

3. Verify that all directories above this are accessible as well. The easiest way to figure out if you've gotten it right is to use the oracle user id to do the check:

Saturday, November 24, 2007

Most Solaris 10 users and admins already know that a whole lot about the Solaris OS changed when they switched from version 9 to 10. One of the most significant things that changed is the way Solaris deals with internal IPC settings, semaphores, shared memory and other kernel tunables.

In all previous versions, you needed to include those sorts of settings in /etc/system. The upside to this was that you could include all of your information in one file and easily report on all of your information by calling a common system command like "sysdef." Of course, one of the big disadvantages of having to set all of that stuff up in /etc/system was that you had to reboot your machine, every time you made a change, in order for the settings to take effect (/etc/system is read during the boot process at a level which can't be simulated at any user-operable run level)

In Solaris 10, a lot of the old setting in /etc/system have been deprecated. You can still include them without causing a problem with your boot process, but Solaris 10 will ignore most of the options you put in there with regards to the old system settings. The new preferred way of setting these variables in Solaris 10 is the "project." Among its many advantages, you can now modify a lot of settings, that used to require a reboot, on-the-fly!

This post isn't going to go into the entire "project" concept, but, instead, focus on two different ways of using it. That is, we'll be looking at using "project" settings on a "per user" basis and a "per action" basis.

Enabling project settings on a "per user" basis is generally preferred by most users. For instance, the oracle user account will want to have the maximum amount of shared memory set to an exact value in the kernel whenever it is invoked. The settings need to be the same all the time, and enabling project settings on "per user" basis is the easiest way to accomplish that.

For a small example; here is how we would set the oracle user up to always have its shared memory maximum set to 17GB (now matter how the account is called or used):

Now, you may want to make it so these values can be used on a "per action" basis. This way the user can log in and not necessarily have to use the project values assigned. This is preferrable if you only need to run certain processes with the project settings, but don't want them set while you do other things. The process is pretty much the same:

Now when the user logs in, he or she can verify that the "project" settings aren't in effect by running "prctl $$" as above. Output should show that user is in the default project for the OS. He or she can now also take advantage of the project settings by using a command called "newtask," like so:

newtask -p user.oracle COMMAND

or attach to an existing process with:

newtask -v -p user.oracle -c PROCESS_ID

For the "per task" processes, you can verify that they're in effect for the process you started (or attached to) them with by doing:

prctl -i PROCESS_ID

or

prctl -n user.oracle (with, optionally, "-i PROCESS_ID" -- Running this without specifying the PROCESS_ID will show all processes running under the user.oracle project resource.

And that's it. Play around with it and have fun. Like we mentioned before, you can try this out a couple different ways until you get it just the way you like. With the new "project" method of setting kernel tunables, you'll not have to reboot any more between your iterations :)

While there are more than a few ways to pack up and distribute software distro's you've either downloaded and custom-compiled, or built yourself from scratch, if you work on a lot of Solaris boxes, it's nice to be able to produce legitimate "pkg" files to showcase your efforts.

Today, we'll look at how easily this can be done (although, of course, many other methods are much much less drawn-ouot). After reading through this post, you should be able to create your own "pkg" files and even (we'll look at this in some detail in a future post) script out and automate the process with a little ingenuity :)

For the purposes of this post, we'll assume that we're going to build the latest version of PROGRAM (pardon my lack of originality in naming ;)

1. First, ensure that you have the disk space and compile your program with the "--prefix" suffix (pretty much standard now with all publicly available source builds):

2. Now you should have your program built and installed under /usr/local/builds/PROGRAM, with all the subdirectories beginning there (e.g. /usr/local/builds/PROGRAM/bin, /usr/local/builds/PROGRAM/sbin, etc). Make sure that all the files have the ownership and permissions that you will want them to have when your "pkg" file is complete!

3. Now we'll create the "prototype" file. Add all necessary links (like usr=/usr) for all subdirectories that you want your package to unpack into. We're going to assume that your package is going to unpack into the root directory (/) like most Solaris "pkg" files:

4. Then, add pointers to the prototype file. You'll "need" one to the "pkginfo" file that we'll create soon, and you can also create entries for "preinstall," "postinstall" and "checkinstall" files, although they're not completely necessary. "checkinstall" is a good file to have if you want to be able to ensure that your package won't install if it is somehow corrupt or not being installed on a proper system. Note that the "checkinstall" script is run by the user "nobody," and runs automatically, so your source needs to be accessible for checks by that user in order for it to work correctly. "preinstall" and "postinstall" are run as "root" and prompt the user to run them. You can also add a line to reference a script to check dependencies within the "prototype" file of the form "depend=/path/to/depend/script." Add the following to the top of the /usr/local/builds/PROGRAM/prototype file:

Lots of these aren't necessary, and are self explanatory. ISTATES are the acceptable run levels in which the package can be installed. RSTATES are the acceptable run levels in which it can be removed.

6. And create a simple "checkinstall" file as well (both the "pkginfo" and "checkinstall" files need to be in the same directory as the "prototype" file and the software - in our case /usr/local/builds/PROGRAM):

7. Now, we're finally ready to create the package with "pkgmk" and "pkgtrans":

cd /usr/local/builds/PROGRAM pkgmk -b `pwd` -d /tmp b" is your source base diretory and "-d" indicates the "device" (in this case, the /tmp directory) into which you want to build the actual "pkg" file.pkgtrans -s /tmp /usr/local/builds/PROGRAM.pkg PROGRAM.pkg s" indicates that you want to translate the PROGRAM.pkg file in /tmp in datastream format as /usr/local/builds/PROGRAM.pkg

Now you're all set to grab the PROGRAM.pkg file from the /usr/local/builds directory and install it with "pkgadd" on any Solaris system you want to (depending, of course, on the restrictions you put in place in your "checkinstall" - and possibly "depend" - script)!

Friday, November 23, 2007

Apologies for the length of this post. I couldn't find any reasonable way to make it any shorter - so I'm including a link to where Sun is officially keeping all their documentation on this subject, should you want to explore it beyond this post. You can acess that info here!

As I always say when working with Solaris 10 (or whatever the latest version of Solaris is) you should do a trial run of a zone migration before you actually move any zone from one machine to another. Make up a bogus zone (like "origzone") set up the basic attributes and use that before you risk your important data!

The move is fairly simple at a high level (I will avoid breaching the subject of Solaris' RBAC to keep this at least somewhat short!):

1. The zone is halted and detached from its current host.2. The zone is packed up.3. The zonepath is moved to the target host, where it's attached.

In order to migrate the zone, you have to meet a few pre-requisites:

1. The global zone on the target system has to be running the same Solaris release as the original host.

2. To ensure that the zone will run properly, the target system needs to have the same versions of required OS packages and patches as the original. Any other packages and packages can be different; for instance any third party products and their respective patches don't need to be the same. Any packages that are OS-related and/or zone-inheritable must be included!

3. The host and target systems have to be of the same machine architecture (easily verifiable using "uname -m").

Once that's all been cofirmed, we're off! Running "zoneadm detach" on the original host sets the zone up so it can be attached to the target system. Running "zoneadm attach" on the target system attaches the migrated zone and verifies that the new system can host it.

Now we'll walk step-by-step through migrating the non-global zone from the original machine to the target host. Again, this process must be run as root in the global zone in order to ensure that it works without error (not to mention at all ;)

1. Become the root user and halt the zone to be migrated ("origzone" in this procedure).

orighost.xyz.com# zoneadm -z origzone halt

2. Now detach the zone.

orighost.xyz.com# zoneadm -z my-zone detach

3. Move the zonepath for origzone to the new host.

You can do this any way you want, basically. Just cd into /export/zones, and tar up the contents of the origzone directory. Then move the tarred up zone file to the target server using whatever file transfer facility you prefer, so that you have /export/zones/origzone set up on your target host.

4. On the target host, configure the zone:

targethost.xyz.com# zonecfg -z origzone

You will get a message from Solaris letting you know that origzone doesn't exist on the target machine, and needs to be created. This is only partially true ;)

5. To create the zone origzone on the target host, run "zonecfg -a /export/zones/origzone" on the target host. Then do the following:

zonecfg:origzone> create -a /export/zones/origzone

6. Make any necessary modifications to the configuration. For instance, to borrow from the example on sun.com, the NIC on the target host may be different, or other devices may not be the same. To change the NIC:

Note that, at this point, you'll be warned if you didn't meet any of the pre-requisites regarding required-same packages and patches on the target host, as we mentioned above.

As an additonal end-note, you can force the attach operation without performing the validation if you're absolutely sure that everything is okay and the message is in error. Of course, if the message is correct, and you're not, forcing the attach may result in random and bizarre zone behaviour later on. It's not recommended (By Sun or by me :)

targetzone.xyz.com# zoneadm -z origzone attach -F

Hopefully now you've gotten your zone migrated, and can script out as much as possible, to make mass migrations as simple and expeditious as possible in the future!

Thursday, November 22, 2007

Today, in honor of the spirit of the day, I thought I'd write a little something for which you might, eventually, give many thanks :)

As we mentioned in yesterday's post, once you have SSH keys set up network-wide, so that you can passwordlessly login as yourself on all the machines you administrate, our "scmd" script can be a blessing; especially if you're ever stuck with the seemingly impossible task of retrieving some obscure bit of information off of every single box!

I won't lie to you; this is the worst part of the process. In order to get your keys initially set up, you'll have to put forth a good deal of effort (Your misery will be commensurate with the size of your network). But, once you're done, you'll be on easy street.

For our purposes, we're going to assume that you're using Solaris' (now) standard implementation of SSH (Basically OpenSSH) and craft our instructions using those assumptions. Depending on what kind of SSH client/server you're using, there may be differences in the names of certain files (e.g. authorized_keys2 might be authorized_keys -- The man pages are your friends :) We're also assuming that you actually already have user accounts on all the machines you're required to do work on.

So, to get the worst part over with, simply do the following (cut it up into chunks if it becomes frustrating. Depending on the size of your network, this is quite possible):

1. Create a file with the names or IP addresses of all of the machines you need to work on. For our purposes, format that file with one hostname/IP per line. We'll assume it's named "HostFile"

2. If you haven't done so already, generate your personal ssh-keys on the one machine that you're going to use as your central hub for administrating all the others, like so:

ssh-keygen -t dsa (Answer yes to the default file locations -- /your/home/directory/.ssh/id_dsa and /your/home/directory/.ssh/id_dsa.pub) (Also, just hit return both times you're prompted to enter a password. While this is just slightly less secure than entering the double-confirm password, it will completely defeat the purpose of setting this easy-administration up)

2. Run the following from your home directory (This is the first really long and trying part, as you'll have to interact manually):

for x in `cat HostFile`;do ssh -n $x "mkdir -m 700 .ssh";done (Actuall SSH command may vary depending on your distro. I'm using the "-n" flag to avoid having SSH break out of the "for loop")

Now, if we're starting from scratch, you'll need to answer "yes" for each machine's initial prompt (it will ask you if you trust the host and want to accept and save the key) and enter your password.

3. When that's finally over, cd into your home directory's .ssh directory and run the following on the command line ("HostFile" is still in your home directory, so we're using a relative path to it here):

for x in `cat ../HostFile`;do scp id_dsa.pub $x:/your/home/dir/.ssh/authorized_keys2 (Again, you'll need to enter your password for every machine. Note that if you run a variety of servers, or build standards have changed over time so that your home directory isn't always in the exact same place on every box, you can use "~.ssh/authorized_keys2" as a substitute for "/your/home/dir/.ssh/authorized_keys2)"

4. And, now, you're all done, except to make sure that it actually worked! My favorite way to do this is to just run the same command again. Overwriting your authorized_keys2 file on all the remote hosts won't hurt, and - this time - you should be just kicking back, watching scp's progress meters fly by as you try not to fall asleep ;) Again the "~.ssh/authorized_keys2" substitution is perfectly valid and useful:

For all the machines that still ask you for your password, take note of those hosts and troubleshoot as necessary. For the most part, you shouldn't have to worry about running into too many of those.

And from here on out, you can use the "scmd" command we put in our pre-Thanksgiving post, or your own custom scripts, to effortlessly run any command (as yourself, of course) on all your machines, by typing a single line on your hub computer.

Wednesday, November 21, 2007

You may recall that we looked at one of the core components of this script in an earlier post, located here.

Today, in preparation for Thanksgiving, I'm putting up a little script that can help you run almost any command line you can dream up, and save the output for you in a relatively nicely formatted report. I'm calling it "scmd" (short for SSH command) but you can call it whatever you want :)

Now, of course, this script comes with a pre-requisite. For instance, it won't really help you save time if you don't have ssh-key-based passwordless logins set up for yourself on all the servers you manage. For Thanksgiving day, we'll go over how to easily set yourself up with those.

For today, assuming you've got your passwordless logins all set, feel free to modify this script as you wish, and use it to kick back and relax when the boss asks you to verify the version of Veritas Foundation Suite on all 300 servers in your farm. Otherwise, until tomorrow, enjoy only having to type in your password over and over and over again (at least you won't have to keep re-typing the command ;)

The script is posted below, for your enjoyment. Note that the "hostfile" is set to be of the format: hostname colon ip_address (e.g. host.xyz.com : 192.168.0.1). The hostname is the only necessary component of each line and all lines that begin with pound symbols (#'s) will be skipped. One host per line, please.

Again, tomorrow, we'll go over how to easily set up ssh keys for yourself network wide, and feel free to modify this little script to make your worklife as effortless as possible:

After a successful run of, for instance - scmd "/usr/sbin/pkginfo VRTSvcs" - you'll end up with a nicely formatted report with the output from all of the servers in your "hostfile," named OUTPUT.usrsbinpkginfoVRTSvcs.14756; the only real variable in the output report name will be the process id tacked on the end so you can run the same script multiple times and not overwrite your old data.

Tuesday, November 20, 2007

Most of the time, if you want to estimate how fast your network is moving, you'll want to look at the output of tools like "netstat" and the like. Some file transfer programs, like "scp" are kind enough to print out the speed at which they're transferring a file to you while they're doing it.

But, sometimes you'll be stuck waiting on an file to make it all the way over to your machine using a protocol like "ftp." And, while it's possible to measure the speed of transfer in many different ways, I threw together this little script to let you know how fast your file is being "ftp"'ed to you by taking measure of the file being transferred, itself.

Check it out. It's kind of fun to see that there really is more than one way to skin a cat. In this case, you don't even need a skinner ;)

Note: Adjust the time variables to your liking. I wrote this for transfer of rather large files! The script, below, at the most basic level, measures the difference in the received file's size against the passing of time to determine approximately how fast your network is transferring that file.

And, yes, this is another experiment in "brute force scripting" (Getting results as fast as possible using scripting, even if it means all your t's aren't dotted ;), so you'll notice that the first output will be 0 and is accompanied by a message to the effect that measurement can't take place until more data is collected.

Here's another interesting tidbit from the "terminally boring" archives of system administration ;)

A lot of times, when you get a complaint that /var, on a Solaris box, is exceeding whatever size limitation you've placed on monitoring it, your first inclination is to go and wipe out the largest (but not most necessary) files immediately and see if that takes care of the problem.

Every once in a while, if you're checking around, you may notice that /var/adm/lastlog is gigantic. Theoretically, zeroing that out (catting /dev/null into it), should take care of your disk usage problem as it seems fairly obvious. Some of us would just leave it at that. The rest of us would check "df -k /var" again and notice that the percentage of partition space used is relatively the same. That doesn't seem to make any sense.

This is where the interesting part comes in. Solaris' implementation of lastlog has an interesting bug/feature that makes it seem larger than it is; but only some of the time.

The reason for this is that, while its size remains fairly static (about 24kb maximum), lastlog always indicates its size (when using "ls -l") relative to the user account id that last logged in (after the 24 Kb maximum is reached). The equation is roughly "the user account id number" multiplied by "28 bytes." So, when root logs in with a userid of "zero" (after you've zeroed out the file), it seems to grow to a size of 28 bytes (Yes, this is the minumum - and, yes, 28 times zero should equal zero ;) However, if you do an "ls -s" (to figure out the number of blocks) and a "du -k" (to figure out the size in Kb) on /var/adm/lastlog, you'll see that it's not really taking up all that much space. Below:

If a user with a userid of 6504 logs in (after zeroing out the file) the block and Kb size will show the maximum (48 and 24, respectively), but "ls -l" reports:$ ls -l /var/adm/lastlog -r--r--r-- 1 root root 182112 Nov 19 17:53 lastlog

Crazy, yeah? But, interesting to know, and helpful, since you can avoid this file when trying to pare down the size of the /var partition.

As a caveat, the "fake" size reported by "ls -l" is only fake when lastlog is being manipulated by the Solaris Operating System in the manner in which it was specifically designed to be manipulated. If you copy that 500Mb file (or move it, tar it, etc) it pads all the "blank" space with NULLs and you end up having a file on your hands that really "is" insanely large!

Monday, November 19, 2007

If you've ever found yourself in a situation where a Veritas Cluster Server (VCS) resource/service group is reporting faulted, you may have seen the situation described here. Today we're going to focus on one specific condition where the most common wisdom I've seen on the web is to do an "hastop -local -force" and follow up with an "hastart" to solve the problem. Common sense dictates that this is the last thing you'll want to do, unless you're on a testing environment, as you'll be completely downing the cluster in the process.

The reason the stop/start answer is the most common is because, since the resource/service group's "resource" is waiting to go OFFLINE, while the resource/service "group" is trying to start, or go ONLINE, simply attempting to "online" or "offline" either only results in infinite hang. This is because each entity's successful action depends on the other entity's failure and VCS won't fail if it can wait instead.

You'll know you're looking at this sort of error if you see the following (this generally can be found by running "hasum" or "hastatus -summary"):

A. The resource/service "group" will be showing in a "STARTING|PARTIAL" stateB. An individual "resource" within that resource/service "group" will be showing in the "W_OFFLINE" state.

The following steps to resolution are certainly not the only way to get everything back to an "up" state and, also, assume that there is nothing really "wrong" with the individual resource or resource/service group. That sort of troubleshooting is outside the scope of this blog-sized how-to.

So, again, our assumptions here are:1. An Oracle database resource/service group, named "oracledb_sg" has faulted on the server "vcsclusterserver1.xyz.com."2. An individual resource, a member of the resource/service group "oracledb_sg," named "oracledb_oracle_u11192007," is really the only thing that's failed, or shows as failing in "oracledb_sg."3. There is actually nothing wrong with the resource or the resource/service group. Somehow, while flipping service groups back and forth between machines, somebody made an error in execution or VCS ran into a state problem that it caused by itself ("split brain" or some similar condition).4. Note that we've reached these assumptions based partially on the fact that the resource is waiting to go OFFLINE, and the resource/service group is waiting to go ONLINE (stuck in STARTING|PARTIAL), on the same server!

And the following are the steps we could take to resolve this issue and get on with our lives:

1. First, get a summary of the resource group that you've been alerted as failed or faulted, like this:

hastatus -summary|grep oracledb_sg|grep vcsclusterserver1 (or some variation thereof, depending on how much information you want to get back)

Now your resource/service group should be back in the straight ONLINE state, and you shouldn't see any messages (In "hasum" or "hastatus -summary" output) regarding the individual resource. Time to relax :)

Sunday, November 18, 2007

It's generally good practice, as an administrator, to make sure that you'll be able to account for who was on what box and when, just in case something bad ever happens and the higher-ups want an explanation.

This issue generally comes up with "generic" (or group-used) accounts that are directly accessible. If anyone can log into the "genericAccountName" user account, it's harder to establish an audit trail, since you have to start outside of the box on which your investigation should be centering.

Fortunately, it's relatively simple to set up "generic" user accounts so that users have to log on to your server as themselves first, and then su to that account. This means you'll be able to look at the output from "last" and the "su log" to quickly determine who was logged in as the "generic" user at any given point in time.

The steps to setting up a "generic" account so that its bound by these restrictions (and is still usable) are the following (substitute the appropriate profile file if you're not using ksh or sh):

1. Give the user's .profile 755 permissions and make it owned by root:root, so the user can't edit it or change its permissions.

2. At the end of the .profile, source in another file (.kshrc or something) so that users can still modify their environment, to suit their needs, once they've su'ed to the account.

3. In the .profile that you've made "755 root:root" do a simple check to see if "logname" matches the account name. I suggest using "logname" because that's set by the login program and never changes; even if you su. The $LOGNAME variable, on the other hand, can be changed and is easy to manipulate. Something like this should be sufficient:

4. You should also check the value of "$-" in the system's /etc/profile (specifically look for the "i" in the output) and make sure that you don't allow commands to be executed in a non-interactive shell (you need to do this in /etc/profile instead of the user's .profile as a non-interactive session won't read the .profile). If you allow non-interactive shell access, users won't be able to log in and work, but they will still be able to use ssh, or rlogin, to execute commands remotely and directly, as such:

ssh genericAccountName@host.xyz.com "ls"

5. If the "generic" account needs to run cron jobs under its own identity, the simple check at the beginning of the .profile will keep their cron jobs from executing (That is, when cron executes and reads their .profile, the initial check will indicate that "logname" equals the "generic" accout name and fail). The easiest fix for this is to put an additional check in the user's root-owned .profile to allow direct login as long as the user is already on the host (I know that sounds goofy, since you can't invoke the "login" command as a user once you're on the box, but that's essentially what cron will do). You can verify the host info with the "w" command like:

combine that with the basic "logname" check, however you like, and the "generic" user can still use cron even though the account is now "su-only."

And that should make it so people can't login to your machine directly, and ensure that they will have to su to the "generic" account. This is great for a cheapo-quickie audit trail and should save you a good deal of time if that "generic" account is used improperly or is involved in causing any other issues.

Saturday, November 17, 2007

Today's little tip can actually come in useful even if the information you're seeking isn't "mission critical" (which, by the way, ranks among one of my least favorite terms. If there's one thing positive I can say about where I work now, it's that they don't describe every problem, resolution or project as if we were engaged in war -- but that could be an entirely separate post ;).

I've actually been asked to figure out what process was running on what port more often for information's sake than to try and figure out why something was "wrong," but the same principles apply. The scenario is generally something like the following:

Internal customer Bob needs to start (or restart) an application, but it keeps crashing and getting errors about how it can't bind to a port. This port is necessarily vague, since, in my experience, it's very common to be asked to figure something out with little or no information. I consider myself lucky if I have a somewhat-specific description of the problem at the onset. As we all know, folks will sometimes just complain that "the server is broken." What does that mean? ;)

The troubleshooting process here is pretty simple and linear (perhaps more detail and information in a future post regarding similar issues, as any problem or situation can be fluid and not always follow the rules). In order to try and fix Bob's problem, we'll do the following:

1. Double check that the port (We'll use 1647 as a random example) is actually in use by running netstat.

netstat -an|grep 1647|grep LIST

you can leave out the final "grep LIST" if you just want to know if anything is going on on port 1647 at all. Generally the output to look for is in the local address column (Format is generally IP_ADDRESS:PORT - like 192.168.1.45:1647 or *:1647 - depending on your OS the colon may be a dot). Whether or not you're checking for a LISTENing process, information about a connection from your machine on any port to foreign port 1647 shouldn't concern you.

2. We're going to assume that you actually found that the port is either LISTENing, or actively connected to, on your local machine (if it isn't, your troubleshooting would likely take a much different turn at this point). Now we'll try to figure out what process is using that port.

If you have lsof installed on your machine, figuring this out is fairly simple. Just type:

lsof -i :1647

and you should get a dump of the list of processes (or single process) listening on port 1647 (Easily found under the PID column). They're probably all going to be the same, but, if not, take note of all of them.

3. Run sommething along the lines of:

ps -ef|grep PID

and Problem solved! You now know what process is listening on port 1647 and you'll probably end up having to hard kill it if Bob doesn't have any idea why it won't let go of the port using standard methods associated with whatever program is using it.

But, sometimes, the last part isn't that simple, so:

4. What's that? lsof isn't installed on your machine? My first inclination is to recommend that you download it ;) Seriously, it's a valuable tool that you'll find a million uses for. But you can find out the process ID another way, just in case you can't get your hands on it and/or time is of the essence, etc.

In this instance, and we'll just assume the worst, you can use two commands called "ptree" and "pfiles" (these are standard on Solaris in /usr/proc/bin - may be located elsewhere on your OS of choice and/or named somewhat differently). Use the following command to just grab all the information possible and weed it down to the process using port 1647:

and you'll get the line of output that maps your PID to your port. The above is, admittedly, somewhat messy (not really messy, but you'll end up printing a lot of blank lines ;) Feel free to tailor it to your needs and make it more general (I explicitly used port 1647, but that should also be a variable if you want to create a little script to keep in your war chest).

Run your ps, as above, and now you should know what process is hogging that port and, in the process, making Bob's life miserable. If you cleanly kill that process, Bob should have one less thing to worry about and his program should be able to bind to the now-free port :)

Friday, November 16, 2007

Yes, the title is somewhat misleading. You can't really use up all of any given disk partition and actually have 98% of that space still available for use. Not technically, anyway. This is something fun to try on anyone you like (or like playing pranks on) if you've got the time and the spare machine to do it.

You'd be surprised at how many Solaris admin's don't consider what's actually filling up the disk when you throw them a humdinger like this one. Even the one's who do still have to go an extra step. Here are the ingredients for the recipe:

1. Write a simple script to just make directories. You can write an infinite loop that executes "mkdir a;cd a" or get more complicated (to speed up the process) and write a little script that will cd into multiple subdirectories, exec itself and then re-run, ad infinitum. In any event, do this until your computer complains that there is no space left on the partition.

I'm sure the astute observer has noticed that I'm simulating this information. I can't afford to do this for real right now ;)

So, this is the gist of it. You've used up all the available inodes, by creating a bunch of empty directories, but you've barely used up any disk space in process!

4. Now, to take it one step further (and further your enjoyment in the process), run this little Perl command line (assuming the top level directory you created on the /mydir partition was /mydir/newdir):

perl -e "unlink '/mydir/newdir';"

Now the top level directory, and just the top level directory, has been removed. The link to the inode no longer exists, making it impossible to cd into the directories below it, which still exist. To the naked eye it appears that there is nothing on /mydir that could possibly be causing the problem. Even utilities like "du" will report the correct total size (which is misleading) but not list the disk space usage for your non-existent directory when run as, say, "du -sk /mydir/*"

Eventually your co-worker and/or friend (maybe not anymore ;) will figure out, after doing an fsck, that the directory inode needs to be relinked and, if he's seen this before, will look in the /mydir/lost+found directory to find the missing "inode number" pointer and relink it manually.

That is why I highly recommend you remove this directory as well!

Enjoy your weekend. Remember - Please only do this in fun and, for your own safety, only to someone with a good sense of humor :)

Thursday, November 15, 2007

As hinted in the subtitle of this little blog, every once and a while, I like to sound off on the English language. It's as least as interesting as all the different programming ones :)

I'm just coming off a long day of work after a long night of fighting a cold and watching a lot of bad TV. The topic for today is something I notice constantly, but never really say anything about, because the misuse of the meaning of these words is so commonly traded, it seems odd when a screenwriter or news anchor uses them correctly.

I can't tell you how many times I'll be watching a movie and some guy (usually a tough guy - the hard ass) will be putting the press on his nemesis-of-the-moment and he'll say something like "Listen, Pal. You've got two choices. We can do this the easy way, or the hard way." Something like that. Feel free to embellish with expletives.

So, then, I think a passive "what the.." and let it go. I mean, no one wants to hear about it every time it happens - because it happens all the time! I don't expect the victim on the TV screen to correct some hooligan's English, but somebody needs to sit all these screenwriters, commentators, and other people whose job it is to "speak" to the general public, down in a room and not let them leave until they understand the meaning of certain terms. Of course, we'll have to excuse the one's who are clever enough to point out that their characters only speak that way because everybody else in the world does. Sadly, it's called authenticity ;)

What I'm saying is, the victim here doesn't have two choices. Only one! He does have two options, however. He must choose between those two options, make a decision, and select the one that will propel the plot in an interesting direction; or, at least, keep him from being beaten too badly.

I would actually like to a see a scene where some rough slouching beast of a bad-guy says something like "All right, buddy (as a blogger's side note, I think it's strange that bad guys are always so chummy with their prey. Do they need friends that badly? They should try being nice ;) - Starting again - "All right, buddy. You've got two choices. You can either do this the easy way or the hard way and you can either eat your beans or take a vitamin supplement." More confusing? Yes, but at least the guy really has two choices.

I'd imagine it would be really hard to sound cool responding to that multiple-threat. Who's got a snappy comeback for two completely unrelated evaluations? "Oh, I think we're gonna do this the hard way, my friend... and pass the beans!"

That's enough for now, but keeps your ears open, pay attention to the words fictional characters, newsfolk and even our governmental leaders use when they speak. Always be asking yourself: Are these multiple choices or one choice composed of several options? Is someone making a choice (actively constructing a set of options), or are they making a decision? Once they've decided, can they still choose?

Consider the meaning of these three words and their derivitives - Choice - Option - Decision . Once you and your friends have agreed upon the definitions (after consulting a dictionary or Strunk and White), turn on the TV and slam a shot or beer every time somebody completely mangles the meaning.

Warning: If you engage in the above activity, you may not be able to drive home safely at the end of the evening. If you are watching CNN, you may not be alive after an hour or so ;)

Wednesday, November 14, 2007

Most of the places I've ever worked, where I needed to manage Bind and DNS, it was a few little things at a slow pace. You had your main zone, maybe a few subzones and it all boiled down to a few files to manage in a relatively small setup. I'm talking about the kind of setup that a Sun Ultra5 could eat for lunch. The NIC was almost always the bottleneck.

However, some places (like ISP's and other service providers) will have gigantic amounts of zones that they serve. In those sorts of instances, doing things on the fly doesn't much cut it, unless you can afford to spend the majority of your work life doting over your DNS setup.

One of the main things that you need to worry about, after the initial setup and any time you do a migration or merge DNS zone depots, is making sure you didn't screw the pooch in the process. Thankfully, Bind comes with a few programs to make your life easy.

Given an unlimited amount of zones, you can use the named_checkzone command to verify them all while you surf the web ...or work really really hard ;). Its syntax is simple:

named_checkzone the_zone_name the_zone_filename

And, if you wrap that in a simple loop, iterating over the files in your DNS directory, you'll end up with a simple-to-analyze report in minutes. Consider the following:

Above, very simply, we've taken every zone file (like most folks who like to keep things simple when they don't have to be complicated) and checked it, using Bind's own checker (how much better could that get?). In this case, I named my zone files using the convention ZoneName.db. To create the_zone_name variable, I use a simple sed command to capture every part of the filename except for the tailing ".db" - That output all gets dumped into the confstate file using tee.

You can do the same thing with your named.conf (using Bind's named_checkconf), but there's almost no reason to script that. Hopefully you've only got one configuration file and you can check that as simply as typing:

named_checkconf /etc/named.conf

or whatever you named your conf file and wherever it is.

Enjoy your snooze. Hopefully all of your zone files are in compliance :)