At the IBM Technical Symposium in Sydney last week, a person approached me to discuss NIM and some of its capabilities. During the conversation we discussed how NIM could be used to copy files from the NIM master to its NIM clients. I promised to send them some information on how to achieve this ASAP. They were smart and followed up with an email the next day! They’d even tried to configure this on their systems but hit a small problem.

"Hello Chris,

So trying to push out a new netbackup tar file to one of our nim clients (usually we do all sort of stuff via our SSH deployment server). But decided to use NIM.

So I created the resource file_res called netbackup. Where the tar file is placed on the master

But for the life of me I cannot find anywhere under NIM to allocate the resource to the client i.e. to push the file out..
I though it would be under install software, but the resource is not listed only lpp_sources.

Do you have any ideas, this one I am stuck on.

Can I actually push files out by themselves? Thanks"

The answer is yes! You can define a file_res resource on the NIM master. From the AIX 7.2 Knowledge Centre:

"A file_res resource is where NIM allows for resource files to be stored on the server. When the resource is allocated to a client, a copy of the directory contents is placed on the client at a location that is specified by the dest_dir attribute."

"-a location=Value Specifies the full path name of the directory on the NIM server. This path is used as a source directory among clients.
-a dest_dir=Value Specifies the full path name of the directory on the NIM client. This path is where the source directory is recursively copied into.

Notes: If the target directory does not exist on the destination machine, the entire source directory contents are copied (including the hidden files in the top-level directory). If the target directory exists on the destination machine, the source directory contents are copied (excluding the hidden files in the top-level directory)."

Essentially, you can place the files you need to distribute to your NIM clients, into a directory on the NIM master. Then you can create a file_res NIM resource that points to this directory. After that, you can allocate this resource to the NIM client or NIM machine group, and run a NIM customisation operation against the client (or machine group). This will copy the directory and all its files to the NIM client. Pretty cool really!

Here's an example. I want to distribute (copy) all of the files located in /usr/local/etc, to a NIM client. The original (source) directory and files reside on the NIM master.

If you’re looking for a fast and cheap way of copying a bunch of files from a central location to one or more servers, then this is a very good option for AIX administrators. It’s like a “poor mans” file collections; similar to what we once had with Cluster System Management (CSM, which is now discontinued) and PowerHA File Collections (but not as powerful or configurable).

On the odd occasion, NIM may report that a resource is allocated to a NIM client, when, in fact, it is not. Typically, you’d check that the resource was, in fact, not allocated for use to any NIM client and if it was, you would reset the client; and this would resolve the issue. But if that doesn’t work, you may need to take an additional action to resolve the problem. This doesn’t happen very often but it can frustrate you when it does.

Here’s an example of the problem. I try to remove an lpp_source resource but I’m told that it’s still allocated to a client. But it isn’t, I tell you!

# nim -o remove liveupdaterte

0042-001 nim: processing error encountered on "master":

0042-061 m_rmpdir: the "liveupdaterte" resource is currently

allocated for client use

Even lsnim is telling me that the resource is still allocated, somewhere, because alloc_count is set to 1.

# lsnim -Fl liveupdaterte

liveupdaterte:

id = 1447111715

class = resources

type = lpp_source

comments = LIVE

arch = power

Rstate = ready for use

prev_state = verification is being performed

location = /export/nim/cglpp

alloc_count = 1

server = master

After trying to de-allocate the resources, by resetting my NIM clients (see my script at the bottom of the page), and still receiving the same error, I’m left with little choice but to manually reset the alloc_count value to 0, using the (almost undocumented) /usr/lpp/bos.sysmgt/nim/methods/m_chattr NIM utility.

The release level of the resource is incomplete, or incorrectly specified. The level of the resource can be obtained by running the lsnim -l ResourceName command and viewing the version, release, and mod attributes. To correct the problem, either recreate the resource, or modify the NIM database to contain the correct level using the command on the NIM master:/usr/lpp/bos.sysmgt/nim/methods/m_chattr -a Attribute= Value ResourceName, where Attribute is version, release, or mod; Value is the correct value; and ResourceName is the name of the resource with the incorrect level specification.

One question that comes to mind is how did the NIM resource end up in this state? Most likely it was the result of a failed NIM operation on the lpp_source and NIM client to which it was to be allocated. This can be tricky to pick up and almost always, it’s the next person who tries to use the resource that finds the problem and has no idea what events led up to this point.

As always, use caution when experimenting with this tool. If in doubt, take a backup of your NIM database before you start messing with the attributes, just in case you need it in the future.

Here’s my NIM client reset script. It resets the client and de-allocates any resources assigned to it. It also resets the NIM client cpuid (this is not always required) but I often use the same NIM client to install multiple AIX partitions across several Power servers, so it’s useful to me only (probably)! You can remove that line if need be.

I’ve been working with a customer recently on
an issue with nimadm. They were
attempting to migrate a system from AIX 5.3 to 7.1 using nimadm. The NIM client AIX
level was 5.3 TL12 SP4 and the NIM master was running AIX 7.1 TL1 SP1.

Given that the error appeared to be related to init_multibos, we assumed the failure was due to some multibos checks being performed by alt_disk_copy on the client. The client
system did not have an existing multibos
standby instance. So, we tried two things: First we created a standby instance
on the client (multibos –s –X) and
re-tried the nimadm operation. This
failed. Next we removed the standby instance (multibos –R) and re-tried the nimadm
operation. This worked and the client then migrated to AIX 7.1 successfully. We
re-tried the same operations (i.e. create standby instance, remove standby
instance & nimadm) several times and each worked as expected.

So it appeared that the unofficial work around to this problem would
be to create and then remove a standby multibos instance prior to the nimadm migrate. However, the customer
has over 200 LPARs that they need to migrate to AIX 7.1. If possible they would
really rather avoid this extra step in the AIX 7.1 migration plan. We’ve made
contact with IBM support and are hoping they can assist us in identifying the
root cause of the issue and provide us with an official solution to the
problem.

And just yesterday we hit the same problem when migrating from AIX 6.1
to 7.1 using nimadm. I’ll update my
blog with any progress we make with this problem. In the meantime, our
unofficial work around will get us “out of hot water”!

UPDATE (14/12/2011): The simple fix is to remove the /bos_inst directory before attempting
the AIX migration. i.e.

If you have a multibos image in rootvg, remove it. AIX
migrations are not
supported with multibos enabled systems. Ensure all rootvg LVs are
renamed to their legacy names. If necessary, create a new instance of
rootvg and reboot the LPAR. For example:

# multibos
–sXp

# multibos
–sX

# shutdown
–Fr

Confirm the legacy LV names are now
in use that is, not bos_.

# lsvg -l
rootvg | grep hd | grep open

hd6paging801602open/syncdN/A

hd8jfs2log122open/syncdN/A

hd4jfs2122open/syncd/

hd2jfs27142open/syncd/usr

hd3jfs216322open/syncd/tmp

hd1jfs2122open/syncd/home

hd9varjfs28162open/syncd/var

hd7sysdump881open/syncdN/A

hd7asysdump881open/syncdN/A

hd10optjfs28162open/syncd/opt

Remove the old multibos instance.

# multibos -R

Unfortunately, it appears that ‘multibos
–R’ may not clean up the /bos_inst directory. If this directory exists the
nimadm operation will most likely fail.