Chris's AIX Blog

I received an email this week from a
colleague that worked with me on the NIM
Redbook back in 2006. He was experiencing an issue with DSM and NIM. He was
attempting to use the dgetmacs
command to obtain the MAC address of the network adapters on an LPAR. The
command was failing to return the right information.

I experienced this very issue during
the writing of the AIX 7.1 Differences
Guide Redbook. And given that I was in Austin, sitting in the same building
as the AIX development team, I was able to speak with the developers directly
about the issue. At that time they provided me with the following workaround.

First they asked me to check the size
of the /usr/lib/nls/msg/en_US/IBMhsc.netboot.cat message
catalog file.

# ls –l /usr/lib/nls/msg/en_US/IBMhsc.netboot.cat

-rw-r--r--1 binbin3905 Aug 08 09:54

They were surprised to find that the
file appeared to be “too small”. They promptly sent me the catalog file from
one of their development AIX 7.1 systems.I replaced the file as follows:

# cd/usr/lib/nls/msg/en_US/

# ls -ltr IBMhsc*

-rw-r--r--1 binbin3905 Aug 08 09:54
IBMhsc.netboot.cat

# cp -p IBMhsc.netboot.cat
IBMhsc.netboot.cat.old

# cp /tmp/lpar1/IBMhsc.netboot.cat.new IBMhsc.netboot.cat

# ls -ltr IBMhsc*

-rw-r--r--1 binbin3905 Aug 08 09:54
IBMhsc.netboot.cat.old

-rw-r--r--1 binbin26374 Dec 23 11:24 IBMhsc.netboot.cat

This fixed the problem for me during
the residency.

So I asked my friend to do the same
(after I sent him the message catalog file). He ran the dgetmacs command again and this time it returned the MAC address
for all the network adapters in his LPAR. Success!

I received the following errors whilst running dsh on a NIM master recently.

root@nim1 : / # dsh -waixlpar1 date

0042-053 lsnim: there is no NIM object named "aixlpar1"

The node aixlpar1 is not defined in NIM database.

aixlpar1: Mon Aug 4 14:01:57 EET 2014

I had to set the following environment variable, shown below. By setting DSH_CONTEXT to DSH this prevented the dsh command from referring to the NIM database and instead forced it to query a user-defined node list.

root@nim1 : / # export DSH_CONTEXT=DSH

root@nim1 : / # dsh -waixlpar1 date

aixlpar1: Mon Aug 4 14:02:22 EET 2014

root@nim1 : / # env | grep -i dsh

DSH_CONTEXT=DSH

DSH_NODE_RSH=/usr/bin/ssh

root@nim1 : / # dsh -q

DSH:DCP_DEVICE_OPTS=

DSH:DCP_DEVICE_RCP=

DSH:DCP_NODE_OPTS=-q

DSH:DCP_NODE_RCP=/usr/bin/scp

DSH:DSH_CONTEXT=DSH

DSH:DSH_DEVICE_LIST=

DSH:DSH_DEVICE_OPTS=

DSH:DSH_DEVICE_RCP=

DSH:DSH_DEVICE_RSH=

DSH:DSH_ENVIRONMENT=

DSH:DSH_FANOUT=

DSH:DSH_LOG=

DSH:DSH_NODEGROUP_PATH=

DSH:DSH_NODE_LIST=/usr/local/etc/csmnodes.list

DSH:DSH_NODE_OPTS=

DSH:DSH_NODE_RCP=

DSH:DSH_NODE_RSH=/usr/bin/ssh

DSH:DSH_OUTPUT=

DSH:DSH_PATH=

DSH:DSH_REPORT=

DSH:DSH_SYNTAX=

DSH:DSH_TIMEOUT=

DSH:RSYNC_RSH=

Here’s another dsh tip I picked up. By default dsh will use the default port for ssh connections to nodes. For example, by default sshd listens on port 22 on an AIX node. I recently came across a customer environment where they had configured sshd to listen on port 6666 (not the real port number!). They wanted to use dsh from a NIM master which would connect to all the defined nodes in their custom list. When they ran it they got the following error message:

# dsh date

aixlpar1: ssh: connect to host aixlpar1 port 22: Connection refused

dsh: 2617-009 aixlpar1 remote shell had exit code 255

On the AIX node, we could see that sshd was listening on port 6666:

# netstat -a | grep 6666 | grep LIST

tcp6 0 0 *.6666 *.* LISTEN

tcp4 0 0 *.6666 *.* LISTEN

We needed to find a way to force dsh to use a different port number when starting the ssh connection. This was accomplished by setting the DSH_REMOTE_OPTS variable, as shown below.

[root@nim1]/ # export DSH_REMOTE_OPTS=-p6666

[root@nim1]/ # dsh date

aixlpar1: Tue Aug 5 17:37:16 2014

[root@nim1]/ # env | grep DSH

DSH_REMOTE_CMD=/usr/bin/ssh

DSH_NODE_LIST=/etc/ibm/sysmgt/dsm/nodelist

DSH_REMOTE_OPTS=-p6666

DSH_NODE_RSH=/usr/bin/ssh

DSH CONTEXT

The DSH CONTEXT is the in-built context for all the DSH Utilities commands. It permits a user-defined node group database contained in the local file system. The DSH_NODEGROUP_PATH environment variable specifies the path to the node group database. Each file in this directory represents a node group, and contains one host name or TCP/IP address for each node that is a group member. Blank lines and comment lines beginning with a # symbol are ignored. If all nodes are requested for the DSH CONTEXT, a full node list is built from all groups in the DSH_NODEGROUP_PATH directory, and cached in /var/ibm/sysmgt/dsm/dsh/$DSH_NODEGROUP_PATH/AllNodes. This file is recreated each time a group file is modified or added to the DSH_NODEGROUP_PATH directory. Device targets are not supported in the DSH context.

Someone installs SAP onto an AIX system and decides to use TCP
port 3901 as an SAP service port. This is the same port used by nimsh. In some
rare cases, nimsh may not be active on the LPAR, which makes it easy for the
SAP installation to hijack port 3901. If nimsh is active, the person installing
SAP may consciously stop nimsh and use port 3901 for SAP anyway. Hopefully that
doesn’t happen. Hopefully, they will talk to the AIX administrator and discuss
the best way forward. Hopefully...

In either case, if the port is taken by SAP, nimsh will no longer
work. If you love using NIM as much as I do, this is a real problem! We could
revert back to using rsh but no-one will do this anymore because of concerns
around security. And rightfully so!

The ports used by nimsh (3901 and 3902) are registered to Internet
Assigned Number Authority (IANA). These port numbers appear in the
/etc/services file.

nimsh3901/tcp# NIM Service Handler

nimsh3901/udp# NIM Service Handler

nimaux3902/tcp# NIMsh Auxiliary Port

nimaux3902/udp# NIMsh Auxiliary Port

Considering these port numbers are registered with IANA, we can usually
persuade our SAP colleagues to change their SAP installation to use a different
port number. However, depending on the skills/experience of the SAP resource,
one of two things usually happens 1) They take an outage, re-install SAP and
choose a different port number or 2) The more experienced/confident SAP basis
resource will take an outage and modify the instance to use a different port:
without reinstalling SAP.

Perhaps SAP need to include a warning in their install notes,
advising customers not to use port 3901 on AIX systems (i.e. best practice)?

Now, if you must change nimsh to use a different port number, it
is possible. But not recommended.

To do this, you must change the /etc/services file on the NIM master
and the NIM client to reflect the same port numbers for nimsh. This will work
until the NIM master or the NIM client have their services file overwritten by
way of install or fileset updates. After which, the default values for nimsh
will be reinstated.

You would also need to change the services file on all of your NIM clients. Every time you
performed a NIM fileset update, you would need to remember to change the /etc/services
file again. This is painful and bound to catch someone out eventually!

In the following example I’ll demonstrate how to change the port
number used by nimsh.

We start with a typical nimsh configuration using port 3901. On
the NIM client, nimsh is listening on port 3901.

We can confirm that we have connected to the NIM client on port
39011 by looking at the output from lsof
and netstat. There is a TCP session
established between the master and the client on port 39011.

After
updating my NIM master to AIX 7.1 TL2 SP1 (7100-02-01-1245), I noticed a
problem. Whenever I installed a new AIX partition using NIM, the resources
allocated to the NIM client were not
being de-allocated, even though the installation was completing successfully. Also,
if I tried to run my usual ‘NIM client reset’ script (below), the resources
were still allocated.

#!/usr/bin/ksh

# Reset a
NIM client.

if [[
"$1" = "" ]] ; then

echo Please specify a NIM client to reset
e.g. aixlpar1.

else

if lsnim -l $1 > /dev/null 2>&1 ;
then

nim -o reset -F $1

nim -Fo de-allocate -a subclass=all $1

nim -Fo change -a cpuid= $1

else

echo Not a valid NIM client!?

fi

fi

For
example, here’s my NIM client with the lpp_source,
mksysb and SPOT resources assigned to it (even though the AIX install
completed OK).

root@nim1 :
/ # lsnim -l aixlpar1

aixlpar1:

class= machines

type= standalone

connect= shell

platform= chrp

netboot_kernel= 64

if1= network1 aixlpar1 0

cable_type1= N/A

Cstate= ready for a NIM operation

prev_state= not running

Mstate= currently running

boot= boot

lpp_source=
lpp_sourceaix710105

mksysb= aixlpar1-71

nim_script= nim_script

spot=
spotaix710105

cpuid= 00C453C75C00

control= master

Cstate_result= success

installed_image = aixlpar1-71

My
workaround was to use 'smit nim_mac_res'
to manually de-allocate resources from
the client:

====

De-allocate Network Install Resources

aixlpar1machinesstandalone

>lpp_sourceaix710105lpp_source

>spotaix710105spot

>aixlpar1-71mksysb

====

It
appears that others were also experiencing this problem. I found the following
thread on the IBM developerWorks AIX user forum:

Recent releases of AIX installation media
(for 7.1 and 6.1) now contain the OpenSSH base installation filesets. This is
very handy; we no longer need to download or locate the software from other
sources.

One thing to consider is what this means
for future AIX migrations.

If you are migrating a system (that already
has a version of SSH installed) to AIX 7.1 then you may notice that the first
time you attempt to connect to the server (after the 7.1 migration) the
following ssh message appears:

root@nim1
: / # ssh aixlpar1

@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@

@WARNING: REMOTE HOST IDENTIFICATION HAS
CHANGED!@

@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@

IT
IS POSSIBLE THAT SOMEONE IS DOING SOMETHING NASTY!

Someone
could be eavesdropping on you right now (man-in-the-middle attack)!

It
is also possible that a host key has just been changed.

The
fingerprint for the RSA key sent by the remote host is

59:68:05:71:60:b5:d1:96:87:df:f6:9c:ca:9a:14:3e.

Please
contact your system administrator.

Add
correct host key in /.ssh/known_hosts to get rid of this message.

Offending
RSA key in /.ssh/known_hosts:17

RSA
host key for aixlpar1 has changed and you have requested strict checking.

Host key
verification failed.

In the
output above I’m attempting to SSH from another system to the newly migrated
AIX 7.1 LPAR. This is essentially informing us that the SSH host keys on the
AIX 7.1 server don’t match the host key stored in the local systems
/.ssh/known_hosts file. Something has changed.

Now of
course I could simply accept this change and update my known_hosts files, like
so:

root@nim1
: / # ssh-keygen -R aixlpar1

/.ssh/known_hosts
updated.

Original
contents retained as /.ssh/known_hosts.old

With
known_hosts updated, I’m able to SSH to the AIX 7.1 system successfully.

cgibson@nim1
: /home/cgibson $ ssh aixlpar1 date

Mon
Aug 20 19:44:20 EET 2012

But that’s
just for my SSH known_hosts file only. What about all the users that connect to
this system via SSH/SFTP/SCP? Do I really expect all of them to update their
known_hosts file with the new host key information?

This could
create problems for automated tasks, like file transfers. If these transfers
stop working then their could be “hell to pay”. So the question I’m often asked
is what can I do to prevent this from happening in the first place? Luckily
there is a way.

In this
example, we are using nimadm to migrate from AIX 5.3 to 7.1. The
AIX 7.1 lpp_source resource was created using the AIX 7.1 installation media
DVDs. All filesets were copied from the DVDs, verbatim, to the new 7.1
lpp_source resource on the NIM master.

First we
verify that the openssh* filesets are in fact in the AIX 7.1 lpp_source on the
NIM master.

root@nim1
: / # nim -o showres lpp_sourceaix710101 | grep -i ssh

openssh.base.client5.4.0.6100IN usr,root

openssh.base.client5.8.0.6101IN usr,root

openssh.base.server5.4.0.6100IN usr,root

openssh.base.server5.8.0.6101IN usr,root

openssh.man.en_US5.4.0.6100IN usr

openssh.man.en_US5.8.0.6101IN usr

openssh.msg.EN_US5.8.0.6101IN usr

openssh.msg.en_US5.4.0.6100IN usr

openssh.msg.en_US5.8.0.6101IN usr

On the NIM
client (running AIX 5.3), we verify there is an older version of SSH already
installed. The migration will remove these filesets (and the associated
/etc/ssh_host_* files). The newer version of SSH will be installed and new
ssh_host_key* files will be generated (hence the problem with the remote SSH
clients known_hosts files no longer holding the correct host keys).

Rather than
update these filesets manually after the migration, you can include this step
as a post migration task with nimadm.

An
alternative way to work around this problem (after the fact) would be to
restore the original ssh_host_key* files from a backup.For example, I copied the original
ssh_host_key* files to my home directory before starting the AIX migration.

aixlpar1
: / # cd /etc

aixlpar1
: /etc # cp -pr ssh /home/cgibson/ssh_orig/

In the
output below, I discover that my ssh_host_key* files have all been recreated
during the migration.

aixlpar1
: /etc/ssh # ls -ltr

total
352

-rw-r--r--1 rootsystem1288 May 01
2007ssh_config

-rw-r--r--1 rootsystem1155 May 04
2007sshd_banner

-rw-r--r--1 rootsystem2867 Oct 29
2008sshd_config

-rw-r-----1 rootsystem7 Aug 20 21:00
sshd.pid

-rw-r--r--1 rootsystem2341 Aug 20 21:19
ssh_prng_cmds

-rw-------1 rootsystem132839 Aug 20 21:19
moduli

-rw-r-----1 rootsystem382 Aug 20 21:45
ssh_host_rsa_key.pub

-rw-------1 rootsystem1679 Aug 20 21:45
ssh_host_rsa_key

-rw-r-----1 rootsystem630 Aug 20 21:45
ssh_host_key.pub

-rw-------1 rootsystem965 Aug 20 21:45
ssh_host_key

-rw-r-----1 rootsystem590 Aug 20 21:45
ssh_host_dsa_key.pub

-rw-------1 rootsystem668 Aug 20 21:45
ssh_host_dsa_key

I copy the
original files back to the /etc/ssh directory. The sshd subsystem is also
restarted to pick up the updated ssh_host* files.

aixlpar1
: /etc/ssh # cp -p /home/cgibson/ssh_orig/ssh_host_* .

aixlpar1
: /etc/ssh # ls -ltr

total
352

-rw-r--r--1 rootsystem210 Feb 03
2006ssh_host_rsa_key.pub

-rw-------1 rootsystem887 Feb 03 2006ssh_host_rsa_key

-rw-r--r--1 rootsystem319 Feb 03
2006ssh_host_key.pub

-rw-------1 rootsystem515 Feb 03
2006ssh_host_key

-rw-r--r--1 rootsystem590 Feb 03
2006ssh_host_dsa_key.pub

-rw-------1 rootsystem668 Feb 03
2006ssh_host_dsa_key

-rw-r--r--1 rootsystem1288 May 01
2007ssh_config

-rw-r--r--1 rootsystem1155 May 04
2007sshd_banner

-rw-r--r--1 rootsystem2867 Oct 29 2008sshd_config

-rw-r-----1 rootsystem7 Aug 20 21:00
sshd.pid

-rw-r--r--1 rootsystem2341 Aug 20 21:19
ssh_prng_cmds

-rw-------1 rootsystem132839 Aug 20 21:19
moduli

aixlpar1
: /etc/ssh # stopsrc -s sshd

0513-044
The sshd Subsystem was requested to stop.

aixlpar1
: /etc/ssh # startsrc -s sshd

0513-059
The sshd Subsystem has been started. Subsystem PID is 3997822.

I was working at a client site today, on a NIM master that I
configured a month or so ago. I was there to install the TSM backup client
software on about 30 or so LPARs. Of course I was going to use NIM to
accomplish this task.

The software install via NIM worked for the majority of the LPARs
but I noticed a few of them were failing. This was very odd, as the last time
I’d use the same NIM method to install software, everything was fine.

I suspected that perhaps something had changed on the client
LPARs...maybe with their /etc/niminfo
file for instance. So I performed the following steps to reconfigure the /etc/niminfo file
and configure the nimsh subsystem on
the client LPAR.

lpar1#
mv /etc/niminfo /etc/niminfo.old

lpar1#
niminit -a master=nim1 -a name=`hostname`

lpar1#
stopsrc -s nimsh

lpar1#
smit nim_config_services

Configure Client Communication Services

Type
or select values in entry fields.

Press
Enter AFTER making all desired changes.

[Entry Fields]

* Communication Protocol used by client[nimsh]+

NIM Service Handler Options

*Enable Cryptographic Authentication[disable]+

for client communication?

Install Secure Socket Layer Software (SSLv3)?[no]+

Absolute path location for INSTALLP package[/dev/cd0]/

-OR-

lpp_source which contains INSTALLP
package[]+

Alternate Port Range for Secondary
Connections

(reserved values will be used if left
blank)

Secondary Port Number[]#

Port Increment Range[]+#

The last step failed with the following error message:

0042-358
niminit: The connect attribute may only be assigneda service value of "shell" or
"nimsh".

I checked the NIM client and confirmed it was configured for nimsh
and it was fine. However, I did notice something odd when I ran the following
command:

lpar1#
egrep 'nimsh|nimaux' /etc/services

lpar1#

The entries for nimsh
were missing from the /etc/services
file!

Somebody had decided that these entries were not required and had
simply removed them! Gee, thanks so much for that!

After adding the following entries back into the services file,
everything started working again!

nimsh3901/tcp# NIM Service Handler

nimsh3901/udp# NIM Service Handler

nimaux3902/tcp# NIMsh Auxiliary Port

nimaux3902/udp# NIMsh Auxiliary Port

I’ve also encountered this error when there is another
process (other than nimsh) using
port 3901 or 3902.

Another error message you might confront, if those
entries are either missing or commented out, is on the NIM master:

nimmast#
nim -o showlog -a log_type=lppchk lpar1

0042-001
nim: processing error encountered on "master":

0042-006 m_showlog: (From_Master) connect
Error 0

poll:
setup failure

I thought I’d also mention another error message that
can potentially drive you insane (especially if you haven’t had your morning
coffee!). The error doesn’t relate to nimsh
at all but I thought I’d describe it anyway. The message appears when running
the nim –o showlog command against a
client LPAR.

nimmast# nim -o showloglpar1

0042-001
nim: processing error encountered on "master":

0042-006 m_showlog: (From_Master) connect
Error 0

0042-008 nimsh: Request denied – wronghostname

I’ve modified the output a little to make it easier to
identify the problem. Can you see it? I thought so! Upon investigation you may
find that the IP address for the NIM master is resolving to a different
hostname on the client. For example:

On the NIM master:

nimmast# host nimmast

nimmast
is 172.29.150.177

nimmast# host 172.29.150.177

nimmast is 172.29.150.177

nimmast# grep 177 /etc/hosts

172.29.150.177nimmast

On the NIM client:

lpar1# host nimmast

nimmast
is 172.29.150.177

lpar1# host 172.29.150.177

wronghostname is 172.29.150.177

lpar1# grep 172.29.150.177 /etc/hosts

172.29.150.177wronghostname

172.29.150.177nimmast

In this example, someone placed two host entries in /etc/hosts with the same IP address. The client was resolving the IP address
to an incorrect hostname. This resulted in our nim –o showlog command failing.

Do you use SSL with nimsh on AIX? No? Well, you might want to consider it. If you regularly use LPM to migrate AIX partitions from one server to another, you may have found that, on occasion, your NIM master has trouble communicating with its NIM clients afterwards. This is by design, as nimsh uses the NIM clients cpuid to authenticate with the NIM master. During an LPM operation, the cpuid of the NIM client changes and its possible the NIM master may reject the client as a result. This problem can occur even when CPU validation is disabled on the NIM master.

In the example below, I’ve LPM’ed a NIM client (750lpar1) to another server. Immediately afterwards, I’m able to execute a NIM command against the NIM client, from the NIM master (750lpar4). At this point the NIM client is configured with standard nimsh authentication i.e. no SSL.

NIM CLIENT (AFTER LPM):

[root@750lpar1]/ # uname -a

AIX 750lpar1 1 7 00F603CD4C00

[root@750lpar1]/ # uname -a

AIX 750lpar1 1 7 00F627664C00

[root@750lpar1]/var/adm/ras# echo HELLO > /var/adm/ras/nim.installp

NIM MASTER:

[root@750lpar4]/ # nim -o change -a validate_cpuid=no master

[root@750lpar4]/ # lsnim -l master | grep -i cpu

validate_cpuid = no

[root@750lpar4]/ # lsnim -l 750lpar1

750lpar1:

class = machines

type = standalone

connect = nimsh

platform = chrp

netboot_kernel = 64

if1 = 10_1_50 750lpar1 0

cable_type1 = N/A

Cstate = ready for a NIM operation

prev_state = not running

Mstate = currently running

cpuid = 00F603CD4C00 << Notice that the cpuid is different.

[root@750lpar4]/ # nim -o showlog 750lpar1

HELLO

The cpuid is cached by the nimsh daemon, so the previous system id is retained in memory and passed to the NIM master, which allows the operation to complete successfully. But, if I restart the nimsh daemon, on the NIM client, I find that the NIM master is no longer able to communicate with the client.

[root@750lpar1]/ # stopsrc -s nimsh

0513-044 The nimsh Subsystem was requested to stop.

[root@750lpar1]/ # startsrc -s nimsh

0513-059 The nimsh Subsystem has been started. Subsystem PID is 6160610.

When Live Partition Mobility (LPM) is used to move a machine from one physical server to another and the machine is defined as a Network Installation Management (NIM) client, the NIM administrator must update the cpuid attribute for the NIM client to reflect the new hardware value after the LPM migration completes. To update the cpuid attribute, complete the following steps:

On the NIM client, acquire the new cpuid ID by running the following command:

uname –a

On the NIM master, run the following command:

nim -o change -a cpuid=cpuid client”

However, there is a better way. Using nimsh, with SSL-enabled authentication, will prevent the checking of cpuid during nimsh service handling. This is considered the recommended choice of operation (since the client/server can agree upon identity using the certificate information passed during the ssl handshake). Once the certificate is in place, the NIM master will disregard any cpuid validation and instead rely on the success of the SSL-handshake. This configuration will work well with LPM. If using standard nimsh, the limitation of cpuid updating would still apply (because NIM has no way of automatically updating a client once the value has changed).

Both the NIM master and client must be configured to support SSL-enabled authentication. To configure SSL-enabled authentication, we can use ‘nimconfig -c’ on the NIM master.

I’ve received a couple of requests for an example of using a
post migration script with nimadm.
What follows is a simple example of using such a resource with NIM. If you are
not familiar with the nimadm tool then
perhaps you’d like to start first by reading my article on using nimadm
to migrate to AIX 6.1.

The nimadm utility
can perform both pre and post migration tasks. This is accomplished by running
NIM scripts either before or after a migration. The tool accepts the following
flags for pre and post migration script resources:

This
script resource that is run on the NIM master, but in the environment of the
client's alt_inst file system that is mounted on the master (this is done by
using the chroot command). This
script is run before the migration begins.

post-migration

This
script resource is similar to the pre-migration script, but it is executed
after the migration is complete.

We are going to focus on post-migration only, although the
configuration is the same for both.

In this example I need to uninstall and install a 3rd
party device fileset for a storage device. I need to perform this task as part
of the migration process. To protect the innocent, I have not named the storage
vendor in this post. But I will say that it was not IBM storage we are dealing
with in this case.

Before we start, first we collect all the necessary device
filesets that provide support for this type of storage on AIX. We place them
into a local directory on my NIM master. Along with the software, I also place
a copy of my NIM script in the same directory on the NIM master. The script
name is XYZpost.ksh.

root@nim1
: /usr/local/XYZ # ls -ltr

total
544

-r-xr-xr-x1 rootsystem51200 Mar 11
2011MPIO_1001I

-r-xr-xr-x1 rootsystem51200 Mar 11
2011MPIO_1002U

-r-xr-xr-x1 rootsystem51200 Mar 11
2011MPIO_1003U

-r-xr-xr-x1 rootsystem51200 Mar 11
2011MPIO_1004U

-r-xr-xr-x1 rootsystem51200 May 18 16:39 MPIO_1005U

-r-xr-xr-x1 rootsystem715 May 24 16:57
XYZpost.ksh

-rw-r--r--1 rootsystem2310 May 25 14:57
.toc

The contents of my script are simple. This script will
de-install the old device fileset and then immediately install the latest
version of the VendorXYZ’s device fileset. The script will then change the attributes
for the vendor’s storage to more appropriate default values.

At this point I copy the same directory and all of its contents
to the NIM client.

root@nim1 : /usr/local # scp –pr XYZ
lparaix01:/usr/local/

…etc…

lparaix01 : /usr/local/XYZ # ls -ltr

total
0

-r-xr-xr-x1 rootsystem51200 Mar 11
2011MPIO_1001I

-r-xr-xr-x1 rootsystem51200 Mar 11
2011MPIO_1002U

-r-xr-xr-x1 rootsystem51200 Mar 11
2011MPIO_1003U

-r-xr-xr-x1 rootsystem51200 Mar 11
2011MPIO_1004U

-r-xr-xr-x1 rootsystem51200 May 18 16:39 MPIO_1005U

-r-xr-xr-x1 rootsystem715 May 24 16:57
XYZpost.ksh

-rw-r--r--1 rootsystem2310 May 25 14:57
.toc

Make sure that any scripts you write for use with nimadm start with an appropriate ‘hashbang’ to announce it is a shell
script and the shell that must be used to execute it e.g. #!/usr/bin/ksh.
If you forget to do this nimadm will
fail to execute your script and will report an error message similar to the
following:

/lparaix01_alt/alt_inst/tmp/.alt_mig_chroot_script.11731036: Cannot run a file that does not have a valid format.

The next step is to define the script as a NIM resource so that nimadm can call the resource during the
migration process. I’ve decided to call this new NIM resource, XYZPOST.

This is easily achieved using smit nim_mkres:

root@nim1
: / # smit nim_mkres

|script= an executable file which is
executed on a client|

Define a Resource

Type
or select values in entry fields.

Press
Enter AFTER making all desired changes.

[Entry Fields]

*
Resource Name[XYZPOST]

*
Resource Typescript

* Server of Resource[master]+

* Location of Resource[/usr/local/XYZ/XYZpost.ksh]/

We can confirm that the NIM script resource is now available
using the lsnim command.

root@nim1
: / # lsnim -t script

XYZPOSTresourcesscript

root@nim1
: / # lsnim -l XYZPOST

XYZPOST:

class= resources

type= script

Rstate= ready for use

prev_state= unavailable for use

location= /usr/local/XYZ/XYZpost.ksh

alloc_count = 0

server= master

Now that the script is in place, and defined to NIM, we are
ready to test it. We will migrate the system from AIX 5.3 to AIX 6.1 using nimadm. Once the migration phase is
complete (phases 1 to 6), the post-migration script will be executed in the NIM
clients nimadm (chroot) environment
on the NIM master. Once this is finished the NIM clients data is synced back to
the NIM clients alternate disk and the boot image is created. The migration
process is then complete.

We add the –z flag to
our nimadm command line options to
specify the post migration resource.

In normal operation we would simply let nimadm run all phases in sequence with the following command.

Phase 6 has completed successfully. The NIM clients rootvg data
has been migrated from AIX 5.3 to 6.1 on the NIM master. The data has not yet been
synced back to the NIM client.

At this stage we can now run phase 7 separately and ensure that
it performs the required task. We expect it will de-install the device fileset,
install the latest version and change the ODM default attributes for the device
type. Again, you’ll notice that we specify the –P flag for phase 7 only.

Great news! Our script has worked as expected. The old fileset
was de-installed, the new fileset was installed and the PdAt default attributes
were changed successfully.

Note:
You can also review the post migration script output at a later
date if you wish. All nimadm
activities are logged, on the NIM master, to /var/adm/ras/alt_mig/NIMclientname_alt_mig.log
(where NIMclientname is the name
of the NIM client being migrated by nimadm).

With regard to nimadm
log files, please be aware that if you choose to run nimadm in phases (as I’ve shown in this example) that each run will
generate a new log file. So in my
case, when I ran phases 1 to 6, this created a log file named
lparaix01_alt_mig.log. When I ran phase 7, the original log file was moved to
lparaix01_alt_mig.log.prev. A new
log file was created and used for phase 7. Then when I ran phases 8 to 12, the
phase 7 log file was moved to lparaix01_alt_mig.log.prev and a new log file was
used for phases 8-12. For this reason you may want to backup each log file to a
unique file name as you execute each phase group, so that you do not lose any of
the information logged to the .log or .log.prev files.

Now we can complete the rest of the migration and execute the
remaining phases, 8 through 12.

At the IBM Technical Symposium in Sydney last week, a person approached me to discuss NIM and some of its capabilities. During the conversation we discussed how NIM could be used to copy files from the NIM master to its NIM clients. I promised to send them some information on how to achieve this ASAP. They were smart and followed up with an email the next day! They’d even tried to configure this on their systems but hit a small problem.

"Hello Chris,

So trying to push out a new netbackup tar file to one of our nim clients (usually we do all sort of stuff via our SSH deployment server). But decided to use NIM.

So I created the resource file_res called netbackup. Where the tar file is placed on the master

But for the life of me I cannot find anywhere under NIM to allocate the resource to the client i.e. to push the file out..
I though it would be under install software, but the resource is not listed only lpp_sources.

Do you have any ideas, this one I am stuck on.

Can I actually push files out by themselves? Thanks"

The answer is yes! You can define a file_res resource on the NIM master. From the AIX 7.2 Knowledge Centre:

"A file_res resource is where NIM allows for resource files to be stored on the server. When the resource is allocated to a client, a copy of the directory contents is placed on the client at a location that is specified by the dest_dir attribute."

"-a location=Value Specifies the full path name of the directory on the NIM server. This path is used as a source directory among clients.
-a dest_dir=Value Specifies the full path name of the directory on the NIM client. This path is where the source directory is recursively copied into.

Notes: If the target directory does not exist on the destination machine, the entire source directory contents are copied (including the hidden files in the top-level directory). If the target directory exists on the destination machine, the source directory contents are copied (excluding the hidden files in the top-level directory)."

Essentially, you can place the files you need to distribute to your NIM clients, into a directory on the NIM master. Then you can create a file_res NIM resource that points to this directory. After that, you can allocate this resource to the NIM client or NIM machine group, and run a NIM customisation operation against the client (or machine group). This will copy the directory and all its files to the NIM client. Pretty cool really!

Here's an example. I want to distribute (copy) all of the files located in /usr/local/etc, to a NIM client. The original (source) directory and files reside on the NIM master.

If you’re looking for a fast and cheap way of copying a bunch of files from a central location to one or more servers, then this is a very good option for AIX administrators. It’s like a “poor mans” file collections; similar to what we once had with Cluster System Management (CSM, which is now discontinued) and PowerHA File Collections (but not as powerful or configurable).

I was contacted recently by a customer who was attempting to restore an AIX 5.3 Versioned WPAR (VWPAR) from backup using NIM. The restore worked OK but the data was restored to the wrong volume group!

When the VWPAR was created, the –g option was specified with mkwpar to force the creation of the VWPAR file systems in a separate volume group (named wparvg) rather than the default location of the Global root volume group (rootvg).

# mkwpar -g wparvg -n p8vw2 -B /cg/53gibbo.mksysb -C -O

Running lsvg against wparvg confirmed the file systems were in the right location after creation.

# lsvg -l wparvg

wparvg:

LV NAME TYPE LPs PPs PVs LV STATE MOUNT POINT

loglv00 jfs2log 1 1 1 open/syncd N/A

fslv02 jfs2 4 4 1 open/syncd /wpars/p8vw2

fslv03 jfs2 2 2 1 open/syncd /wpars/p8vw2/home

fslv04 jfs2 12 12 1 open/syncd /wpars/p8vw2/opt

fslv05 jfs2 6 6 1 open/syncd /wpars/p8vw2/tmp

fslv06 jfs2 56 56 1 open/syncd /wpars/p8vw2/usr

fslv07 jfs2 12 12 1 open/syncd /wpars/p8vw2/var

Before handing the VWPAR over for production use, the customer wanted to ensure they could successfully backup and recover the VWPAR using NIM. First they took a backup of the VWPAR using NIM. From the NIM master, they created a “savewpar backup image”, as shown below.

In the Global environment, we then stopped and removed the VWPAR (p8vw2).

# stopwpar -F p8vw2

# rmwpar -F p8vw2

Back on the NIM master, we attempted to restore the VWPAR from the recently created backup image (p8vw2-backup).

# smit nim_wpar_create

p8vw2

Create a Managed Workload Partition

Type or select values in entry fields.

Press Enter AFTER making all desired changes.

[Entry Fields]

* Target Name [p8vw2]

Remain NIM client after install? [yes] +

Specification Resource [] +

WPAR Options

WPAR Name p8vw2

Resource for WPAR Backup Image [p8vw2-backup] +

Resource for System Backup Image [] +

Alternate DEVEXPORTS for installation [] +

Alternate SECATTRS for installation [] +

The restore completed successfully but to our surprise, the VWPAR file systems were in the Global rootvg not wparvg.

# lsvg –l rootvg

rootvg:

LV NAME TYPE LPs PPs PVs LV STATE MOUNT POINT

hd5 boot 1 1 1 closed/syncd N/A

hd6 paging 8 8 1 open/syncd N/A

hd8 jfs2log 1 1 1 open/syncd N/A

hd4 jfs2 6 6 1 open/syncd /

hd2 jfs2 35 35 1 open/syncd /usr

hd9var jfs2 7 7 1 open/syncd /var

hd3 jfs2 2 2 1 open/syncd /tmp

hd1 jfs2 1 1 1 open/syncd /home

hd10opt jfs2 5 5 1 open/syncd /opt

hd11admin jfs2 2 2 1 open/syncd /admin

lg_dumplv sysdump 16 16 1 closed/syncd N/A

livedump jfs2 4 4 1 open/syncd /var/adm/ras/livedump

cglv jfs2 100 100 1 open/syncd /cg

fslv02 jfs2 2 2 1 open/syncd /wpars/p8vw2

fslv03 jfs2 1 1 1 open/syncd /wpars/p8vw2/home

fslv04 jfs2 6 6 1 open/syncd /wpars/p8vw2/opt

fslv05 jfs2 3 3 1 open/syncd /wpars/p8vw2/tmp

fslv06 jfs2 28 28 1 open/syncd /wpars/p8vw2/usr

fslv07 jfs2 6 6 1 open/syncd /wpars/p8vw2/var

# lsvg -l wparvg

wparvg:

LV NAME TYPE LPs PPs PVs LV STATE MOUNT POINT

loglv00 jfs2log 1 1 1 closed/syncd N/A

We attempted the restore again but this time we explicitly included a WPAR “Specification Resource”. We did this to ensure that the restwpar process was using the correct specification file.

# smit nim_wpar_create

p8vw2

Create a Managed Workload Partition

Type or select values in entry fields.

Press Enter AFTER making all desired changes.

[Entry Fields]

* Target Name [p8vw2]

Remain NIM client after install? [yes] +

Specification Resource [p8vw2-spec] +

WPAR Options

WPAR Name p8vw2

Resource for WPAR Backup Image [p8vw2-backup] +

Resource for System Backup Image [] +

Alternate DEVEXPORTS for installation [] +

Alternate SECATTRS for installation [] +

We created a WPAR Specification NIM resource. The file was created in the Global environment, using the mkwpar command to write out the VWPAR specification details to a text file. This file was then copied to the NIM master to be used to create the NIM resource.

# mkwpar -e p8vw2 -w -o /tmp/cg/p8vw2_cg.cf

# lsnim -l p8vw2-spec

p8vw2-spec:

class = resources

type = wpar_spec

Rstate = ready for use

prev_state = unavailable for use

location = /tmp/cg/p8vw2_cg.cf

alloc_count = 1

server = master

# lsnim -t wpar_spec

p8vw2-spec resources wpar_spec

The specification file contained the volume group name (wparvg) where each of the VWPAR file systems where located.

# grep vg /tmp/cg/p8vw2_cg.cf

rootvgwpar = "no"

vg = "wparvg"

vg = "wparvg"

vg = "wparvg"

vg = "wparvg"

vg = "wparvg"

vg = "wparvg"

# grep -p vg /tmp/cg/p8vw2_cg.cf

general:

version = "1"

name = "p8vw2"

hostname = "p8vw2"

checkpointable = "no"

directory = "/wpars/p8vw2"

privateusr = "yes"

uuid = "3e7a2bfb-6060-4770-ad7e-4d6b2a84f657"

devices = "/etc/wpars/devexports"

architecture = "none"

ostype = "1024"

xwparipc = "no"

auto = "no"

rootvgwpar = "no"

preserve = "no"

routing = "no"

mount:

logname = "/dev/loglv00"

directory = "/home"

vfs = "jfs2"

vg = "wparvg"

size = "131072"

mount:

logname = "/dev/loglv00"

mountopts = "rw"

directory = "/opt"

vfs = "jfs2"

vg = "wparvg"

size = "786432"

mount:

logname = "/dev/loglv00"

directory = "/var"

vfs = "jfs2"

vg = "wparvg"

size = "786432"

mount:

logname = "/dev/loglv00"

directory = "/tmp"

vfs = "jfs2"

vg = "wparvg"

size = "393216"

mount:

logname = "/dev/loglv00"

directory = "/"

vfs = "jfs2"

vg = "wparvg"

size = "262144"

mount:

logname = "/dev/loglv00"

mountopts = "rw"

directory = "/usr"

vfs = "jfs2"

vg = "wparvg"

size = "3670016"

However, even with the specification file in place, the result was the same and the VWPAR file systems were created in rootvg rather than wparvg.

Note: Both the Global environment and the NIM master were running AIX 7100-03-04-1441.

We were able to request in ifix from AIX support. Once we installed the ifix, in the Global environment, the restore process via NIM worked as expected and the VWPAR file systems were recovered in wparvg. We did not need to use the WPAR specification NIM resource.

It’s no secret that I like
NIM. I’ve written about NIM on several occasions, in Redbooks, technical
articles and on my blog. It’s a great tool for installing and maintaining AIX. Occasionally
though, it can throw you an unusual error or two when things go wrong. For the uninitiated
resolving these errors can be challenging.

Here’s a couple of NIM
related questions I received recently. Both are troubleshooting queries.

“I'm getting the following error when trying
to create a NIM lpp_source for AIX 6.1”:

Define a
Resource

Type or
select values in entry fields.

Press Enter
AFTER making all desired changes.

[Entry Fields]

* Resource
Name[6100-06_lpp]

* Resource
Typelpp_source

* Server of
Resource[master]+

* Location
of Resource[/export/eznim]/

NFS Client Security Method[]+

NFS Version Access[]+

Architecture of Resource[]+

Source of Install Images[/dev/cd0]+/

Names of Option Packages[]

Show Progress[yes]+

Comments[]

...

Preparing to
copy install images (this will take several minutes)...

0042-001
nim: processing error encountered on "master":

0042-081 m_mk_lpp_source: a resource already
exists on "master" at

location
"/export/eznim/bid_ow"; due to NFS export restrictions, the

new location "/export/eznim"
may not be used

...

I could see
what could be causing the issue but I first asked to see the contents of the /etc/exports file. Just to make sure
that /export/eznim was not
explicitly exported. It wasn’t.

# grep eznim
/etc/exports

/export/eznim/vios/mksysb_image
-ro,root=vio1,access=vio1

/export/eznim/bid_ow
-ro,root=nim1,access=nim1

I suggested
they re-try the operation again but this time specify the full path to the lpp_source directory, like so:

Define a Resource

Type or
select values in entry fields.

Press Enter
AFTER making all desired changes.

[Entry Fields]

* Resource
Name
[6100-06_lpp]

* Resource
Type
lpp_source

* Server of
Resource
[master]
+

* Location
of
Resource
[/export/eznim/6100-06_lpp]
/

NFS
Client Security
Method
[]
+

NFS
Version
Access
[]
+

Architecture of Resource
[]
+

Source of Install Images
[/dev/cd0]
+/

Names
of Option
Packages
[]

Show
Progress
[yes]
+

Comments
[]

This worked
fine. And the lpp_source was created successfully.

# lsnim
-l 6100-06_lpp

6100-06_lpp:

class = resources

type = lpp_source

locked = 26738726

Rstate = unavailable for use

prev_state =

location = /export/eznim/6100-06_lpp

alloc_count = 0

server = master

The second
issue related to down-level filesets after an AIX 6.1 migration with nimadm.There was also an issue with disk space that generated an error during
NIM operations.

“Hi Chris,

Thanks for replying
back to my post. I am a newbie to AIX/pSeries world.

I was following
your document on upgrading AIX 5.3 to 6.1 using nimadm, it seemed to have
worked partially for me, as I got to AIX 6100-00-00. The mksysb and the SPOT I
created were for AIX 6100-06-05.

Before I re-attempt
the upgrade, is there anything I need to be clean up on AIX 5.3 side that could
have perhaps caused issues with the upgrade.

Thanks”

I
recommended that the following command be run on the NIM client, to determine
which filesets were down-level after the migration.

#
oslevel -s

#
instfix -i | grep AIX

#
instfix -i | grep SP

If the output contained messages like "Not all filesets for
6100-06_AIX_ML were found", or something similar, then they should
run the following command against each line (there could be more than one).

#
instfix -icqk 6100-06_AIX_ML | grep ":-:"

This will tell you which filesets were not updated.

The output from ‘oslevel –s’and
‘instfix –i | grep AIX’ looked OK.

#
oslevel -s

6100-06-04-1112

#
instfix -i | grep AIX

All filesets for 6100-00_AIX_ML were found.

All filesets for 6100-01_AIX_ML were found.

All filesets for 6100-02_AIX_ML were found.

All filesets for 6100-03_AIX_ML were found.

All filesets for 6100-04_AIX_ML were found.

All filesets for 6100-05_AIX_ML were found.

All filesets for 6100-06_AIX_ML were found.

But the
output from ‘instfix –i| grep SP’ indicated an issue with 61-06-051115_SP.

#
instfix -i | grep SP

All filesets for 61-00-010748_SP were
found.

All filesets for 61-00-020750_SP were
found.

All filesets for 61-00-030808_SP were
found.

All filesets for 61-00-040815_SP were
found.

All filesets for 61-01-010823_SP were
found.

All filesets for 61-00-050822_SP were
found.

All filesets for 61-00-060834_SP were
found.

All filesets for 61-01-020834_SP were
found.

All filesets for 61-00-070846_SP were
found.

All filesets for 61-01-030846_SP were
found.

All filesets for 61-02-010847_SP were
found.

All filesets for 61-02-020849_SP were
found.

All filesets for 61-00-080909_SP were
found.

All filesets for 61-01-040909_SP were
found.

All filesets for 61-02-030909_SP were
found.

All filesets for 61-03-010921_SP were
found.

All filesets for 61-00-090920_SP were
found.

All filesets for 61-01-050920_SP were
found.

All filesets for 61-02-040920_SP were
found.

All filesets for 61-00-100939_SP were
found.

All filesets for 61-01-060939_SP were
found.

All filesets for 61-02-050939_SP were
found.

All filesets for 61-03-020939_SP were
found.

All filesets for 61-04-010944_SP were
found.

All filesets for 61-04-021007_SP were
found.

All
filesets for 61-04-031009_SP were found.

All filesets for 61-00-110943_SP were
found.

All filesets for 61-01-081014_SP were
found.

All filesets for 61-02-060943_SP were
found.

All filesets for 61-03-030943_SP were
found.

All filesets for 61-03-041014_SP were
found.

All filesets for 61-04-041014_SP were
found.

All filesets for 61-05-011016_SP were
found.

All filesets for 61-01-091015_SP were
found.

All filesets for 61-02-081015_SP were
found.

All filesets for 61-03-051015_SP were
found.

All filesets for 61-04-051015_SP were
found.

All filesets for 61-01-070943_SP were
found.

All filesets for 61-02-071014_SP were
found.

All filesets for 61-02-091034_SP were
found.

All filesets for 61-03-061034_SP were
found.

All filesets for 61-04-061034_SP were
found.

All filesets for 61-05-021034_SP were
found.

All filesets for 61-06-011043_SP were
found.

All filesets for 61-02-101036_SP were
found.

All filesets for 61-03-071036_SP were
found.

All filesets for 61-04-071036_SP were
found.

All filesets for 61-05-031036_SP were
found.

All filesets for 61-06-021044_SP were
found.

All filesets for 61-06-031048_SP were
found.

All filesets for 61-05-041048_SP were
found.

All
filesets for 61-04-081048_SP were found.

All filesets for 61-03-081048_SP were
found.

All filesets for 61-03-091112_SP were
found.

All filesets for 61-04-091112_SP were
found.

All filesets for 61-05-051112_SP were
found.

All filesets for 61-06-041112_SP were
found.

Not all filesets for 61-06-051115_SP were found.

The culprit
was the csm.hc_utils fileset from
TL6 SP5. It was still at level 1.7.1.0 instead of the expected 1.7.1.1.

I recommended that the csm.hc_utils
1.7.1.1 fileset be added to the AIX 6.1 lpp_source.
The migration was attempted again and the client migrated to AIX 6.1 TLP6 SP5
successfully. I also suggested they
determine if this fileset was really used and/or needed (CSM is no longer
supported/available starting with AIX 7.1 anyway).

And just
when I thought all his problems were resolved, he replied with: “however upon trying to install these from
the NIM server, I get errors on NIM”.

On the odd occasion, NIM may report that a resource is allocated to a NIM client, when, in fact, it is not. Typically, you’d check that the resource was, in fact, not allocated for use to any NIM client and if it was, you would reset the client; and this would resolve the issue. But if that doesn’t work, you may need to take an additional action to resolve the problem. This doesn’t happen very often but it can frustrate you when it does.

Here’s an example of the problem. I try to remove an lpp_source resource but I’m told that it’s still allocated to a client. But it isn’t, I tell you!

# nim -o remove liveupdaterte

0042-001 nim: processing error encountered on "master":

0042-061 m_rmpdir: the "liveupdaterte" resource is currently

allocated for client use

Even lsnim is telling me that the resource is still allocated, somewhere, because alloc_count is set to 1.

# lsnim -Fl liveupdaterte

liveupdaterte:

id = 1447111715

class = resources

type = lpp_source

comments = LIVE

arch = power

Rstate = ready for use

prev_state = verification is being performed

location = /export/nim/cglpp

alloc_count = 1

server = master

After trying to de-allocate the resources, by resetting my NIM clients (see my script at the bottom of the page), and still receiving the same error, I’m left with little choice but to manually reset the alloc_count value to 0, using the (almost undocumented) /usr/lpp/bos.sysmgt/nim/methods/m_chattr NIM utility.

The release level of the resource is incomplete, or incorrectly specified. The level of the resource can be obtained by running the lsnim -l ResourceName command and viewing the version, release, and mod attributes. To correct the problem, either recreate the resource, or modify the NIM database to contain the correct level using the command on the NIM master:/usr/lpp/bos.sysmgt/nim/methods/m_chattr -a Attribute= Value ResourceName, where Attribute is version, release, or mod; Value is the correct value; and ResourceName is the name of the resource with the incorrect level specification.

One question that comes to mind is how did the NIM resource end up in this state? Most likely it was the result of a failed NIM operation on the lpp_source and NIM client to which it was to be allocated. This can be tricky to pick up and almost always, it’s the next person who tries to use the resource that finds the problem and has no idea what events led up to this point.

As always, use caution when experimenting with this tool. If in doubt, take a backup of your NIM database before you start messing with the attributes, just in case you need it in the future.

Here’s my NIM client reset script. It resets the client and de-allocates any resources assigned to it. It also resets the NIM client cpuid (this is not always required) but I often use the same NIM client to install multiple AIX partitions across several Power servers, so it’s useful to me only (probably)! You can remove that line if need be.

There’s a new NIM HTTP service handler included with AIX 7.2 (due for release next month, December 2015). This new service is designed “…….to help Clients better conform to emerging data center policies restricting the use of NFS, NIM will now have support to apply updates to AIX or install new packages over HTTPs.Initial AIX installs will still require the use of NFS version 3 or the more secure NFS version 4 protocol.

In addition to fileset installs, NIM customization activities such as script execution and file_res copying also support access over HTTPs.

Major Advantages of using HTTP during NIM Management:

All communication occurs over a single http port, so the authorization through a firewall is quite easy to manage.

Actions are driven from the client's end (the install target), so remote access isn't necessary for pushing the commands.

Easy to consume by NIM or other products that currently use the client/server model of NFS.

Able to extend the end-product to support additional protocols (context driven).”

“How Does it Work?

AIX ships a new service handler (in 7.2.0) that provides http access to NIM resources. The service name (defined in /etc/services) is nimhttp and it listens for requests over port 4901. When active, NIM clients attempt file access and/or scripting customization requests from nimhttp. If http access fails or is denied, a failover attempt at NFS client access occurs. Future support will include options to remove NFS client attempts altogether.”

“On startup, the nimhttp service attempts to read the httpd.conf configuration file -‐-‐ located in the default home directory of the user. First time users will notice that starting the service without a configuration file will result in one being created and populated with default service values.”

“document_root

….for now, the key detail to point out is that NIM expects all http accessible files to exist under the path of /export/nim/. This path location is defined as the document_root and cannot be modified at this time. Future enhancements will support multiple document_root paths. The document root path is not limited in depth and may contain many sub-directories. Client requests are able to traverse the path setting by using the enable_directory_listing option. If set to “no”, all files being served must reside in the current working directory of document_root.”

“The default authentication used in nimhttp for client access is a basic protocol handshake and is probably considered by some (if not all) as undesirable. To enable the more secure Digest Authentication method, users must provide valid paths for certificate authority and root certificate files for the server. The certificate authority and root PEM files used in nimhttp are easily created using the existing SSL management option in NIM. Run the following command on the NIM master to create the ssl.cert_authority and ssl.pemfiles used by the nimhttp service:

# nimconfig –c”

I tested this functionality during the AIX 7.2 Early Ship Program.

Warning: The information shown here was collected from testing conducted with beta level code. Some details may change in the final release.

Configuring the service was easy. For the sake of simplicity I chose not to use SSL with the authentication mechanism. With my NIM master already configured, all I need to do is confirm that the NIM client fileset is installed on the master and any client I wish to manage with the HTTP service.

NIM MASTER:

# lslpp -l | grep nim

bos.sysmgt.nim.master 7.2.0.0 COMMITTED Network Install Manager -

bos.sysmgt.nim.client 7.2.0.0 COMMITTED Network Install Manager –

NIM CLIENT:

# lssrc -s nimsh

Subsystem Group PID Status

nimsh nimclient 6554064 active

# lslpp -l | grep nim

bos.sysmgt.nim.client 7.2.0.0 COMMITTED Network Install Manager -

Start the NIMHTTP service on the NIM master. This starts the nimhttpd daemon (on the master only) and creates the default httpd.conf file (in root’s home directory, /).

# startsrc -s nimhttp

0513-059 The nimhttp Subsystem has been started. Subsystem PID is 6685178.

# lssrc -s nimhttp

Subsystem Group PID Status

nimhttp 6685178 active

# ps -ef | grep nimhttp

root 6685178 4194712 0 Nov 10 - 0:00 /usr/sbin/nimhttpd –v

# ls -ltr /httpd.conf

-rw-r--r-- 1 root system 1159 Nov 05 15:31 /httpd.conf

# cat /httpd.conf

#

#---------------------

# http service defines

#---------------------

#

service.name=nimhttp

# Designates the service name used when discovering the listening port for requests (i.e., nimhttp)

#

service.log=/var/adm/ras/nimhttp.log

# Log of access attempts and equivalent responses. Also useful for debug purposes.

#

# service.proxy_port=

# Designates the service port number used when configured as a proxy.

#

# service.access_list=

# White-list of IP (host) addresses which have access to our http file service. All others are denied.

#

#

#---------------------

# http configuration

#---------------------

#

document_root=/export/nim/

# Designates the directory to serve files from.

#

enable_directory_listing=yes

# Allow requests for listing served files/directories under the document root.

#

enable_proxy=no

# Enable the web service to act as a proxy server.

#

ssl.cert_authority=/ssl_nimsh/certs/root.pem

# Designates the file location of the certificate authority used for digital certificate signing.

#

ssl.pemfile=/ssl_nimsh/certs/server.pem

# Designates the file location of the PEM format file which contains both a certificate and private key.

#

I configured a new lpp_source resource (liveupdaterte) on the NIM master. I ensured that all the files for the lpp_source were in the correct location (i.e. /export/nim) . This restriction will be lifted in the future, but during my testing the service required all files to be served from /export/nim, on the master.

In my previous post on AIX Live Updates I discussed how to use the geninstall command to perform a non-disruptive (ifix) update on an AIX system. In this post I wanted to show you how to perform the same task using NIM.

NIM can be used to start an AIX Live Update operation on a target machine (NIM client) either from a NIM master or from the NIM client itself (with nimclient).

Note: The AIX Live Update operation started by NIM calls the hmcauth command during the cust operation to authenticate to the NIM client with the HMC by using the HMC passwd file. The NIM master is responsible for obtaining password information from the HMC (using ssh).Without it, NIM clients will not have the password information necessary when running hmcauth as part of the NIM client operation.So, we must first define an hmc object in NIM and create the password file (used when accessing the console.)Once this required step has been completed, all clients using NIM live_update have the ability to pass the proper hmc login credentials when configuring 'hmcauth'.

First, I need to install the dsm.core fileset and configure SSH keys between the NIM master and the HMC.

The NIM client must either be defined with or updated to include the Managed System name (Management Source) and LPAR id number.

# smit nim_chmac

Change/Show Characteristics of a Machine

Type or select values in entry fields.

Press Enter AFTER making all desired changes.

[Entry Fields]

Machine Name [AIXmig]

* Hardware Platform Type [chrp] +

* Kernel to use for Network Boot [64] +

Machine Type standalone

Network Install Machine State currently running

Network Install Control State ready for a NIM operation

Primary Network Install Interface

Network Name net1

Host Name [AIXmig]

Network Adapter Hardware Address [0]

Network Adapter Logical Device Name [ent]

Cable Type N/A +

Network Speed Setting [] +

Network Duplex Setting [] +

IPL ROM Emulation Device [] +/

VLAN Tag Priority (0 to 7) [] #

VLAN Tag Identifier (0 to 4094) [] #

CPU Id [00F94F584C00]

Communication Protocol used by client [nimsh] +

NFS Client Reserved Ports [] +

Comments []

Managing System Information

LPAR Options

Identity [88]

Management Source [S824]

# lsnim -l AIXmig

AIXmig:

class = machines

type = standalone

connect = nimsh

platform = chrp

netboot_kernel = 64

if1 = net1 AIXmig 0

cable_type1 = N/A

mgmt_profile1 = hsc02 88 S824 <<< LPARD id 88, Mgmt Src S824

Cstate = ready for a NIM operation

prev_state = ready for a NIM operation

Mstate = currently running

cpuid = 00F94F584C00

Cstate_result = success

I also need to configure an lpp_source for the ifix location (on the NIM master) and the Live Update data file (on the NIM master). This file can reside on the NIM client if you wish but I’ve chosen to manage all the resources on the NIM master.

# lsnim -t lpp_source

lpp_sourceaix72 resources lpp_source

liveupdatefix resources lpp_source

# lsnim -l liveupdatefix

liveupdatefix:

class = resources

type = lpp_source

arch = power

Rstate = ready for use

prev_state = unavailable for use

location = /nim/lvup/ifix

alloc_count = 0

server = master

# ls -ltr /nim/lvup/ifix

total 72

-rw-r----- 1 root system 35625 Oct 15 14:50 dummy.150813.epkg.Z

# lsnim -t live_update_data

liveupdate_AIXmig resources live_update_data

# lsnim -l liveupdate_AIXmig

liveupdate_AIXmig:

class = resources

type = live_update_data

Rstate = ready for use

prev_state = unavailable for use

location = /nim/lvup/lvupdate.data

alloc_count = 0

server = master

# ls -ltr /nim/lvup/

total 16

drwxr-xr-x 2 root system 256 Oct 15 14:54 ifix

-r--r----- 1 root system 4289 Oct 15 15:04 lvupdate.data

# tail -20 /nim/lvup/lvupdate.data

# Users need not provide redundant options such as "-a -U -C and -o"

# in the trc_option field for trace stanza.

# Do not add a trace stanza to the lvupdate.data file unless you

# want the live update commands to be traced.

#

general:

mode = automated

kext_check = no

disks:

nhdisk = hdisk0

mhdisk = hdisk1

tohdisk =

tshdisk =

hmc:

lpar_id = 88

management_console = 10.1.50.30

user = hscroot

Now I can perform a preview of the live update operation, from the NIM master. The preview operation will be run on the NIM client called AIXmig.

If you want, you could initiate the live update from the NIM client using the nimclient command. All the resources reside on the NIM master, but the NIM client starts the operation, not the NIM master.