I was working at a client site today, on a NIM master that I
configured a month or so ago. I was there to install the TSM backup client
software on about 30 or so LPARs. Of course I was going to use NIM to
accomplish this task.

The software install via NIM worked for the majority of the LPARs
but I noticed a few of them were failing. This was very odd, as the last time
I’d use the same NIM method to install software, everything was fine.

I suspected that perhaps something had changed on the client
LPARs...maybe with their /etc/niminfo
file for instance. So I performed the following steps to reconfigure the /etc/niminfo file
and configure the nimsh subsystem on
the client LPAR.

lpar1#
mv /etc/niminfo /etc/niminfo.old

lpar1#
niminit -a master=nim1 -a name=`hostname`

lpar1#
stopsrc -s nimsh

lpar1#
smit nim_config_services

Configure Client Communication Services

Type
or select values in entry fields.

Press
Enter AFTER making all desired changes.

[Entry Fields]

* Communication Protocol used by client[nimsh]+

NIM Service Handler Options

*Enable Cryptographic Authentication[disable]+

for client communication?

Install Secure Socket Layer Software (SSLv3)?[no]+

Absolute path location for INSTALLP package[/dev/cd0]/

-OR-

lpp_source which contains INSTALLP
package[]+

Alternate Port Range for Secondary
Connections

(reserved values will be used if left
blank)

Secondary Port Number[]#

Port Increment Range[]+#

The last step failed with the following error message:

0042-358
niminit: The connect attribute may only be assigneda service value of "shell" or
"nimsh".

I checked the NIM client and confirmed it was configured for nimsh
and it was fine. However, I did notice something odd when I ran the following
command:

lpar1#
egrep 'nimsh|nimaux' /etc/services

lpar1#

The entries for nimsh
were missing from the /etc/services
file!

Somebody had decided that these entries were not required and had
simply removed them! Gee, thanks so much for that!

After adding the following entries back into the services file,
everything started working again!

nimsh3901/tcp# NIM Service Handler

nimsh3901/udp# NIM Service Handler

nimaux3902/tcp# NIMsh Auxiliary Port

nimaux3902/udp# NIMsh Auxiliary Port

I’ve also encountered this error when there is another
process (other than nimsh) using
port 3901 or 3902.

Another error message you might confront, if those
entries are either missing or commented out, is on the NIM master:

nimmast#
nim -o showlog -a log_type=lppchk lpar1

0042-001
nim: processing error encountered on "master":

0042-006 m_showlog: (From_Master) connect
Error 0

poll:
setup failure

I thought I’d also mention another error message that
can potentially drive you insane (especially if you haven’t had your morning
coffee!). The error doesn’t relate to nimsh
at all but I thought I’d describe it anyway. The message appears when running
the nim –o showlog command against a
client LPAR.

nimmast# nim -o showloglpar1

0042-001
nim: processing error encountered on "master":

0042-006 m_showlog: (From_Master) connect
Error 0

0042-008 nimsh: Request denied – wronghostname

I’ve modified the output a little to make it easier to
identify the problem. Can you see it? I thought so! Upon investigation you may
find that the IP address for the NIM master is resolving to a different
hostname on the client. For example:

On the NIM master:

nimmast# host nimmast

nimmast
is 172.29.150.177

nimmast# host 172.29.150.177

nimmast is 172.29.150.177

nimmast# grep 177 /etc/hosts

172.29.150.177nimmast

On the NIM client:

lpar1# host nimmast

nimmast
is 172.29.150.177

lpar1# host 172.29.150.177

wronghostname is 172.29.150.177

lpar1# grep 172.29.150.177 /etc/hosts

172.29.150.177wronghostname

172.29.150.177nimmast

In this example, someone placed two host entries in /etc/hosts with the same IP address. The client was resolving the IP address
to an incorrect hostname. This resulted in our nim –o showlog command failing.

I was performing a volume group re-org i.e. changing the
INTER-POLICY of a logical volume from minimum to maximum.

# lslv
fixeslv| grep INTER

INTER-POLICY:minimumRELOCATABLE:yes

# chlv
-e x fixeslv

# lslv
fixeslv| grep INTER

INTER-POLICY:maximumRELOCATABLE:yes

I attempted to run the reorgvg
command. I was greeted by the following error message!

# reorgvg
tempvg fixeslv

0516-966 reorgvg:
Unable to create internal map.

I ran the command again, this time with truss. I found that the /usr/sbin/allocp
command was being called and was failing. I determined this must be because of
a lack of space at the logical volume layer.

#
/usr/sbin/allocp -?

/usr/sbin/allocp:
Not a recognized flag: ?

0516-422
allocp: [-i LVid] [-t Type] [-c Copies] [-s Size]

[-k] [-u UpperBound>] [-e
InterPolicy] [-a InterPolicy

The truss output showed:

statx("/usr/sbin/allocp", 0x2FF21ED8, 76, 0)= 0

statx("/usr/sbin/allocp", 0x20009E70, 176, 020) = 0

kioctl(2, 22528, 0x00000000, 0x00000000)Err#25
ENOTTY

kfork()=
3735812

_sigaction(20,
0x00000000, 0x2FF21F20)= 0

_sigaction(20,
0x2FF21F20, 0x2FF21F30)= 0

kwaitpid(0x2FF21F90,
-1, 6, 0x00000000, 0x00000000) = 3735812

And yes, my volume group was indeed out of free PPs!

# lsvg
tempvg

VOLUME
GROUP:
tempvg
VG IDENTIFIER: 00f6027300004c0000000130773bdb73

VG
STATE:
active
PP SIZE: 512 megabyte(s)

VG
PERMISSION: read/write
TOTAL PPs: 99 (50688 megabytes)

MAX
LVs:
256
FREE
PPs: 0 (0 megabytes)

LVs:
1
USED PPs: 99 (50688 megabytes)

OPEN
LVs:
1
QUORUM: 2 (Enabled)

TOTAL
PVs: 2
VG DESCRIPTORS: 3

STALE
PVs:
0
STALE PPs: 0

ACTIVE
PVs:
1
AUTO
ON: yes

MAX
PPs per VG: 32768
MAX PVs: 1024

LTG
size (Dynamic): 256
kilobyte(s) AUTO
SYNC: no

HOT
SPARE:
no
BB POLICY: relocatable

PV
RESTRICTION: none

cgaix7[/opt]
>

Silly me, it clearly states in the reorgvg man page that there must be at least one free PP in the
volume group for the command to run successfully.

2At least one free physical partition
(PP) must exist on the specified volume group for the reorgvg command to run
successfully. For mirrored logical volumes, one free PP per physical
volume (PV) is required in order for the reorgvg command to maintain logical
volume strictness during execution; otherwise the reorgvg command still runs,
but moves both copies of a logical partition to the same disk during its
execution.

So I shrunk the file system in question (there was a large amount
of allocated but unused file system space, so it was safe to shrink it).

I can see when my reorgvg
failed (rc=1) and when it succeded (rc=0). This is also a good way of
determining when a reorgvg command
is issued and when it finished. Of course, an easier way would be to start the reorgvg command with the time command. It will produce a nice
little summary of the time taken.

# time
reorgvg tempvg fixeslv

0516-962
reorgvg: Logical volume fixeslv migrated.

real3m12.94s

user0m1.52s

sys0m4.60s

But if I forgot to use the time
command, I can look at the lvmcfg alog file for an answer. In the following
example, the reorgvg.sh process is
started at 23:49. The entry in the log file begins with an uppercase S. The entry that starts with an
uppercase E indicates the end of the
reorgvg.sh process.It is the information in the third field that
tells me how long the process ran for in seconds:milliseconds.

Recently I came across a small but strange problem with an AIX 5.3 system running Oracle. When we tried to perform an offline filesystem backup of the sapdata1 filesystem, we received errors stating the certain files were not able to opened. I confirmed this by using the 'file' command on some of the database files. I received the following messages:

gibsonc@hxaix35 /oracle/TC0/sapdata1/sr3_6 $ file sr3.data6

sr3.data6: 0653-902 Cannot open the specified file for reading.

I discovered that the sapdata1 filesystem had not been mounted with the CIO (Concurrent I/O) option. CIO was enabled within the Oracle database, however.

Recently I came across a small but strange problem with an AIX 5.3 system running Oracle. When we tried to perform an offline filesystem backup of the sapdata1 filesystem, we received errors stating the certain files were not able to opened. I confirmed this by using the 'file' command on some of the database files. I received the following messages:

gibsonc@hxaix35 /oracle/TC0/sapdata1/sr3_6 $ file sr3.data6

sr3.data6: 0653-902 Cannot open the specified file for reading.

I discovered that the sapdata1 filesystem had not been mounted with the CIO (Concurrent I/O) option. CIO was enabled within the Oracle database, however.

Recently I came across a small but strange problem with an AIX 5.3 system running Oracle. When we tried to perform an offline filesystem backup of the sapdata1 filesystem, we received errors stating the certain files were not able to opened. I confirmed this by using the 'file' command on some of the database files. I received the following messages:

gibsonc@hxaix35 /oracle/TC0/sapdata1/sr3_6 $ file sr3.data6

sr3.data6: 0653-902 Cannot open the specified file for reading.

I discovered that the sapdata1 filesystem had not been mounted with the CIO (Concurrent I/O) option. CIO was enabled within the Oracle database, however.

In order to create the WPAR, I needed an AIX 5.2 mksysb file to
supply to the mkwpar command.

Fortunately, I just happened to have an old AIX 5.2 mksysb image
in my archives!

I then executed the following command to build the WPAR:

# mkwpar -n wpar1 -C -B /home/cgibson/AIX5202_64bit-mksysb

The flags to the command are:

-n wparname

Specifies the name for the workload partition to be
created. You must specify a name, either using the -n flag or in a
specification file using the -f flag, unless the -p name or both –w and -o
flags are used.

-B wparbackupdevice

Specifies a device containing a workload partition
backup image. This image is used to populate the workload partition file
systems. The wparBackupDevice parameter is a workload partition image that is
created with the savewpar, mkcd, or mkdvd command. The -B flag is used by the
restwpar command as part of the process of creating a workload partition from a
backup image.

-C

Creates a versioned workload partition. This option
is valid only when additional versioned workload partition software has been
installed.

I was then able to start my new AIX 5.2 WPAR successfully!

# startwpar -v wpar1

Starting workload partition wpar1.

Mounting all workload partition file systems.

Mounting /wpars/wpar1

Mounting /wpars/wpar1/home

Mounting /wpars/wpar1/mksysb

Mounting /wpars/wpar1/nre/opt

Mounting /wpars/wpar1/nre/sbin

Mounting /wpars/wpar1/nre/usr

Mounting /wpars/wpar1/opt

Mounting /wpars/wpar1/proc

Mounting /wpars/wpar1/tmp

Mounting /wpars/wpar1/usr

Mounting /wpars/wpar1/usr/local

Mounting /wpars/wpar1/var

Mounting /wpars/wpar1/var/log

Mounting /wpars/wpar1/var/tsm/log

Loading workload partition.

Exporting workload partition devices.

Exporting workload partition kernel extensions.

Starting workload partition subsystem cor_wpar1.

0513-059 The cor_wpar1 Subsystem has been started.
Subsystem PID is 8388822.

Verifying workload partition startup.

Return Status = SUCCESS.

The WPAR was now in an active state and the associated file
systems were mounted (as shown from the Global
environment).

# lswpar

NameStateTypeHostnameDirectoryRootVG WPAR

--------------------------------------------------------

wpar1ASwpar1/wpars/wpar1no

# mount | grep wpar

/dev/lv00/wpars/wpar1jfsJul 26 20:13 rw,log=/dev/loglv00

/dev/lv01/wpars/wpar1/home
jfsJul 26 20:13 rw,log=/dev/loglv00

/dev/lv02/wpars/wpar1/mksysb
jfsJul 26 20:13 rw,log=/dev/loglv00

/opt/wpars/wpar1/nre/opt namefs Jul 26 20:13 ro

/sbin/wpars/wpar1/nre/sbin namefs Jul 26 20:13 ro

/usr/wpars/wpar1/nre/usr namefs Jul 26 20:13 ro

/dev/lv03/wpars/wpar1/opt
jfsJul 26 20:13 rw,log=/dev/loglv00

/proc/wpars/wpar1/proc
namefs Jul 26 20:13 rw

/dev/lv04/wpars/wpar1/tmp
jfsJul 26 20:13 rw,log=/dev/loglv00

/dev/lv05/wpars/wpar1/usr jfsJul 26 20:13 rw,log=/dev/loglv00

/dev/lv06/wpars/wpar1/usr/local jfsJul
26 20:13 rw,log=/dev/loglv00

/dev/lv07/wpars/wpar1/var
jfsJul 26 20:13 rw,log=/dev/loglv00

/dev/lv08/wpars/wpar1/var/log jfsJul 26
20:13 rw,log=/dev/loglv00

/dev/lv09/wpars/wpar1/var/tsm/log jfsJul 26 20:13 rw,log=/dev/loglv00

I was curious what the WPAR environment was going to look like, so
I used clogin to access
it and run a few commands.

From the Global environment I confirmed I was indeed on an AIX 7
system.

# uname -W

0

# oslevel

V7BETA

From within the WPAR, I confirmed that I was indeed running AIX
5.2! Wow!

# clogin wpar1

wpar1 : / # oslevel

5.2.0.0

And I could see all 8 logical CPUs (4 hardware threads per POWER7
processor i.e. SMT-4).

wpar1 : / # sar -P ALL 1 5

AIX wpar1 2 5 00F602734C0007/26/10

wpar1 configuration: @lcpu=8@mem=4096MB@ent=0.50

20:22:20 cpu%usr%sys%wio%idlephysc%entc

20:22:2107781140.010.0

11700290.010.0

2010990.000.0

30001000.010.0

40350650.000.0

70280720.000.0

U--0930.4793.9

-030970.030.0

I noticed an interesting device in the lscfg output.

wpar1 : / # lscfg

INSTALLED RESOURCE LIST

The following resources are installed on the
machine.

+/- = Added or deleted from Resource List.

*=
Diagnostic support not available.

Model
Architecture: chrp

Model
Implementation: Multiple Processor, PCI bus

+ sys0System Object

*
wio0WPAR I/O Subsystem

Also noticed some new and interesting mount points, for example /nre/opt.

wpar1 : / # df

Filesystem512-blocksFree %UsedIused %Iused Mounted on

Global1310729992824%14245% /

Global1310721267044%701% /home

Global104857610155604%171% /mksysb

Global78643242890446%733114% /nre/opt

Global4587528840081%1002047% /nre/sbin

Global498073624872100%5369887% /nre/usr

Global1310726380052%164011% /opt

Global-----/proc

Global1310721250805%521% /tmp

Global157286416574490%2318312% /usr

Global5242884944646%1541% /usr/local

Global 13107211151215%4934% /var

Global2621442537444%281% /var/log

Global1310721268324%201% /var/tsm/log

I did have one minor problem when I first tried to start my WPAR,
but that issue was quickly resolved by the AIX developers on the AIX 7 Open
Beta Forum.

You are receiving that mail because we have been either in touch in regards
of LPA2RRD tool about year ago or you are on LPAR2RRD mailing list.

I’ve decided to make professional paid support of LPAR2RRD.
Here are my reasons which led me to that thought:
- I have not touched the source code for nearly 1.5 year (looks like the
last version is quite stable :) ). No time at all and lack of motivation...
- Only what I trying to keep is support of the tool, but my response times
are not what I am proud of
- the tool itself would need big time investment to implement new
functionalities, rewrite the web face (would need to hire a web designer
for that) etc.

Only the solution what I see how to keep continuity and do not let project
slowly died is paid support.
Basically I feel that as the last chance for any further development or
support from my side.
I simply cannot manage to do everything I actually do around and still work
on LPAR2RRD.
If I would get some money from the project then I will be able to
prioritize my other activities in favour of LPAR2RRD development and
support.

From that reason I’ve created new LPAR2RRD home here:http://www.lpar2rrd.com
I am about to found a company which would be responsible for support. Based
on your feedback I will decide go or no-go.
I could even imagine working on LPAR2RRD full time in case of enough
support subscribers.

I do not intend to force you ordering support. If you are happy with the
actual version, you have no issues etc ... that this makes me also happy.
Make people happy was the main reason why I spent my free time by
developing of the tool since 2006. I do not regret that at all! It was fun.

During the testing of the migration process we noticed that some of
the sys0 tunables were being reset to their default settings after the
migration had completed. This was rather odd. I’d never had this issue during
an AIX migration in the past.

We noticed the following attributes had changed.

fullcore-
Was set to true before migration. Set to false after migration.

iostat-
Was set to true before migration. Set to false after migration.

maxuproc-
Was set to 2048 before migration. Set to 128 after migration.

The maxuproc value wasof particular concern as it has an impact on the number of processes an
application (user) can start. So when one of our SAP/Oracle test systems was
unable to start because maxuproc was set to low, we were very puzzled.
After we had discovered that maxuproc was incorrect, we changed it to the
appropriate value and restarted SAP/Oracle successfully. We were then very determined to
identify the root cause of this issue. We could not see any issue with our
migration process (via nimadm) and decided to log a call with IBM.

IBM AIX support were able to assist us in determining the problem.

---------------------

After building more debug methods and performing
further debug, which

involves multiple restore attempts, we figured out
the root cause.

- During second phase of boot, when cfgsys_chrp
run, it tries to set all

customized values for sys0 device.However, in this process, if and

when an error occurs, all customized values for
sys0 will be reset to

allow the system to boot. (Instead of hang/crash).

- In the case of the customer's scenario, when
cfgsys_chrp() tries to set ncargs

to value of 30, it fails with an error.Reason being, for

AIX 6.1, minimum value for ncargs is 256. If it is
less than 256, the kernel

returns an error and then cfgsys_chrp
"resets" all customized attribute

Recently
a colleague contacted me with a question relating to hostname resolution and
DNS on AIX 6.1. I thought it was an interesting discussion so I thought I’d
share it with you here.

His
question was basically this:

“In AIX 6.1, as you know, the resolv.conf has
some additional options. Do you know what would happen if I have two
nameservers in my file and the target hostname isn't found, will the second
nameserver necessarily be looked up? The man page says:

If more than one name server is listed, the
resolver routines query each name server (in the order listed) until either the
query succeeds or the maximum number of attempts have been made.

but the rotate
option seems to be set for that purpose:

Enables the resolver to use all the nameservers
in the resolv.conf file, not just the first one.

If I have
multiple name servers in /etc/resolv.conf, and the first one is available but
the query fails, will the name resolution inevitably go to the second
nameserver?

By default, if the first nameserver
is able to answer the query, either by returning the IP address for the target
hostname OR a 'host does not exist',
then this equates to a successful lookup. Only if the first nameserver does not respond and/or times
out will the resolver routine send the query to the next nameserver in the list.

To debug this you could use the RES_OPTIONS
environment variable and examine the output to see what nameservers are being called and when and in what order. For
example:

-In the
following test, my resolv.conf file has what you
would typically configure i.e. a couple of nameservers
and a domain entry. Note that I have two nameservers listed in this file.

# cat /etc/resolv.conf

nameserver 10.1.50.201

nameserver 10.1.50.202

domain cg.com

-I then
perform a lookup of a host that is known to DNS and returns an IP address. The
output indicates that only one nameserver
is queried not both. As expected.

# RES_OPTIONS=debug host mygoodhostname |
grep Query

;; Querying server (# 1) address = 10.1.50.201

-Likewise
if I perform a lookup on a hostname that is not known to DNS, I receive a reply
from the first nameserver in the list
only. Again, as expected.

# RES_OPTIONS=debug host mybadhostname |
grep Query

host: 0827-801 Host name mybadhostname does
not exist.

;; Querying server (# 1) address = 10.1.50.201

- Now, if I add the new rotate option to my resolv.conf file,
I observe different behaviour. Both nameservers
are queried, regardless.

# cat /etc/resolv.conf

nameserver 10.1.50.201

nameserver 10.1.50.202

domain cg.com

options rotate

-Both nameservers
are queried to lookup the hostname of a host known to DNS.

# RES_OPTIONS=debug host mygoodhostname
| grep Query

;; Querying server (# 1) address = 10.1.50.202

;; Querying server (# 2) address =
10.1.50.201

- Again, both nameservers
are queried to lookup the hostname of a host not known to DNS. In this case,
the second nameserver (10.1.50.202)
is bogus and it is actually the first nameserver,
10.1.50.201, that replies i.e. Query #1.

# RES_OPTIONS=debug host mybadhostname
| grep Query

host: 0827-801 Host name mybadhostname does
not exist.

;; Querying server (# 1) address = 10.1.50.202

;; Querying server (# 2) address =
10.1.50.201

;; Querying server (# 1) address = 10.1.50.201

I finished off my response by stating that this approach was
probably good practice, but might have the potential to slow down hostname
lookups if there are several (max. of 3) nameservers
to query. I expect the performance impact would be minimal. If he was concerned
with the performance hit, he could always enable the netcd daemon
to cache DNS lookups locally, which might speed things up for hosts that were referenced
frequently.

Which brings me to the netcd daemon. This was first
introduced with AIX 6.1 and is included in the bos.net.tcp.client fileset.

# lslpp -f bos.net.tcp.client | grep netcd

/usr/sbin/netcdctrl

/usr/sbin/netcd

/usr/samples/tcpip/netcd.conf

This new subsystem
can be enabled to help improve network performance and reduce network traffic.
You can configure this daemon to cache answers from DNS, NIS and other server
queries. This daemon is not activated by default in AIX 6.1.

The netcd daemon can cache resolver lookups to a network resource
such as a DNS server. It will populate its cache with the result of each query.
Negative answers are cached as well. When an entry is inserted to the cache, a
TTL is associated to it. For DNS queries, the TTL value returned by the DNS
server is used (with the default settings). The daemon will also check
periodically for expired entries and remove them.

There are a number of configurable options for netcd.
However, on my test LPAR, I simply ran the following command to start the
daemon and test it. I used the lssrc command to get an overview of the
active configuration.

# startsrc –s netcd

# lssrc -ls netcd

SubsystemGroupPIDStatus

netcdnetcd569432active

DebugInactive

Configuration File/etc/netcd.conf

Configured Cachelocal services

Configured Cachelocal protocols

Configured Cachelocal hosts

Configured Cachelocal networks

Configured Cachelocal netgroup

Configured Cachedns services

Configured Cachedns protocols

Configured Cachedns hosts

Configured Cachedns networks

Configured Cachedns netgroup

Configured Cachenisplus services

Configured Cachenisplus protocols

Configured Cachenisplus hosts

Configured Cachenisplus networks

Configured Cachenisplus netgroup

Configured Cachenis services

Configured Cachenis protocols

Configured Cachenis hosts

Configured Cachenis networks

Configured Cachenis netgroup

yp
passwd.byname

yp
passwd.byuid

yp
group.byname

yp
group.bygid

yp
netid.byname

yp
passwd.adjunct.byname

Configured Cacheulm services

Configured Cacheulm protocols

Configured Cacheulm hosts

Configured Cacheulm networks

Configured Cacheulm netgroup

If you would like
the daemon to start automatically on a system restart, uncomment the following
entry from the /etc/rc.tcpip file.

#start
/usr/sbin/netcd "$src_running"

By default,
if you start the daemon without configuring it’s associated configuration file
(/etc/netcd.conf), then it will start with its default values. So just
about everything is cached. If you want to trim down the configuration you can
create your own /etc/netcd.conf file. There is a sample file located in /usr/samples/tcpip/netcd.conf. You can copy the file to the /etc/
directory and use it as a template for your configuration.

The netcdctrl command can be used to control and manage the netcd cache(s). You can dump the current contents of a cache,
flush a cache, change the logging level and view statistics. To verify that netcd was caching DNS lookups on my test system, I performed
the following.

- First I
dumped the DNS cache to a file. The contents did not contain any cached DNS lookups
at this point in time.

# netcdctrl -t dns -e hosts -a /tmp/dns.out

# cat /tmp/out1

CACHE dns, hosts, name

END CACHE dns, hosts, name

CACHE dns, hosts, address

END CACHE dns, hosts, address

- Next performed a DNS lookup of an internet host, ibm.com.

# host ibm.com

ibm.com is 129.42.17.103

- Again, I dumped the contents of the cache. Now I could see a
cached entry for ibm.com.

# netcdctrl -t dns -e hosts -a /tmp/dns.out

# cat /tmp/dns.out

CACHE dns, hosts, name

>>>>>>>>>>>>>>>>>>>>>>>>>>>>
ELEM #1

Expiration date : Wed Jan 27 07:50:24 2010

Ulm
or resolver name : dns

Query type : 10100002

Query length : 7

Answer (0: positive; otherwise : negative) : 0

Query key : 1264134311

String used in query : ibm.com

Additional parameters in query:

query param1 : 2

query param2 : 0

Length of cached element : 37

################### hostent

Number of aliases = 0

Number of addresses = 3

Type = 2

Length = 4

Host
name = ibm.com

Alias =

Address = 129.42.17.103

Address = 129.42.18.103

Address = 129.42.16.103

#################### end of hostent

>>>>>>>>>>>>>>>>>>>>>>>>>>>>
END ELEM #1

END CACHE dns, hosts, name

CACHE dns, hosts, address

END CACHE dns, hosts, address

It is also possible to flush the cache if something
is stale and needs to be refreshed manually.

# netcdctrl -t dns -e hosts -f

The netcd daemon can cache lookups for all sorts of resolver
queries (not just DNS). Some of these include local (/etc/hosts), NIS, NIS+ and
YP.

Essentially
this program is designed to give IBM customers, ISVs and IBM BPs the
opportunity to gain early experience with the latest release of AIX prior to
general availability. This is a great time to join forces and help IBM mould
the next generation of the AIX OS.

I got
involved in the AIX 6 Open
Beta back in 2007. It was a worthwhile experience. The time I
spent learning new features like WPARs and RBAC, put me in a good position when
it came time to actually implement these outside of my lab environment. It was
also a good opportunity to provide feedback to the IBM AIX development
community. Several AIX developers monitored the comments/questions in the Beta
Forum and provided advice (and sometimes fixes) for known (and unknown!) issues
with the beta release. It also provided the developers with plenty of real
world feedback that they could take back to the labs, long before the product
was officially released. This certainly helped fix bugs and improve certain
enhancements before customers starting using the OS in their computing
environments.

The Getting Started guide provides useful
information that you will need to know before attempting to install the OS. For
example, the beta code will run on any IBM System p, eServer pSeries or POWER system
that is based on PPC970, POWER4, POWER5, POWER6 or POWER7 processors.

The guide
also describes what new functionality has been included in this release of the
beta. If this program is anything like the AIX 6 beta, there may be more than
one release of the code, with further enhancements available in each release.
The new function in this release includes:

AIX 5.2 Workload Partitions
for AIX 7provides the capability to create a WPAR running AIX
5.2 TL10 SP8. This allows a migration path for an AIX 5.2 system running on old hardware
to move to POWER7. All that is required is to create a mksysb image of the AIX
5.2 system and then provide this image when creating the WPAR. The WPAR must be
created on a system running AIX 7 on POWER7 hardware. This is a very
interesting feature, one that I am eager to test.

B.Removal of WPAR local storage device
restrictions.

AIX 7 will allow for exporting a virtual or physical fibre channel
adapter to a WPAR. The WPAR will essentially own the physical adapter and its
child devices. This will allow for SAN storage devices to be directly assigned to
the WPAR's FC adapter(s). This means it will not be necessary to provision the storage
in the Global environment first and then export it to the WPAR. This is also
interesting as we may now be able to assign SAN disk to a WPAR for both rootvg
and data volume groups. Maybe even FC tape devices within a WPAR will work?

D.Etherchannel enhancements in 802.3ad mode.

There are some enhancements to AIX 7 EtherChannel support
for 802.3AD mode. The enhancement makes sure that the link is LACP ready before
sending data packets. If I’m interpreting this correctly, this will ensure that
the aggregated link is configured appropriately. If it’s not, it will provide
an error in the AIX errpt stating that the link is not configured correctly.
This can help avoid situations where the AIX EtherChannel is configured but the
Network Switch is not. At present there is very little an AIX administrator can
do as the link will appear to be functioning even if the Switch end has not
been configured for an aggregated link.

And
the most important point in the Getting
Started guide has to be, how to install the AIX beta code! An ISO image of
the code is provided for download. The installation steps are straightforward
as the image can be installed via a DVD device. Using a media repository on a
VIO server could be one way to accomplish this task. Unfortunately, there is no
mention of NIM install support yet. Here are the basic steps from the guide:

Installing the AIX Open Beta Driver

The
AIX 7 Open Beta driver is delivered by restoring a system backup (mksysb) of
the code downloaded via DVD ISO image from the AIX Open Beta web-site.

Once
you have downloaded and created the AIX 7 Open Beta media (as described above)
follow the following steps to install the ‘mksysb’.

1.Put
the DVD of the AIX Open Beta in the DVD drive. A series of screens/menus will
be displayed. Follow the instruction on the screens and make the following
selections:

•
Type 1 and press Enter to have English during install.

•
Type 1 to continue with the install.

•
Type 1 to Start Install Now with Default Setting.

2.The
system will start installing the AIX 7.0 BETA.

3.Upon
completion of the install, the system will reboot. You can then login as
“root”, no password is required.

Next I
recommend that you take a look at the Release
Notes. It provides a few bits of information that may come in handy when
planning for the install, such as:

·The
Open Beta code is being delivered via an “mksysb” install image. Migration
installation is not supported with the open beta driver.

·The
open beta driver does not support IBM Systems Director Agent.

·When
installing in a disk smaller than 15.36 GB, the following warning is displayed:
A disk of size 15360 was specified in the bosinst.data file, but there is
not a disk of at least that size on the system. You can safely ignore this
warning.

·The
image is known to install without issues on an 8 GB disk.

·oslevel
output shows V7BETA.

By the
way, if you are unable to find a spare system or LPAR on which to install the
beta, perhaps you can consider using the IBM Virtual Loaner Program (VLP).
They are planning to support LPARs running the AIX 7 Open Beta starting from
July 17th. I use the VLP all the time and found it be a fantastic
way to try new things without the need for, or expense of, my own IBM POWER
system. There are some drawbacks, such as not having access to your own
dedicated hardware, HMC, VIO server, NIM master, but still it’s great if you
just want to test something on an AIX system.

I’ll
report back once I’ve got my AIX 7 Open Beta system up and running!

I received an email this week from a
colleague that worked with me on the NIM
Redbook back in 2006. He was experiencing an issue with DSM and NIM. He was
attempting to use the dgetmacs
command to obtain the MAC address of the network adapters on an LPAR. The
command was failing to return the right information.

I experienced this very issue during
the writing of the AIX 7.1 Differences
Guide Redbook. And given that I was in Austin, sitting in the same building
as the AIX development team, I was able to speak with the developers directly
about the issue. At that time they provided me with the following workaround.

First they asked me to check the size
of the /usr/lib/nls/msg/en_US/IBMhsc.netboot.cat message
catalog file.

# ls –l /usr/lib/nls/msg/en_US/IBMhsc.netboot.cat

-rw-r--r--1 binbin3905 Aug 08 09:54

They were surprised to find that the
file appeared to be “too small”. They promptly sent me the catalog file from
one of their development AIX 7.1 systems.I replaced the file as follows:

# cd/usr/lib/nls/msg/en_US/

# ls -ltr IBMhsc*

-rw-r--r--1 binbin3905 Aug 08 09:54
IBMhsc.netboot.cat

# cp -p IBMhsc.netboot.cat
IBMhsc.netboot.cat.old

# cp /tmp/lpar1/IBMhsc.netboot.cat.new IBMhsc.netboot.cat

# ls -ltr IBMhsc*

-rw-r--r--1 binbin3905 Aug 08 09:54
IBMhsc.netboot.cat.old

-rw-r--r--1 binbin26374 Dec 23 11:24 IBMhsc.netboot.cat

This fixed the problem for me during
the residency.

So I asked my friend to do the same
(after I sent him the message catalog file). He ran the dgetmacs command again and this time it returned the MAC address
for all the network adapters in his LPAR. Success!

This is something
that I experience on all new Power/AIX systems that I install:

System migrated to AIX
5.3 (or later) might experience double boot

When booting AIX Version 5.3 (or
later) on a system that has previously been running an earlier release of AIX, you
may notice that the system automatically reboots and restarts the boot process.
This is how the firmware processes changed information in the boot image. This
reboot also occurs if the process is reversed. A system previously running AIX
5.3 (or later) that is booting a release of AIX prior to 5.3 goes through the
same process. This
″double boot″ occurs only once; if the stored value does not change,
then the second boot does not occur. If you install AIX 5.3 (or later) and
continue to use only that version, this double boot occurs once, and it occurs
only if your system was running a pre-AIX 5.3 release before you boot AIX 5.3
(or later). Systems
that are preinstalled with AIX 5.3 (or later) and use only that version do not
experience the ″double boot.″

Some of my favoritre AIX update and migration tools (nimadm and multibos) have been enhanced to perform cool new tricks! In particular, multibos now supports AIX migrations not just updates to newer TLs and SPs. Unfortunately, from what I've read, you need to be running AIX 6.1 first. So I won't be able to use multibos to migrate from AIX 5.3 to 6.1. So I'll use nimadm instead. Some brief details are provided below. Refer to the online AIX documentation for more information.

Enhanced nimadm command.

The nimadm command is enhanced to allow the system administrator to do the following:

- Use a NIM client’s rootvg to create a NIM mksysb resource that has been migrated to a new version or release level of AIX.

- Use a NIM mksysb resource to create a NIM mksysb resource that is migrated to a new version or release level of AIX.

- Use a NIM mksysb resource to restore to a free disk, or disks, on a NIM client and simultaneously migrate to a new version or release level of AIX.

- To create a migrated new mksysb resource of a client with the filename nim1, type the following:

I’ve
written about multibos before, here and here. But recently I
started experimenting with multibos mksysb migration. A customer asked me how
this worked and apart from a high-level view I wasn’t able to provide any real
world experience, so I thought I’d give it a try. What follows is just a ‘brain
dump’ from my quick test.

First of all
this isn’t really a migration. It just simply populates a second instance of
AIX with a higher-version. It doesn’t really migrate (or merge) your existing
configuration into the second instance. So I’m not sure how useful this feature
really is right now.

Starting with
5.3 TL9 you can add a 6.1 TL2 (or above) instance. This is done with the new –M
flag. You must be running with the 64bit kernel.

This isn’t really a migration because it populates the second instance using a
mksysb based on the new release.

In 6.1 TL2 a new flag (-M) was added to the mksysb command which allows you to
create a mksysb for use with multibos. It creates a backup of BOS (/, /usr,
/var, /opt).
bos.alt_disk_install.boot_images must be installed.

It is not advised to run in this environment for an extended period of time.
There could be problems if tfactor or maps are used. Be aware that 6.1 specific
attributes may not be reflected in the standby instance.

So in my
lab environment I have two AIX LPARs. One is running AIX 6.1 and the other
running AIX 7.1.

First I
take a mksysb (with the –M flag) of the AIX 7.1 system to a file. This file
will be called by multibos to populate the second instance.

aix7[/] > mksysb -Mie /data/aix7-mksysb

Creating information file
(/image.data) for rootvg.

Creating list of files to back up.

Backing up 71643 files.....

71643 of 71643 files (100%)

0512-038 mksysb: Backup Completed
Successfully.

aix7[/] > ls -ltr /data

total 4276112

drwxr-xr-x2 rootsystem256 Feb 21 20:59
lost+found

-rw-r--r--1 rootsystem2189363200 Feb 21 21:06
aix7-mksysb

I copied this
file over to my AIX 6.1 system. This was the system that was to be ‘migrated’.
The next step was to perform a preview of the multibos operation.

Upon
checking my bootlist output, I
noticed (as expected) that the list now contained two extra entries for bos_hd5. These were the boot logical
volume entries for the second instance. If I was to boot from this LV I’d be
booting into AIX 7.1. Cool.

root@aix6 /# bootlist -m normal -o

hdisk0 blv=bos_hd5

hdisk0 blv=bos_hd5

hdisk0 blv=hd5

hdisk0 blv=hd5

So at this
point, I’d created a second instance of AIX running 7.1. My current version of
(running) AIX was AIX 6.1. All I had to do now was reboot the LPAR and let it
restart as an AIX 7.1 system.

root@aix6 /# oslevel -s

6100-01-05-0920

root@aix6 / # shutdown –Fr

The LPAR
rebooted successfully and I found I was now running AIX 7.1, just as I’d hoped.

aix6[/] > oslevel -s

7100-00-01-1037

If I wanted
to go back to AIX 6.1, I would change my bootlist setting again and restart the
LPAR.

Now that
I’ve actually tried this method of migration, I’m not sure I’d actually use it
in its current form.

Although
the migration keeps my hostname and IP address, the file systems are not shared
between instances. Most of the target systems configuration is not retained.
For example, any user accounts I create on my AIX 6.1 system would also need to
be created on the existing AIX.7.1 system which I used to create the AIX 7.1
mksysb image. It reminds me a little of a preservation install.

IBM made some announcements today relating to
their latest POWER7 server offerings. The new line of systems includes new
entry level systems and the highly anticipated high-end system, the POWER7 795!
They also officially outlined some of
the new features available in AIX 7.1. You can review the details here.
I’ve discussed some of these new features here
and here.
The official AIX 7.1 announcement details are available here.

The announcement got me thinking about my
recent customer engagements and why some have chosen to deploy AIX into their
IBM POWER environments, while others are considering a Linux on POWER solution.

I’ve found that it usually comes down to a
skills decision more than anything else. Most customers are happy to either
continue working with AIX (if they are existing AIX users) or migrate from
another UNIX OS to AIX. I’ve seen very few customers actually migrate to Linux
on POWER, but I’ve worked with several that have seriously considered it. Those
that have chosen to deploy Linux are doing so purely because they have in-house
Linux skills. They are concerned that migrating to AIX may be too big a jump
for their technical staff. I find this thinking interesting, as most of the
customers I’ve dealt with who run other UNIX OS’s like Tru64, Solaris or HP-UX
are more than happy to migrate to AIX. They believe the move is relatively
minor and doesn’t require massive re-training of their UNIX admins. I tend to
agree.

For me, AIX
is my preferred “Enterprise class” UNIX Operating System. Notice I’m prefacing
this with the words Enterprise class. Don’t get me wrong, I have worked with Linux
systems in both small and large customer environments. It is a great OS. But
I’ve found that it really only fits into environments that have a relatively
small number of users and where significant downtime can be tolerated for
things like operating system maintenance. This doesn’t fit the Enterprise class
of UNIX server OS’s that I’m thinking of here. When I contemplate the word Enterprise, I think of servers and
operating systems that can respond to business demands in terms of performance,
reliability, stability and availability. An Enterprise UNIX can provide all of
these things without compromise. Linux can offer performance and reliability
(in my opinion). However, from what I’ve seen, it lacks features & functions
in the areas of stability and availability. AIX on the other hand ticks all the
boxes. Again, this is just my opinion based on my experiences with both AIX and
Linux in the Enterprise landscape. Others will no doubt have their own
experiences that may or may not match my own.

So when I’m designing an Enterprise UNIX
server environment for a customer, I always start with an AIX on POWER base. If
the customer wants Linux, sure I can look at that too, but I strongly recommend
AIX as my preferred choice for large systems. Most of my customers are running
relatively large SAP/Oracle systems. AIX on POWER is a great combination for
large Enterprise systems. If you need to deploy large database systems that
must service tens of thousands of users (like a big SAP system), then I believe
AIX is the perfect OS on which to provide a platform for these large scale
systems.

AIX is a very mature and powerful UNIX OS. It
has been a major player in the UNIX server market for over 20 years (as shown
below). Some people are just not aware of how mature, robust and stable the AIX
OS has become over the years. There are many impressive aspects of the OS in
the areas of performance, scalability, reliability, management and
administration.

Just looking at some of the administration
capabilities built into AIX are enough for me to always recommend AIX over
Linux (or any other UNIX OS), when it comes to large Enterprise servers.

For example, the System Management Interface Tool (SMIT) can make the UNIX admins life a lot simpler.
This is an interactive tool that is part of the AIX OS. Almost all tasks that
an AIX administrator may need to perform can be executed using this tool. It is
a text-based tool (there is also an X interface but I recommend sticking with
the text-based menus). Everything it does, it does through standard AIX
commands and Korn shell functions. This feature is especially useful when you
need to automate a repetitive task; you can have SMIT create the proper
command-line sequence, and you can then use those commands in your own script.
My compatriot, Anthony English, has a nice intro to SMIT on his AIX blog.

The
AIX Logical Volume Manager (LVM)
is built into the OS, for free. AIX LVM helps UNIX system administrators manage their
storage in a very flexible manner. LVM allows logical
volumes to span multiple physical volumes. Data on logical volumes appears to
be contiguous to the user, but might not be contiguous on the physical volume.
This allows file systems, paging space, and other logical volumes to be resized
or relocated, span multiple physical volumes, and have their contents
replicated for greater flexibility and availability. It provides capabilities
for mirroring data across disks, migrating data across disks & storage
subsystems, expand/shrink filesystems and more.......all of which can be
performed dynamically.....no downtime
required. The concept, implementation and interface to the AIX LVM is one of a
kind. All of its features support the ‘continuous availability’ philosophy.

One of the biggest reasons that I love AIX
over Linux is the mksysb. It’s built
into the OS and allows you to create a bootable image of your AIX system. This
image can be used to restore a broken AIX system or for cloning other systems.
The cloning feature is truly amazing. You can take an image created on a
low-end system and deploy it on any POWER system, all the way up to the
high-end POWER boxes. This simplifies the installation and cloning processing
when you need to install and manage many AIX LPARs. By using an SOE mksysb
image you can deploy consistent AIX images across your Enterprise POWER server
environment.

This brings me to another wonderful feature of
AIX, the Network Installation Manager
(NIM).
NIM is powerful network installation tool (comparable to Linux Kickstart).
Using NIM you can backup/restore, update and upgrade one or more AIX systems
either individually or simultaneously. This can all be achieved over a network
connection, removing the need for handling physical installation media forever.

Another fine example of AIXs superior OS management tools is multibos. This tool allows an AIX administrator to
create and maintain two separate, bootable instances of the AIX OS within the
same root volume group (rootvg). This second instance of rootvg is known as a
standby Base Operating System (BOS) and is an extremely handy tool for
performing AIX TL and Service Pack (SP) updates. Multibos lets you install,
update and customize a standby instance of the AIX OS without impacting the running and active production instance of the
AIX OS. This is valuable in environments with tight maintenance windows.

When it comes to upgrading the OS to a new
release of AIX, the nimadm
utility can assist the administrator greatly in this task. The nimadm
utility offers several advantages. For example, a system administrator can use nimadm to create a copy of a
NIM client's rootvg and migrate the disk to a newer version or release of AIX.
All of this can be done without
disruption to the client (there is no outage required to perform the
migration). After the migration is finished, the only downtime required will be
a single scheduled reboot of the system.

AIX
6.1 introduced new capability that most UNIX operating systems are still
working on. Concurrent
Updates of the AIX Kernel.....without a reboot! IBM is always working hard at making AIX an OS
that can provide continuous availability,
even if it needs to be patched. AIX now has the ability to update certain kernel
components and kernel extensions in place, without needing a system reboot. In
addition, concurrent updates can be removed from the system without needing a
reboot. Can you do this with other UNIX OSs?

And that’s just some of the features that make
AIX the only UNIX OS that I recommend
for Enterprise systems. There are many more tools and features that I couldn’t
live without (like alt_disk_install, savevg, installp, WPARs, etc, the list
goes on). If you are new to AIX and you are considering what your next UNIX OS
should be, then I recommend you take a very
close look at AIX.

And finally, the support provided by IBM, is
first class. Whenever I’ve needed assistance with an AIX issue or query, I have
always received timely, professional and useful advice. On the rare occasions
where I’ve uncovered a new bug, IBM AIX support have always been quick to
provide me with an interim fix to resolve or workaround a problem. That’s the
sort of support you’d expect for an Enterprise UNIX OS, isn’t it? What’s the
support like from your current UNIX (or Linux) OS vendor?

Linux is still a viable UNIX operating system.
However, I think it’s more suited to certain workloads like small to medium
size mail, web and other utility servers and services. AIX, however, would be
my platform of choice for my 10TB Oracle database running SAP ERP, not just for
performance reasons, but primarily because of the system administration
features of AIX that allow me to support and manage the system without
impacting my customers or enforcing reboots/outages whenever I need to change
something on the system.

All IBM need to do is create their own Linux
distribution (Blue Linux perhaps?)
that has all the features of AIX built in and then I’m sold! But why would
they? We already have AIX.