I was contacted recently by a customer who was attempting to restore an AIX 5.3 Versioned WPAR (VWPAR) from backup using NIM. The restore worked OK but the data was restored to the wrong volume group!

When the VWPAR was created, the –g option was specified with mkwpar to force the creation of the VWPAR file systems in a separate volume group (named wparvg) rather than the default location of the Global root volume group (rootvg).

# mkwpar -g wparvg -n p8vw2 -B /cg/53gibbo.mksysb -C -O

Running lsvg against wparvg confirmed the file systems were in the right location after creation.

# lsvg -l wparvg

wparvg:

LV NAME TYPE LPs PPs PVs LV STATE MOUNT POINT

loglv00 jfs2log 1 1 1 open/syncd N/A

fslv02 jfs2 4 4 1 open/syncd /wpars/p8vw2

fslv03 jfs2 2 2 1 open/syncd /wpars/p8vw2/home

fslv04 jfs2 12 12 1 open/syncd /wpars/p8vw2/opt

fslv05 jfs2 6 6 1 open/syncd /wpars/p8vw2/tmp

fslv06 jfs2 56 56 1 open/syncd /wpars/p8vw2/usr

fslv07 jfs2 12 12 1 open/syncd /wpars/p8vw2/var

Before handing the VWPAR over for production use, the customer wanted to ensure they could successfully backup and recover the VWPAR using NIM. First they took a backup of the VWPAR using NIM. From the NIM master, they created a “savewpar backup image”, as shown below.

In the Global environment, we then stopped and removed the VWPAR (p8vw2).

# stopwpar -F p8vw2

# rmwpar -F p8vw2

Back on the NIM master, we attempted to restore the VWPAR from the recently created backup image (p8vw2-backup).

# smit nim_wpar_create

p8vw2

Create a Managed Workload Partition

Type or select values in entry fields.

Press Enter AFTER making all desired changes.

[Entry Fields]

* Target Name [p8vw2]

Remain NIM client after install? [yes] +

Specification Resource [] +

WPAR Options

WPAR Name p8vw2

Resource for WPAR Backup Image [p8vw2-backup] +

Resource for System Backup Image [] +

Alternate DEVEXPORTS for installation [] +

Alternate SECATTRS for installation [] +

The restore completed successfully but to our surprise, the VWPAR file systems were in the Global rootvg not wparvg.

# lsvg –l rootvg

rootvg:

LV NAME TYPE LPs PPs PVs LV STATE MOUNT POINT

hd5 boot 1 1 1 closed/syncd N/A

hd6 paging 8 8 1 open/syncd N/A

hd8 jfs2log 1 1 1 open/syncd N/A

hd4 jfs2 6 6 1 open/syncd /

hd2 jfs2 35 35 1 open/syncd /usr

hd9var jfs2 7 7 1 open/syncd /var

hd3 jfs2 2 2 1 open/syncd /tmp

hd1 jfs2 1 1 1 open/syncd /home

hd10opt jfs2 5 5 1 open/syncd /opt

hd11admin jfs2 2 2 1 open/syncd /admin

lg_dumplv sysdump 16 16 1 closed/syncd N/A

livedump jfs2 4 4 1 open/syncd /var/adm/ras/livedump

cglv jfs2 100 100 1 open/syncd /cg

fslv02 jfs2 2 2 1 open/syncd /wpars/p8vw2

fslv03 jfs2 1 1 1 open/syncd /wpars/p8vw2/home

fslv04 jfs2 6 6 1 open/syncd /wpars/p8vw2/opt

fslv05 jfs2 3 3 1 open/syncd /wpars/p8vw2/tmp

fslv06 jfs2 28 28 1 open/syncd /wpars/p8vw2/usr

fslv07 jfs2 6 6 1 open/syncd /wpars/p8vw2/var

# lsvg -l wparvg

wparvg:

LV NAME TYPE LPs PPs PVs LV STATE MOUNT POINT

loglv00 jfs2log 1 1 1 closed/syncd N/A

We attempted the restore again but this time we explicitly included a WPAR “Specification Resource”. We did this to ensure that the restwpar process was using the correct specification file.

# smit nim_wpar_create

p8vw2

Create a Managed Workload Partition

Type or select values in entry fields.

Press Enter AFTER making all desired changes.

[Entry Fields]

* Target Name [p8vw2]

Remain NIM client after install? [yes] +

Specification Resource [p8vw2-spec] +

WPAR Options

WPAR Name p8vw2

Resource for WPAR Backup Image [p8vw2-backup] +

Resource for System Backup Image [] +

Alternate DEVEXPORTS for installation [] +

Alternate SECATTRS for installation [] +

We created a WPAR Specification NIM resource. The file was created in the Global environment, using the mkwpar command to write out the VWPAR specification details to a text file. This file was then copied to the NIM master to be used to create the NIM resource.

# mkwpar -e p8vw2 -w -o /tmp/cg/p8vw2_cg.cf

# lsnim -l p8vw2-spec

p8vw2-spec:

class = resources

type = wpar_spec

Rstate = ready for use

prev_state = unavailable for use

location = /tmp/cg/p8vw2_cg.cf

alloc_count = 1

server = master

# lsnim -t wpar_spec

p8vw2-spec resources wpar_spec

The specification file contained the volume group name (wparvg) where each of the VWPAR file systems where located.

# grep vg /tmp/cg/p8vw2_cg.cf

rootvgwpar = "no"

vg = "wparvg"

vg = "wparvg"

vg = "wparvg"

vg = "wparvg"

vg = "wparvg"

vg = "wparvg"

# grep -p vg /tmp/cg/p8vw2_cg.cf

general:

version = "1"

name = "p8vw2"

hostname = "p8vw2"

checkpointable = "no"

directory = "/wpars/p8vw2"

privateusr = "yes"

uuid = "3e7a2bfb-6060-4770-ad7e-4d6b2a84f657"

devices = "/etc/wpars/devexports"

architecture = "none"

ostype = "1024"

xwparipc = "no"

auto = "no"

rootvgwpar = "no"

preserve = "no"

routing = "no"

mount:

logname = "/dev/loglv00"

directory = "/home"

vfs = "jfs2"

vg = "wparvg"

size = "131072"

mount:

logname = "/dev/loglv00"

mountopts = "rw"

directory = "/opt"

vfs = "jfs2"

vg = "wparvg"

size = "786432"

mount:

logname = "/dev/loglv00"

directory = "/var"

vfs = "jfs2"

vg = "wparvg"

size = "786432"

mount:

logname = "/dev/loglv00"

directory = "/tmp"

vfs = "jfs2"

vg = "wparvg"

size = "393216"

mount:

logname = "/dev/loglv00"

directory = "/"

vfs = "jfs2"

vg = "wparvg"

size = "262144"

mount:

logname = "/dev/loglv00"

mountopts = "rw"

directory = "/usr"

vfs = "jfs2"

vg = "wparvg"

size = "3670016"

However, even with the specification file in place, the result was the same and the VWPAR file systems were created in rootvg rather than wparvg.

Note: Both the Global environment and the NIM master were running AIX 7100-03-04-1441.

We were able to request in ifix from AIX support. Once we installed the ifix, in the Global environment, the restore process via NIM worked as expected and the VWPAR file systems were recovered in wparvg. We did not need to use the WPAR specification NIM resource.

The AIX minidump facility was introduced with AIX 5.3 TL3. A mini dump is a small compressed dump that is stored to NVRAM when a system crashes or a dump is initiated, and then written to the AIX error log on reboot. It can be used to see some of the system’s state and do some debugging when a full dump is not available. It can also be used to get a quick snapshot of a crash without having to transfer the entire dump from the crashed system to IBM support.

Please refer to the following, official guide, on "How to examine a minidump in AIX".

"Using this crash stack IBM support personnel can then search through the database to find what the fault may mean".

"The RAS effort mentioned..is part of an ongoing effort by AIX to increase stability and to make more information available for troubleshooting when a problem occurs. The ability to look at minidump data has helped solve many issues that would otherwise go unresolved".

Of course, minidumps have their limitations and do not replace the need for a full system dump in many cases.

"Limitations of minidumps
A minidump is of limited or no use in situations where a server has hung. In this situation we can use mdmprpt to see what was running on each cpu. Practically speaking a full dump is usually needed to determine how and why a server has hung."

Here are some examples of how I've used minidump to assist me (and IBM support) in diagnosing the root cause of a system crash.

The first example is from a customer that found that one of their AIX partitions would crash when they ran optmem DPO (Dynamic Platform Optimiser) against one of their new POWER8 E880s. Immediately after optmem ran, one specific LPAR would crash. The LPAR reference LED code would show "888 102 300 C20". This LPAR was installed with AIX 7.1 TL3 SP4. When we tried to restart the partition, it would crash (and dump) several times before it would start successfully. Once the partition booted successfully, we noticed that there were several system dumps (SYSDUMP) and minidumps (COMPRESSED MINIMAL DUMP) shown in the AIX error report.

In this case, the stack trace showed messages relating to trc_generate and trc_inmem_recor. Using this information, I was able to find several hits (both internally to IBM and using my preferred internet search tool) for the potential cause of the crash. The problem was a known issue and related to the following APAR.

We verified our findings with IBM support (who performed their own internal search) and we concluded that an update would be required. We chose to update the AIX system to AIX 7.1 TL4 SP3, and the problem went away.

In the next example, the customer had updated one of their AIX partitions to AIX 7.1 TL4 SP3 (previously on AIX 7.1 TL4 SP1). This LPAR also housed a single, AIX 5.3 versioned WPAR. The update was successful but whenever they ran the stopwpar command, the entire partition would crash (dump). So, once again, I employed the mdmprpt utiliy to check the stack traces.

The report showed us pcmUserGetDevIn and sddUserInterfac in the stack trace. Both are related to the IBM sddpcm device driver. They appeared to be the likely culprits. Searching for these presented us with a couple of potential reasons for the crash, such as too many paths configured for a sddpcm device (which we checked and confirmed was far less than the maximum of 32). So, my next question was, what version of sddpcm was installed? We found that the customer had (unexpectedly) two different versions of sddpcm installed. In the Global LPAR, the sddpcm version was 2.6.9.0 (the latest) and in the vWPAR, 2.6.6.0. During the TL update, the customer had updated sddpcm to the latest level available, but had forgotten to update sddpcm inside the vWPAR as well. We then discovered, that if they simply ran "pcmpath query device" from inside the vWPAR, this would crash the partition. The latest sddpcm readme file contained the following information "5557 Fix for AIX 7.1 crash during pcmpath query device"! This was indeed the cause of the system crash. After the customer updated sddpcm, inside the vWPAR, the system was stable and the problem was resolved.

If you encounter a system crash in the future, and there's minidump data available, why not consider using the minidump reporting tool to analyse the issue? It might just help speed up the root cause analysis of the problem.

"I tested in the lab, for a versioned wpar you can use it more like a regular LPAR - extend a disk into rootvg, mirror on it, unmirror from the old one, reduce out and rmdev. No bosboot or bootlist needed. There is no bootset in a versioned wpar for some reason.”

I have a rootvg WPAR that is on one disk, is there a method to move it to a new disk?

Answer

There may be an occasion where you have created a rootvg WPAR on a specific disk and you want to move the entire WPAR to another disk. One example might be that the original disk is from an older storage enclosure, and you wish to move the WPAR to newly purchased storage, connected to the system.

You can do this by means of an alternate bootset. Similar to how using the alt_disk_copy command in a global LPAR will create a copy of rootvg on another disk, an alternate bootset is a copy of a WPAR's rootvg on another disk.

The example in this technote will use a rootvg wpar that is on a single disk (hdisk11), and has private /opt and /usr filesystems (AKA a "detached" WPAR). This was initially created using these options:

That bootset also believes hdisk9 is "hdisk0" for it, and the other disk is hdisk1. Notice the bootset ID has not changed, bootset 0 is still on (global) disk hdisk11 and bootset 1 on (global) disk hdisk9.

One of my customers was configuring a new AIX 5.3 Versioned WPAR when they came across a very interesting issue. I thought I’d share the experience here, just in case anyone else comes across the problem. We configured the VWPAR to host an old application. The setup was relatively straight forward, restore the AIX 5.3 mksysb into the VWPAR and export the data disk from the Global into the VWPAR, import the volume group and mount the file systems. Job done! However, we noticed some fairly poor performance during application load tests. After some investigation we discovered that disk I/O performance was worse in the VWPAR than on the source LPAR. The question was, why?

We initially suspected the customers SAN and/or the storage subsystem, but both of these came back clean with no errors or configuration issues. In the end, the problem was related to a lack of ODM attributes in the PdAt object class, which prevented the VWPAR disk from using the correct queue depth setting.

Let me explain by demonstrating the problem and the workaround.

First, let’s add a new disk to a VWPAR. This will be used for a data volume group and file system. The disk in question is hdisk3.

# uname -W

0

# lsdev -Cc disk

hdisk0 Available Virtual SCSI Disk Drive

hdisk1 Available Virtual SCSI Disk Drive

hdisk2 Defined Virtual SCSI Disk Drive

hdisk3 Available Virtual SCSI Disk Drive <<<<<<

We set the disk queue depth to an appropriate number, in this case 256.

Note: This value will differ depending on the storage subsystem type, so check with your storage team and/or vendor for the best setting for your environment.

# chdev -l hdisk3 -a queue_depth=256

hdisk3 changed

Using the lsattr command, we verify that the queue depth attribute is set correctly in both the ODM and the AIX kernel.

# lsattr -El hdisk3 -a queue_depth

queue_depth 256 Queue DEPTH True

# lsattr -Pl hdisk3 -a queue_depth

queue_depth 256 Queue DEPTH True

We can also use kdb to verify the setting in the kernel. Remember at this stage, we are concentrating on hdisk3, which is referenced with a specific kernel device address in kdb.

From the output above, we can see that the queue depth is correctly i.e. set to 0x100 in Hex (256 in decimal).

Next, we export hdisk3 to the VWPAR using the chwpar command. The disk, as expected, enters a Defined state in the Global environment. It is known as hdisk1 in the VWPAR.

# chwpar -D devname=hdisk3 p8wpar1

# lswpar -D p8wpar1 | head -2 ; lswpar -D p8wpar1 | grep hdisk

Name Device Name Type Virtual Device RootVG Status

-------------------------------------------------------------------

p8wpar1 hdisk3 disk hdisk1 no EXPORTED <<<<<<

p8wpar1 hdisk2 disk hdisk0 yes EXPORTED

[root@gibopvc1]/ # lsdev -Cc disk

hdisk0 Available Virtual SCSI Disk Drive

hdisk1 Available Virtual SCSI Disk Drive

hdisk2 Defined Virtual SCSI Disk Drive

hdisk3 Defined Virtual SCSI Disk Drive

In the VWPAR, we run cfgmgr to discover the disk. We create a new data volume group (datavg) and file system (datafs) for application use (note: the steps to create the VG and FS are not shown below). This is for demonstration purposes only; the customer imported the data volume groups on their system.

We perform a very simple I/O test in the /datafs file system. We write/create a 1GB file and time the execution. We noticed immediately that the task took longer than expected.

# cd /datafs

# time lmktemp Afile 1024M

Afile

real 0m7.22s <<<<<<<<<<<<<<< SLOW?

user 0m0.04s

sys 0m1.36s

We ran the iostat command, from the Global environment, and noticed that “serv qfull” was constantly non-zero (very large numbers) for hdisk3. Essentially the hdisk queue was full all the time. This was bad and unexpected, given the queue depth setting of 256!

Now comes the interesting part. With a little help from our friends in IBM support, using kdb we found that the queue depth was reported as being set to 1 in the kernel and not 256! You’ll also notice here that the hdisk name has changed from hdisk3 to hdisk1. This happened as a result of exporting hdisk3 to the VWPAR. The disk is known as hdisk1 in the VWPAR (not hdisk3) but the kernel address is the same.

Fortunately, IBM support was able to provide us with a workaround. The first step was to add the missing vparent PdAt entry to the ODM in the Global environment.

# cat addodm_pdat_for_vparent.txt

PdAt:

uniquetype = "wio/common/vparent"

attribute = "naca_1_spt"

deflt = "1"

values = "1"

width = ""

type = "R"

generic = ""

rep = "n"

nls_index = 0

# odmadd addodm_pdat_for_vparent.txt

# odmget PdAt | grep -p "wio/common/vparent"

PdAt:

uniquetype = "wio/common/vparent"

attribute = "naca_1_spt"

deflt = "1"

values = "1"

width = ""

type = "R"

generic = ""

rep = "n"

nls_index = 0

We did the same in the VWPAR.

# clogin p8wpar1

# uname -W

11

# odmget PdAt | grep -p "wio/common/vparent"

#

# odmadd addodm_pdat_for_vparent.txt

# odmget PdAt | grep -p "wio/common/vparent"

PdAt:

uniquetype = "wio/common/vparent"

attribute = "naca_1_spt"

deflt = "1"

values = "1"

width = ""

type = "R"

generic = ""

rep = "n"

nls_index = 0

In the VWPAR, we removed the hdisk and then discovered it again, ensuring that the queue depth attribute was set to 256 in the ODM.

# uname –W

11

# rmdev -dl hdisk1

hdisk1 deleted

# cfgmgr

# lspv

hdisk0 00f94f58a0b98ca2 rootvg active

hdisk1 none None

# lsattr -El hdisk1 –a queue_depth

queue_depth 256 Queue DEPTH True

# odmget CuAt | grep -p queue

CuAt:

name = "hdisk1"

attribute = "queue_depth"

value = "256"

type = "R"

generic = "UD"

rep = "nr"

nls_index = 12

Back in the Global environment we checked that the queue depth was set correctly in the kernel. And it was!

# uname -W

0

# echo scsidisk 0xF1000A01C014C000 | kdb | grep queue_depth

ushort queue_depth = 0x100;

We re-ran the simple I/O test and immediately found that the test ran faster and the hdisk queue (for hdisk3, as shown by iostat from the Global environment) was no longer full. Subsequent application load tests showed much better performance.

A colleague was attempting to recreate a WPAR. The WPAR had previously
been removed using the WPAR Manager Web GUI. This worked, or so he tells me!

Anyway, when he tried to create the WPAR again from the GUI, using
the same WPAR name (wpar20), it
failed. The error claimed that /wpars/wpar20
already existed. But it didn’t!

Now, before I go any further, looking back at the issue now, it is
pretty obvious what the problem might have been. However, the solution was not
evident at first. This proves one thing for certain: some problems cannot be
resolved without TWO cups of coffee in the morning.

Of course, I resorted to the command line to see what was REALLY
happening.

The mkwpar command was
indeed complaining that /wpars/wpar20 already existed.

#
mkwpar -n wpar20

mkwpar:
0960-288 The /wpars/wpar20 file system already exists.

mkwpar:
0960-417 Specify -p to create this workload partition using the existing file
system data.

I could not find any directory or file system with that name,
anywhere under /wpars or elsewhere.

#
ls -l /wpars/wpar20

/wpars/wpar20
not found

#
ls -l /wpars

total
0

#

#
df –g | grep –i wpar

#

#
lsvg | lsvg –il | grep –i wpar

#

I ran the mkwpar with truss to try to find out where it was
finding the reference to /wpars/wpar20.
At this point you have probably already determined where the problem might be!

The truss output
pointed me in the direction of /etc/filesystems.....of
course! Another “facepalm“ moment!

#
truss mkwpar -n wpar20 > /tmp/wpar20 2>&1

#
vi /tmp/wpar220

statx("/wpars/wpar20",
0x30008AB8, 128, 010) = -1

kopen("/etc/filesystems", O_RDONLY|O_LARGEFILE) =
4

Whatever my learned colleague had done when he “removed” the WPAR
with the GUI, it left entries for the WPARs file systems in /etc/filesystems.

In order to create the WPAR, I needed an AIX 5.2 mksysb file to
supply to the mkwpar command.

Fortunately, I just happened to have an old AIX 5.2 mksysb image
in my archives!

I then executed the following command to build the WPAR:

# mkwpar -n wpar1 -C -B /home/cgibson/AIX5202_64bit-mksysb

The flags to the command are:

-n wparname

Specifies the name for the workload partition to be
created. You must specify a name, either using the -n flag or in a
specification file using the -f flag, unless the -p name or both –w and -o
flags are used.

-B wparbackupdevice

Specifies a device containing a workload partition
backup image. This image is used to populate the workload partition file
systems. The wparBackupDevice parameter is a workload partition image that is
created with the savewpar, mkcd, or mkdvd command. The -B flag is used by the
restwpar command as part of the process of creating a workload partition from a
backup image.

-C

Creates a versioned workload partition. This option
is valid only when additional versioned workload partition software has been
installed.

I was then able to start my new AIX 5.2 WPAR successfully!

# startwpar -v wpar1

Starting workload partition wpar1.

Mounting all workload partition file systems.

Mounting /wpars/wpar1

Mounting /wpars/wpar1/home

Mounting /wpars/wpar1/mksysb

Mounting /wpars/wpar1/nre/opt

Mounting /wpars/wpar1/nre/sbin

Mounting /wpars/wpar1/nre/usr

Mounting /wpars/wpar1/opt

Mounting /wpars/wpar1/proc

Mounting /wpars/wpar1/tmp

Mounting /wpars/wpar1/usr

Mounting /wpars/wpar1/usr/local

Mounting /wpars/wpar1/var

Mounting /wpars/wpar1/var/log

Mounting /wpars/wpar1/var/tsm/log

Loading workload partition.

Exporting workload partition devices.

Exporting workload partition kernel extensions.

Starting workload partition subsystem cor_wpar1.

0513-059 The cor_wpar1 Subsystem has been started.
Subsystem PID is 8388822.

Verifying workload partition startup.

Return Status = SUCCESS.

The WPAR was now in an active state and the associated file
systems were mounted (as shown from the Global
environment).

# lswpar

NameStateTypeHostnameDirectoryRootVG WPAR

--------------------------------------------------------

wpar1ASwpar1/wpars/wpar1no

# mount | grep wpar

/dev/lv00/wpars/wpar1jfsJul 26 20:13 rw,log=/dev/loglv00

/dev/lv01/wpars/wpar1/home
jfsJul 26 20:13 rw,log=/dev/loglv00

/dev/lv02/wpars/wpar1/mksysb
jfsJul 26 20:13 rw,log=/dev/loglv00

/opt/wpars/wpar1/nre/opt namefs Jul 26 20:13 ro

/sbin/wpars/wpar1/nre/sbin namefs Jul 26 20:13 ro

/usr/wpars/wpar1/nre/usr namefs Jul 26 20:13 ro

/dev/lv03/wpars/wpar1/opt
jfsJul 26 20:13 rw,log=/dev/loglv00

/proc/wpars/wpar1/proc
namefs Jul 26 20:13 rw

/dev/lv04/wpars/wpar1/tmp
jfsJul 26 20:13 rw,log=/dev/loglv00

/dev/lv05/wpars/wpar1/usr jfsJul 26 20:13 rw,log=/dev/loglv00

/dev/lv06/wpars/wpar1/usr/local jfsJul
26 20:13 rw,log=/dev/loglv00

/dev/lv07/wpars/wpar1/var
jfsJul 26 20:13 rw,log=/dev/loglv00

/dev/lv08/wpars/wpar1/var/log jfsJul 26
20:13 rw,log=/dev/loglv00

/dev/lv09/wpars/wpar1/var/tsm/log jfsJul 26 20:13 rw,log=/dev/loglv00

I was curious what the WPAR environment was going to look like, so
I used clogin to access
it and run a few commands.

From the Global environment I confirmed I was indeed on an AIX 7
system.

# uname -W

0

# oslevel

V7BETA

From within the WPAR, I confirmed that I was indeed running AIX
5.2! Wow!

# clogin wpar1

wpar1 : / # oslevel

5.2.0.0

And I could see all 8 logical CPUs (4 hardware threads per POWER7
processor i.e. SMT-4).

wpar1 : / # sar -P ALL 1 5

AIX wpar1 2 5 00F602734C0007/26/10

wpar1 configuration: @lcpu=8@mem=4096MB@ent=0.50

20:22:20 cpu%usr%sys%wio%idlephysc%entc

20:22:2107781140.010.0

11700290.010.0

2010990.000.0

30001000.010.0

40350650.000.0

70280720.000.0

U--0930.4793.9

-030970.030.0

I noticed an interesting device in the lscfg output.

wpar1 : / # lscfg

INSTALLED RESOURCE LIST

The following resources are installed on the
machine.

+/- = Added or deleted from Resource List.

*=
Diagnostic support not available.

Model
Architecture: chrp

Model
Implementation: Multiple Processor, PCI bus

+ sys0System Object

*
wio0WPAR I/O Subsystem

Also noticed some new and interesting mount points, for example /nre/opt.

wpar1 : / # df

Filesystem512-blocksFree %UsedIused %Iused Mounted on

Global1310729992824%14245% /

Global1310721267044%701% /home

Global104857610155604%171% /mksysb

Global78643242890446%733114% /nre/opt

Global4587528840081%1002047% /nre/sbin

Global498073624872100%5369887% /nre/usr

Global1310726380052%164011% /opt

Global-----/proc

Global1310721250805%521% /tmp

Global157286416574490%2318312% /usr

Global5242884944646%1541% /usr/local

Global 13107211151215%4934% /var

Global2621442537444%281% /var/log

Global1310721268324%201% /var/tsm/log

I did have one minor problem when I first tried to start my WPAR,
but that issue was quickly resolved by the AIX developers on the AIX 7 Open
Beta Forum.

Whenever
I’m building a new AIX system I always make sure to install it. I really like
the fact that I can quickly list processes that are connected to TCP and UDP
ports on my system. For example, to check for the current SSH connections on my
system I can run lsof and check
port 22 (SSH). Immediately I have a good idea of the existing SSH
sessions/connections. I can also check to see if the SSH server (sshd daemon)
is running and listening (LISTEN) on my AIX partition.

But
sometimes I work on systems that don’t have lsof installed. It may not be practical or appropriate for me to
install it either. So I have
to find another tool (or tools) that will do something similar.

Of course,
I could use netstat to check
that a server daemon was listening on a particular TCP port and view any
established connections. But this doesn’t give me the associated process id’s.

$ netstat -a | grep -i ssh

tcp400*.ssh*.*LISTEN

tcp4048aix01.ssh172.29.131.16.50284ESTABLISHED

Fortunately,
the rmsock command
can provide that information. So if I wanted to find the process id for the
sshd daemon that is listening on my system I’d do the following. First I need
to find the socket id using netstat*.

# netstat -@aA | grep -i ssh
| grep LIST | grep Global

Globalf1000700049303b0 tcp40 0*.ssh*.*LISTEN

Then
I can use rmsockto
discover the process id associated with the sockect. In this case it’s PID 282700.

$ rmsock f1000200003e9bb0
tcpcb

The socket 0x3e9808 is being
held by proccess 282700 (sshd).

Unlike what its name implies, rmsock
does not remove the socket, if it is being used by a process. It just reports
the process holding the socket. Note that the second argument of rmsock is the protocol. It's tcpcb
in this example to indicate that the protocol is TCP. The results of the
command are also logged to /var/adm/ras/rmsock.log.

#
tail /var/adm/ras/rmsock.log

socket
0xf100020001c45008 held by process 434420 (writesrv) can't be removed.

socket
0xf100020000663008 held by process 418040 (java) can't be removed.

socket
0xf1000200012ad008 held by process 418040 (java) can't be removed.

socket
0xf100020000dec008 held by process 163840 (inetd) can't be removed.

socket
0xf100020000deb008 held by process 163840 (inetd) can't be removed.

socket
0xf10002000016f808 held by process 192554 (snmpdv3ne) can't be removed.

socket
0xf100020001c51808 held by process 442596 (dtlogin) can't be removed.

socket
0xf1000200012a4008 held by process 418040 (java) can't be removed.

socket
0xf100020000666008 held by process 315640 (java) can't be removed.

socket
0xf100020000deb808 held by process 163840 (inetd) can't be removed.

*Note: In my
example I specified the @ symbol with the netstat command. I
also grep’ed for the string Global.
You may have to do the same if you have WPARs running on your system. In my
case I have two active WPARs who both have their own sshd process. My Global
environment also has an sshd process. So in total there are three sshd daemons
that I can view from the Global environment. By specifiying the @ symbol with
netstat, I can quickly determine which process belongs to the Global
environment and those that exist within each WPAR.