Category Archives: SuperCluster

One of the big challenges with the Software in Silicon features is actually monitoring them to see if they are actually doing anything. To monitor the DAX on the SPARC M7 chip you used to have to use busstat commands, and then interpret them (not easy! the documentation is not very thorough).

One challenge is with Exadata cells in a lab environment is that they are secure! This means that it has long lock out times in the event of an incorrect login and tough lock settings. You can manually change these.. but every time you update your cell there is a chance they will be reset.

So, you have a zpool provided by an iscsi LUN which is tight on space, and you’ve done all the tidying you can think of.. what do you do next? Well if you’re lucky, you have space to expand the iscsi LUN and then make it available to your zpool.

First – find the LUN that holds the zpool using zpool status <poolname>

Locate the LUN on your storage appliance. If you are on a ZFS appliance there is a really handy script in Oracle Support Document 1921605.1 Otherwise you’ll have to use the tools supplied with your storage array or your eyes 😉

So, I’ve located my lun on my ZFS appliance by matching the LUN identifier and then I need to change the LUN size..

Application Zones on a SuperCluster Solaris 11 LDOM are subject to a lot fewer restrictions than the Exadata Database Zones. This also means that the documentation is less proscriptive and detailed.

This post will show a simple Solaris 11 zone creation, it is meant for example purposes only and not as a product. I am going to use a T5 SuperCluster for this walkthrough. The main difference you will need to consider for a M7 SuperCluster are:

both heads of the ZFS-ES are active so you will need to select the correct head and infiniband interface name.

there is only 1 QGBE card available per PDOM. This means you may need to present vnics from the domain that owns the card for the management network if you require this connectivity.

Considerations

As per note 2041460.1 the best practice for creating the file systems for the zone root filesystem is to 1 LUN per LDOM and create a filesystem on this shared pool for each application zone. Reservations and quotas can be used to prevent a zone from using more that its share.

You need to make sure you calculate minimum number of cores required for the global-zone as per note 1625055.1

You need to make sure that the IPS repos are all available, and that any IDRs you have applied to your global zone are available.

Preparation

Put entries into global zone’s hostfile to for your new zone. I will use 3 addresses, one for the 1gbit management network, 1 for the 10gbit client network and 1 for the infiniband network on the storage partition (p8503).

Create an iscsi LUN for the zone root filesystem if you do not already have one already defined to hold zone roots. I am going to use the iscsi-lun.sh script that is designed for use by other tools which create the Exadata Database Zones. The good thing about using this is it follows the naming convention etc. used for the other zones. However, it is not installed by default on Application zones (it is provided by the system/platform/supercluster/iscsi package in the exa-family repository) and this is not a supported use of the script.

-z is the name of my ZFS-ES

-i is the 1gbit hostname of my globalzone

-n and -N are used by the exavm utility to create the LUNs. In our case they will both be set to 1.

-s The size of the LUN to be created.

-l the volume block size. I have selected 32K, you may have other performance metrics that lead you to a different block size.

root@sc5bcn01-d3:/opt/oracle.supercluster/bin# ./iscsi-lun.sh create \
-z sc5bsn01 -i sc5bcn01-d3 -n 1 -N 1 -s 500G -l 32K
Verifying sc5bcn01-d3 is an initiator node
The authenticity of host 'sc5bcn01-d3 (10.10.14.14)' can't be established.
RSA key fingerprint is 72:e6:d1:a1:be:a3:b3:d9:96:ea:77:61:bd:c7:f8:de.
Are you sure you want to continue connecting (yes/no)? yes
Password:
Getting IP address of IB interface ipmp1 on sc5bsn01
Password:
Setting up iscsi service on sc5bcn01-d3
Password:
Setting up san object(s) and lun(s) for sc5bcn01-d3 on sc5bsn01
Password:
Setting up iscsi devices on sc5bcn01-d3
Password:
c0t600144F0F0C4EECD00005436848B0001d0 has been formatted and ready to use

Create partitions so your zone can access the IB Storage Network (optional, but nice to have, and my example will include them). First locate the interfaces that have access to the IB Storage Network partition (PKEY=8503) using dladm and then create partitions using these interfaces.

Create the Zone

Prepare your zone configuration file, here is mine. Note, I have non-standard link names to make it more readable. You will need to use ipadm to determine the lower-link names that match your system

create -b
set brand=solaris
set zonepath=/zoneroots/sc5b01-d4-rpool
set autoboot=true
set ip-type=exclusive
add net
set configure-allowed-address=true
set physical=sc5b01d4_net7_p8503
end
add net
set configure-allowed-address=true
set physical=sc5b01d4_net8_p8503
end
add anet
set linkname=net0
set lower-link=auto
set configure-allowed-address=true
set link-protection=mac-nospoof
set mac-address=random
end
add net
set linkname=mgmt0
set lower-link=net0
set configure-allowed-address=true
set link-protection=mac-nospoof
set mac-address=random
end
add net
set linkname=mgmt1
set lower-link=net1
set configure-allowed-address=true
set link-protection=mac-nospoof
set mac-address=random
end
add anet
set linkname=client0
set lower-link=net2
set configure-allowed-address=true
set link-protection=mac-nospoof
set mac-address=random
end
add anet
set linkname=client1
set lower-link=net5
set configure-allowed-address=true
set link-protection=mac-nospoof
set mac-address=random
end

Implement the zone configuration using your pre-configured file or type it in manually..

Next you boot the zone, and use zlogin -C to login to the console and answer the usual Solaris configuration questions about root password, timezone, locale. I do not usually configure the networking at this time, and add it later.

Resource Capping

At the time of writing (20/04/16) virtual and physical memory capping is not supported on SuperCluster. This is mentioned in Oracle Support Document 1452277.1 (SuperCluster Critical Issues) as issue SOL_11_1.

This post is mainly related to SuperCluster configuration, but can be applied to other solaris based systems.

In SuperCluster you can run the Oracle RDBMS in zones, and the zones are CPU capped. You may want to change the number of CPUs assigned to your zone for a couple of reasons

1) You have changed the number of CPUs that are available in the LDOM supporting this zone using a tool such as setcoremem and want to resize the zone to take this into account.

2) You need to change the zone size because you need more / less capacity.

Determine the number of processors that need to be reserved for the global zone (and all of your other zones!)

For SuperCluster you should reserve a miniumum of 2 cores per IB HCA in domains with a single HCA, a minimum of 4 cores for domains with 2 or more HCA. You also need to take into account other zones running on the system.

So in my case, there are 512 virtual CPU, of which I need to reserve at least 32 (4 cores x 8 threads) for my global zone, and as I will also have an app zone on there, possibly a lot more. So lets say I’m going to keep 16 cores in global that would leave 384 for my dbzone

Get the name of the pool

The SuperCluster db zone creation tools create the pools with logical names based on the zonename. However, the way to be sure what pool is in use is to look at the zone’s definition

If this doesn’t return the name of the pool, your zone is not using resource pools and may be using one of the other methods of capping CPU usage such as allocating dedicated cpu resources (ncpus=X). If so.. stop reading here as this is not the blog post you are looking for.

Find the processor set associated with your pool by either looking in /etc/pooladm.conf or by running pooladm with no parameters to get the current running config. Checking in pooladm is WAY more readable so that is my preferred method.

It looks to me that the res=”pset_1″ in the pool definition points to the ref_id=”pset_1″ in the pset definition.

OK, so now I know my pset is called pset_sc5acn01-d5.blah.uk.mydomain.com_id_6135 and it currently has 8 cpus. I also know that my running config and my persistent config are synchronised.

Shutdown the database zone.

This may not be necessary, but since Oracle can make assumptions based on CPU count at startup, I think it is safest.
# zoneadm -z sc5acn01-d5 shutdown

Change the pset configuration

I’m going to do this by changing the config file to make it persistent, as there’s nothing more embarassing than making a change that is lost by a reboot. I set the processor set to have a minimum of 384 CPU and a maximum of 384 CPU.
# poolcfg -c 'modify pset pset_sc5acn02-d5.blah.uk.mydomain.com_id_6135 ( uint pset.min=384 ; uint pset.max=384 )' /etc/pooladm.conf

Check that it has applied to your config file# grep pset_sc5acn01-d5 /etc/pooladm.conf

Force it to re-read the file and use the new configuration

# pooladm -c
# pooladm -s

Now you can run pooladm without any arguments and get the running config. If it all looks ok, go ahead and boot your zone

Create the required oracle network resources on one node of the cluster. If you’re uncertain about the subnet to specify you can use one of the online calculators such as http://www.subnet-calculator.com/

Define the network interface as a global public cluster interface as root

Add entries to the grid home tnsnames.ora. There are slightly different configuartions per node, as we will explicitly name the ‘local’ and ‘remote’ nodes. If we had more than 2 nodes, then our entries for LISTENER_IBREMOTE and LISTENER_IPREMOTE would list all of the non-local nodes.

Certificates that have been exported by Apache cannot be directly imported into OTD. Apache exports consist of 2 files, a .pem and a .key, and if you try to load the .pem file you will get the error ‘OTD-64112’

Then you import them into Oracle Traffic Director. The first time I tried this I had problems, as the new certificates were not visible in the GUI despite it noticing that files cert8.db/key3.db had changed and prompting for a reload. cert8.db/key3.db are the older Berkeley format databases, and OTD uses the newer SQLite format files cert9.db/key4.db. You can control this by setting the environment variable NSS_DEFAULT_DB_TYPE or by putting the prefix sql: on the directory path to the certificate directory specified by -d

As Oracle

Stop any databases running from the ORACLE_HOME where you want to enable DNFS.
Ensure you can remotely authenticate as sysdba, creating a password file using orapwd if required.
Relink for dnfs support

I was a little uncertain about the oradnfstab entries as most examples relate to a ZFS-BA which has many IB connections and 2 active heads, whereas the 7320 in this case was set in Active/Passive. I created $ORACLE_HOME/dbs/oradnfstab with the following entries.