First of all a disclaimer: I don’t work for Oracle nor do I speak for them. I believe this information to be correct, but for licensing questions, Oracle themselves have the final word.

With that out of the way, followers of this blog may have seen some of the results from my testing of actual CPU capacity with public clouds like Amazon Web Services, Microsoft Azure, and Google Compute Engine. In each of these cases, a CPU “core” was actually measured to be equivalent to an x86 HyperThread, or half a physical core. So when provisioning public cloud resources, it’s important to include twice as many CPU cores as the equivalent physical hardware. The low price and elasticity of public cloud infrastructure can however offset this differential, and still result in a cost savings over physical hardware.

One place this difference in CPU core calculation can have a significant impact, however, is software licensing. In this post I’ll look at Oracle database licensing in particular.
Oracle databases can be licensed using many metrics, including unlimited use agreements, embedded licenses, evaluation/developer licenses, partner licenses, and many more. But for those without a special agreement in place with Oracle, there are two ways to license products: Named User Plus (NUP) and processor licenses. NUP licenses are per-seat licenses which have a fixed cost per physical user or non-user device. The definition of a user is very broad, however. Quoting the Oracle Software Investment Guide:

Named User Plus includes both humans and non-human operated devices. All human users and non-human operated devices that are accessing the program must be licensed. A non-human operated device can be many things, such as a temperature-monitoring device. It is important to note that if the device is operated by a person, then this person must be licensed. As described in illustration #1, the 400 employees who are operating the 30 forklifts must be licensed because the forklift is not a “non-human operated device”.

So, if the application has any connection outside the organization (batch data feeds and public web users would be examples), it’s very difficult to fit the qualifications to count as NUP licenses.

Now, this leaves per-processor licenses, using processor cores that can potentially run the database software as licensing metric. When running in a public cloud, however, there is an immediate issue, which is your Oracle instance could presumably run on any of the thousands of servers owned by the cloud provider, so unique physical processors are virtually impossible to count. Fortunately, Oracle has provided a way to properly license Oracle software in public cloud environments: Licensing Oracle Software in the Cloud Computing Environment. It sets out a few requirements, including:

Amazon EC2, Amazon S3, and Microsoft Azure are covered under the policy.

There are limits to the counting of sockets and the number of cores per instance for Standard Edition and Standard Edition One.

But most importantly is the phrase customers are required to count each virtual core as equivalent to a physical core. Knowing that each “virtual core” is actually half a physical core, it can shift the economics of public cloud usage for Oracle database significantly.

Here’s an example of a general-purpose AWS configuration and a close equivalent on physical hardware. I’m excluding costs of external storage and datacenter costs (power, bandwidth, etc) from the comparison.

Now let’s add in an Oracle license. From the Oracle Price List, a socket of Standard Edition One costs $5800, with an additional $1276/year for support. Due to the counting of CPU cores, our AWS hardware requires two sockets of licensing. So instead of saving $772, we end up paying $9628 more.

If we were to use Oracle Enterprise edition (excluding any options or discounts), that becomes an extra $157,700. Not small change anymore.

So before you make the jump to put your Oracle databases on a public cloud, check your CPU core counts to avoid unexpected licensing surprises.

For the n1 series of machine types, a virtual CPU is implemented as a single hyperthread on a 2.6GHz Intel Sandy Bridge Xeon or Intel Ivy Bridge Xeon (or newer) processor. This means that the n1-standard-2 machine type will see a whole physical core.

I still believe calling such a hyperthread a “virtual CPU” is misleading. When creating a virtual machine in a non-cloud VM platform, 1 virtual CPU = 1 physical core. Plain and simple. But when using a cloud platform, I need 2 virtual CPUs to get that same physical core.

Anyways, off to run some CPU tests. n1-standard-4 is a close match to the m3.xlarge instances previously tested, so I’ll try that.

Getting set up on Google Compute Engine

I already signed up with Google Compute Engine’s free trial and created a project I’m calling marc-cpu-test. Installing the gcloud compute command-line tools.

OK, instance all set and connected. As a CentOS 6 image it doesn’t allow SSH root logins by default, so attempting to set up a gcloud environment as a root user will get you “permission denied” errors on SSH. Serves me right for trying to run these tools as root in the first place :-).

So the taskset commands are working: when we ask for CPUs 0 and 1, we are getting them, but throughput shows that cores aren’t being shared. It means that the CPUs in the virtual machine are not statically bound to hardware threads as seen under AWS. I’d call it a win as it gets more consistent performance even if the guest operating system is forced to makes poor CPU scheduling decisions as in this case.

Lessons learned

Although they share virtual CPUs like competitors, Google is very upfront about this behavior.

Actual throughput for a simple gzip workload is excellent.

Google Compute Engine has an abstraction layer in front of CPUs that dynamically schedules tasks between CPU threads, in addition to the regular scheduler in the virtual machine. In my testing, it allocates tasks efficiently across CPU cores, even when the OS scheduler is configured suboptimally.

I’ve been working on setting up a demo for my upcoming presentation on application continuity at RMOUG training days later this month. The challenge is to get a multi-node cluster, plus a load generator, and a host OS, to fit on a memory-constrained laptop.

According to the Oracle grid installation guide, 4GB per virtual host is the minimum requirement. However with a few tweaks I’ve been able to get the full stack to run in 2GB of memory. For anyone else out there installing 12c clusters into virtual machines, here are a few tips.

But first the disclaimer: these changes are mostly unsupported by Oracle, and intended for test and demo databases. They can potentially cause unexpected behaviour, hangs, or crashes. Don’t try this in production!

Grid Infrastructure management repository database (GIMR): This is a full Oracle database that stores operating system workload metrics generated by the cluster health monitor, for use Oracle QoS management and troubleshooting. Being a full database, it has a large memory and CPU footprint. I originally installed Oracle 12.1.0.1 skipping the database on install, and upgraded to 12.1.0.2 without it. However, it looks like it’s no longer possible to skip the installation on the GUI. My colleague Gleb suggests adding -J-Doracle.install.mgmtDB=false on the installer command line to skip it.

Cluster health monitor (CHM): this tool colleccts a myriad fo worklaod-related metrics to store in the GIMR. And it uses a surprisingly high amount of CPU: it was the top CPU consumer in my VM before removal. It can be disabled fairly easily, with a hat tip to rocworks:
$ crsctl stop res ora.crf -init
# crsctl delete res ora.crf -init

Trace File Analyzer Collector (TFA): collects log and trace files from all nodes and products into a single location. Unfortunately it’s written in Java with its own Java Virtual Machine, again requiring a large memory footprint for the heap etc. It can be removed wit ha single command, though note that next time you run rootcrs.pl (patching for example) it will reinstall itself.
# tfactl uninstall

Cluster Verification Utility (CVU): As you install Oracle Grid Infrastructure, the CVU tool automatically runs, pointing out configuration issues that may affect system operation (such as running under 4GB of RAM). In Oracle 12.1.0.2, it also gets scheduled to run automatically every time the cluster is started and periodically after that. The CVU itself and checks use CPU and RAM resources, and are better run manually when such resources are limited. It’s also a quick removal:
$ srvctl cvu stop
$ srvctl cvu disable

ASM memory target: as of 12c, the ASM instance has a default memory target of 1 gigabyte, a big jump from the 256mb of Oracle 11g. And if you set a lower target, you’ll find it’s ignored unless it’s overridden with a hidden parameter. I’ve set it to 750mb with good results, and it can possibly be set even lower in light-utilization workloads:
$ sqlplus "/ as sysasm"
alter system set "_asm_allow_small_memory_target"=true scope=spfile;
alter system set memory_target=750m scope=spfile;
alter system set memory_max_target=750m scope=spfile;
exit
# service ohasd stop
# service ohasd start

A non-memory issue I’ve run into is the VKTM, virtual keeper, to time background process using large amounts of CPU time in both ASM and database instances. I’ve noticed it to be especially pronounced in virtual environments, and in Oracle Enterprise Linux 6. I’ve ended up disabling it completely without obvious ill effects, but as always, don’t try on your “real” production clusters.
alter system set "_disable_highres_ticks"=TRUE scope=spfile;

Additionally, Jeremy Schneider has taken on the biggest remaining GI memory user, the Oracle cluster synchronization service daemon (OCSSD). This is an important cluster management process, and Jeremy figured out a way to unlock its memory in the gdb debugger, allowing it to be swapped out. My own tests were less successful: the process wasn’t swapped out even after trying his changes. But his blog post is worth a read, and others may have more success than I did.

I also noted that during the link phase of installation and patching, the ld process alone takes over 1GB of RAM. So either shut down clusterware or add swap and wait while linking.

So to wrap up, I’ve managed to get a full Oracle GI 12.1.0.2 stack including database to run in a virtual machine with 2GB RAM. Readers, any other tips to put the goliath that is Oracle GI on a diet?

Signing up for Azure’s 30-day trial gives $200 in credit to use over the next 30-day period: more than enough for this kind of testing. Creating a new virtual machine, using the “quick create” option with Oracle Linux, and choosing a 4-core “A3” standard instance.

I must say I like the machine naming into built-in “clouadpp.net” DNS that Azure uses: no mucking around with IP addresses. The VM provisioning definitely takes longer than AWS, though no more than a few minutes. And speaking of IP addresses, both start with 191.236. addresses assigned to Microsoft’s Brazilian subsidiary through the Latin American LACNIC registry, due to the lack of north american IP addresses.
Checking out the CPU specs as reported to the OS:

It’s using full CPUs and all from gzip, so no large system overhead here. Also, “%st”, time reported “stolen” by the hypervisor, is zero. We’re simply getting half the throughput of AWS.

Basic instances

In addition to standard instances, Microsoft makes available basic instances, which claim to offer “similar machine configurations as the Standard tier of instances offered today (Extra Small [A0] to Extra Large [A4]). These instances will cost up to 27% less than the corresponding instances in use today (which will now be called “Standard”) and do not include load balancing or auto-scaling, which are included in Standard” (http://azure.microsoft.com/blog/2014/03/31/microsoft-azure-innovation-quality-and-price/)

Having a look at throughput here, by creating a basic A3 instance “marc-cpu-basic” that otherwise matches exactly marc-cpu created earlier.

Very consistent results, but consistently slow. They do show that cores aren’t being shared, but throughput is lower than even a shared core under AWS.

Wrapping up

Under this simple gzip test, we are testing CPU integer performance. The Azure standard instance got half the throughput of the equivalent AWS instance, in spite of a clock speed only 15% slower. But the throughput was consistent: no drops when running on adjacent cores. The basic instance was a further 33% slower than a standard instance, in spite of having the same CPU configuration.

Under Azure, we simply aren’t getting a full physical core’s worth of throughput. Perhaps the hypervisor is capping throughput, and capping even lower for basic instances? Or maybe the actual CPU is different than the E5-2660 reported? For integer CPU-bound workloads like our gzip test, we would need to purchase at least twice as much capacity under Azure than AWS, making Azure considerably more expensive as a platform.

I’ve been doing some testing to clarify what a vCPU in Amazon Web Services actually is. Over the course of the testing, I experienced inconsistent results on a 2-thread test on a 4-vCPU m3.xlarge system, due to the mislabeling of the vCPUs as independent single-core processors by the Linux kernel. This issue manifests itself in a CPU-bound, multithreaded workload where there is idle CPU time.

My test environment used a paravirtualized (PV) kernel, which moves some of the virtualization logic into the Linux kernel, reducing the need for high-overhead hardware emulation. One drawback is that the kernel cannot be modified to, for example, resolve the CPU mislabeling. But there is an alternative: an HVM system relying on virtualization extensions in the CPU hardware and allowing custom kernels or even non-Linux operating systems to run. Historically the drawback has been a performance hit, though I read a very interesting post from Brendan Gregg’s blog, indicating that what’s called HVM in Amazon EC2 is actually a hybrid of PV and HVM, combining aspects of both. A test run by Phoronix on EC2 showed HVM performance on par with PV, and in some cases even better. So it definitely seems worth repeating my earlier tests on.
As before, I fire up an instance, but this time using the latest HVM Amazon Linux image:

We see that work is split between adjacent CPUs, but that the scheduler is doing a good job of keeping the adjacent CPUs near 100% usage between them.

So based on these tests, it looks like, even though the CPU is still mislabeled, HVM has almost entirely avoided the issue of variability due to shared-core scheduling, at the cost of a small reduction in overall throughput.

Some months ago, Amazon Web Services changed the way they measure CPU capacity on their EC2 compute platform. In addition to the old ECUs, there is a new unit to measure compute capacity: vCPUs. The instance type page defines a vCPU as “a hyperthreaded core for M3, C3, R3, HS1, G2, and I2.” The description seems a bit confusing: is it a dedicated CPU core (which has two hyperthreads in the E5-2670 v2 CPU platform being used), or is it a half-core, single hyperthread?

I decided to test this out for myself by setting up one of the new-generation m3.xlarge instances (with thanks to Christo for technical assistance). It is stated to have 4 vCPUs running E5-2670 v2 processor at 2.5GHz on the Ivy Bridge-EP microarchitecture (or sometimes 2.6GHz in the case of xlarge instances).

Looks like I got some of the slightly faster 2.6GHz CPUs. /proc/cpuinfo shows four processors, each with physical id 0 and core id 0. Or in other words, one single-core processor with 4 threads. We know that the E5-2670 v2 processor is actually a 10-core processor, so the information we see at the OS level is not quite corresponding.

Nevertheless, we’ll proceed with a few simple tests. I’m going to run “gzip”, an integer-compute-intensive compression test, on 2.2GB of zeroes from /dev/zero. By using synthetic input and discarding output, we can avoid effects of disk I/O. I’m going to combine this test with taskset comments to impose processor affinity on the process.

Sharing those cores

Now, let’s make things more interesting: two threads, on adjacent processors. If they are truly dedicated CPU cores, we should get a full 121 MB/s each. If our processors are in fact hyperthreads, we’ll see throughput drop.

We have our answer: throughput has dropped by a third, to 79.9 MB/sec, showing that processors 0 and 1 are threads sharing a single core. (But note that Hyperthreading is giving performance benefits here: 79.9 MB/s on a shared core is higher than then 60.5 MB/s we see when sharing a single hyperthread.)

Trying the exact same test, but this time, non-adjacent processors 0 and 2:

This means that the OS scheduler has no way of knowing which processors have shared cores, and can not schedule tasks around it. Let’s go back to our two-thread test, but instead of restricting it to two specific processors, we’ll let it run on any of them.

We see throughput varying between 82 MB/sec and 120 MB/sec, for the exact same workload. To get some more performance information, we’ll configure top to run 10-second samples with per-processor usage information:

Although usage percentages are similar, we’ve seen earlier that throughput drops by a third when cores are shared, and we see varied throughput as the processes are context-switched between processors.

This type of situation arises where compute-intensive workloads are running, and when there are fewer processes than total CPU threads. And if only AWS would report correct core IDs to the system, this problem wouldn’t happen: the OS scheduler would make sure processes did not share cores unless necessary.

Here’s a chart summarizing the results:

Summing up

Over the course of the testing I’ve learned two things:

A vCPU in an AWS environment actually represents only half a physical core. So if you’re looking for equivalent compute capacity to, say, an 8-core server, you would need a so-called 4xlarge EC2 instance with 16 vCPUs. So take it into account in your costing models!

The mislabeling of the CPU threads as separate single-core processors can result in performance variability as processes are switched between threads. This is something the AWS and/or Xen teams should be able to fix in the kernel.

Readers: what has been your experience with CPU performance in AWS? If any of you has access to a physical machine running E5-2670 processors, it would be interesting to see how the simple gzip test runs.

]]>https://www.pythian.com/blog/virtual-cpus-with-amazon-web-services/feed/1666866Disabling Triggers in Oracle 11.2.0.4https://www.pythian.com/blog/disabling-triggers-in-oracle-11-2-0-4/
https://www.pythian.com/blog/disabling-triggers-in-oracle-11-2-0-4/#commentsWed, 08 Jan 2014 14:10:42 +0000http://www.pythian.com/blog/?p=62761enable_goldengate_replication parameter, use a program name that starts with replicat, and set your module to OGG.]]>

Unfortunately Oracle seems to have disabled this use in 11.2.0.4, and most likely 12.1 as well. Boo-Hiss! This is needed functionality for DBAs!

A new parameter: enable_goldengate_replication

I tried this on an Oracle 11.2.0.4 system, and I indeed got an error:

SQL> exec sys.dbms_xstream_gg.set_foo_trigger_session_contxt(fire=>true);
BEGIN sys.dbms_xstream_gg.set_foo_trigger_session_contxt(fire=>true); END;
*
ERROR at line 1:
ORA-26947: Oracle GoldenGate replication is not enabled.
ORA-06512: at "SYS.DBMS_XSTREAM_GG_INTERNAL", line 46
ORA-06512: at "SYS.DBMS_XSTREAM_GG", line 13
ORA-06512: at line 1

A quick look at oerr gives a path forward, assuming you do indeed have a GoldenGate license:

[oracle@ora11gr2b ~]$ oerr ora 26947
26947, 00000, "Oracle GoldenGate replication is not enabled."
// *Cause: The 'enable_goldengate_replication' parameter was not set to 'true'.
// *Action: Set the 'enable_goldengate_replication' parameter to 'true'
// and retry the operation.
// Oracle GoldenGate license is needed to use this parameter.

ENABLE_GOLDENGATE_REPLICATION controls services provided by the RDBMS for Oracle GoldenGate (both capture and apply services). Set this to true to enable RDBMS services used by Oracle GoldenGate.
…
The RDBMS services controlled by this parameter also include (but are not limited to):
…
Service to suppress triggers used by GoldenGate Replicat

The SQL statement is actually checking two things. The first is looking for the current username in the dba_goldengate_privileges view. This view isn’t listed in the Oracle 11.2 documentation, but it does appear in the 12c docs:

ALL_GOLDENGATE_PRIVILEGES displays details about Oracle GoldenGate privileges for the user.

Oracle GoldenGate privileges are granted using the DBMS_GOLDENGATE_AUTH package.

Related Views

DBA_GOLDENGATE_PRIVILEGES displays details about Oracle GoldenGate privileges for all users who have been granted Oracle GoldenGate privileges.

USER_GOLDENGATE_PRIVILEGES displays details about Oracle GoldenGate privileges. This view does not display the USERNAME column.

I had previously run dbms_goldengate_auth to grant privs here, so should be OK.

The second check simply verifies that the DBA role had been granted to the current user, again as recommended by the documentation. (A side note: in previous versions, I had avoided granting the overly broad DBA role to the GoldenGate user in favor of specific grants for the objects it uses. There’s no reason for the GoldenGate user to need to read and modify data objects that aren’t related to its own replication activities for example. And I would argue that it helps avoid errors such as putting the wrong schema in a map statement when permissions are restricted. But sadly it’s no longer possible in the world of 11.2.0.4.)

Running the query manually to verify that the grants are indeed in place:

Because this SQL statement involves an ordinary select without an aggregate function, I can look at the FETCH line in the tracefile to get the number of rows returned. In this case it’s r=0, meaning no rows returned.

The query itself is looking for a system property I haven’t seen before: GG_XSTREAM_FOR_STREAMS. A Google search returns only a single result: the PDF version of the Oracle 11.2 XStream guide. Quoting:

ENABLE_GG_XSTREAM_FOR_STREAMS Procedure
This procedure enables XStream capabilities and performance optimizations for Oracle
Streams components.
This procedure is intended for users of Oracle Streams who want to enable XStream
capabilities and optimizations. For example, you can enable the optimizations for an
Oracle Streams replication configuration that uses capture processes and apply
processes to replicate changes between Oracle databases.
These capabilities and optimizations are enabled automatically for XStream
components, such as outbound servers, inbound servers, and capture processes that
send changes to outbound servers. It is not necessary to run this procedure for
XStream components.
When XStream capabilities are enabled, Oracle Streams components can stream ID key
LCRs and sequence LCRs. The XStream performance optimizations improve efficiency
in various areas, including:
? LCR processing
? Handling large transactions
? DML execution during apply
? Dependency computation and scheduling
? Capture process parallelism

On the surface, I don’t see what this would have to do with trigger execution, but I’m going to try enabling it as per the newly read document anyway:

Now we look in v$session, to see if a session associated with the process with OS PID 2293 (which happens to be the SPID of our current shadow process) has a PROGRAM column starting with the word extract. extract is, naturally, the name of the GoldenGate executable that captures data from the source system. In a GoldenGate system, however, trigger suppression does not happen in the extract process at all, but rather the replicat process that applies changes on the target system. So I’m going to skip this check and move on to the next one in the tracefile:

This SQL is similar to the previous one, but instead of looking for a program called extract, it looks for one called replicat, and adds an extra check, so see if the module column either starts with OGG or is called GoldenGate. And since it’s the replicat process that does trigger disabling in GoldenGate, this check is likely to be related.

To make this check succeed, I’m going to have to change both the program and module columns in v$session for the current session. of the two, module is much easier to modify: a single call to dbms_application_info.set_module. But modifying program is less straightforward. One approach is to use Java code with Oracle’s JDBC Thin driver and setting the aptly-named v$session.program property, as explained in De Roeptoeter. But I’m hoping to stay with something I can do in SQL*Plus. If you’ve looked through a packet trace of a SQL*Net connection being established, you will know that the program name is passed by the client at the time of connection establishment, so could be modified by either modifying the network packet in transit. This is also complex to get working, as it also involves fixing checksums and the like. There’s a post on Slavik’s blog with a sample OCI C program that modifies its program information. Again more complexity thn I’d like, but it gave me an idea: if the program is populated by the name of the client-side executable, why don’t we simply copy sqlplus to a name that the dbms_xstream_gg likes better?

Wrapping up

So it looks like you can disable triggers per-session in 11.2.0.4 just like previous versions, but need to jump through a few more hoops to do it. A few conclusions to draw:

Oracle patchsets, while normally intended to include bugfixes, can have major changes to underlying functionality too. See Jeremy Schneider’s post on adaptive log file sync for an even more egregious example. So before applying a patchset, test thoroughly!

The enforcement of full DBA privileges for the GoldenGate user in Oracle 11.2.0.4 requires very broad permissions to use GoldenGate, which can be a concern in security-conscious or consolidated environments.

TL;DR: Yes you can still disable triggers per-session in Oracle 11.2.0.4, but you have to have a GoldenGate license, set the enable_goldengate_replication parameter, use a program name that starts with replicat, and set your module to OGG.

I’ve spent the better part of the day troubleshooting an issue with Oracle’s Auto Service Request (ASR) and wanted to share my results in case if saves someone else some effort.

The ASR manager is designed to be a side-wide aggregation point for ASR alerts, receiving SNMP traps and forwarding over https to transport.oracle.com. But if you’re using port 162 for SNMP traps on a Linux system, you may find that such traps are never sent to Oracle.
I was testing this by creating test traps through IPMI:

This is part 3 of a multipart series of getting Oracle RAC running on a cloud environment. In part 1, we set up a NFS server for shared storage. In part 2, we set up OS components for each RAC server. Now we finish up the OS configuration and move to the Oracle grid infrastructure.

Passwordless SSH, take two

Now that we have Oracle users on both rac01 and rac02, we need to configure passwordless SSH between them. (It’s also possible to do it from the installer, but I prefer to do it myself)

Getting RAM for the install

Before we run the Oracle installer, we should expand the physical RAM for each machine. This can be done from the Gandi control panel for each server. When I first tried this I got a quota error, and had to raise a support ticket (and wait for a response) to get the quota raised. A second issue with the RAM is that the the VM doesn’t see the full amount of RAM allocated: when I tried firing up a 4GB instance, Linux only saw 3667716k available, and the Oracle installer promptly complained about insufficient memory.

So instead of 4096MB of memory, we’re going to adjust rac01 and rac02 to have 4800MB. After adjusting in the control panel, you may see that the operation is complete within a minute or so, but the server didn’t consistently get more memory. So while logged onto rac01 as oracle, have a look:

for host in rac01-pub rac02-pub; do echo $host; ssh $host free; done

If each node shows 4388612 total memory, you’re golden. Otherwise, reboot the nodes.
(And yes, 700mb seems like an awful amount of memory to simply not be available to the OS; I’m wondering what’s using the space?)

Getting ready for the installer

By now the Oracle software download should be complete, and we need to give the downloaded files .zip extensions and install an unzipper to use it. (Note to Oracle packagers: unzip isn’t _all_ that common in the Linux world, and gzip provides better compression rates anyways. Why not do tarballs?)

Starting the server; you’ll need to supply a password the first time around. (still logged in as oracle)

vncpasswd
vncserver :1

Now we start a VNC viewer locally. If you don’t have one already, you can download one from www.realvnc.com

Grid Infrastructure Install

Connecting to display 1 of the server external IP, you should get an xterm window if all went well. Running the installer:

cd /srv/datadisk01/dl/grid
./runInstaller

Skipping software updates, and doing a cluster install, using a standard cluster. Doing an advanced install. Using the default language. Under “grid plug and play” we need to set up the node naming. Using cluster name “rac-cluster”, and SCAN name “rac-cluster” as defined in /etc/hosts earlier. On the cluster node screen, we should see that rac01-pub has been detected. Adding rac02-pub too, with rac02-pub-vip as its VIP address.

Now comes the validation, where we learn if SSH, naming etc were properly set up. If all goes well, you’ll make it to the network interface usage screen. Here we need to make changes: eth0 shouldn’t be used, eth1 is public, and eth2 is private. The management repository is a choice: it takes up memory and install time, but it does allow us to use such things as QoS management, and it can only be created at install time. I chose to skip.

For storage, we’re using a shared file system: the NFS we created. Using external redundancy since it’s a single disk anyway. Doing the same for the voting disk.

Not using IPMI. We’ll also leave the ASM oper group blank, and accept the warning.

Using the default “/u01/app/oracle” and “/u01/app/12.1.0/grid” directories for ORACLE_BASE and grid home. Using /u01/app/oraInventory for oraInventory. You can either run sudo yourself or let the installer do it. I like to run it myself to have more control over re-running and deconfigs if required.

Now it’s time for the prerequisite checks. If all previous steps have succeeded, you shouldn’t see any warnings at all.

Saving the response file and kicking off the install itself.

Running the orainstRoot.sh from the oraInventory, plus root.sh from the grid home. Running on rac01 first.

Running the rootinventory and root.sh since we have sudo running. It does take some time to run as the grid infrastructure is shut down and started up a few times.

Database install

Now that the grid infrastructure is in place, we can move onto the actual database install. We can re-use the same VNC window to install:

Skipping software updates and skipping the DB creation too (software only). Picking a RAC install. At this point we should see both nodes detected. Using the default language.

Installing enterprise edition with default home locations. In the group selection, it won’t let me select dba, even though the group was installed by the preinstall RPM. For now I’ll select oinstall.

The rest are default.

Running root.sh, which this time is very short.

Database creation assistant

With a database home we can run the creation assistant. But first, working on a hugepage configuration. /proc/meminfo is missing the HugePages lines entirely, and it does look like, regrettably, the supplied kernel does not support hugepages:

[oracle@rac01-pub ~]$ zgrep HUGETLB /proc/config.gz
# CONFIG_HUGETLBFS is not set
# CONFIG_HUGETLB_PAGE is not set

And a quick web search seems to show that, while custom kernel support has been a long-standing user request at Gandi, it’s still not available.

So onto the install. From the same VNC session:

/u01/app/oracle/product/12.1.0/dbhome_1/bin/dbca

Creating a new database. Using the advanced install with:

RAC database (default)

Admin-managed

General purpoase/transaction processing

DB name: racdb

non-PDB

Selecting both nodes to run on

Configuring EM express

Running CVU periodically

Picking a password

File system storage

/srv/datadisk01/oradata/oradata – default

Default FRA, using the default size of 5G

Archiving disabled

Skipping sample schemas and database value

Unselecting automatic memory management

Leaving the remaining parameters default

And we’re installed and have a database. It can be tested via SQL*Plus:

And that’s it for the series. I made it through with 50,000 credits remaining in my Gandi account to play with.

Feel free to ping me in case of issues getting running. Some of these steps are a combination of several iterations as bugs were worked out, so it’s likely that there are some gremlins still lurking, and I’ll try and incorporate fixes as issues are discovered.

Lessons Learned

Yes, Oracle RAC can installed cleanly on a cloud environment, and at $17, the price is right

True shared storage from a cloud provider is still hard to come by, limiting the high-availability potential

There are quite a few extra steps required to satisfy the RAC installer and its prerequisite checks

In the Gandi environment, you need to overallocate RAM as not all of it is visible to the OS

The lack of hufepage support in the Gandi kernel (and complete lack of custom kernel support) further increases memory requirements

A dummy oracle-release RPM is all we need to keep the OS prerequisite checks happy

In part 1 of this series, we talked about some of the challenges of setting up Oracle RAC on a public cloud provider, and went on to order some VMs from provider Gandi, and finally configuring an NFS server for shared storage. In this post, we move on to configuring the RAC servers themselves, rac01 and rac02.

Network config

After starting up the RAC nodes in the GUI, we can log in via the SSH key we created. The first order of business is to set up networking. The eth1 and eth2 VLAN interfaces by default have no network configuration at all. We’ll set up a configuration to use a DHCP, and sending a hostname to the DHCP server so it knows which IP address to give us. This ties into the dnsmasq configuration we set up in part 1, to automatically assign IP addresses to the eth1 and eth2 private-VLAN network interfaces.

Fun with hostnames

For DNS resolution, we simply point ourselves to the DNS server on server01, 10.100.0.1:

cat > /etc/resolv.conf <<-EOF
server 10.100.0.1
EOF

One issue I ran into with hostnames: the grid infrastructure install expects its public network to be associated with the hostname of the machine. But in the Gandi setup, the hostname is associated with the Internet-facing IP. Modifying /etc/sysconfig/network-scripts to change the hostname to rac01-pub.

But even after rebooting the hostname is still rac01. More digging showed it to be part of Gandi’s auto-configuration scripts. But conveniently they provide a file, /etc/sysconfig/gandi, where specific configurations can be turned off. There are a few that won’t play well with RAC: the hostname as mentioned above, but also the mountpoint name: the grid infrastructure expects the same mountpoint names for both RAC nodes, so the default /srv/rac01data mountpoints won’t work. And lastly, we have our DNS server, so we don’t want the Gandi configuration to mangle our resolv.conf.

The relevant section of the /etc/sysconfig/gandi looks like this:

# set to 0 to avoid hostname automatic reconfigure
CONFIG_HOSTNAME=1
# set to 0 to avoid nameserver automatic reconfigure
CONFIG_NAMESERVER=1
# allow mounting the data disk to the mount point using the disk label
CONFIG_ALLOW_MOUNT=1

A bit about the /etc/fstab line: Gandi created our filesystem with a filesystem label, so we can use this to locate the disk even if the actual device node /dev/xvdb changes. The mountpoint options are taken form Gandi’s defaults. Notably, “noatime” avoids doing a disk write every time a file is accessed.

And Oracle checks for /dev/shm, just in case you want to use memory_target. Adding a config here to make it owned by Oracle to avoid ugly error messages from dbca later on. It you get a “ORA-00600: internal error code, arguments: [SKGMHASH]…” error message, it may very well be permission issues on /dev/shm. The “54321” userid is the numeric ID of the oracle user that the preinstall RPM will create later on.

It complains about a UEK dependency. On this kind of cloud environment we can’t install custom kernels like UEK, so we do need to stay with default kernels. And we’re running CentOS anyway.

flashdbba has an excellent blog post on this subject; he listed a two-line workaround of downloading the full UEK kernel package, and installing it with the –justdb and –nodeps options. I can confirm that it works, but it requires a big UEK package download, plus results in warnings about missing dependencies every time yum is run.

So instead, we’re going to create a dummy RPM package to satisfy the dependency. It won’t have any files or scripts, but will match the name that the preinstall RPM is looking for.

To actually turn this specification into a RPM, we need the the rpm-build package installed as well: