Why does my Linux virtual machine lose time?

Despite the fact that I am a real Unix adept I run Windows XP on my laptop because it is much better accessible to the blind due to the greater availability of screen readers which I found to be more feature rich than Unix screen readers (if they exist at all). But this doesn’t mean I can’t run Linux as well on my laptop, as a matter of fact I prefer to run my Oracle demos on a VMware Server virtual machine running Oracle Enterprise Linux. Which in turn runs my Oracle10g and Oracle11g databases. The problem I encountered with this setup is that my Linux virtual machines are losing time and the clock falls behind rather quickly. After conducting a bit of research I discovered that this is caused by the fact that the default Linux kernel runs at a 1000Hz internal clock frequency and that VMware is unable to deliver the clock interrupts on time without losing them. This means that some clock interrupts are lost without notice to the Linux kernels which assumes each interrupt marks 1/1000th of a second. So each clock interrupt that gets lost makes the clock fall behind a 1/1000th of a second. Although you can let VMware synchronize the guest O/S clock to the host O/S clock I don’t recommend this because it makes your Linux clock very bumpy. What I understand is that if you enable this clock synchronization VMware will set the Linux clock every minute equal to the host clock. This means that if my clock falls 3 seconds behind every minute the clock will jump forward 3 seconds each time VMware does its synchronization thing. You can imagine what this means to the Oracle database instrumentation. I also tried to keep the clock synchronized using the Network Time Protocol (NTP) but it didn’t work because the time loss is to unpredictable and NTP gave up. Everything else I tried didn’t solve this problem. The solution is to recompile the Linux kernel with a 100Hz internal kernel frequency.

Recompiling the Linux kernel

Note: The following procedure is only applicable to Oracle Enterprise Linux 5. If there is enough demand I will explain the procedure for Oracle Enterprise Linux 4 in a future posting.

To recompile the Linux kernel I first need to know which kernel I am running and second I need to get the kernel source code for that kernel. I can get the kernel release with the uname command as shown below:

# uname -r
2.6.18-128.0.0.0.2.el5

Next I can download the kernel source code from the Oracle Open Source website. In my case I need to download the kernel-2.6.18-128.0.0.0.2.el5.src.rpm file. Once downloaded I can install this kernel source RPM with the rpm command as follows:

# rpm -i kernel-2.6.18-128.0.0.0.2.el5.src.rpm

Note: The ‘#’ prompt indicates that I ran this as the root user. Also There will be warnings which can be ignored.

The kernel sources are now installed in the /usr/src/redhat/SOURCES directory, and in /usr/src/redhat/SPECS is a so called SPEC file installed which will be used to build the kernel rpm. Before recompiling the kernel I first need to change the internal clock frequency from 1000Hz to 100Hz. This is done by changing a setting in a configuration file. The name of this configuration file is hardware architecture dependant so I first need to get my machine type with the uname command as follows:

# uname -m
i686

The configuration file is located in /usr/src/redhat/SOURCES and the name is kernel-2.6.18-i686.config. In this file I need to change the line with CONFIG_HZ_1000=y into CONFIG_HZ_100=y and I am ready to compile the kernel with the rpmbuild command given the SPEC file as its input as shown below:

# cd /u01/redhat/SPECS
# rpmbuild --target=i686 -bp kernel-2.6.spec

This will run for an hour or more generating lots of output. Once finished the compiled kernel RPM is in /usr/src/redhat/RPMS/i686 with the name kernel-2.6.18-128.0.0.0.2.el5.i686.rpm waiting to get installed.

Installing the new kernel

The new kernel can be installed with the rpm command, but the same kernel is currently running so a reboot with a different kernel is required before continuing. After the reboot I recommend removing the current installed kernel with rpm before installing the newly compiled one as follows:

Note: It is possible that the kernel remove fails due to dependencies which have to be removed before the kernel is removed, and reinstalled afterwards.

A final reboot is required and after setting the clock it should never run behind anymore, but setting up NTP is still a wise thing to do.

Warning: Recompiling (and installing) the Linux kernel yourself makes your environment unsupported by Oracle and should never be done on a production environment.

-Harald

Like this:

LikeLoading...

Related

This entry was posted on February 8, 2009 at 18:59 and is filed under Linux.
You can follow any responses to this entry through the RSS 2.0 feed.
You can leave a response, or trackback from your own site.

Adrian Hollaysaid

for Suse SLES running 32-bit and 64-bit kernels I am using the “clock=pit” kernel setting and just crontab to set the clock every minute using ntpdate:
* * * * * /usr/sbin/ntpdate 192.168.1.1 2&>1 >> /var/log/ntpdate.log

Harald van Breederodesaid

Many many thanks for pointing me to this KB article. Yesterday I installed a 1000Hz kernel in both a EL4 and EL5 VMware Server VM and they both run on time with the mentioned kernel parameters, even without NTP. This saves me quite some kernel compilations.

If I knew this a week ago I didn’t have to write this posting ;-)
-Harald

dikkiedicksaid

I noticed on the training you gave last week that the time on your VMWare-servers was getting further and further behind on your Windowslaptop-time. Why was that then? As you’ve found a solution for the problem.

Harald van Breederodesaid

I recently installed a new kernel and somehow the extra kernel arguments “clock=pmtmr divider=10” from /etc/grub.conf were lost in this process. I re-added them and time keeping is back to normal.
-Harald