This blog article is not purely Oracle Database specific, yet it may have some relevance to companies that run Oracle Database on the Windows Server platform (for those DBAs lucky/unlucky enough to run Oracle Database on the Windows Server platform, you may find this article interesting).

I am in the process of setting up a couple of new Windows servers to perform various non-Oracle Database tasks. I noticed that one of the servers had an odd issue – the server would occasionally become very slow at responding to mouse movements and keyboard input, for instance taking 30 seconds to move the mouse pointer a short distance across the screen. These servers are running Windows Server 2012, which shares the same kernel and includes much the same features as Windows 8 – with the exception that the server operating system opens to the desktop rather than Windows 8’s new start screen.

Two years ago I wrote a brain teaser article that asked how it was possible that a 10046 extended SQL trace could output c=15600,e=510 on a line of the trace file when executing a SQL statement without using parallel query – essentially asking how it was possible to consume 0.015600 seconds of CPU time in 0.000510 seconds of elapsed time when the SQL statement was restricted to running on no more than one CPU. In the comments section of the article I mentioned the ClockRes utility, but did not provide a link for the download of the program. So, I thought that I would run the ClockRes utility on one of the new servers, make a change to the server, and then run the ClockRes utility again:

As can be seen above, on the first execution of ClockRes the Current timer interval was 1.001 ms, while on the second execution of the ClockRes program the Current timer interval was 15.626 ms. There is an odd similarity between that 15.626ms time (which oddly exceeds the reported Maximum timer interval of 15.625ms) and the c=15600 reported in the Oracle 10046 extended SQL trace file. So, what change did I make to the server between the first execution of ClockRes utility and the second execution? For now I will just say that I stopped one of the background services on the server (more later).

I recall performing an experiment a couple of years ago with Oracle Database. I downloaded a utility that offered to change the Windows default timer resolution from 15.625ms to 1.0ms. That utility did in fact change the Windows timer resolution, resulting in Oracle Database outputting c= values in increments of 1000, rather than in increments of 15600. If I am remembering correctly, a second outcome of the experiment was a decrease in performance of the test Oracle database on the computer due to the higher resolution of the Windows timer.

Could the change in the resolution of the Windows timer from the Windows default of 15.625ms to 1.001ms be responsible for the occasionally sluggish performance of the server? One article that I found (and unfortunately did not save the link to) claimed that adjusting the Windows timer from the default of 15.625ms to a lower value, 1ms for example, could cause a significant negative impact in multitasking system performance (roughly 30% decrease, if I recall correctly). I located an article on Microsoft’s website that offered some level of clarification, below is a short quote from the article:

“Applications can call timeBeginPeriod to increase the timer resolution. The maximum resolution of 1 ms is used to support graphical animations, audio playback, or video playback. This not only increases the timer resolution for the application to 1 ms, but also affects the global system timer resolution, because Windows uses at least the highest resolution (that is, the lowest interval) that any application requests. Therefore, if only one application requests a timer resolution of 1 ms, the system timer sets the interval (also called the “system timer tick”) to at least 1 ms. For more information, see “timeBeginPeriod Function” on the MSDN® website.

Modern processors and chipsets, particularly in portable platforms, use the idle time between system timer intervals to reduce system power consumption. Various processor and chipset components are placed into low-power idle states between timer intervals. However, these low-power idle states are often ineffective at lowering system power consumption when the system timer interval is less than the default.

If the system timer interval is decreased to less than the default, including when an application calls timeBeginPeriod with a resolution of 1 ms, the low-power idle states are ineffective at reducing system power consumption and system battery life suffers.”

The above mentioned Microsoft article also suggested running the following command from the Windows command line:

powercfg /energy

I had actually executed the above command before running the ClockRes program for the first time, and again after running the ClockRes program for the second time. A very small portion of the powercfg generated HTML file follows, generated prior to the first execution of ClockRes:

Platform Timer Resolution:Platform Timer Resolution
The default platform timer resolution is 15.6ms (15625000ns) and should be used whenever the system is idle. If the timer resolution is increased, processor power management technologies may not be effective. The timer resolution may be increased due to multimedia playback or graphical animations.
Current Timer Resolution (100ns units) 10009
Maximum Timer Period (100ns units) 156250

This is the same section of the generated HTML file, generated after the second execution of ClockRes:

Platform Timer Resolution:Platform Timer Resolution
The default platform timer resolution is 15.6ms (15625000ns) and should be used whenever the system is idle. If the timer resolution is increased, processor power management technologies may not be effective. The timer resolution may be increased due to multimedia playback or graphical animations.
Current Timer Resolution (100ns units) 156261

That is potentially interesting. The output of powercfg stated that PROGRA~2\APC\POWERC~1\agent\pbeagent.exe requested a timer of 1.000 ms, which then changed the Windows server system-wide timer to 1.0009ms. Interesting? PROGRA~2\APC\POWERC~1\agent\pbeagent.exe resolves to the “APC PBE Agent” service in Windows, which is a component of the American Power Conversion (APC) PowerChute Business Edition software. That software interfaces with an attached UPS to provide a gentle shutdown of the server in the event of an extended power outage. The “APC PBE Agent” service happens to be the service that I shut down between the first and second execution of the ClockRes utility.

Interesting? Does that suggest that installing the APC PowerChute Business Edition software on a server potentially has a significant impact on the performance of that server due to the program’s insistance on changing the Windows system-wide timer resolution to 1ms? A quick observation indicates that the change made by the APC software to the Windows system-wide timer resolution does NOT apparently affect the reporting of the c=15600 entries in an Oracle Database 10046 extended SQL trace when the APC software is installed on the server. The question remains whether or not this APC software could significantly decrease the performance of that Oracle Database software (potentially by 30%, as suggested in the one unnamed article).

——

The Windows Server that is experiencing occasionally jittery mouse and keyboard input is reasonally high-end for a Windows server: Intel Xeon E5-2690 8 core CPU at 2.9GHz (with hyperthreading enabled, giving the appearance of 16 CPUs in Windows), 64GB of memory, RAID controller with 1GB of battery backed cache, 16 internal 10,000 RPM hard drives, two gigabit network adapters in a teamed configuration, etc. It should require a substantial load on the server to cause the jittery mouse and keyboard input behavior.

The power option plan in Windows was set to High Performance, while the default plan in Windows Server is Balanced. Various articles on Microsoft’s website state that the Balanced plan allows the server/operating system to use CPU speed throttling (reducing the CPU speed from the stated speed rating, 2.9GHz in the case of this server), and core parking (essentially putting one or more CPU cores to sleep) in order to reduce energy consumption. Some articles on Microsoft’s site indicate that, at least with Windows Server 2008, that CPU parking may increase IO latencies – that, of course, would be bad if Oracle Database were installed on the server. Other articles on Microsoft’s site indicate that there are bugs, at least with Windows Server 2008, related to core parking which causes the parked cores not to wake up when the CPU load increases. I wonder if this particular bug is playing a part in the performance issue faced in this very recent Usenet thread that describes poor performance of Oracle Database running in Hyper-V on Windows?

Here is a screen capture of the Power Options window and Task Manager on the Windows Server 2012 machine that is experiencing occasionally jittery mouse and keyboard input (screen capture taken when the server was mostly idle):

Notice the inconsistency? The server’s CPU is throttled down from 2.9GHz to just 1.16GHz while the power option plan is set to High Performance. The Microsoft published “Performance Tuning Guidelines for Windows Server 2012” document on pages 16-17 states the following (I highlighted some of the words in red):

High Performance: Increases performance at the cost of high energy consumption. Power and thermal limitations, operating expenses, and reliability considerations apply. Processors are always locked at the highest performance state (including “turbo” frequencies). All cores are unparked.

Power Saver: Limits performance to save energy and reduce operating cost. Caps processor frequency at a percentage of maximum (if supported), and enables other energy-saving features.”

Well, that is interesting, and is inconsistent with the above screen capture. Incidentally, when the server was experiencing the worst of the occasionally jittery mouse and keyboard input, the CPU utilization was hovering around 6% and the CPU speed was still coasting at 1.16GHz to 1.18GHz, the network performance hovered between 600Mbps and 1100Mbps, and the server’s internal hard drives barely noticed the traffic passing to/from the disks through the network interface (lower than 75MB/s and 137MB/s, respectively). 6% CPU utilization causes the mouse and keyboard input to become jittery? With hyperthreading enabled, there is essentially 16 available CPU seconds per each second of elapsed time. A quick check: 1/16 = 0.0625, so 1 (hyperthreaded) CPU at 100% utilization would be reported as a system-wide utilization of 6.25%. Interesting, but is that statistic relevant?

I happened to have the Windows Resource Monitor open during one of the jittery episodes. The Resource Monitor showed, shockingly, that 14 (possibly 15) of the hyperthreaded “CPUs” were parked! That result is also in conflict with the Microsoft document mentioned above regarding “all cores are unparked” when the High Performance power plan is selected. So, at 6% CPU utilization the server was CPU constrained. Modifying the setting in the server’s BIOS that controls whether or not cores may be parked, so that the cores could not be parked, fixed the issue in Windows Server 2012 that resulted in the 30 second delay that accompanied moving the mouse pointer a short distance across the screen.

The server still exhibits a bit of jittery behavior with mouse and keyboard input when the server’s teamed network cards are heavily used for file transfers to the server, but at least the CPU activity is no longer confined to a single hyperthreaded “CPU”:

—

Considering that this server was ordered from the manufacturer as “performance optimized”, I am a bit surprised at the power consumption of the server. The server was ordered with dual (redundant) 1100 watt power supplies. With the CPU’s 135 watt maximum TDP (per Intel: “Thermal Design Power (TDP) represents the near maximum power a product can draw for a thermally significant period while running commercially available software.”), 16 hard drives, and 64GB of memory, I fully expected the server to consume between 700 and 900 watts of electrical power.

Here is the server’s power consumption when the server is lightly loaded with roughly 68 running processes (note that the server is connected to a 120 volt power outlet):

Here is the server’s power consumption when the server is moderately loaded with between 600Mbps and 1100Mbps of network traffic (the mouse pointer was slightly jittery at this point):

One of the common arguments for server virtualization is energy savings – the above screen captures may suggest that energy savings may not be a significant cost-savings factor for virtualization with modern server hardware. One might question how much energy is really being saved when the network interface is maxed out by a single virtualized server, just 6% CPU utilization results in a jittering mouse pointer, and there are eight to ten virtualized servers stacked on the physical hardware (all competing for the scarce CPU and network resources).

With the BIOS option set to enabled, disk activity caused by network traffic results in occasionally jittery mouse movements on the server. Based on a bit of research, installing the Hyper-V role on either Windows Server 2012 or Windows 8 may disable CPU throttling and/or disable CPU parking.

—

Added June 5, 2014:

I finally had sufficient time to fully analyze this problem, where a 2.9GHz CPU in a Dell PowerEdge T620 server crawled along at a leasurely pace of about 1.16GHz, actually throttling back performance further as demand for the server’s resources increased. A second Dell PowerEdge T620 server with a 2.6GHz CPU that was purchased at the same time also coasted along at roughly 1.16GHz, but that server did not seem to throttle back performance further as demand for the server’s resources increased.

As a review, the screen capture shown below at the left shows the Windows Server 2012 Power Options settings and the Performance tab of the Task Manager. The screen capture below at the right shows the Windows Server 2012 Power Options settings and the Performance tab of the Task Manager after fixing this particular problem – note that the 2.9GHz CPU is now essentially overclocked at 3.28GHz (it has operated at roughly that speed since the fix).

The 2.9GHz PowerEdge T620 and the 2.6GHz PowerEdge T620 are both Active Directory domain controllers and internal DNS servers (along with supporting other tasks), so the occasionally slow (or extremely slow) performance of the servers negatively impacted the performance of other servers as well as client workstations.

There was a BIOS firmware update released in the third quarter of 2013, which was supposed to address some CPU throttling issues – that BIOS update did not seem to help the problem that I experienced.

I rebooted the server, pressed F2, and dug around in the settings a bit. I found that the System Profile Setting was set to “Performance per Watt” (I believe that this was how it was set when it left the Dell factory). I changed that setting to “Performance”, saved the changes, and rebooted the server again. The server is now consuming 200+ watts, and the CPU is freely exceeding its rated speed. Once in the System BIOS settings, the pictures below show the configuration changes to remove the electric power cap, thus allowing the server to behave as it should have from the factory:

I suppose that if a Dell PowerEdge T620 (or similar recent model Dell server) seems to be running a bit slower than expected (note that the particular problem mentioned above is NOT Windows specific – a Dell PowerEdge T620 running Linux should be affected in the same way), you might take a quick peek at the System Profile Setting in the System BIOS to make certain that the System Profile is set to Performance. As shipped from the factory, two Dell PowerEdge T620 servers purchased this year were NOT affected by the problems mentioned in this blog article.

Hello. Did Resource Monitor really list the cores as “parked” (as in actually writing this very word on each core)? If so then Microsoft has messed up power profiles a lot with their latest Windows, because on Windows 8 Professional none of the built in power schemes parks any core at all (W7 would park 50% by default).

Core Parking is not a processor feature anyway, but just a name for the OS scheduling processes/threads on cores to keep other cores unused. Even without Core Parking the cores are still put into sleep states (C3/5/7), just with a smaller chance. This again is handled by power profiles and may contribute to you experiencing different clock readings (Task-Manager is not the right tool for that, though). Even the very basic C1E state contributes to CPU frequency changes and it’s not controlled by Windows power schemes, but can either be turned off in BIOS or – among other things – by a free software called Throttlestop (if your BIOS allows it to).

Just to mention it: Having IOs spread among cores is not always a good thing. Microsoft even implemented an API into Windows Server for IO controllers to make sure that a core that processes the data also is the one that handles it to the IO driver/hardware. This is meant to keep the data within a single core’s cache instead of having to move it around between cores.

Yes, Resource Monitor listed the cores as Parked. I wish that I had taken a screen capture, but at the time I was more concerned about the availability of the server to respond to client requests and did not think to capture the screen at that time. I might try to reproduce the problem on the server.

Based on research that I performed after the problem was corrected with the BIOS modification, I found that Dell’s latest (12th generation) servers apparently have the ability to throttle total server power utilization to a specified “ideal” maximum power consumption ( en.community.dell.com/techcenter/extras/m/white_papers/20158582/download.aspx ). From what I could see, there were no default power profiles of that type enabled on the server. I do not know if Dell’s hardware power throttling features affected the performance problem that I experienced.

A couple of days after disabling CPU core parking in the BIOS, I decided to perform a test to see if any of Dell’s power throttling capabilities were at fault for the low reported CPU speed by attempting to keep the power consumption below a threshold, for instance 150 watts. I used a simple script included in the “Expert Oracle Practices: Oracle Database Administration from the Oak Table” book to induce a CPU load on the server for up to 10 minutes – CPULoad.vbs:

Dim i
Dim dteStartTime
dteStartTime = Now
Do While DateDiff("n", dteStartTime, Now) < 10
i = i + 0.000001
Loop

I concurrently executed the script 4 or so times, and noticed that the reported CPU speed in task manager increased from 1.16GHz all the way up to about 3.7GHZ. I continued to execute additional copies of the script until there were 15-16 copies of the script running concurrently. The reported CPU speed in task manager dropped to about 2.9GHz (the rated speed). I then started one additional copy of the scipt – the mouse pointer movement became jerky/stuttering at that point. Surprisingly, the reported CPU speed automatically decreased (to around 1.5GHz if I remember right) until I started terminating copies of the CPULoad script. My memory is a bit fuzzy at the moment, but I believe that the server's power consumption had increased to about 350 watts at this point.

Windows Server 2012 seems to do a better job of keeping processes running on the same CPU core than did older Windows versions, where a process may frequently jump between cores. I suspect that the better task scheduling will allow the computer to take better advantage of the L1, L2, and L3 processor caches.

I think your mouse (or rather USB) difficulties stem from some other driver misbehaving (DPC Latencies) or the USB itself doing so. I would try to get some current (USB) chipset drivers, current LAN drivers (turn off to check if this is the culprit) and try to disable USB power saving via Device Manager and power scheme.

Concerning the decrease of maximum CPU clock under constant load you can check (and sometimes change) the TPL of your CPU via Throttlestop. What this tells you is the *average* wattage allowed by the CPU (and Bios) for the whole CPU. Shorter peaks allow full Turbo Boost clocks, but prolonged load throttles the CPU down to match the average. In practice you still get some Turbo Boost, but not the max clock with prolonged load.

Those are excellent suggestions. When I first experienced the problem on the server I made certain that the server (which is roughly 1.5 months old now) had the most recent USB, network, RAID controller, video, and system bus drivers. I also upgraded the system BIOS and the RAID controller firmware, and also tried moving the mouse from the integrated USB 2 to a USB 3 card mounted in a PCI Express slot. The stuttering behavior did not change until I disabled Logical Processor Idling in the system BIOS. I believe that the issue might be related to system interrupt processing – the system interrupt processing seems to be handled by logical CPU 0, and if that logical CPU is overwhelmed with other processing tasks because the remaining logical CPUs are parked, then that would be a logical explanation for the studdering mouse movement.

1.5 days ago I re-enabled Logical Processor Idling in the system BIOS, and so far the studdering problem (and CPU parking issue) has not returned. There are now roughly 40 additional background processes running on the server, and the latest Windows updates were applied to the server – so those changes may be affecting the outcome of whether or not the logical CPUs are parking.

Regarding the term “TPL”, is that similar to “Max TDP” (Thermal Design Power) found on Intel’s website?http://ark.intel.com/products/64596/Intel-Xeon-Processor-E5-2690-20M-Cache-2_90-GHz-8_00-GTs-Intel-QPI
That makes a lot of sense. I was aware that Intel CPUs will slow down (throttle down the MHz/GHz speed) if the CPU starts to overheat, but I had not considered the possibility that the “Max TDP” may cause the performance to drop below the rated speed; higher wattage usage directly corresponds to higher generated heat, so what you stated is understandable.

Out of curiosity, I checked Resource Monitor on two different Windows 7 Pro computers – both showed that half of the logical CPUs were parked. I do not think that I noticed that behavior before – but I do wonder if those are the hyper-threaded virtual CPUs that were parked.

Interrupts – or mostly rather Deferred Procedure Calls (DPC) – can be handled by all cores. What “Core Parking” does is that the OS deliberately schedules threads not to be used on certain cores, which in turn increases the chance/time to send this core into deeper sleep states (C3/5/7). This is why I see it more as an OS function than a processor feature. You get deep C states without core parking, too, just either less often or more distributed among different cores. And even though C states are a processor feature they are still fully controlled by the OS (Windows, OS X, Linux), in case of Windows by the active power scheme, which can even be changed by the user with the help of some Registry keys. BIOS/EFI can disable that control, though.

It is beneficial to keep low CPU load restricted to less cores. For once this allows Turbo Boost to use higher clock frequencies on the fewer cores compared to having more cores active (=not in a C3/5/7 state). But sometimes more important it keeps data inside the same cache instead of shifting it around different core’s caches. And it keeps the cache of the active cores running instead of turning it off and on (which needs to reload all the data into the cache with C5/7, maybe even C3, I’d have to check again). So “Core Parking” is a good thing on its own, just the default power profiles don’t always match highest performance needs. One example for the latter is the negative influence of default Windows profiles on 4K random SSD throughput.

In your case Core Parking likely made all DPCs/Interrupts run on one core, because overall load was *low*. This becomes a problem if one driver keeps the core occupied longer than allowed (120 picosecond is what MS suggest, if I remember right), because only one driver can be served by a each single core at a time, while all others have to wait unless other cores are available. Tools like DPC Latency Checker (not compatible with Window 8 yet) and LatencyMon help to identify the driver that’s running afoul. Sometimes you need to use an older driver version, sometimes you need to do INF/registry hacks or even more exotic workarounds to make a driver behave, or you need to exchange the hardware for another one. For example, an old Broadcom WLAN driver that was only to be found in MS Web Driver Catalog would run with proper DPC times, same with some exotic NVidia GPU driver that had me do quite creative workaround to get its own power-saving DPC issues in line.

–

Regarding “TPL”, it is not a MAX value like TDP, but an AVERAGE value. There are “Package Power limit” (watts), “Turbo Time Limit” (seconds) and “Package Current Limit” (ampere) listed in Throttlestop. My 2011 17″ Macbook Pro running a 2.3 GHz i7 (3.4 max) comes configured with a TPL Package Power Limit of 45 watts by EFI/BIOS. The listed specification for this CPU is 48 watts, though, so by increasing the Package Power Limit to 48 watts makes about 0.1 GHz difference with prolonged maximum (near 100%) CPU load. Again, this is an AVERAGE value, so the CPU is allowed to go higher for several seconds. But with constantly high CPU load it will throttle down to match that wattage number. This can/will happen before the maximum temperature of the CPU is even reached (just did a short Prime95 test and TDP throttled down to 45 watts at 95°C, while the maximum is 105°C). So while TPL has an effect on temperature it is not controlled by temps.

–

Regarding the disabled Core Parking, or rather “Processor performance core parking min cored” being set to 100%: I don’t even have Hyper-V installed here and still all power schemes use this as a default. On Windows 7 this value was set to 50%, which corresponds to all hyperthreaded cores being parked, yes. In the light of Turbo Boost and Hyperthreading this doesn’t make full sense, but don’t forget that this is only a setting that tells the OS to keep threads away from those cores to force them being idle. The latter still can happen even without cores being specifically “parked”, they just need to be idle enough. Other keys of the active power scheme control how C states are entered and left.

You should take a look at the C1/C3 “Auto demotion” setting in Throttlestop. Click on the button between BCLK and DTS (like labeled C7, C7s or MAX). What this does is to set two registers in the CPU which usually are left off by BIOS. This in turn tells the CPU directly (not the OS) to keep cores away from the deeper sleep states for longer, when it detects constant enough load. You could say that this needs the CPU to be “more idle” before entering a deeper sleep state (C1 Auto demotion keeps it longer in C1 instead of going C3, C3 Auto Demotion keeps it longer in C3 instead of C5).

Last, but not least, there is the C1E setting that ThrottleStop and some BIOS allow to change. C1 is the oldest “Idle” state of all, and at one point it was expanded by C1E. The latter allowed the CPU clock frequency to be throttled down directly by the CPU whenever C1 is entered. This can have some negative impact on anything that need low latency reaction times. Before turning off all the other power-saving options and core parking, I’d try to change this single setting.

–

Just to mention it: I have no experience with Oracle databases (or other Server based ones) whatsoever. So you need to watch the impact of all these settings in practice on your own setups. ;-)

Hints for Posting Code Sections in Comments

********************
When the spacing of text in a comment section is important for readability (execution plans, PL/SQL blocks, SQL, SQL*Plus output, etc.) please use a <pre> tag before the code section and a </pre> tag after the code section:

<pre>

SQL> SELECT
2 SYSDATE TODAY
3 FROM
4 DUAL;
TODAY
---------
01-MAR-12

</pre>
********************
When posting test case samples, it is much easier for people to reproduce the test case when the SQL*Plus line prefixes are not included - if possible, please remove those line prefixes. This:

SELECT
SYSDATE TODAY
FROM
DUAL;

Is easier to execute in a test case script than this:

SQL> SELECT
2 SYSDATE TODAY
3 FROM
4 DUAL;

********************
Greater than and Less than signs in code sections are often interpretted as HTML formatting commands. Please replace these characters in the code sections with the HTML equivalents for these characters: