from VMware's performance team

Tag Archives: ESXi

Along with the recent release of VMware vSphere 6.7 U2, we published a new whitepaper that shows the performance of a new scheduler option that was included in the 6.7 U2 update. We referred to this new scheduler option internally as the “sibling” scheduler, but the official name is the side-channel aware scheduler version 2, or SCAv2. The whitepaper includes full details about SCAv1 and SCAv2, the L1TF security vulnerability that made them necessary, and the performance implications with several different workload types. This blog is a brief overview of the key points, but we recommend that you check out the full document.

In August of 2018, a security vulnerability known as L1TF, affecting systems using Intel processors, was revealed, and patches and remediations were also made available. Intel provided micro-code updates for its processors, operating system patches were made available, and VMware provided an update for vSphere. The full details of the vCenter and ESXi patches are in a VMware security advisory that links to individual KB articles.

The ESXi-provided patches included a side-channel aware scheduler (SCAv1) that mitigated the concurrent-context attack vector for L1TF. Once that mode was enabled, the scheduler would only schedule processes on one thread for each core. This mode impacted performance mostly from a capacity standpoint because the system was no longer able to use both hyper-threads on a core. A server that was already fully utilized and running at maximum capacity would see a decrease in capacity of up to approximately 30%. A server that was running at 75% of capacity would see a much smaller impact to performance, but CPU utilization would rise.

In vSphere 6.7 U2, the side-channel aware scheduler has been enhanced (SCAv2) with a new policy to allow hyper-threads to be used concurrently if both threads are running vCPU contexts from the same VM. In this way, L1TF side channels are constrained to not expose information across VM/VM or VM/hypervisor boundaries.

Performance testing with several different workloads found a range of impact in performance for both SCAv1 and SCAv2 as compared to the default scheduler as the baseline of performance. If SCAv1 or SCAv2 were able to achieve the same performance, it would be 1.0, and if it achieved 75% of the performance, it would be .75. The graphs here show the performance impact at max server utilization and the impact at the reduced load of approximately 75% utilization.

The charts show that the SCAv2 scheduler, represented by the third bar in each group, recovers a significant percentage of performance in all cases, except for the monster VM test case. The monster VM test case was for a single large Oracle database VM that consumed an entire 4 socket host with 192 vCPUs. In configurations with a single large monster VM that uses all the logical threads of the host, SCAv1 had a slight performance advantage over SCAv2 in our testing.

The reduced load numbers show that at server usage levels of approximately 75%, the overall impact to performance is much lower. With SCAv2 and the overall load below 75%, tests show that the largest performance impact measured in these tests was 11%. The SCAv2 scheduler option, available in vSphere 6.7 U2, provides better performance than SCAv1 in almost all cases.

For full details about the individual benchmark tests as well as more details about L1TF and VMware’s response to it, please see the full whitepaper and VMware KB 55806.

VMmark is a free tool used by hardware vendors and others to measure the performance, scalability, and power consumption of virtualization platforms. If you’re unfamiliar with VMmark 3.x, each tile is a grouping of 19 virtual machines (VMs) simultaneously running diverse workloads commonly found in today’s data centers, including a scalable Web simulation, an E-commerce simulation (with backend database VMs), and standby/idle VMs.

I’m happy to announce that today we published the first VMmark 3.1 results. These results were obtained on systems meeting our industry-leading side-channel-aware mitigation requirements, thus continuing the benchmark’s ability to provide an indication of real-world performance.

Some mitigations for recently-discovered side-channel vulnerabilities (i.e., Spectre, Meltdown, and L1TF) incur significant performance impacts, while others have little or no impact. Today’s VMmark results demonstrate that even when additional mitigations are in place, ESXi hosts using the new 2nd-Generation Intel® Xeon® Scalable processors obtain higher VMmark scores than comparable 1st-Generation Intel Xeon Scalable processors. This is due to processor design improvements that reduce (or even negate) the performance impact of security mitigations, by mitigating some of the security vulnerabilities in hardware rather than in software.

These results, from Fujitsu, span all three VMmark publication categories:

So, how does this new performance result with Cascade Lake processors compare to the previous generation with Skylake processors? Hopefully a graph is worth a thousand words 😊…

As you can see, Fujitsu was able to achieve a higher score, while being able to run an additional tile (19 more VMs) and still meeting strict Quality-of-Service (QoS) compliance requirements imposed by the VMmark benchmark harness.

Industry-Leading Side-Channel Mitigation RequirementsGiven the numerous security vulnerabilities recently identified, we set a high bar in VMmark 3.1 that requires all applicable security mitigations in benchmarked environments to best represent secure, real-world customer environments.

These are the current security mitigation requirements for VMmark 3.1:

VMmark 3.1 Security Mitigations Table

Note: If “N/A” is listed, that vulnerability does not apply to that portion of the stack.

You’ve probably already heard about VMware Cloud on Amazon Web Services (VMC on AWS). It’s the same vSphere platform that has been running business critical applications for years, but now it’s available on Amazon’s cloud infrastructure. Following up on the many tests that we have done with Oracle databases on vSphere, I was able to get some time on a VMC on AWS setup to see how Oracle databases perform in this new environment.

It is important to note that VMC on AWS is vSphere running on bare metal servers in Amazon’s infrastructure. The expectation is that performance will be very similar to “regular” onsite vSphere, with the added advantage that the hardware provisioning, software installation, and configuration is already done and the environment is ready to go when you login. The vCenter interface is the same, except that it references the Amazon instance type for the server.

Our VMC on AWS instance is made up of four ESXi hosts. Each host has two 18-core Intel Xeon E5-2686 v4 (aka Broadwell) processors and 512 GB of RAM. In total, the cluster has 144 cores and 2 TB of RAM, which gives us lots of physical resources to utilize in the cloud.

In our test, the database VMs were running Red Hat Enterprise Linux 7.2 with Oracle 12c. To drive a load against the database VMs, a single 18 vCPU driver VM was running Windows Server 2012 R2, and the DVD Store 3 test workload was also setup on the cluster. A 100 GB test DS3 database was created on each of the Oracle database VMs. During testing, the number of threads driving load against the databases were increased until maximum throughput was achieved, which was around 95% CPU utilization. The total throughput across all database servers for each test is shown below.

In this test, the DB VMs were configured with 16 vCPUs and 128 GB of RAM. In the 8 VMs test case, a total of 128 vCPUs were allocated across the 144 cores of the cluster. Additionally the cluster was also running the 18 vCPU driver VM, vCenter, vSAN, and NSX. This makes the 12 VM test case interesting, where there were 192 vCPUs for the DB VMs, plus 18 vCPUs for the driver. The hyperthreads clearly help out, allowing for performance to continue to scale, even though there are more vCPUs allocated than physical cores.

The performance itself represents scaling very similar to what we have seen with Oracle and other database workloads with vSphere in recent releases. The cluster was able to achieve over 370 thousand orders per minute with good scaling from 1 VM to 12 VMs. We also recently published similar tests with SQL Server on the same VMC on AWS cluster, but with a different workload and more, smaller VMs.

UPDATE (07/30/2018): The whitepaper detailing these results is now available here.

In the past, I’ve always benchmarked performance of SQL Server VMs on vSphere with “on-premises” infrastructure. Given the skyrocketing interest in the cloud, I was very excited to get my hands on VMware Cloud on AWS – just in time for Amazon’s AWS Summit!

A key question our customers have is: how well do applications (like SQL Server) perform in our cloud? Well, I’m happy to report that the answer is great!

VMware Cloud on AWS Environment

The HTML5-based vSphere Client interface should be very familiar to vSphere administrators, making the move to the cloud extremely easy

This SDDC instance was auto-provisioned with 4 ESXi hosts and 2TB of memory, all of which were pre-configured with vSAN storage and NSX networking.

Each host is configured with two CPUs (Intel Xeon Processor E5-2686 v4); each socket contains 18 cores running at 2.3GHz, resulting in 144 physical cores in the cluster. For more information, see the VMware Cloud on AWS Technical Overview

Virtual machines are provisioned within the customer workload resource pool, and vSphere DRS automatically handles balancing the VMs across the compute cluster.

Benchmark Methodology

To measure SQL Server database performance, I used HammerDB, an open-source database load testing and benchmarking tool. It implements a TPC-C like workload, and reports throughput in TPM (Transactions Per Minute).

To measure how well performance scaled in this cloud, I started with a single 8 vCPU, 32GB RAM VM for the SQL Server database. To drive the workload, I created a 4 vCPU, 4GB RAM HammerDB driver VM. I then cloned these VMs to measure 2 database VMs being driven simultaneously:

I then doubled the number of VMs again to 4, 8, and finally 16. As with any benchmark, these VMs were completely driven up to saturation (100% load) – “pedal to the metal”!

Results

So, how did the results look? Well, here is a graph of each VM count and the resulting database performance:

As you can see, database performance scaled great; when running 16 8-vCPU VMs, VMware Cloud on AWS was able to sustain 6.7 million database TPM!

I’ll be detailing these benchmarks more in an upcoming whitepaper, but wanted to share these results right away. If you have any questions or feedback, please leave me a comment!

UPDATE (07/25/2018): The whitepaper detailing these results is now available here.

VMware and Microsoft have collaborated to validate and support the functionality and performance scalability of SQL Server 2017 on vSphere-based Linux VMs. The results of that work show SQL Server 2017 for Linux installs easily and has great performance within VMware vSphere virtual machines. VMware vSphere is a great environment to be able to try out the new Linux version of SQL Server and be able to also get great performance.

Using CDB, a cloud database benchmark developed by the Microsoft SQL Server team, we were able to verify that the performance of SQL Server for Linux in a vSphere virtual machine was similar to other non-virtualized and virtualized operating systems or platforms.

Our initial reference test size was relatively small, so we wanted to try out testing larger sizes to see how well SQL Server 2017 for Linux performed as the VM size was scaled up. For the test, we used a four socket Intel Xeon E7-8890 v4 (Broadwell)-based server with 96 cores (24 cores per socket). The initial test began with a 24 virtual CPU VM to match the number of physical cores of a single socket. Additional tests were run by increasing the size of the VM by 24 vCPUs for each test until, in the final test, the VM had 96 total vCPUs. We configured the virtual machine with 512 GB of RAM and separate log and data disks on an SSD-based Fibre Channel SAN. We used the same best practices for SQL Server for Linux as what we normally use for the windows version as documented in our published best practices guide for SQL Server on vSphere.

The results showed that SQL Server 2017 for Linux scaled very well as the additional vCPUs were added to the virtual machine. SQL Server 2017 for Linux is capable of scaling up to handle very large databases on VMware vSphere 6.5 Linux virtual machines.

We were able to get one of the new four-socket Intel Skylake based servers and run some more tests. Specifically we used the Xeon Platinum 8180 processors with 28 cores each. The new data has been added to the Oracle Monster Virtual Machine Performance on VMware vSphere 6.5 whitepaper. Please check out the paper for the full details and context of these updates.

The generational testing in the paper now includes a fifth generation with a 112 vCPU virtual machine running on the Skylake based server. Performance gain from the initial 40 vCPU VM on Westmere-EX to the Skylake based 112 vCPU VM is almost 4x.

The performance gained from Hyper-Threading was also updated and shows a 27% performance gain from the use of Hyper-Threads. The test was conducted by running two 112 vCPU VMs at the same time so that all 224 logical threads are active. The total throughput from the two VMs is then compared with the throughput from a single VM.

My colleague David Morse has also updated his SQL Server monster virtual machine whitepaper with Skylake data as well.

Back in March, I published a performance study of SQL Server performance with vSphere 6.5 across multiple processor generations. Since then, Intel has released a brand-new processor architecture: the Xeon Scalable platform, formerly known as Skylake.

Our team was fortunate enough to get early access to a server with these new processors inside – just in time for generating data that we presented to customers at VMworld 2017.

Each Xeon Platinum 8180 processor has 28 physical cores (pCores), and with four processors in the server, there was a whopping 112 pCores on one physical host! As you can see, that extra horsepower provides nice database server performance scaling:

We have just published a new whitepaper on the performance of Oracle databases on vSphere 6.5 monster virtual machines. We took a look at the performance of the largest virtual machines possible on the previous four generations of four-socket Intel-based servers. The results show how performance of these large virtual machines continues to scale with the increases and improvements in server hardware.

In addition to vSphere 6.5 and the four-socket Intel-based servers used in the testing, an IBM FlashSystem A9000 high performance all flash array was used. This array provided extreme low latency performance that enabled the database virtual machines to perform at the achieved high levels of performance.

Some similar tests with Microsoft SQL Server monster virtual machines were also recently completed on vSphere 6.5 by my colleague David Morse. Please see his blog post and whitepaper for the full details.

Achieving optimal SQL Server performance on vSphere has been a constant focus here at VMware; I’ve published past performance studies with vSphere 5.5 and 6.0 which showed excellent performance up to the maximum VM size supported at the time.

Since then, there have been quite a few changes! While this study uses a similar test methodology, it features an updated hypervisor (vSphere 6.5), database engine (SQL Server 2016), OLTP benchmark (DVD Store 3), and CPUs (Intel Xeon v4 processors with 24 cores per socket, codenamed Broadwell-EX).

Previous posts have shown vSphere can easily handle running Microsoft SQL Server on four-socket servers with large numbers of cores—with vSphere 5.5 on Westmere-EX and more recently with vSphere 6 on Ivy Bridge-EX. We recently ran similar tests on vCloud Air to measure how these enterprise databases with mission critical performance requirements perform in a cloud environment. The tests show that SQL Server databases scale very well on vCloud Air with a variety of virtual machine (VM) counts and virtual CPU (vCPU) sizes.

The benchmark tests were run with vCloud Air using their Virtual Private Cloud (VPC) subscription-based service. This is a very compelling hybrid cloud service that allows for an on-premises vSphere infrastructure to be expanded into the public cloud in a secure and scalable way. The underlying host hardware consisted of two 8-core CPUs for a total of 16 physical cores, which meant that the maximum number of vCPUs was 16 (although additional processors were available via Hyper-Threading, they were not utilized).