Significant performance regresion (40%) (also present in 4.2.10)

Description

I have noticed a significant performance regression in CPU and network intensive operation from version 4.2.6 to 4.2.8. Regression is also present in 4.2.10. I am running Windows 7 x64 host with Debian 6 x32 guests. Intel i5-3570k.

Introduction

I am running a scientific project related to routing and traffic management algorithms. We are testing new algorithms with own software routing suite containing Click Modular Router and XORP (AGH Live Router).

Before deploying an experimental network into the real lab, I am testing it on virtual machines. VirtualBox is very useful tool because of its performance and simplicity of creating internal networks between VMs.

Details

Click kernel module processes packets. It is a CPU intensive operation. It means that the router's throughput is limited by the CPU, not by the throughput of the interface. For example using Intel PRO/1000 virtual adapters I was able to reach 40MB/s with CPU ~95% loaded by the kernel Click thread.

However, after upgrading to version 4.2.10 I have noticed a performance regression. Throughput dropped to 24 MB/s. I started to investigating what is the cause of regression and I have found out that performance drop appears in version 4.2.8.

I tested this on another machine (Core2Duo T7700, Win7 64x). Interestingly, in that case there is no performance degradation or it is very small (13 to 12 MB/s).

How to reproduce this

Start all three virtual machines (R1, R2, R4). Each machine should boot from the appropiate live cd (R1 from r1.iso, R2 from r2.iso, etc.).

System will start. After the system boot application called Clicky starts automatically. With Clicky you can inspect currently loaded configuration of the Click kernel module. Don't mind with that. If you only see a graph inside a Clicky it means that the configuration has been loaded correctly.

All three machines have running OSPF daemons. After boot you should wait ~30s in order to let them to establish adjaciencies. You can check the propagation of routing information with linux command "route". Basically you should only check if R1 "sees" network 192.168.4.0. If not - wait a minute more. Sometimes XORP has problems with setting routes to the kernel. If you don't see proper routes 2 minutes after boot, reboot the R1.

Create a sample large file on R4. You can use "dd" command (see screenshots). After that invoke "python -m SimpleHTTPServer 80" in the directory containing created file.

Download this file to R1 with command "wget 192.168.4.2/largefile".

Uninstall VirtualBox and install verion 4.2.6. Do not make any changes in VMs configurations. Just start machines and perform a test starting from the point 1.

The VT-x code got a major rewrite. The next major release should fix a number of bugs and should also increase the performance. Could you check if this test build still shows the performance problems you saw with VBox 4.2.12?

I have problems with the installation of this build. First time I managed to install it, but starting a VM resulted in bluescreen and system crash, so I removed that. When I try to install it again, installation ends prematurely with an error.