Description

I believe that between 5.1 and 5.2 some change led to an inversion of the "Use Host I/O cache" setting.

I mark this bug as critical as people that explicitly disable Host I/O caching to ensure data integrity in case of a power outage may now actually have Host I/O caching enabled and are in danger of loosing data in power outage/reset situations.

Using VirtualBox for testing system installations I activate the Host I/O caching setting to increase performance when installing a lot of RPMs in the virtual machine, particularly because the VDI file on the host resides on a mechanical HDD. When switching from VirtualBox 5.1 to 5.2, I noticed a considerable slowdown.

I measured the slowdown with an ansible playlist that brings a minimal system (installed with Fedora kickstart) to a production level system. The script spends most time on installing RPMs.

Some stats of the RPM installation:
Number of RPM packages before: 324
Number of RPM packages after: 1490
Total file size of all RPMs before: 734 MiB
Total file size of all RPMs after: 3.9 GiB

It is Fedora (guest) on Fedora (host).

This is with the old VirtualBox 5.1, where everything worked as expected.
Installation run on VirtualBox 5.1.32, "Use Host I/O cache" activated:

Copy pasting my comment from bug #17746 since both bug seem to reference the same issue:

I also noticed this issue with Virtualbox 5.2.14. I have a couple of CentOS servers running on Virtualbox inside a Windows 10 host for testing purposes. I use a script to automatically configure them all. The configuration can be pretty disk intensive on the computer. In Virtualbox 5.1.38, the script takes about 4 minutes to complete. With Virtualbox 5.2.14, it takes 12 minutes. This seems like a major performance regression in Virtualbox 5.2. I also have "Use Host I/O Cache" switched on. Even if I turn off this setting, the VMs are still slower than when I was running on 5.1.38 without "Use Host I/O Cache". I am guessing that this regression is caused by the "first milestone of the I/O stack redesign" that was introduced in 5.2.0.

I've been digging through the VirtualBox source code (5.2.18 and 5.1.38). I do believe I've found the root cause -- IgnoreFlush (which defaults to being on) does not work in 5.2, so any IDE or SATA flush results in fsync() on the VirtualBox VDI or VMDK file.

In src/VBox/Devices/Storage/DrvVD.cpp, IgnoreFlush is checked in drvvdFlush() (if IgnoreFlush is true, the rest of the drvvdFlush() function is skipped and VINF_SUCCESS is returned). drvvdFlush() is (was!) called via call to "pThis->IMedia.pfnFlush".

However, this is no longer used! (Or at least not used for the typical async I/O case.)

Instead, pThis->ImediaEx.pfnIoReqFlush is called, resulting in use of drvvdIoReqFlush function. No ignoreflush checks here! It calls
drvvdMediaExIoReqFlushWrapper which uses 1 of 3 methods to write any internal buffers out to the OS cache as needed and then generate an fsync().

I think it'd be appropriate to put a ignoreflush check in either drvvdIoReqFlush or drvvdMediaExIoReqFlushWrapper.

Looks good! Installed, I booted up a few VMs (Ubuntu 18.04 VMs in this case). With 5.2.18 I could see (on the host) with gkrellm almost continuous disk writes and with /proc/meminfo dirty blocks would get up to like 4-10MB then get flushed. With 5.2.19 r125117 I see no discernible write activity in gkrellm and dirty blocks going up and up (as it should, I recently popped 16GB of RAM into this system so I'm sure the OS disk caches are quite large.)
Thanks!