Performance of virtual storage (part 2) : QEMU

Jul 7, 2024 by Thibault Debatty | 511 views

Virtualization Linux Sysadmin

https://cylab.be/blog/351/performance-of-virtual-storage-part-2-qemu

In a previous blog post, I evaluated the performance penalty of virtual storage. I compared different host filesystems and different hypervisors (including QEMU). The conclusion was pretty harsh: in all tested configurations, virtual disks are almost 10 times slower than host drives. In this blog post, I will test additional QEMU configuration options, to see if I can get better results...

shutterstock_1511197688.jpg

bench.sh

To run the different tests in a reproducible way, I created a small wrapper script for sysbench. You can run it to compare your results with:

bash <(curl -Ls https://gist.githubusercontent.com/tdebatty/0358da0a2068eca1bf4583a06aa0acf2/raw/bench.sh)

scaleway-local.png

QEMU RAW disk image

By default, the disk image for QEMU virtual machines uses QCOW2 format, which I used in the previous tests. However, the documentation states that raw images provide better performance.

Moreover, QEMU offers different cache modes that affect how data is handled by the guest storage controller, by the host page cache, and by the host storage controller (a RAID controller, or the built-in controller of a SSD).

According to the documentation:

  • writeback : Data is cached on the host page cache. Data is written to host storage controller when discarded from the cache or when the guest virtual storage adapter sends a flush command. The guest's virtual storage adapter is informed of the writeback cache and therefore expected to send flush commands when data must be committed to storage, which may not supported by all guest operating systems. So writeback should offer good performance, but may not be supported by all guest systems.
  • writethrough : Data is written to host storage controller and host page cache simultaneously. Writes are reported as completed only when the data has been committed to the storage device. So it should provide good read performance.
  • none : The host page cache is bypassed, and reads and writes happen directly between the hypervisor and the host storage controller, so the guest still benefits from cache mechanism of the storage controller.
  • directcync : Similar to none, but the writes are reported as completed only when the data has been committed to the storage device.
  • unsafe : Similar to the writeback mode, except that all flush commands from the guests are ignored. Should offer the best performance, at the price of catastrophic data loss in case of power failure.

So this time I wanted to test the raw storage format, and test the different cache modes. The results of the different modes are listed below, together with the results of running the script directly on the host (and on the same drive).

MiB/s random read random write sequential read sequential write
host 4644 3722 14861 6422
writeback 489 1665 4272 2652
default 488 1673 4317 2681
writethrough 464 195 4283 2405
none 154 1896 2083 4253
directsync 151 176 3086 3786
unsafe 502 1805 4227 2949

As we can see:

  • writeback mode provides better performance than other modes (except for unsafe);
  • default provides the same performance as writeback, which suggests that the default mode is actually writeback;
  • unsafe provides slightly better performance than writeback, but at the expense of catastrophic data loss in case of power failure;
  • whatever the mode, virtual disk is always slower than the host disk, especially for random read operations.

Cloud provider : Scaleway

I wanted to compare my results with results from a real cloud provider. As I already had an account at Scaleway, I used that one to create my virtual machines.

I chose DEV1-L instances hosted at Amsterdam1. These instances have 4 cores and 8GB of RAM. According to /proc/cpuinfo the CPU is an AMD EPYC 7281 16-Core. This CPU is already quite old (Q4 2017), but this should not be a problem for my storage tests.

These instances can be configured with local or block storage.

According to Scaleway's documentation, block storage is replicated three times on multiple nodes. It also supports snapshots. So I guess it's actually a GlusterFS cluster.

Local storage is not described in the documentation, but I assume data is simply stored on the host.

You can find below the result of the tests.

MiB/s random read random write sequential read sequential write
local 1126 192 6987 222
block 1000 33 6719 181

These results are very interesting! The write performance is really disappointing, especially when using block storage. This was predictable as this means all write operations happen over a network.

The read performance however is impressive! It is still half the speed of my Samsung 990 PRO SSD on the host, but much faster than what I could achieve in my guest machines!

References

This blog post is licensed under CC BY-SA 4.0

This website uses cookies. More information about the use of cookies is available in the cookies policy.
Accept