Jul 7, 2024 by Thibault Debatty | 703 views
https://cylab.be/blog/351/performance-of-virtual-storage-part-2-qemu
In a previous blog post, I evaluated the performance penalty of virtual storage. I compared different host filesystems and different hypervisors (including QEMU). The conclusion was pretty harsh: in all tested configurations, virtual disks are almost 10 times slower than host drives. In this blog post, I will test additional QEMU configuration options, to see if I can get better results…
To run the different tests in a reproducible way, I created a small wrapper script for sysbench. You can run it to compare your results with:
bash <(curl -Ls https://gist.githubusercontent.com/tdebatty/0358da0a2068eca1bf4583a06aa0acf2/raw/bench.sh)
By default, the disk image for QEMU virtual machines uses QCOW2 format, which I used in the previous tests. However, the documentation states that raw images provide better performance.
Moreover, QEMU offers different cache modes that affect how data is handled by the guest storage controller, by the host page cache, and by the host storage controller (a RAID controller, or the built-in controller of a SSD).
According to the documentation:
So this time I wanted to test the raw storage format, and test the different cache modes. The results of the different modes are listed below, together with the results of running the script directly on the host (and on the same drive).
MiB/s | random read | random write | sequential read | sequential write |
---|---|---|---|---|
host | 4644 | 3722 | 14861 | 6422 |
writeback | 489 | 1665 | 4272 | 2652 |
default | 488 | 1673 | 4317 | 2681 |
writethrough | 464 | 195 | 4283 | 2405 |
none | 154 | 1896 | 2083 | 4253 |
directsync | 151 | 176 | 3086 | 3786 |
unsafe | 502 | 1805 | 4227 | 2949 |
As we can see:
I wanted to compare my results with results from a real cloud provider. As I already had an account at Scaleway, I used that one to create my virtual machines.
I chose DEV1-L instances hosted at Amsterdam1. These instances have 4 cores and 8GB of RAM. According to /proc/cpuinfo
the CPU is an AMD EPYC 7281 16-Core. This CPU is already quite old (Q4 2017), but this should not be a problem for my storage tests.
These instances can be configured with local or block storage.
According to Scaleway’s documentation, block storage is replicated three times on multiple nodes. It also supports snapshots. So I guess it’s actually a GlusterFS cluster.
Local storage is not described in the documentation, but I assume data is simply stored on the host.
You can find below the result of the tests.
MiB/s | random read | random write | sequential read | sequential write |
---|---|---|---|---|
local | 1126 | 192 | 6987 | 222 |
block | 1000 | 33 | 6719 | 181 |
These results are very interesting! The write performance is really disappointing, especially when using block storage. This was predictable as this means all write operations happen over a network.
The read performance however is impressive! It is still half the speed of my Samsung 990 PRO SSD on the host, but much faster than what I could achieve in my guest machines!
This blog post is licensed under CC BY-SA 4.0