SSD performance/benchmark on VM


Recommended Posts

Hello guys,

 

I'm about to become a new unRAID-er very soon (just waiting for components to arrive). I have been searching A LOT through the forum regarding SSD performance on Virtual Machines. General feeling seems that "it's fine" but:

 

- I would love to know real overhead. Let's say I have a SSD capable on bare-metal of doing 520mb/s read and 520mb/s write. What would be the speed in the VM? I tried to benchmark my current VMs on VMWare OSX using CrystalDiskMark but I'm getting unrealistic results. Is there a way to benchmark a VM?

 

- Is the passthrough the best option to get bare-metal performance? If yes, as for what I've been reading it seems I cannot pass only one SATA port but the entire "set" of ports, Am I right?

 

Thanks a lot!

Link to comment

Not sure how to answer your question more broadly, but happy to provide some data. I ran CrystalDiskMark within a Windows 10 VM running on my E3 Xeon TS140. The hard drive is a Samsung SSD 850 EVO 120GB.

-----------------------------------------------------------------------
CrystalDiskMark 5.1.2 x64 (C) 2007-2016 hiyohiyo
                           Crystal Dew World : http://crystalmark.info/
-----------------------------------------------------------------------
* MB/s = 1,000,000 bytes/s [sATA/600 = 600,000,000 bytes/s]
* KB = 1000 bytes, KiB = 1024 bytes

   Sequential Read (Q= 32,T= 1) :  5653.807 MB/s
  Sequential Write (Q= 32,T= 1) :   144.931 MB/s
  Random Read 4KiB (Q= 32,T= 1) :   599.689 MB/s [146408.4 IOPS]
Random Write 4KiB (Q= 32,T= 1) :   247.703 MB/s [ 60474.4 IOPS]
         Sequential Read (T= 1) :  3098.443 MB/s
        Sequential Write (T= 1) :   149.109 MB/s
   Random Read 4KiB (Q= 1,T= 1) :    70.047 MB/s [ 17101.3 IOPS]
  Random Write 4KiB (Q= 1,T= 1) :    65.852 MB/s [ 16077.1 IOPS]

  Test : 1024 MiB [F: 0.1% (0.1/79.9 GiB)] (x5)  [interval=5 sec]
  Date : 2016/05/07 19:58:31
    OS : Windows 10 Professional [10.0 Build 10586] (x64)

Link to comment

Not sure how to answer your question more broadly, but happy to provide some data. I ran CrystalDiskMark within a Windows 10 VM running on my E3 Xeon TS140. The hard drive is a Samsung SSD 850 EVO 120GB.

-----------------------------------------------------------------------
CrystalDiskMark 5.1.2 x64 (C) 2007-2016 hiyohiyo
                           Crystal Dew World : http://crystalmark.info/
-----------------------------------------------------------------------
* MB/s = 1,000,000 bytes/s [sATA/600 = 600,000,000 bytes/s]
* KB = 1000 bytes, KiB = 1024 bytes

   Sequential Read (Q= 32,T= 1) :  5653.807 MB/s
  Sequential Write (Q= 32,T= 1) :   144.931 MB/s
  Random Read 4KiB (Q= 32,T= 1) :   599.689 MB/s [146408.4 IOPS]
Random Write 4KiB (Q= 32,T= 1) :   247.703 MB/s [ 60474.4 IOPS]
         Sequential Read (T= 1) :  3098.443 MB/s
        Sequential Write (T= 1) :   149.109 MB/s
   Random Read 4KiB (Q= 1,T= 1) :    70.047 MB/s [ 17101.3 IOPS]
  Random Write 4KiB (Q= 1,T= 1) :    65.852 MB/s [ 16077.1 IOPS]

  Test : 1024 MiB [F: 0.1% (0.1/79.9 GiB)] (x5)  [interval=5 sec]
  Date : 2016/05/07 19:58:31
    OS : Windows 10 Professional [10.0 Build 10586] (x64)

 

Thanks a lot! so the speed is 144MB/s for write and 149MB/s for read? That's actually less than half of it's bare-metal performance :(

Link to comment

Not sure how to answer your question more broadly, but happy to provide some data. I ran CrystalDiskMark within a Windows 10 VM running on my E3 Xeon TS140. The hard drive is a Samsung SSD 850 EVO 120GB.

-----------------------------------------------------------------------
CrystalDiskMark 5.1.2 x64 (C) 2007-2016 hiyohiyo
                           Crystal Dew World : http://crystalmark.info/
-----------------------------------------------------------------------
* MB/s = 1,000,000 bytes/s [sATA/600 = 600,000,000 bytes/s]
* KB = 1000 bytes, KiB = 1024 bytes

   Sequential Read (Q= 32,T= 1) :  5653.807 MB/s
  Sequential Write (Q= 32,T= 1) :   144.931 MB/s
  Random Read 4KiB (Q= 32,T= 1) :   599.689 MB/s [146408.4 IOPS]
Random Write 4KiB (Q= 32,T= 1) :   247.703 MB/s [ 60474.4 IOPS]
         Sequential Read (T= 1) :  3098.443 MB/s
        Sequential Write (T= 1) :   149.109 MB/s
   Random Read 4KiB (Q= 1,T= 1) :    70.047 MB/s [ 17101.3 IOPS]
  Random Write 4KiB (Q= 1,T= 1) :    65.852 MB/s [ 16077.1 IOPS]

  Test : 1024 MiB [F: 0.1% (0.1/79.9 GiB)] (x5)  [interval=5 sec]
  Date : 2016/05/07 19:58:31
    OS : Windows 10 Professional [10.0 Build 10586] (x64)

 

Thanks a lot! so the speed is 144MB/s for write and 149MB/s for read? That's actually less than half of it's bare-metal performance :(

 

welcome to my world... https://lime-technology.com/forum/index.php?topic=43931.0

Link to comment

Not sure how to answer your question more broadly, but happy to provide some data. I ran CrystalDiskMark within a Windows 10 VM running on my E3 Xeon TS140. The hard drive is a Samsung SSD 850 EVO 120GB.

-----------------------------------------------------------------------
CrystalDiskMark 5.1.2 x64 (C) 2007-2016 hiyohiyo
                           Crystal Dew World : http://crystalmark.info/
-----------------------------------------------------------------------
* MB/s = 1,000,000 bytes/s [sATA/600 = 600,000,000 bytes/s]
* KB = 1000 bytes, KiB = 1024 bytes

   Sequential Read (Q= 32,T= 1) :  5653.807 MB/s
  Sequential Write (Q= 32,T= 1) :   144.931 MB/s
  Random Read 4KiB (Q= 32,T= 1) :   599.689 MB/s [146408.4 IOPS]
Random Write 4KiB (Q= 32,T= 1) :   247.703 MB/s [ 60474.4 IOPS]
         Sequential Read (T= 1) :  3098.443 MB/s
        Sequential Write (T= 1) :   149.109 MB/s
   Random Read 4KiB (Q= 1,T= 1) :    70.047 MB/s [ 17101.3 IOPS]
  Random Write 4KiB (Q= 1,T= 1) :    65.852 MB/s [ 16077.1 IOPS]

  Test : 1024 MiB [F: 0.1% (0.1/79.9 GiB)] (x5)  [interval=5 sec]
  Date : 2016/05/07 19:58:31
    OS : Windows 10 Professional [10.0 Build 10586] (x64)

 

Thanks a lot! so the speed is 144MB/s for write and 149MB/s for read? That's actually less than half of it's bare-metal performance :(

 

@bluepr0 Dont think its 149mb/s for read speeds i see the write speed as that?  " Sequential Write (T= 1) :  149.109 MB/s"

@sudonim Are these results from a passed through ssd or froma vdisk on an ssd?

Link to comment
  • 2 weeks later...

I'm getting this result from the 950pro's in my cache pool,

 

-----------------------------------------------------------------------

CrystalDiskMark 5.1.2 x64 © 2007-2016 hiyohiyo

                          Crystal Dew World : http://crystalmark.info/

-----------------------------------------------------------------------

* MB/s = 1,000,000 bytes/s [sATA/600 = 600,000,000 bytes/s]

* KB = 1000 bytes, KiB = 1024 bytes

 

  Sequential Read (Q= 32,T= 1) :  5322.083 MB/s

  Sequential Write (Q= 32,T= 1) :  3262.831 MB/s

  Random Read 4KiB (Q= 32,T= 1) :  691.821 MB/s [168901.6 IOPS]

Random Write 4KiB (Q= 32,T= 1) :  475.712 MB/s [116140.6 IOPS]

        Sequential Read (T= 1) :  2292.775 MB/s

        Sequential Write (T= 1) :  1558.035 MB/s

  Random Read 4KiB (Q= 1,T= 1) :    71.989 MB/s [ 17575.4 IOPS]

  Random Write 4KiB (Q= 1,T= 1) :    60.590 MB/s [ 14792.5 IOPS]

 

  Test : 1024 MiB [C: 34.1% (10.0/29.5 GiB)] (x5)  [interval=5 sec]

  Date : 2016/05/22 9:44:53

    OS : Windows 10 Professional [10.0 Build 10586] (x64)

Link to comment

Thanks Lebowski! the problem with using the cache pool is that you get this weird cached-unrealistic results :(. I think that the best way to test real performance is to passthrough the disk completely and see what results gives there

 

Thanks a lot anyway!

Link to comment

I'm trying to figure out the results as my ssd's are capable of 2500 read and 1500 write. but not sure if any numbers showed that or is it all wrong due to cached issues?

I think they are off because caching stuff, I have no idea how to avoid this other than passing-through the SSD entirely :(

Link to comment

You can change the amount of caching by editing your vm.xml.

In you disk settings, you have an option "cache=writeback".

 

To avoid any caching through qemu, you should go with "directsync"

cache=directsync

This mode causes qemu-kvm to interact with the disk image file or block device with both O_DSYNC and O_DIRECT semantics, where writes are reported as completed only when the data has been committed to the storage device, and when it is also desirable to bypass the host page cache. Like cache=writethrough, it is helpful to guests that do not send flushes when needed. It was the last cache mode added, completing the possible combinations of caching and direct access semantics.

 

Other options are explained HERE.

 

To completly bypass any cache, you would still need to disable caching IN your VM or use software that does not use cache for benchmarks. Otherwise even passthrough would not bypass the caching in the VM, just outside.

 

A good start to make sure there is not much caching going on, is to run very long benchmarks (hours) with very large amounts of random data (multiple times the amount of RAM; 10GB+). That way its very unlikely to hit any cached data.

To achieve that, I would recommend IOMETER, you can test all sorts of possible workloads.

 

But on the other side, 50-70% of bare-metal is basicly what you can expect of kvm (depending of workload).

HERE is a comparison between bare-metal, vmware and kvm (from 2013).

 

So don't blame unraid or LT, its a trade-off for a very cheap, highly compatible and very versatile system.

There is always room for optimization, like caching.

 

I recommend reading THIS page and more important all the  linked sources. It explaines very good where the problem is and how it "could" be solved. Main Problem is latency that comes with adding layers of virtualisation to support almost every hardware.

They are implementing methods to make better use of cpu-power / threads, but It's not yet there.

Link to comment

Thanks daigo and gridrunner! Would love to improve as much as I can my VMs so I can actually work with it.

 

My idea is to have very powerful hardware so even considering overhead and latency it won't be noticeable (or actually it will be even more performance from my current iMac)... for work usage (graphic design is what I mostly do)

 

Can't wait to hear more! Also this week I should be receiving the rest of components to finally build my own unRAID box!

 

I honestly thought that passing through an entire SSD would give almost bare-metal performance (as it does with GPUs)... So if performance is not that good it might not be worth to have a separate SSD and just use the cache pool.

Link to comment

My idea is to have very powerful hardware so even considering overhead and latency it won't be noticeable (or actually it will be even more performance from my current iMac)... for work usage (graphic design is what I mostly do)

Depending on your definition of "noticable" it works for me.

But unless you take advantage of the features that kvm bring (caching & sharing ressources) it can't be faster than bare-metal unless (like in your case) compared to other/older hardware.

 

In addition to the disk-cache options I posted, I would suggest to experiment with CPU-pinning like dlandon did HERE.

Keeping cpu-cores "clean" should result in less context-switches and with that reduce latency. And keeping threads on corresponding cores will speed things up due to shared cpu cache.

 

I honestly thought that passing through an entire SSD would give almost bare-metal performance (as it does with GPUs)... So if performance is not that good it might not be worth to have a separate SSD and just use the cache pool.

I only tested it once, but yes perfomance will be "near" bare-metal. (like GPU -> 95-99%)

 

But you have to consider, that not every operating system support booting/installing on a nvme disk.

And you need to be carefull with your boot order, because it happened to me once, that my system booted from the os of a VM instead of the USB stick (after a BIOS update). I had no issues with that VM afterwards, but who knows what another OS might do.

Link to comment

Wonder if limetech are going to document or implement any of these performance improvements and suggestions centrally somewhere.

 

Thank you everyone for your research here!

I guess they are trying ...

 

But because its to much for LT alone to manage, I guess they need to take the community approach.

So spread the word and post the stuff you found usefull or helped with an issue HERE.

 

That beeing said, I already found something interesting that may be a usefull tweak in terms of disk perfomance :)

Here's one to add. Need lots of RAM. Really shines when you don't have a cache drive to speed up writes.

 

add to Go file. The number is a percentage of your RAM.

 

# set size of write cache to RAM

sysctl vm.dirty_ratio=75

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.