crescentwire Posted July 24, 2023 Share Posted July 24, 2023 (edited) (Cross-posted to Reddit.) Hey everyone, The full story is over on Reddit, but I'll recap here: I have a Dell R520 running unRAID 6.11.5 with: 4 x 4 TB array SAS disks 2 x 4 TB parity SAS disks 2 x 1 TB SSD (SATA) cache disks 2 x Xeon E5-2450 v2 CPUs (32 cores across two physical sockets) 160 GB DDR3 memory In my mind, this setup should be plenty fast for running multiple VMs, Docker containers, and so on. This is all part of a home lab setup, so I have a Linux VM, along with a few Windows Server/Windows 10 VMs. Recently, I upgraded my cache drives from 200 GB Intel SSDs to 1 TB Micron M600 SSDs. I was previously running several VMs on my array and, while performance wasn't great, it was moderately usable. I was eager to move all my VMs on to my cache drives, alongside my Docker containers, for increased speed and room to add even more VMs. Since I've moved the VMs to my new cache drives, read/write speeds are unusably slow. I'm talking 2-3 MBps (bytes, not bits). On the Windows VMs, Task Manager reports 100% disk active time almost all the time, with response times often in the hundreds to thousands of milliseconds. That's really, really bad. If I run one VM (Linux) and try booting three other Windows VMs up, Linux slows to a crawl and almost completely stops responding. After witnessing a RedHat-based VM take 9 mins 44 secs to boot from a powered off state, I opened the logs window just for the heck of it... and saw this: I had already run a BTRFS filesystem check when the array was started in maintenance mode, but didn't see any issues. I also don't see any errors listed next to the cache pool drives: I'm in the process of moving my domains, appdata, and system shares to the array to see if performance is any better there. If it is, then I suppose I'll be replacing these SSD cache drives. Would you (like me) suspect a hardware issue (bad SSD) at this point? ih-nas01-diagnostics-20230724-1048.zip Edited July 24, 2023 by crescentwire Added diagnostics file Quote Link to comment
Solution Vr2Io Posted July 26, 2023 Solution Share Posted July 26, 2023 If error come from sdc and not both with sdh ( another M600 in cache pool ), it likely sdc were bad. Quote Link to comment
crescentwire Posted July 26, 2023 Author Share Posted July 26, 2023 Thank you, that seems to match with what I'm seeing. I stopped the array, unassigned sdc (the cache pool drive showing errors), but kept sdh. After starting the array, speeds are now in the hundreds of MB/s, which is exactly what I would expect to see. So, sdc is definitely a bad drive. Thank you for the help and confirmation! 1 Quote Link to comment
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.