j-kowalski Posted October 30, 2020 Share Posted October 30, 2020 I've been evaluating replacing my custom-built NAS with Unraid, and so far I'm getting really bad performance and unusually high load. (From what I'm reading SSDs are not really supported for performance reasons, but I wasn't expecting it to be that bad, so I'm clearly doing something wrong). I repurposed my secondary NAS for this test. The NAS has 5x2TB SSDs, 10 GBE NIC and previously had Ubuntu with ZFS and was performing really well. I installed latest Unraid, added all 5 SSDs to the array (1 parity, 4 data drives) re-formatted using XFS, did not tweak any defaults. I started copying data from my primary NAS using rsync - I had about 2 TB of data to copy mostly in large files (blu ray files). While copying I was observing network traffic on the switch - it stayed around 50 MB/s for the first 5 hours (surprising low, but I think the parity was still being built) then started going slightly higher 100MB/s - 200 MB/s but very choppy. When I zoomed in on the graph, I've noticed 1-2 minute bursts of reasonably high throughput separated by 1-2 minute periods of zero network activity. When I'm copying data to the NAS and ssh into the server and do `top` I see almost no CPU usage but very high load of > 100: top - 09:14:51 up 11:53, 2 users, load average: 101.27, 70.65, 40.85 Tasks: 266 total, 1 running, 265 sleeping, 0 stopped, 0 zombie %Cpu(s): 0.1 us, 0.2 sy, 0.0 ni, 81.3 id, 18.4 wa, 0.0 hi, 0.0 si, 0.0 st MiB Mem : 31913.8 total, 329.1 free, 7356.1 used, 24228.6 buff/cache MiB Swap: 0.0 total, 0.0 free, 0.0 used. 23477.5 avail Mem PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 790 root 20 0 0 0 0 S 0.3 0.0 1:38.41 kswapd0 1300 root 20 0 0 0 0 D 0.3 0.0 0:19.74 kworker/u16:4+flush-9:2 2104 root 20 0 149724 8380 3736 S 0.3 0.0 0:21.78 nginx 4084 root 20 0 0 0 0 D 0.3 0.0 2:09.32 unraidd2 4163 root 20 0 348536 4580 932 S 0.3 0.0 12:30.89 shfs At this point all filesystem operations are at a standstill - creation of 100 MB empty file takes >35s. $ dd if=/dev/zero bs=1M count=100 of=/mnt/user/Videos/test.bin 100+0 records in 100+0 records out 104857600 bytes (105 MB, 100 MiB) copied, 35.5957 s, 2.9 MB/s When I stop copying data to the server, it takes ~3 minutes for the load to drop to zero and then filesystem is fast again: dd if=/dev/zero bs=1M count=100 of=/mnt/user/Videos/test.bin 100+0 records in 100+0 records out 104857600 bytes (105 MB, 100 MiB) copied, 0.0724426 s, 1.4 GB/s If I wait for the load to drop to zero and try creating lots of 100 MB files under /mnt/user on the server in a loop, things very quickly get stuck again and I get completely unpredictable performance: for x in `seq 1 100`; do dd if=/dev/zero bs=1M count=100 of=/mnt/user/Videos/test-3-$x.bin; done 100+0 records in 100+0 records out 104857600 bytes (105 MB, 100 MiB) copied, 0.0999753 s, 1.0 GB/s 100+0 records in 100+0 records out 104857600 bytes (105 MB, 100 MiB) copied, 6.53282 s, 16.1 MB/s 100+0 records in 100+0 records out 104857600 bytes (105 MB, 100 MiB) copied, 16.683 s, 6.3 MB/s 100+0 records in 100+0 records out 104857600 bytes (105 MB, 100 MiB) copied, 0.0598832 s, 1.8 GB/s 100+0 records in 100+0 records out 104857600 bytes (105 MB, 100 MiB) copied, 0.0620345 s, 1.7 GB/s 100+0 records in 100+0 records out 104857600 bytes (105 MB, 100 MiB) copied, 0.067086 s, 1.6 GB/s 100+0 records in 100+0 records out 104857600 bytes (105 MB, 100 MiB) copied, 0.484227 s, 217 MB/s 100+0 records in 100+0 records out 104857600 bytes (105 MB, 100 MiB) copied, 1.62638 s, 64.5 MB/s 100+0 records in 100+0 records out 104857600 bytes (105 MB, 100 MiB) copied, 1.08325 s, 96.8 MB/s 100+0 records in 100+0 records out 104857600 bytes (105 MB, 100 MiB) copied, 1.23314 s, 85.0 MB/s 100+0 records in 100+0 records out 104857600 bytes (105 MB, 100 MiB) copied, 1.28973 s, 81.3 MB/s 100+0 records in 100+0 records out 104857600 bytes (105 MB, 100 MiB) copied, 0.820569 s, 128 MB/s 100+0 records in 100+0 records out 104857600 bytes (105 MB, 100 MiB) copied, 0.593468 s, 177 MB/s 100+0 records in 100+0 records out 104857600 bytes (105 MB, 100 MiB) copied, 0.251855 s, 416 MB/s 100+0 records in 100+0 records out 104857600 bytes (105 MB, 100 MiB) copied, 0.190008 s, 552 MB/s 100+0 records in 100+0 records out 104857600 bytes (105 MB, 100 MiB) copied, 0.0649713 s, 1.6 GB/s 100+0 records in 100+0 records out 104857600 bytes (105 MB, 100 MiB) copied, 0.0622822 s, 1.7 GB/s 100+0 records in 100+0 records out 104857600 bytes (105 MB, 100 MiB) copied, 0.124393 s, 843 MB/s 100+0 records in 100+0 records out 104857600 bytes (105 MB, 100 MiB) copied, 0.333185 s, 315 MB/s 100+0 records in 100+0 records out 104857600 bytes (105 MB, 100 MiB) copied, 0.418967 s, 250 MB/s 100+0 records in 100+0 records out 104857600 bytes (105 MB, 100 MiB) copied, 0.427942 s, 245 MB/s I verified that individual SSD can achieve about 500 MB/s read and write. There are no obvious errors in dmesg or any other log file I could find. I'll keep digging into that. Any help is greatly appreciated. Quote Link to comment
JorgeB Posted October 30, 2020 Share Posted October 30, 2020 If you haven't yet enable turbo write and performance should be a little better, but don't expect 500MB/s, even if your SSDs can sustain that, and most SSDs can't contrary to popular belief. Also make sure you're writing to a single disk at a time, i.e., don't enable most free allocation mode unless you have a much faster parity device that can cope with the overlapping writes. Quote Link to comment
j-kowalski Posted October 30, 2020 Author Share Posted October 30, 2020 That did not really change much, I will do more testing... Quote Link to comment
JorgeB Posted October 30, 2020 Share Posted October 30, 2020 I get around 300MB/s sustained to my small SSD server with turbo write enable, about 200MB/s without it. Quote Link to comment
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.