Jump to content

Evaluating Unraid on all-SSD array - very slow write performance and super high load


Recommended Posts

I've been evaluating replacing my custom-built NAS with Unraid, and so far I'm getting really bad performance and unusually high load. (From what I'm reading SSDs are not really supported for performance reasons, but I wasn't expecting it to be that bad, so I'm clearly doing something wrong).

 

I repurposed my secondary NAS for this test. The NAS has 5x2TB SSDs, 10 GBE NIC and previously had Ubuntu with ZFS and was performing really well.

 

I installed latest Unraid, added all 5 SSDs to the array (1 parity, 4 data drives) re-formatted using XFS, did not tweak any defaults. I started copying data from my primary NAS using rsync - I had about 2 TB of data to copy mostly in large files (blu ray files).

 

While copying I was observing network traffic on the switch - it stayed around 50 MB/s for the first 5 hours (surprising low, but I think the parity was still being built) then started going slightly higher 100MB/s - 200 MB/s but very choppy.

 

When I zoomed in on the graph, I've noticed 1-2 minute bursts of reasonably high throughput separated by 1-2 minute periods of zero network activity.

 

When I'm copying data to the NAS and ssh into the server and do `top` I see almost no CPU usage but very high load of > 100:

top - 09:14:51 up 11:53,  2 users,  load average: 101.27, 70.65, 40.85
Tasks: 266 total,   1 running, 265 sleeping,   0 stopped,   0 zombie
%Cpu(s):  0.1 us,  0.2 sy,  0.0 ni, 81.3 id, 18.4 wa,  0.0 hi,  0.0 si,  0.0 st
MiB Mem :  31913.8 total,    329.1 free,   7356.1 used,  24228.6 buff/cache
MiB Swap:      0.0 total,      0.0 free,      0.0 used.  23477.5 avail Mem 

  PID USER      PR  NI    VIRT    RES    SHR S  %CPU  %MEM     TIME+ COMMAND                                                                                                                     
  790 root      20   0       0      0      0 S   0.3   0.0   1:38.41 kswapd0                                                                                                                     
 1300 root      20   0       0      0      0 D   0.3   0.0   0:19.74 kworker/u16:4+flush-9:2                                                                                                     
 2104 root      20   0  149724   8380   3736 S   0.3   0.0   0:21.78 nginx                                                                                                                       
 4084 root      20   0       0      0      0 D   0.3   0.0   2:09.32 unraidd2                                                                                                                    
 4163 root      20   0  348536   4580    932 S   0.3   0.0  12:30.89 shfs                                                                                                                        

At this point all filesystem operations are at a standstill - creation of 100 MB empty file takes >35s.

$ dd if=/dev/zero bs=1M count=100 of=/mnt/user/Videos/test.bin
100+0 records in
100+0 records out
104857600 bytes (105 MB, 100 MiB) copied, 35.5957 s, 2.9 MB/s

When I stop copying data to the server, it takes ~3 minutes for the load to drop to zero and then filesystem is fast again:

dd if=/dev/zero bs=1M count=100 of=/mnt/user/Videos/test.bin
100+0 records in
100+0 records out
104857600 bytes (105 MB, 100 MiB) copied, 0.0724426 s, 1.4 GB/s

If I wait for the load to drop to zero and try creating lots of 100 MB files under /mnt/user on the server in a loop, things very quickly get stuck again and I get completely unpredictable performance:

for x in `seq 1 100`; do dd if=/dev/zero bs=1M count=100 of=/mnt/user/Videos/test-3-$x.bin; done
100+0 records in
100+0 records out
104857600 bytes (105 MB, 100 MiB) copied, 0.0999753 s, 1.0 GB/s
100+0 records in
100+0 records out
104857600 bytes (105 MB, 100 MiB) copied, 6.53282 s, 16.1 MB/s
100+0 records in
100+0 records out
104857600 bytes (105 MB, 100 MiB) copied, 16.683 s, 6.3 MB/s
100+0 records in
100+0 records out
104857600 bytes (105 MB, 100 MiB) copied, 0.0598832 s, 1.8 GB/s
100+0 records in
100+0 records out
104857600 bytes (105 MB, 100 MiB) copied, 0.0620345 s, 1.7 GB/s
100+0 records in
100+0 records out
104857600 bytes (105 MB, 100 MiB) copied, 0.067086 s, 1.6 GB/s
100+0 records in
100+0 records out
104857600 bytes (105 MB, 100 MiB) copied, 0.484227 s, 217 MB/s
100+0 records in
100+0 records out
104857600 bytes (105 MB, 100 MiB) copied, 1.62638 s, 64.5 MB/s
100+0 records in
100+0 records out
104857600 bytes (105 MB, 100 MiB) copied, 1.08325 s, 96.8 MB/s
100+0 records in
100+0 records out
104857600 bytes (105 MB, 100 MiB) copied, 1.23314 s, 85.0 MB/s
100+0 records in
100+0 records out
104857600 bytes (105 MB, 100 MiB) copied, 1.28973 s, 81.3 MB/s
100+0 records in
100+0 records out
104857600 bytes (105 MB, 100 MiB) copied, 0.820569 s, 128 MB/s
100+0 records in
100+0 records out
104857600 bytes (105 MB, 100 MiB) copied, 0.593468 s, 177 MB/s
100+0 records in
100+0 records out
104857600 bytes (105 MB, 100 MiB) copied, 0.251855 s, 416 MB/s
100+0 records in
100+0 records out
104857600 bytes (105 MB, 100 MiB) copied, 0.190008 s, 552 MB/s
100+0 records in
100+0 records out
104857600 bytes (105 MB, 100 MiB) copied, 0.0649713 s, 1.6 GB/s
100+0 records in
100+0 records out
104857600 bytes (105 MB, 100 MiB) copied, 0.0622822 s, 1.7 GB/s
100+0 records in
100+0 records out
104857600 bytes (105 MB, 100 MiB) copied, 0.124393 s, 843 MB/s
100+0 records in
100+0 records out
104857600 bytes (105 MB, 100 MiB) copied, 0.333185 s, 315 MB/s
100+0 records in
100+0 records out
104857600 bytes (105 MB, 100 MiB) copied, 0.418967 s, 250 MB/s
100+0 records in
100+0 records out
104857600 bytes (105 MB, 100 MiB) copied, 0.427942 s, 245 MB/s

I verified that individual SSD can achieve about 500 MB/s read and write.

 

There are no obvious errors in dmesg or any other log file I could find. I'll keep digging into that.

 

Any help is greatly appreciated.

Link to comment

If you haven't yet enable turbo write and performance should be a little better, but don't expect 500MB/s, even if your SSDs can sustain that, and most SSDs can't contrary to popular belief.

 

Also make sure you're writing to a single disk at a time, i.e., don't enable most free allocation mode unless you have a much faster parity device that can cope with the overlapping writes.

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...