[6.7.x] Very slow array concurrent performance

rclifton · August 23, 2019

51 minutes ago, Marshalleq said:

I'm not sure what you're saying here. Queue Depth is set to 1 on both 6.6.7 and latest stable. So how does 6.6.x have a higher Queue depth?

I think what he was saying is, now that he is back on 6.6.7 he checked and the queue depth is 1 on 6.6.7 as well. Which means that the speculation that NCQ in 6.7 might be part of the problem with that release would be incorrect since for him queue depth was 1 for both 6.6.7 and 6.7 but he has no issues with 6.6.7. Or at least that's how I read what he said anyway.

Marshalleq · August 23, 2019

OK, yes we had sort of established that, but this is confirmation then. Thanks.

s.Oliver · August 24, 2019

11 hours ago, Marshalleq said:

I'm not sure what you're saying here. Queue Depth is set to 1 on both 6.6.7 and latest stable. So how does 6.6.x have a higher Queue depth?

i was just reacting on @patchrules2000 post, he was setting all drives to QD=32 (even on 6.6.x).

s.Oliver · August 24, 2019

10 hours ago, rclifton said:

I think what he was saying is, now that he is back on 6.6.7 he checked and the queue depth is 1 on 6.6.7 as well. Which means that the speculation that NCQ in 6.7 might be part of the problem with that release would be incorrect since for him queue depth was 1 for both 6.6.7 and 6.7 but he has no issues with 6.6.7. Or at least that's how I read what he said anyway.

nearly perfect

i haven't checked my own QD settings on 6.7.x before i left (no one has brought up the QD as a possible reason), but i looked at a friends unRAID system. a fresh setup (just a few weeks old) and there all spinners are also on QD=1.

Marshalleq · August 24, 2019

1 hour ago, s.Oliver said:

i was just reacting on @patchrules2000 post, he was setting all drives to QD=32 (even on 6.6.x).

OK - regardless of it not being 'the' issue - yes I'd agree that these should be set to 32 on all drives - unless Unraid has also invented a way of replacing NCQ - which I highly doubt, or there some reason to do with their unraid parity logic or something. So this is an unexpected and hopefully additional performance benefit. It IS enabled on my SSD's if I recall correctly.

That said the below actually outlines areas where performance is decreased and specifically mentions RAID. So perhaps that is Limetech's testing found it works better switched off.

https://en.wikipedia.org/wiki/Native_Command_Queuing

Edited August 24, 2019 by Marshalleq

s.Oliver · August 24, 2019

13 hours ago, Marshalleq said:

That said the below actually outlines areas where performance is decreased and specifically mentions RAID. So perhaps that is Limetech's testing found it works better switched off.

https://en.wikipedia.org/wiki/Native_Command_Queuing

on normal SSDs (SATA) (at least on one machine as cache drive seen) it is set to "32". but these are fast enough to handle it and they are not embedded in that special "RAID" operation as the data/parity drives.

because of the nature of unRAIDs "RAID"-modus i guess, the drives are "faster" if they work one small chunks of data in 'sequential' order.

Edited August 24, 2019 by s.Oliver

simalex · August 24, 2019

1 hour ago, s.Oliver said:

because of the nature of unRAIDs "RAID"-modus i guess, the drives are "faster" if they work one small chunks of data in sequentiell order.

I think it's more that when a sector is written on a data drive then for parity to be consistent the same sector needs to be updated near real time as well on the parity drive. The individual drives don't understand this concept and allowing a drive to update in which ever order it chooses would increase the chance of the parity drive being out of sync with the actual data drives, especially when you have updates in multiple data drives simultaneously.

Imagine having to update sector 13456 on drive 3 and sector 25789 on drive 4 in that order and then the parity drive deciding that is should update first sector 25789 and then 13456 and at the same time having a power failure in-between those writes. Then you would end up having 2 sectors with invalid parity data, even though your data drives both have the correct information.

s.Oliver · August 24, 2019

8 hours ago, simalex said:

I think it's more that when a sector is written on a data drive then for parity to be consistent the same sector needs to be updated near real time as well on the parity drive. The individual drives don't understand this concept and allowing a drive to update in which ever order it chooses would increase the chance of the parity drive being out of sync with the actual data drives, especially when you have updates in multiple data drives simultaneously.

Imagine having to update sector 13456 on drive 3 and sector 25789 on drive 4 in that order and then the parity drive deciding that is should update first sector 25789 and then 13456 and at the same time having a power failure in-between those writes. Then you would end up having 2 sectors with invalid parity data, even though your data drives both have the correct information.

i didn't want to go into deep of the concept of unRAIDs parity algorithm. so you're right, unRAID needs to be strict in writing the same sector to data/parity drive(s) at (more or less) at the same time (given how fast different drives are completing the request). so the slowest drive in the mix (which is in the data writing cycle – doesn't matter if parity or data) is responsible for the time needed (or how fast that write cycle will be completed).

but, unRAID is not immune against data loss because of not finished write operations (whatever reason) and has no concept of a journal (to my knowledge). so this file (at that time when writing was abrupt ended and not finished) is damaged/incomplete and parity doesn't/can't change anything here and probably isn't in sync anyway. so unRAID does usually force an parity sync on next start of the array (and it will rebuild parity information completely/only based on the values of the data drive(s)).

unRAID would need some concept of journaling to replay the writes and find the missing part. it has not (again, to my knowledge). ZFS is one file system, which has an algorithm to prevent exactly this.

my observation is, that it is a pretty much synchronous write operation (all drives which need to write data, do write the sectors in the same order/same time – else i imagine, i could hear much more 'noise' from my drives, especially if you do a rebuild).

but i do confess – that is only my understanding of unRAIDs way of writing data into the array.

Edited August 25, 2019 by s.Oliver

DaMAN · August 25, 2019

FWIW, New 1st timer Unraid build, I have an array speed issue(Posted) that maybe related, drives within the array are maxing at 40mb's including clearing of 2 new drives, the same drives(Unassigned) outside the array transfer speeds disk to disk via krusader are approx 170mb/s, the drives within the array have a NCQ of 1(all drives including parity), the same identical drives outside the array have a NCQ value of 32 and are blazing quick, how can I set the NCQ value to 32 for the array drives? or what for a patch?

Marshalleq · August 25, 2019

Sadly, this kind of speed is normal for Unraid when writing. It's due to parity calculations. Like you I do struggle to see how it can be 'that' much slower, but it really is. Unraid trades this for allowing differing sized drives and being able to power down unused drives. Mostly this speed is enough, but when doing large copies it does become rather noticeable. That's why there is a cache option to offset the worst part of Unraid - writing. Some day, when SSD's are affordable for the array we'll get a speed increase. Until then the choice is to put up with it, or to move to something like Proxmox, Freenas etc... I get about the same speed BTW and my disks perform at around 230MB/s individually. Quite a decrease!

DaMAN · August 25, 2019

Ok, googlefu...changed the NCQ of the array drives from 1 to 31 and speed has increased dramatically, doing a clear on 2 drives, gone from 38mb/s to 130mb/s as soon as I made the change, I am a happy camper!, I was about to hit the go button on another HBA controller!

Command to change your NCQ level is:

echo 31 > /sys/block/sdX/device/queue_depth

sd'X' is your device, echo XX is the level you want to change to, max is 32

command to check what your NCQ is set too:

cat /sys/block/sdx/device/queue_depth

sd'X' is your device

Edited August 25, 2019 by DaMAN

John_M · August 25, 2019

On 8/24/2019 at 11:05 AM, Marshalleq said:

OK - regardless of it not being 'the' issue - yes I'd agree that these should be set to 32 on all drives - unless Unraid has also invented a way of replacing NCQ - which I highly doubt,

It was determined a long time ago that the Linux kernel is better at queueing up disk I/O requests than the firmware on the drives themselves (the reverse of the situation with Windows), so in Unraid NCQ is disabled by default. See here.

It looks like something's broken in the Unraid 6.7.x kernel at the moment and enabling NCQ with a value of 32 helps with mitigation, but users reverting to Unraid 6.6.7 should set it to the default "Auto", which in this case means "Off".

John_M · August 25, 2019

6 hours ago, Marshalleq said:

Sadly, this kind of speed is normal for Unraid when writing. It's due to parity calculations. Like you I do struggle to see how it can be 'that' much slower, but it really is.

It is not to do with parity calculations, per se, because they are quick. It is to do with the way the parity drive is updated. There are two modes of operation.

The default is read-modify-write, which requires a block to be read from the parity disk, modified and rewritten. This depends on the rotational latency of the mechanical hard drive since, after reading, it has to wait for a whole rotation before it can write the data back in the same place on the platter. The advantage of this method is that only the data disk being written to and the parity disk need to be spun up - the rest of the disks can be spun down.

The alternative mode is reconstruct write, sometimes called "turbo write", which reads all the data disks simultaneously and calculates then writes parity without having to read it first. It's faster because it doesn't depend on the rotational latency of the parity drive but it requires that all disks be spun up.

J.Nerdy · August 25, 2019

Do we think that intermittent latency problems with a VM could be traced back to this as well? The VM is not on the array (nor cache) but booted from a dedicated nvme drive passed through.

It is incredibly frustrating to try and chase down intermittent problems...

Vr2Io · August 25, 2019

40 minutes ago, J.Nerdy said:

Do we think that intermittent latency problems with a VM could be traced back to this as well? The VM is not on the array (nor cache) but booted from a dedicated nvme drive passed through.

It is incredibly frustrating to try and chase down intermittent problems...

If NVMe was PT, it won't help even set any command for it.

Marshalleq · August 25, 2019

6 hours ago, John_M said:

It is not to do with parity calculations, per se, because they are quick. It is to do with the way the parity drive is updated. There are two modes of operation.

The default is read-modify-write, which requires a block to be read from the parity disk, modified and rewritten. This depends on the rotational latency of the mechanical hard drive since, after reading, it has to wait for a whole rotation before it can write the data back in the same place on the platter. The advantage of this method is that only the data disk being written to and the parity disk need to be spun up - the rest of the disks can be spun down.

The alternative mode is reconstruct write, sometimes called "turbo write", which reads all the data disks simultaneously and calculates then writes parity without having to read it first. It's faster because it doesn't depend on the rotational latency of the parity drive but it requires that all disks be spun up.

Even so, in either read-modify-write or reconstruct-write (I've done both) it is still an awful lot slower than RAID 5 done with striping. At least that's what came out with my testing. Perhaps I'm wrong.

Marshalleq · August 25, 2019

1 hour ago, Benson said:

If NVMe was PT, it won't help even set any command for it.

I assume you're taking into account the normal performance issues on NVME caused by heat.

bonienl · August 25, 2019

20 minutes ago, Marshalleq said:

Even so, in either read-modify-write or reconstruct-write (I've done both) it is still an awful lot slower than RAID 5 done with striping. At least that's what came out with my testing. Perhaps I'm wrong.

I guess hardware plays a role too.

I have pretty fast drives and with reconstruct write on, I do get satisfying results.

Below a 12GB file copied from my PC (nvme disk) directly to the array over a 10G connection.

image.png.255bccd5ab192c0c74689691fea6176f.png

It starts of at 1 GBps (10 Gbps) until the RAM caching is full and continues writing at around 140 MBps (1.12 Gbps)

Marshalleq · August 25, 2019

Very nice - I'm yet to buy one end of my 10G connection. Anyway, as I said above it is a tradeoff and I've chosen to trade having the drives spin down rather than have the speed. I guess you can always turn reconstruct write on if you know you're going to do a large copy. But now we're starting to move away from the topic of this thread.

bonienl · August 25, 2019

Of course I wasn't talking about concurrent drives access...

(disclosure: I am testing on a newer kernel)

Edited August 25, 2019 by bonienl

StevenD · August 25, 2019

8 minutes ago, bonienl said:

Of course I wasn't talking about concurrent drives access...

(disclosure: I am testing on a newer kernel)

So, I'm guessing a new kernel fixes this problem?

I see similar speeds on 6.6.x, but not on any version of 6.7.x.

Marshalleq · August 25, 2019

1 minute ago, StevenD said:

So, I'm guessing a new kernel fixes this problem?

I see similar speeds on 6.6.x, but not on any version of 6.7.x.

So far nothing has been found that fixes this and @limetech continues to be silent.

Vr2Io · August 25, 2019

4 minutes ago, Marshalleq said:

Very nice - I'm yet to buy one end of my 10G connection. Anyway, as I said above it is a tradeoff and I've chosen to trade having the drives spin down rather than have the speed. I guess you can always turn reconstruct write on if you know you're going to do a large copy. But now we're starting to move away from the topic of this thread.

You are so interesting.

bonienl · August 25, 2019

17 minutes ago, StevenD said:

So, I'm guessing a new kernel fixes this problem?

I am afraid that simultaneous transfers to the array (different disks) is hampered.

Here are the results of the same 12TB file copied to two different disks simultaneously.

image.png.57a9b344164b6bf33ead440681dcc5f5.png

This is not extremely low, but pales in comparison when I read this 12TB file from the array 😐

image.png.0f552acd40be812c5ed1d30a220c5feb.png

Okay, I was cheating a little when reading from the array (things were cached in RAM).

I do get around 250 MBps in real world circumstances.

image.png.5d6d1e991191ec7eb55f8e7414580e60.png

Edited August 25, 2019 by bonienl

Vr2Io · August 25, 2019

40 minutes ago, bonienl said:

Of course I wasn't talking about concurrent drives access...

(disclosure: I am testing on a newer kernel)

Unraid 6.8?

[6.7.x] Very slow array concurrent performance

User Feedback

Recommended Comments

rclifton 7

Link to comment

Marshalleq 139

Link to comment

s.Oliver 25

Link to comment

s.Oliver 25

Link to comment

Marshalleq 139

Link to comment

s.Oliver 25

Link to comment

simalex 2

Link to comment

s.Oliver 25

Link to comment

DaMAN 1

Link to comment

Marshalleq 139

Link to comment

DaMAN 1

Link to comment

John_M 413

Link to comment

John_M 413

Link to comment

J.Nerdy 4

Link to comment

Vr2Io 371

Link to comment

Marshalleq 139

Link to comment

Marshalleq 139

Link to comment

bonienl 1768

Link to comment

Marshalleq 139

Link to comment

bonienl 1768

Link to comment

StevenD 88

Link to comment

Marshalleq 139

Link to comment

Vr2Io 371

Link to comment

bonienl 1768

Link to comment

Vr2Io 371

Link to comment

Join the conversation