s.Oliver
-
Posts
308 -
Joined
-
Last visited
Content Type
Profiles
Forums
Downloads
Store
Gallery
Bug Reports
Documentation
Landing
Report Comments posted by s.Oliver
-
-
13 hours ago, Marshalleq said:
That said the below actually outlines areas where performance is decreased and specifically mentions RAID. So perhaps that is Limetech's testing found it works better switched off.
on normal SSDs (SATA) (at least on one machine as cache drive seen) it is set to "32". but these are fast enough to handle it and they are not embedded in that special "RAID" operation as the data/parity drives.
because of the nature of unRAIDs "RAID"-modus i guess, the drives are "faster" if they work one small chunks of data in 'sequential' order.
-
10 hours ago, rclifton said:
I think what he was saying is, now that he is back on 6.6.7 he checked and the queue depth is 1 on 6.6.7 as well. Which means that the speculation that NCQ in 6.7 might be part of the problem with that release would be incorrect since for him queue depth was 1 for both 6.6.7 and 6.7 but he has no issues with 6.6.7. Or at least that's how I read what he said anyway.
nearly perfect
i haven't checked my own QD settings on 6.7.x before i left (no one has brought up the QD as a possible reason), but i looked at a friends unRAID system. a fresh setup (just a few weeks old) and there all spinners are also on QD=1.
-
11 hours ago, Marshalleq said:
I'm not sure what you're saying here. Queue Depth is set to 1 on both 6.6.7 and latest stable. So how does 6.6.x have a higher Queue depth?
i was just reacting on @patchrules2000 post, he was setting all drives to QD=32 (even on 6.6.x).
-
my 2cents here (i'm back on 6.6.7 for 12 days and all is as good as it ever was):
Disk Settings: Tunable (md_write_method): Auto (have never touched it)
cat /sys/block/sdX/device/queue_depth for all rotational HDDs is "1"
QD for cache NVMe drive is unknown (doesn't have the same path to print the value)
wouldn't this contradict the opinion, that because of 6.6.x series has a higher QD value, it performs better?
-
need to correct my last post: PLEX docker (media scan background task) did crash now once. so possible that this isn't related to the kernel, or whatever.
-
5 hours ago, sirkuz said:
I bit the bullet and have decided to revert one system as well, as reported many times things seem to be back to "normal". Not sure how long I will be able to hold of on the secondary reverting as well as it was quite simple.
maybe you don't have to, if limetech can identify the problem and fix it.
- 1
-
funny thing, now another problem has disappeared (after going back to 6.6.7), which brought some serious brain smashing:
PLEX (docker) has some background tasks running (usually in the night), one is the media scanning job. this one regularly crashed and alot of people had this problem too and tried to find a solution. now after some days of up time with 6.6.7 i haven't seen one crash – YEAH!
in the nights i've some big backup jobs running, which are writing into the array. so i would guess, that PLEX has timed out on accessing data in the array (albeit, i just reads files).
- 1
-
well, couldn't stand it anymore – so back to 6.6.7 and all is back to normal, expected behavior.
though, missing stuff from 6.7, so i'll hope they can identify/fix the problem really soon.
-
i can add to this and it's a major drop-down for unRAID going from 6.7 onward.
before i was reluctant to post about it, cause of too less tests done to be 100% sure of not having some settings somewhere changed…
but now, i'm sure. today i upgraded one more unRAID server from 6.6x to 6.7.2 and do see the exact same behavior! so i do have 2 machines here, which haven't had a single change, except they were uograded to 6.7.x (meanwhile all on 6.7.2).
in my book, it doesn't matter how you access the data: coming from network or locally on the server, using different machines to connect to the server… when one write into the array is ongoing, then any reads (even from cache SSDs/NVMe') – even the ones coming from data or cache devices which aren't written to – are super slow. also whenever now a rebuild is happening, you better not want to read any file...
also RAM amount doesn't change anything, nor the used controllers nor the cpu (with/without mitigation enabled/disabled). and while i can't back it by data, it seems that rebuilds are slower too.
this can have severe scenarios, where some services are writing continuously data into the array (like video surveillance for example).
hopefully we can find a fast fix for this, because going back to 6.6.x isn't a good option anymore.
@limetech what can we do to help debugging this?
- 2
-
12 minutes ago, SpaceInvaderOne said:
Well, I just changed my motherboard for an ASRock board and swapped it out today. Then I read this post and saw there was a new bios for the Gigabyte boards! Typical.
I am seriously thinking of just using air cooling and going with a Noctua NH U14s TR4 then selling the Enermax when I get the replacement.
hey spaceinvaderone,
might consider to tell the name/type of the mainboards you use for your threadripper cpus?
well, i can recommend the Noctua CPU air coolers, at least here i've used several of them, all excellent!
[6.7.x] Very slow array concurrent performance
in Stable Releases
Posted · Edited by s.Oliver
i didn't want to go into deep of the concept of unRAIDs parity algorithm. so you're right, unRAID needs to be strict in writing the same sector to data/parity drive(s) at (more or less) at the same time (given how fast different drives are completing the request). so the slowest drive in the mix (which is in the data writing cycle – doesn't matter if parity or data) is responsible for the time needed (or how fast that write cycle will be completed).
but, unRAID is not immune against data loss because of not finished write operations (whatever reason) and has no concept of a journal (to my knowledge). so this file (at that time when writing was abrupt ended and not finished) is damaged/incomplete and parity doesn't/can't change anything here and probably isn't in sync anyway. so unRAID does usually force an parity sync on next start of the array (and it will rebuild parity information completely/only based on the values of the data drive(s)).
unRAID would need some concept of journaling to replay the writes and find the missing part. it has not (again, to my knowledge). ZFS is one file system, which has an algorithm to prevent exactly this.
my observation is, that it is a pretty much synchronous write operation (all drives which need to write data, do write the sectors in the same order/same time – else i imagine, i could hear much more 'noise' from my drives, especially if you do a rebuild).
but i do confess – that is only my understanding of unRAIDs way of writing data into the array.