January 20, 201313 yr Is it normal behaviour that after an unclean shutdown a corrective parity check is started? Both the WEBGUI and syslog indicate so.
January 20, 201313 yr Im still fairly new to unraid but its been the case for me that every time after an unclean shutdown its started a parity check
January 20, 201313 yr Author Im still fairly new to unraid but its been the case for me that every time after an unclean shutdown its started a parity check A corrective or non corrective one?
January 20, 201313 yr It's normal. If there were outstanding writes when the system went down, in most cases, the file system will correct itself when restarted. However, the file system won't update the parity drive so if a drive should fail you'd loose data. That's why a corrective check is kicked off immediately - to maintain parity with any data corrected by the file system.
January 20, 201313 yr Is it normal behaviour that after an unclean shutdown a corrective parity check is started? Both the WEBGUI and syslog indicate so. "corrective" means that the parity information on the parity drive is changed to match the other drives. So the correction is on the parity, not on the data drives..
January 20, 201313 yr Im still fairly new to unraid but its been the case for me that every time after an unclean shutdown its started a parity check A corrective or non corrective one? Sorry should of put it in my post, a corrective check
January 20, 201313 yr It's normal. If there were outstanding writes when the system went down, in most cases, the file system will correct itself when restarted. However, the file system won't update the parity drive so if a drive should fail you'd loose data. That's why a corrective check is kicked off immediately - to maintain parity with any data corrected by the file system. That makes sense--superficially. But it may prove unwise. The above assumes that the array was in perfect shape prior to the "crash" (ie, if a non-correcting parity check had been performed then, it would have passed w/0 errs). But, as many of you know, when you run your periodic parity check and find an error or two, it can't be assumed. I don't use unRAID (and,if PCRx's description is accurate, this adds another reason), but I think a correcting parity check is a BAD idea, under any condition, most especially one where the user had no say in the matter. Remember, everthing is fine--until it isn't. --UhClem
January 21, 201313 yr It's normal. If there were outstanding writes when the system went down, in most cases, the file system will correct itself when restarted. However, the file system won't update the parity drive so if a drive should fail you'd loose data. That's why a corrective check is kicked off immediately - to maintain parity with any data corrected by the file system. That makes sense--superficially. But it may prove unwise. The above assumes that the array was in perfect shape prior to the "crash" (ie, if a non-correcting parity check had been performed then, it would have passed w/0 errs). But, as many of you know, when you run your periodic parity check and find an error or two, it can't be assumed. I don't use unRAID (and,if PCRx's description is accurate, this adds another reason), but I think a correcting parity check is a BAD idea, under any condition, most especially one where the user had no say in the matter. Remember, everything is fine--until it isn't. --UhClem I'll start by saying that from the posts I've read of yours, I've been highly impressed with your knowledge and level-headed analysis, but I do have to disagree with your conclusion here. If I may respectfully say, I suspect that you have drawn your conclusion because of a lower confidence in the integrity of the parity protection UnRAID provides, partly based on lack of use yourself, and partly based on some of the intense discussions we have had in the past about the issue of Correcting and Non-Correcting parity checks. I believe we still respect each others positions on the matter, but the one thing we may not have done was to keep the problem in perspective. In other words, the examples put forth where some believe the parity was corrupt are actually *extremely* rare cases. After all of those discussions, I accept that there may be cases where a Non Correcting parity check would be safer, but it's SO rare that I still strongly believe in Corrective parity checks. I certainly do in this particular case, after a bad shutdown. The odds (to me) are so incredibly small that the parity might be improperly corrected, that I don't think it's worth checking for. Especially since, and here's the kicker for me, what can you possibly do, if you ran a Non Corrective parity check first and it indicated an error? Especially, what could the normal user do? The normal user will not have Par2 files for ALL of his data disks, and will not have a second parity protection system (such as Reed Solomon) installed. How can he possibly tell if this is not a normal parity error? And how can he determine which disk is actually at fault, has the incorrect block? His first issue is whether it is a problem at all, whether it is the normal case that just needs the parity drive to be updated. The chances are 100,000 to 1 (I'm being conservative, I think it is even higher than that) that it is the parity info on the parity drive that needs updating, than that bits in a sector were somehow changed without causing a disk error or corresponding parity update. The next issue is that it is not that easy to determine what the block is, generally requires some competent syslog analysis, and the syslog, the competent analysis, and even the block itself may or may not be available. Then *if* he obtains the block number, there is a rather technical process, time-consuming and error-prone, to try and determine what file if any is using that block, on *every* data disk. *If* he is able to obtain a list of suspect files, then he has to figure out which if any are actually corrupted. If he has a backup, he can compare. If it is a zip file or other file with built-in CRC, then he can test its integrity (but even then, the chances of it having already been corrupted (transferred that way to the disk) are much higher (in my opinion), than that this is the newly corrupted file that caused the parity error). But in all likelihood, he won't be able to tell if some of the files are corrupted or not. And if he cannot be definitive about *every* file, then how can he make any conclusion? He can't, so he has to fall back on the overwhelmingly likely choice, correct the parity info on the parity disk. There is one more case to deal with, but this one is a little more straightforward, what if the corrupted sector is not in a file, or in free space, but is in the file system itself. That means he will need to run reiserfsck on the disks, any disk for which he cannot identify a file. Since reiserfsck won't identify certain metadata changes, such as within the file name, or possibly its attributes or dates, he will also need to carefully examine the entire directory tree for visible damage. And again, if he can't be definitive about *every* disk... I doubt if even the most experienced and competent UnRAID user would want to go through that process, unless there was *other* evidence of a hardware issue with his server, that might indicate all of this work might be worth the effort and time. Perhaps someday, someone will write a script that will automate all or part of the file identifying process. I hope you will give UnRAID a try sometime. I personally am a believer in its reliability and usefulness for archiving, and the greatly increased data integrity it provides. I also don't expect everyone to agree with me.
January 21, 201313 yr Tom (LimeTech) gave his justification here... http://lime-technology.com/forum/index.php?topic=24921.msg217766#msg217766 The reasoning behind "correcting parity check upon restart after unclean shutdown" is this. This state of detecting a restart following an unclean shutdown only occurs when there's been, well, an unclean shutdown - that is, a crash, or a power failure, or some other case where the sever is rebooted while the array is started and all the disks are mounted and there's possibly outstanding disk i/o. Let's consider this case of outstanding writes at the time of a hard reset. In this case the data on at least one of the disks is "incomplete" - either file data or metadata or both, as well as possibly the parity data. So this particular disk will have some corruption, where 99% of the time a subsequent reboot will fix due to reiserfs replaying journaled transactions. But this does not fix the parity. So we want to start up a parity check, and write corrected parity, as soon as possible because if some other disk fails, we will not be able to completely rebuild it. So, following an unclean shutdown, provided server comes up with no missing/disabled disks, I would say you always want to do a correcting parity check, and this is what the code does.
January 22, 201313 yr I'll start by saying that from the posts I've read of yours, I've been highly impressed with your knowledge and level-headed analysis, but I do have to disagree with your conclusion here. If I may respectfully say, I suspect that you have drawn your conclusion because of a lower confidence in the integrity of the parity protection UnRAID provides, partly based on lack of use yourself, and partly based on some of the intense discussions we have had in the past about the issue of Correcting and Non-Correcting parity checks. Just to back up a little ... I do believe the core unRAID mechanism for parity protection is solid (a modified/relaxed version of Linux's md RAID-4) [even moreso now that the possible, but highly unlikely, "race condition" glitch was fixed in the 4.7-5.0 transition]. I also have a very high level of trust in modern disk drives--not that they won't fail, but that they can be relied upon to not return a sector, from a Read, whose contents differs from the sector previously stored, during a Write (for a given LBA) It might return an error condition--i.e., and No_Data (which is much!! better than Erroneous_Data [and NO error condition]) The real issues arise just before and after (and even during) the RAID parity calculation, and the primary culprit is non-ECC memory. I believe that most unRAID users (as well as most PC users in general) use non-ECC memory. While memory parity errors are not commonplace; they are definitely not rare either. (Getting struck by lightning is rare.) It is these occasional memory glitches (bit-flip) that are the likely cause of a variety of (disk) data integrity problems. They can manifest [on disk] as bad data (but good parity) or good data (but bad parity) or bad data (and bad parity). [using ECC memory does not eliminate all such risks, but is probably 10x-100x better than using non-ECC]. Regardless, for many users, because of motherboard and/or CPU, ECC is not even an option. But that is where software-based efforts can reduce the resulting (disk) data errors, without resorting to all the complexities, and overhead, of somethinhg like ZFS. [i forget which thread here, but I briefly sketched out the idea for a "limited/targeted parity check" which was intended to follow a cache-drive=>unRAID-array session.] There are other possible enhancements too. But the goal is to catch any data integrity errors as early as possible; thus, they can be easily resolved, and they are not laying in wait to either cause errors (possibly silently) during a full drive restore, or, to cause collateral damage during a correcting parity check, either explicit (user-initiated) or implicit (event-initiated [ie, the subject of this thread]). Oh yeah, speaking of this thread ... my quibble with doing a corrective parity check following the Reiser-fsck's of a crash recovery procedure is this: While you definitely need to re-generate parity corresponding to the sectors/stripes modified by reiser_fsck, those should be the only sectors/stripes which should get correctively parity checked. (Again a "targeted parity check") This way, it finishes in minutes, not hours -- and we've basically eliminated any chance of collateral damage. An option should be added to reiser_fsck to generate a list of LBAs which were written to effect the repair (and, of course, unRAID would use that). Those lists (one per repaired disk) would comprise the target for the "targeted [corrective] parity check". And, yes, RobJ, I do believe that unRAID should include a tool, nee wholiveshere, which would take a list of LBAs (or Stripe#s) and for each such arg generate the "usage" of that address, on each data disk. The need for it would be much less if error conditions were caught "in the act", but it would still come in handy in extreme cases.
January 22, 201313 yr I also have a very high level of trust in modern disk drives--not that they won't fail, but that they can be relied upon to not return a sector, from a Read, whose contents differs from the sector previously stored, during a Write (for a given LBA) It might return an error condition--i.e., and No_Data (which is much!! better than Erroneous_Data [and NO error condition]) I do not share that confidence in the disks themselves, as the reason for the post-preclear-read phase in my preclear_disk script is specifically because users have found disks that randomly returned different data for the same stored data block and with no other errors to indicate anything was amiss. These disks have been found multiple times by various users. I know there is a sector level checksum, and an additional checksum on the SATA communications, but somewhere in between the bits do get mangled undetected. I suspect marginal cache-memory or associated electronics internal to the drive. They first showed themselves as random parity errors, as a given disk would return a different value than stored every once in a while. There is a memory test for system RAM, but there is no equivalent for the memory internal to the disk. (except perhaps for the manufacturer, and they are not sharing) As far as the correcting parity sync... perhaps it can change to non-correcting once all the journals are replayed on the data disks. Joe L.
January 22, 201313 yr I feel it is important to include this preface: unRAID users should note that what we are discussing here, while definitely possible, is pretty unlikely. It is not a reason to distrust unRAID, or even your hardware system. But it should motivate you to be sure that your important data is properly, and rigorously, backed up. I also have a very high level of trust in modern disk drives--not that they won't fail, but ... I do not share that confidence in the disks themselves, as the reason for the post-preclear-read phase in my preclear_disk script is specifically because users have found disks that randomly returned different data for the same stored data block and with no other errors to indicate anything was amiss. These disks have been found multiple times by various users. Yes, there will always be the possibility of defective hardware, and it is prudent to use tools/procedures (like preclear) to help detect such hardware early. I suspect that some of those reports were actually provoked by RAM (parity) faults (somewhere in the test-chain), but those reports could be isolated (and "re-categorized") by further focused testing. The remaining cases provide an excellent basis, even when the server uses ECC memory, for making an effort to verify the successful completion of all modifications to one's array, and to verify all reports of errors in one's array before taking corrective [and irreversible] action. I know there is a sector level checksum, and an additional checksum on the SATA communications, but somewhere in between the bits do get mangled undetected. I suspect marginal cache-memory or associated electronics internal to the drive. They first showed themselves as random parity errors, as a given disk would return a different value than stored every once in a while. I agree with this 100%. "But, wait, there's more!" [i don't mean to be alarmist here, but it is important to "confront your demons", right?] Isn't it very likely that only about half such occurrences have become visible? That same "bit mangling" is just as likely to have occurred on the way TO the disk, resulting in a persistent/consistent [array] parity error. I suspect that this particular scenario is the cause for most of the "apparently" true examples of bit rot--but, again, only a rigorous verification of all array modifications, would allow one to distinguish between this apparent bit rot, and a real bona fide (my holy grail of) bit rot. That would be where, for a particular sector, the data bits and the ECC bits morphed just so such that a sector contents different from what was originally stored and verified was returned (on a subsequent Read) successfully (ie, no ECC error) And, repeated Reads of that sector (successfully) return that same new contents (thus distinguishing from one of the bit-mangling scenarios [FROM the drive] like JoeL described). Statistically, I'd say that one ranks right up there with one getting struck by lightning--except that you definitely know it when lightning strikes you . There is a memory test for system RAM, but there is no equivalent for the memory internal to the disk. (except perhaps for the manufacturer, and they are not sharing) I would expect that a quickie version of it is performed at each (drive) power-on initialization. But, since it needs to be a quick test, only the flagrant flakes will be detected. --UhClem "Welcome to the Future Fair--a fair for all, and no fair to anybody."
January 22, 201313 yr I do not share that confidence in the disks themselves, as the reason for the post-preclear-read phase in my preclear_disk script is specifically because users have found disks that randomly returned different data for the same stored data block and with no other errors to indicate anything was amiss. These disks have been found multiple times by various users. I know there is a sector level checksum, and an additional checksum on the SATA communications, but somewhere in between the bits do get mangled undetected. I suspect marginal cache-memory or associated electronics internal to the drive. They first showed themselves as random parity errors, as a given disk would return a different value than stored every once in a while. Joe, I know it will be very hard (impossible) to provide good numbers, but just to give me some perspective, could you come up with a ballpark figure as to how many times this has occurred? And over how many times Preclear may have been run? Would it be closer to 1 in 3000 drives, 1 in 10,000, 1 in 30,000, etc? It strikes me that it could be very useful to know if the problem is on the read side or the zero-writing side. When helping with video playback issues, the very first question is if it is a problem in the recording or in its playback, and to discover that, you ask them to play exactly the same defective part back, and determine if the defective play is absolutely identical. If it is, then you know you have a defect in the stored recording, and that is one set of issues. If the playback is different, not identical, then the issue is a playback issue, and a whole different set of diagnostics and issues applies. Preclear could help out with this. Upon detecting a non-zero sector, first make sure all buffers and caches are flushed, then reread. We would then know if it was a random read error (sector is actually all zeroes), or a write zeroes error (sector truly is not all zeroes). It could be very useful if we discovered that these Preclear nonzero errors were all one or the other problem (all read or all write). No clue what we should do about them, but more info gives us a start to better understanding them. What we really need is more info and better tests, that would help us rule out some of the possible suspects. There are still too many components and data paths involved. (Edit:) We're still just guessing.
January 22, 201313 yr I do not share that confidence in the disks themselves, as the reason for the post-preclear-read phase in my preclear_disk script is specifically because users have found disks that randomly returned different data for the same stored data block and with no other errors to indicate anything was amiss. These disks have been found multiple times by various users. I know there is a sector level checksum, and an additional checksum on the SATA communications, but somewhere in between the bits do get mangled undetected. I suspect marginal cache-memory or associated electronics internal to the drive. They first showed themselves as random parity errors, as a given disk would return a different value than stored every once in a while. Joe, I know it will be very hard (impossible) to provide good numbers, but just to give me some perspective, could you come up with a ballpark figure as to how many times this has occurred? And over how many times Preclear may have been run? Would it be closer to 1 in 3000 drives, 1 in 10,000, 1 in 30,000, etc? Really hard to judge... I seem to remember at least 4 or 5 times in past few years. Question is, how many disks does that represent? I'll bet more than a few thousand. It strikes me that it could be very useful to know if the problem is on the read side or the zero-writing side. When helping with video playback issues, the very first question is if it is a problem in the recording or in its playback, and to discover that, you ask them to play exactly the same defective part back, and determine if the defective play is absolutely identical. If it is, then you know you have a defect in the stored recording, and that is one set of issues. If the playback is different, not identical, then the issue is a playback issue, and a whole different set of diagnostics and issues applies. I would say it on reading in most cases. I remember disks that only occasionally returned the wrong value, but more often would read fine. (doing repeated md5sums from ranges of blocks on the disk..) Most md5sums are identical for a given set of blocks, then a different one pops up, then it returns to the original (presumed correct) value. A badly written bit would not ever pass its own CRC check and would be marked as an un-readable sector (checksum does not match sector contents) Preclear could help out with this. Upon detecting a non-zero sector, first make sure all buffers and caches are flushed, then reread. We would then know if it was a random read error (sector is actually all zeroes), or a write zeroes error (sector truly is not all zeroes). Actually, it prints the criteria needed to duplicate the "dd". In cases where we've retried, it has always ended up being written correctly and read back incorrectly, at times. It could be very useful if we discovered that these Preclear nonzero errors were all one or the other problem (all read or all write). No clue what we should do about them, but more info gives us a start to better understanding them. Actually, in most cases, once a drive that acts this way is discovered, I suggest the user perform a "chock" test. In that test, the drive is removed from the server and placed behind the tire of their car. The test involves backing over the drive multiple times in an attempt to see how well it performs as a "wheel chock." If it does not impede the motion of the car, retest. If it does impede the motion, get a bigger car and retest. What we really need is more info and better tests, that would help us rule out some of the possible suspects. There are still too many components and data paths involved. (Edit:) We're still just guessing. When you have a drive that returns occasional random bit, but shows no other error, it is really hard to isolate the causes of random parity errors to a specific drive. The best way identified by users so far is the set of scripts used to repeatedly read blocks from the disk and compare md5sums. They should always match. The only thing that might help, is if a parity error is detected, to flush the buffers and retry again, re-reading the same set of blocks across the disks. If different results (correct parity), might be able to ignore the first "glitch" and just report instead of changing parity. Joe L.
Archived
This topic is now archived and is closed to further replies.