garycase Posted November 2, 2015 Share Posted November 2, 2015 If you do a New Config and check the "Parity is already valid" box [i.e. what's often called the "Trust Parity" option], the system immediately starts a parity check when the array is started. In situations where the goal of the New Config is to do a drive rebuild due to some user error and a previous configuration is being reconstructed to take this action, that parity check can do some "corrections" that will invalidate the parity disk and not allow a proper reconstruction of the drive. The check can be quickly Canceled => but ANY "corrections" already done will cause errors in the reconstructed drive. I'd like to see one of the following changes ... (1) Make this specific check a non-correcting check, so nothing is changed on the parity disk OR (2) Instead of automatically starting one, post a note advising it ... e.g. "A Parity Check should be run to confirm the parity disk is valid". You could post this note every time the array starts until a check is run. OR (3) When you check the "Parity is already Valid" box, display a 2nd box that says "Automatically check parity when the array starts" => and only start the automated check if that box is checked. Any of those would eliminate the potential changes to parity when the array starts that are NOT desirable. ================================= Great news => This is fixed in v6.1.5 !! "Trust Parity" now means exactly that -- the array will Start and NOT initiate a parity check. Not helpful for those who have already lost what would have been recoverable data (e.g. archivist) ... but nice to know that it's no longer an issue. Link to comment
JorgeB Posted November 2, 2015 Share Posted November 2, 2015 +1 Any of those options would be good. Link to comment
trurl Posted November 2, 2015 Share Posted November 2, 2015 +1 There have been several incidents recently where a user did something that needed to be fixed with a New Config followed by a data disk rebuild. Link to comment
Archivist Posted November 3, 2015 Share Posted November 3, 2015 +1 Me too see all of my problems "Help Please ?? Post". I would also request some means of identifying an existing Parity Drive when applying a "new" configuration with an override for changes to Parity (i.e. a larger Parity Drive replacement). Dave Link to comment
Brucey7 Posted November 4, 2015 Share Posted November 4, 2015 I think the best solution is on the "Settings" page, simply add - default parity check to read only or correcting Link to comment
garycase Posted November 4, 2015 Author Share Posted November 4, 2015 The option to default to a non-correcting check would indeed resolve this (note it was one of the 3 alternatives I mentioned). The simple fact is that the "Trust Parity" option (i.e. checking the "Parity is already valid" box) allows reconfiguring an array when you KNOW that parity is already good => but to then automatically start a parity check will wipe out that good parity if the reason you re-created the array is to rebuild another drive that has been corrupted if ANY of the corrupted sectors are encountered before you can CANCEL the automatic check [since the corrupted data will cause parity to be "corrected" to match the corrupted data]. I think the best option is to NOT do an automatic check in this circumstance => the system's been told to "Trust Parity" ... so it should TRUST it !! Link to comment
limetech Posted November 5, 2015 Share Posted November 5, 2015 if the reason you re-created the array is to rebuild another drive that has been corrupted In this case you would not use the 'trust parity' box. Instead you restore the device configuration you had before then type a command at the console or telnet/ssh session: mdcmd set invalidslot <N> where <N> is the disk number that you want disabled. After typing this command, you click 'Start' back on the webGui (without doing an intervening browser refresh). The array will come up with that disk disabled, and if a device has been assigned, it will kick off the reconstruct. [The way the 'trust parity' box is implemented is, if checked, emhttp will execute 'mdcmd set invalidslot 99' just before starting driver.] Link to comment
trurl Posted November 5, 2015 Share Posted November 5, 2015 if the reason you re-created the array is to rebuild another drive that has been corrupted In this case you would not use the 'trust parity' box. Instead you restore the device configuration you had before then type a command at the console or telnet/ssh session: mdcmd set invalidslot <N> where <N> is the disk number that you want disabled. After typing this command, you click 'Start' back on the webGui (without doing an intervening browser refresh). The array will come up with that disk disabled, and if a device has been assigned, it will kick off the reconstruct. [The way the 'trust parity' box is implemented is, if checked, emhttp will execute 'mdcmd set invalidslot 99' just before starting driver.] Wish invalidslot was better documented. All I have ever seen are old posts for old versions. It is one of those things that seems more like unreliable folklore than recommended procedures. I think adding it to the GUI would just be asking for trouble, but something in the wiki so that more "motivated" users could know exactly how to use it and what to expect would be great. Maybe somebody with a test server could work through testing a detailed procedure to follow and document it. Would certainly be better than the "procedure" we have come up with that prompted this feature request. Link to comment
JorgeB Posted November 5, 2015 Share Posted November 5, 2015 if the reason you re-created the array is to rebuild another drive that has been corrupted In this case you would not use the 'trust parity' box. Instead you restore the device configuration you had before then type a command at the console or telnet/ssh session: mdcmd set invalidslot <N> where <N> is the disk number that you want disabled. After typing this command, you click 'Start' back on the webGui (without doing an intervening browser refresh). The array will come up with that disk disabled, and if a device has been assigned, it will kick off the reconstruct. [The way the 'trust parity' box is implemented is, if checked, emhttp will execute 'mdcmd set invalidslot 99' just before starting driver.] I’m trying to test this procedure but I think I’m doing something wrong. This is what I did: Created a parity + 2 disk array, xfs formatted, copied some data to disk1 Stoped array New config Selected same parity and disk 2 Selected different disk as disk 1 Without starting array typed on console: mdcmd set invalidslot 1 Started array Instead of rebuild Disk 1, it appears as unmountable and unraid starts doing a parity sync Also tried invalidslot 2 with same result as I was not sure if slot 1 is parity or disk 1 Can anyone see what I’m doing wrong? Link to comment
garycase Posted November 5, 2015 Author Share Posted November 5, 2015 if the reason you re-created the array is to rebuild another drive that has been corrupted In this case you would not use the 'trust parity' box. Instead you restore the device configuration you had before then type a command at the console or telnet/ssh session: mdcmd set invalidslot <N> where <N> is the disk number that you want disabled. After typing this command, you click 'Start' back on the webGui (without doing an intervening browser refresh). The array will come up with that disk disabled, and if a device has been assigned, it will kick off the reconstruct. [The way the 'trust parity' box is implemented is, if checked, emhttp will execute 'mdcmd set invalidslot 99' just before starting driver.] Not sure what you're saying here. The cases where this was needed were instances where somebody had already done a New Config, but had made some significant error -- e.g. incorrectly assigning a data drive as parity -- and when they started the system it corrupted a drive. The users almost certainly do NOT have a copy of their previous config from the flash drive; so to "get back" to the configuration they HAD requires a New Config. As I understand it, if they do this, UnRAID will NOT recognize that the parity drive is valid unless the "Parity is already valid" box is checked. Is that not correct? The problem is that if you check the box UnRAID immediately starts a parity check when you start the array. If there's a "known bad" disk in the system at that point, then that parity check will almost certainly do a bunch of "corrections" to the parity drive -- which kills the opportunity to rebuild that bad disk. I guess my question is WHY doesn't the "Trust Parity" option result in parity actually being trusted instead of starting that parity check? Link to comment
JorgeB Posted November 5, 2015 Share Posted November 5, 2015 OK, I misunderstood, restore device configuration is restoring from a flash backup, not doing a new config. This is what once happened to me once and why I’d like this feature request: Server was all green I started to upgrade my parity drive Forgot to make a flash backup of old config, yes I know I should have. During the parity sync one of the data disks redballed with read errors. So I had to put the old parity back, did a new config and started array, thankfully the problem disk errors were not in the beginning and I was able to stop it without invalidating my parity and then rebuilded the problem disk. If the disk was completely dead I believe I could not have recovered from this without this feature request. Link to comment
trurl Posted November 5, 2015 Share Posted November 5, 2015 ...I started to upgrade my parity drive...During the parity sync one of the data disks redballed with read errors... Slightly OT, but I thought that a redball only results from a write error. Here is what I thought happened with a read error, and that might produce a redball: If the data cannot be read, then unRAID will "reconstruct" the data from the other disks + parity, and then attempt to write that data back to the disk. If that write fails then you get a redball. But if parity is being built to a new disk how can it reconstruct the data that can't be read? Is this a special case for redballing, or have I got it all wrong? Link to comment
JorgeB Posted November 5, 2015 Share Posted November 5, 2015 ...I started to upgrade my parity drive...During the parity sync one of the data disks redballed with read errors... Slightly OT, but I thought that a redball only results from a write error. Here is what I thought happened with a read error, and that might produce a redball: If the data cannot be read, then unRAID will "reconstruct" the data from the other disks + parity, and then attempt to write that data back to the disk. If that write fails then you get a redball. But if parity is being built to a new disk how can it reconstruct the data that can't be read? Is this a special case for redballing, or have I got it all wrong? I think you’re right, it was some time ago, I think what I got were several read errors. Link to comment
jbuszkie Posted November 5, 2015 Share Posted November 5, 2015 If you do a New Config and check the "Parity is already valid" box [i.e. what's often called the "Trust Parity" option], the system immediately starts a parity check when the array is started. In situations where the goal of the New Config is to do a drive rebuild due to some user error and a previous configuration is being reconstructed to take this action, that parity check can do some "corrections" that will invalidate the parity disk and not allow a proper reconstruction of the drive. The check can be quickly Canceled => but ANY "corrections" already done will cause errors in the reconstructed drive. I'd like to see one of the following changes ... (1) Make this specific check a non-correcting check, so nothing is changed on the parity disk OR (2) Instead of automatically starting one, post a note advising it ... e.g. "A Parity Check should be run to confirm the parity disk is valid". You could post this note every time the array starts until a check is run. OR (3) When you check the "Parity is already Valid" box, display a 2nd box that says "Automatically check parity when the array starts" => and only start the automated check if that box is checked. Any of those would eliminate the potential changes to parity when the array starts that are NOT desirable. +1... Link to comment
garycase Posted November 16, 2015 Author Share Posted November 16, 2015 This is REALLY a necessary change => when you need to "Trust Parity", the system should in fact trust it ... and NOT start an automatic check that COULD, depending on the reason for reconstituting the array, actually destroy the ability to do a rebuild. Tom => Note that you once noted that it was NOT your intent for this to happen (way back in v5 days). Somewhere along the way it has started doing it again !! Yes I've noticed that a "trust parity" operation fires up a parity-check. This is not the intent. I intended that if you Start the array with "Parity is already valid" box checked, that no parity check is automatically started. This has been fixed in 5.0. Link to comment
garycase Posted November 16, 2015 Author Share Posted November 16, 2015 Tom => In lieu of this being fixed, is there a command line parameter that can be changed before starting the array that will keep this check from starting ?? [Or will perhaps at least change it to a non-correcting check so it won't cause any harm] Link to comment
garycase Posted November 16, 2015 Author Share Posted November 16, 2015 Tom => There's been yet-another casualty because of the automatic parity check. That's now the 4th case I can recall in the past few months where a drive could have easily been recovered after it was accidentally assigned as parity (and thus corrupted) IF the "Trust Parity" option actually TRUSTED it ... but the automatic parity check destroyed the ability to do a rebuild. Here's the latest thread: http://lime-technology.com/forum/index.php?topic=44022.msg420282#msg420282 Link to comment
itimpi Posted November 16, 2015 Share Posted November 16, 2015 Related to this would be an option in the New Config to leave all current assignments in place (as though one had done a New Config and then assigned all drives as they were before the New Config) so that one can now make any desired changes. As well as being a convenience, this would also dramatically reduce the chance of anyone accidentally assigning the drives incorrectly when doing the New Config. I even think it should be the default behaviour with a checkbox added labelled something like "Clear all current assignments" Link to comment
garycase Posted November 16, 2015 Author Share Posted November 16, 2015 That would indeed be convenient => but by far the #1 thing that needs to be done is to make "Trust Parity" actually TRUST parity -- and NOT start a new check when the array is started. Link to comment
ohlwiler Posted November 16, 2015 Share Posted November 16, 2015 Related to this would be an option in the New Config to leave all current assignments in place (as though one had done a New Config and then assigned all drives as they were before the New Config) so that one can now make any desired changes. As well as being a convenience, this would also dramatically reduce the chance of anyone accidentally assigning the drives incorrectly when doing the New Config. I even think it should be the default behaviour with a checkbox added labelled something like "Clear all current assignments" Something I grumble about every time I hit that "New Config" button. Link to comment
jbuszkie Posted November 16, 2015 Share Posted November 16, 2015 Related to this would be an option in the New Config to leave all current assignments in place (as though one had done a New Config and then assigned all drives as they were before the New Config) so that one can now make any desired changes. As well as being a convenience, this would also dramatically reduce the chance of anyone accidentally assigning the drives incorrectly when doing the New Config. I even think it should be the default behaviour with a checkbox added labelled something like "Clear all current assignments" Something I grumble about every time I hit that "New Config" button. Me too! +1 to that as well... But I agree that changing the default behavior needs to be don't do a correcting parity check is the first priority. Link to comment
garycase Posted December 1, 2015 Author Share Posted December 1, 2015 Great news => This is fixed in v6.1.5 !! "Trust Parity" now means exactly that -- the array will Start and NOT initiate a parity check. Not helpful for those who have already lost what would have been recoverable data (e.g. archivist) ... but nice to know that it's no longer an issue. Link to comment
Recommended Posts
Archived
This topic is now archived and is closed to further replies.