January 23, 201016 yr Hi there, Thanks for the fantastic software. However, I have a warning for others should they do something silly (as I did) and then decide to check whether the silly thing they did has affected their data. If you think you have bad data on a drive, do NOT perform a parity check. Unfortunately, the button in the webUI labelled "Check" actually writes data to the parity drive. Thus, if there are errors between parity and data, it assumes the parity is wrong rather than assuming that the data may be wrong. Thus, the "Check" function is actually a "Assume my data is OK and regenerate the parity information, telling me if there were any errors". It's not "Check" at all. This is a bug. A user interface bug. I've just lost a shitload of data due to this. Hopefully not too much - fortunately I have some additional parity information at the file level which will at least tell me how much of the data is now dead. How did this happen: 1. I added a disk that I was going to use for unprotected additional storage. 2. I wished to format this disk. 3. I made the mistake of running mkfs.reiserfs on the wrong disk (sdc1 rather than hdc1 - a reasonably easy mistake to make, unfortunately, but something I take full responsibility for - I am an idiot!) 4. I immediately recognised my mistake, after this was started, and killed the process. 5. I re-checked the mount, and all the dir links were still there (thus one presumes the majority of the data is still there) 6. I figured the next step was to Check the array against parity (prior to then fixing the data from parity) I repeat, for the purposes of warning others: The Check function actually writes to the parity disk. Do NOT run this if you think some of the data on one drive is invalid. I strongly suggest that the software is altered so that this is made clear to the user from the user interface. Any function that writes to the disk should come with a warning. Any suggestions on the next move here are greatly appreciated. Needless to say the array is now stopped, the "suspect" data drive the only one mounted (read only), and I'm currently running through the data to see just how much is dead. Cheers, Jonathan
January 23, 201016 yr Two items. 1. You might be able to recover most everything on that drive by using the process described here: http://antrix.net/journal/techtalk/reiserfs_data_recovery_howto.comments The command would be: reiserfsck --rebuild-tree -S /dev/sdc1 2. There is a true "check, but do not update parity process", but there is no web-interface to it unless you install the unMENU add-on. It can be invoked at the command line as /root/mdcmd check NOCORRECT Unfortunately, it will give you a count of the errors, but not tell you where they are and the web-interface still says the same as it has not yet been changed in any way to look different when doing a NOCORRECT check vs. a the "check" where parity is written if a difference is detected. It is described in this post: http://lime-technology.com/forum/index.php?topic=3430.msg29619#msg29619 Joe L.
January 23, 201016 yr Author Thanks Joe for the swift and helpful reply. I'll look into the unMenu extension, and will certainly keep that in mind for next time. I may also modify the current web interface to rename the feature so that I don't accidentally click it again - hopefully Tom will take this into account for future versions. Just knowing that there exists errors is exactly what I was assuming that the "Check" function did (after all, it's otherwise equivalent to Parity Sync, right?) Am currently running reiserfsck on the drive and will let you know the results - the data are non-critical (i.e. obtainable from elsewhere given sufficient time and bandwidth) - an inconvenience at worse, and certainly a lesson learnt Cheers, Jonathan
January 24, 201016 yr Hey JM - nice to see you on the boards...AND...welcome. Been using unRAID myself for several years (12Tb system) and have been very happy with it - but don't ask me how it all works - more mysterious than xbmc by far. JoeL is definitely "Da Man" when help is required and has helped me (and many others) more than words express (Thx Joe). Hope you have fun with it. Flambot (a.k.a Kidkiwi)
January 24, 201016 yr This is of no help, but I think that unMenu should be installed as default and used especially by novices to Linux. Thumbs up!
January 24, 201016 yr Thus, if there are errors between parity and data, it assumes the parity is wrong rather than assuming that the data may be wrong. Thus, the "Check" function is actually a "Assume my data is OK and regenerate the parity information That got me thinking... Why is it really assuming that the parity disk is wrong, and data is right? Aren't the chances of things being the other way around equally possible? Maybe, if it has to assume that data is wrong, then it has no way of knowing exactly which data disk should be assumed wrong? Shouldn't we then there be given a choice to tell it which disk slot should be considered wrong?
January 25, 201016 yr Author A button labelled "check" should do nothing more than point out places where the parity is incorrect. In most cases, assuming the data drives are correct is the only sane way to go - particularly in the absense of the hardware itself telling us that it's not OK. Most cases _except_ the case where the user _knows_ the data drive is likely corrupt (due to the user doing something silly most likely) and simply wishes to verify this ofcourse! My particular problem could have been averted by properly labelling the functionality of the "Check" routine. As it turns out, data corruption was very little, due to the reiserfs formatting not actually affecting much of the data on the disk. There's a couple of extremely minor data losses, but it's much less than 1% of the disk in question, and what is lost is replaceable. Just something for others to watch out for. Cheers, Jonathan
Archived
This topic is now archived and is closed to further replies.