Tried to Remove Disk, Read Errors -> Backed Out. Now Parity Errors


Recommended Posts

So tonight I attempted to remove an old (emptied) drive my my array and recalculate parity to minimize the number of disks I had running and improve my parity check speed. I was conservative when doing this operation and:

A) Screencapped my disk configuration from the main menu

B) Used a preclear'ed new disk for a new parity

C) Removed the old drive

D) Backed-up my flash before restarting

 

When I brought the server back up again, used "New Config" and re-assigned all my drives as they were previously (with the new parity drive). When I brought the array online to recalculate, one disk (1TB, "disk4") started reporting hundreds of reading errors. ["shiz"]. I stopped the calcs, shut the machine down, re-checked all the cable connections, and brought it back up. Now this disk was being reported as Unformatted. Ran reiserfsck --check on it and was reported a bad superblock. ["super shiz"].

 

I decide to back out to rebuild disk4- I re-install the original parity disk, the old disk, and re-flash my usb key with the old configuration. This time the box starts up like a charm and begins the array on its own. The entire file structure of disk4 is even visible and the few pictures and videos I sampled off the drive seem to be fine. However, if I do a non-correct Parity Check, it errors out instantly with over 248 parity errors.

 

So I ask the experts - what should my next move be? I have a 2TB preclear'ed drive sitting in standby. Do I swap disk4 with the preclear'ed drive and rebuild, assuming that it was damaged somehow by the read errors? Do I do smart on every drive with the array offline and attempt to find a different drive that's reporting errors? Do I do something I don't even know about?

 

*I'll note that I ran two parity checks before starting this operation, so everything was fine at the beginning!

 

Running version 5.0.4

 

 

Thanks for any help you can offer!

deleteme

Link to comment

Realized I was running reiserfsck on the physical drive rather than the logical. Re-did it on /md4 and seems clean:

 

###########

reiserfsck --check started at Wed Dec 18 21:46:22 2013

###########

Replaying journal: Done.

Reiserfs journal '/dev/md4' in blocks [18..8211]: 0 transactions replayed

Checking internal tree.. finished

Comparing bitmaps..finished

Checking Semantic tree:

finished

No corruptions found

There are on the filesystem:

        Leaves 234174

        Internal nodes 1400

        Directories 7359

        Other files 44141

        Data block pointers 231168541 (0 of them are zero)

        Safe links 0

###########

reiserfsck finished at Wed Dec 18 21:56:19 2013

###########

 

 

Still don't know what to do about the parity errors. Like I said, the New Config was done on a different drive. Would there have been writes to the data disks during the first 10 seconds of a New Config?

Link to comment

Overall sitting around 9TB of data space over 7 drives + parity + cache. Here's a capture from the WebGUI before I re-assigned all the drives when recovering:

 

http://i.imgur.com/pp3Dqmf.png

 

Disk3 was the target of the removal.

 

I guess my question becomes more generic now in that what do I trust - the collective data drives or the parity that I had swapped out? If I go back through a rebuild on a new disk4 and some of the files I mess with are fine- does that actually say that the overall parity is correct? From how I've understood Unraid, that's not how corruption works. (Shoutout to limetech to get P+Q parity and integrity on the short list ;P)

 

deleteme

Link to comment

If you don't have a full set of backups, I'd at least backup disk4 to an external location (external drive;  spare drive on another system; etc.) => and THEN go ahead and either do a rebuild of disk4 on another disk (if it works okay, you won't need the backup);  or simply build your new configuration and then copy the data from the backup to the new config.

 

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.