On the edge of data loss! Please help me save the data!


Recommended Posts

Hi, I'm back after three years of inactivity.

 

Something extremely bad happened to my :((

 

One of my disks failed today. I replaced it instantly with a new one. On the beginning of data rebuild a lot of errors appeared on another drive. Like i found later it was a problem with my case or Sata cable (most probably). I replaced the sata tape and reboot server. Now my disk which had been replaced is orange (indicates interrupted data rebuild) and the other one is blue. What can I do to make unraid know that the blue-one is actually a proper disk, and make it rebuild the new one??

Przechwytywanie.PNG.944bb0def4fab929eff970c7ca61ede5.PNG

Link to comment

While I don't have the answer for you with the failed drive, I have a question.

1. Do you do monthly parity checks?

2. Is the failed drive still functioning at some level? 

3. Can you do a smart test?

 

If #2 is true, you might be able to recover some of the data using ddrescue.

 

However I would wait until someone more knowledgeable then I responds.

I'm not sure of the trust my array procedure.

It did not work for me when I had a failure during a rebuild.

I had to use ddrescue to copy the failed drive to another drive, then have it work in reverse on the failed sectors.

Eventually I recovered all sectors except 1.

Link to comment

First, do NOTHING with the failed drive -- save it to examine later in the hopes you may be able to recover it if you can't successfully rebuild it onto a new one.

 

Second, the following will only work IF you have (a) not written ANYTHING to the array since this happened (except the rebuild attempt); and (b) you had GOOD parity.    If both of these are true, do the following:

 

(1)  Save your key file.  And note the drive #'s for all the current assignments.

(2)  Do a "New Config" => Be sure you assign the correct parity drive; and include the drive you attempted to rebuild in the array.

(3)  Check the "Trust Parity" box and Start the array.  Don't do anything with the array.

(4)  Stop the array and unassign the rebuilt disk.

(5)  Start the array and it should show a "missing" disk.

(6)  Stop the array and re-assign the new disk to the correct slot.

(7)  Start the array and it should start a rebuild.

 

Link to comment

Didn't see the two comments above before I posted -- obviously keeping the old drive for recovery attempts isn't necessary unless the data is important enough that you want to send it off for professional recovery (typically $400 and up ... can easily top $1000).

 

I gather from your question that you don't have backups -- clearly the simplest approach would be to simply install the new drive; then restore the missing data from backups.  In the future remember that RAID (or UnRAID in this case) is NOT a backup.

 

Link to comment

I did as you suggested Webo, and everything would be fine but the parity check started even when i had ticked "parity is already valid"!!! This corrected 326 "errors" ony my parity drive. Thankfully i stopped it soon after that. So i think some files will be broken now. What should I do when the disk gets fully rebuilt?

Link to comment

I did as you suggested Webo, and everything would be fine but the parity check started even when i had ticked "parity is already valid"!!! This corrected 326 "errors" ony my parity drive. Thankfully i stopped it soon after that. So i think some files will be broken now. What should I do when the disk gets fully rebuilt?

 

I think it was Garycase who suggested the procedure.

In any case, the files would be on the outermost tracks. Possibly the earliest files put on the drive.

If you don't have any kind of md5sum of each of the drives, it may be hard to figure out what was affected.

 

Also, the question of monthly parity checks is still open. Not that it's going to save anything.

Do you do monthly parity checks as a preventative review of drive/array health?

 

I'm wonder if it would have caught the marginal drive sooner or if it's really not that much of a tell tale of a drive failing.

Link to comment

If it's otherwise working okay now and doing a rebuild, let it finish.

 

However, clearly the rebuild will have some errors, since parity has been modified.    Without backups, or checksums of the data, there's no way to confirm exactly which files are wrong.  You can only be proactive in the future by ensuring you have good backups or at least checksums, so you can either fully recover (backups) or at least identify corrupted/damaged files (checksums).    But at this point you simply need to recognize that a few of the files on the rebuilt disk likely have some erroneous bits of data.

 

Link to comment

As WeeboTech just noted, if you stopped the parity check quickly enough, the errors will all be on the outermost cylinders of the drive -- which most likely means it's in the first few files on that disk.    The problem is identifying those files -- I'm not aware of any utility that will sort files by their location on the disk.

 

Link to comment

Smart report is OK. Nothing special in syslog. I tried to repair filesystem, but any reiserfsck command fails. What is interesting is that i cannot perform reiserfsck --check on any heatlhy disc, because it says that superblock is not found, even when the disc is properly mounted by unraid. Why is that? How can I rebuild the superblock on that failed disc (i mean what are the parameters)?

Link to comment
What is interesting is that i cannot perform reiserfsck --check on any heatlhy disc, because it says that superblock is not found, even when the disc is properly mounted by unraid.
What command EXACTLY are you typing? It sounds like you may be trying to check the raw device, instead of the partition. You should be operating on the md* devices whenever possible.
Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.