Jump to content

HELP! Is there hope or is this a catastrophe??


Recommended Posts

 

Last evening at the tail end of copying a large amount of media from my server to a remote device on my LAN a problem developed apparently due to the failure of a backplane several of my drives were mounted in.  Both the Parity drive (1.5tb) and disk1 (1.5tb) began registering massive amounts of errors, turned red, and the server began attempting to correct the errors with no luck, never progressing beyond 0%.  Eventually disk1 began displaying the status as "unformatted" and as the errors piled up I stopped the array.  At first glance the drives appeared to have failed and could no longer be seen by the server when I did a refresh but after trying to replace the parity drive with a new one and having it not be seen as well I removed the drives from the backplane and plugged them straight into the motherboard where they were then recognized, however now in an invalid state.  I obtained Smart reports on both drives and they passed which led me to conclude that it's some type of data corruption (feel free to correct me at any point here) that is causing the parity drive to display as "wrong" with a 2.1 PB displayed under the 1.5tb in the size field.  With it and disk1 out of sorts the I'm unable to start the array obviously.

 

First question, is there any hope that either of these first two drives can be repaired and the data recovered?  Seems unlikely but what do I know.  If so could someone point me to where I might learn how to go about trying?  If not how would I go about retrieving/incorporating the large amount of data on the remaining 9 disk drives that were not directly affected by this event into a new array... or is that to much to expect as well at this point.

 

I've looked around and don't see anything that's an obvious starting point and I'm pretty much afraid to do anything for fear of causing more harm with my ignorance.

 

Anyone got a suggestion?

 

I have a Pro-2 license(s) and have been running 5.0-beta13 btw

Link to comment

I sent a similar message to support last night but haven't heard back so I thought I'd see if someone here could offer some here could offer some advice, suggestion or just point me in the right direction. 

 

Last evening at the tail end of copying a large amount of media from my server to a remote device on my LAN a problem developed apparently due to the failure of a backplane several of my drives were mounted in.  Both the Parity drive (1.5tb) and disk1 (1.5tb) began registering massive amounts of errors, turned red, and the server began attempting to correct the errors with no luck, never progressing beyond 0%.  Eventually disk1 began displaying the status as "unformatted" and as the errors piled up I stopped the array.  At first glance the drives appeared to have failed and could no longer be seen by the server when I did a refresh but after trying to replace the parity drive with a new one and having it not be seen as well I removed the drives from the backplane and plugged them straight into the motherboard where they were then recognized, however now in an invalid state.  I obtained Smart reports on both drives and they passed which led me to conclude that it's some type of data corruption (feel free to correct me at any point here) that is causing the parity drive to display as "wrong" with a 2.1 PB displayed under the 1.5tb in the size field.  With it and disk1 out of sorts the I'm unable to start the array obviously.

 

First question, is there any hope that either of these first two drives can be repaired and the data recovered?  Seems unlikely but what do I know.  If so could someone point me to where I might learn how to go about trying?  If not how would I go about retrieving/incorporating the large amount of data on the remaining 9 disk drives that were not directly affected by this event into a new array... or is that to much to expect as well at this point.

 

I've looked around and don't see anything that's an obvious starting point and I'm pretty much afraid to do anything for fear of causing more harm with my ignorance.

 

Anyone got a suggestion?

 

I have a Pro-2 license(s) and have been running 5.0-beta13 btw

1. SMART reports would not tell you of data corruption...  ever.  They do not have a clue as to how the drive is being used to store your data.

2. SMART reports are meaningless if the drive was unplugged from the backplane.

3. have you considered posting a system log, so we can see what is actually happening.

 

un-formatted simply indicate the drive could not be mounted.    (I hope you did not format it)    You've already replaced the parity drive, so you've given up the ability to re-construct anything.

 

You basically need to correct the power/cabling issue.  Since you have no valid parity, you can use reiserfsck on the 1st partition on each of the data disks in turn to fix any corruption you've incurred.    remember, use it on /dev/sdX1  (with the trailing "1" to indicate the first partition)

 

Then, set a new disk configuration, then let your array calculate parity once more.

 

Joe L.

Link to comment

About the same time you replied so did lime-tech support with the following.  I'll follow up with the results later.

 

On 3/7/2012 1:07 PM, Tom Mortensen wrote:

> The system log does not look unusual.  It appears that the 'super.dat'

> file on the flash is possibly corrupted.  So here's what I would do:

>

> 1. Create a backup of your current flash device:

> a) power down server

> b) plug flash into pc

> c) copy contents of flash to folder on your pc

> d) right-click flash under Computer, select Properties. From Tools tab

> click 'Check now'. If this shows problems, you should reformat and

> reinstall unRaid OS, and then restore contents of 'config' directory

> to newly initialized flash.

>

> 2. Plug flash back into server, boot up server.  It should come up in

> same state as you showed me.  If not, t hen stop and tell me what

happened.

> 3. Keep a screen shot handy so you can re-assign all the disks properly.

>

> 4. Go to the  Utils page and click 'New Config'.  Check "I'm sure"

> check box, and click Apply.

>

> 5. Go to Main page and assign all your data disks in the proper slots,

> but leave the Parity disk unassigned.  We are doing this in case

> something goes wrong, we want to preserve parity.

>

> 6. Now click Start... Tell me if all your disks now get mounted.  Do

> NOT write any new files to your server because we don't have parity

> disk installed.

>

> Let me know how this goes.

>

> Cheers,

> Tom

 

 

But just to clarify a few points:

 

"2. SMART reports are meaningless if the drive was unplugged from the backplane."

 

I ran the Smart reports after the drives were unplugged from the back plane but after they were remounted and plugged straight into the motherboars (the unRAID OS was seeing them at this point but showing red) just to determine they were still functioning and not bricks.

 

"3. have you considered posting a system log, so we can see what is actually happening." 

 

I will post the log later when I return home and have access to it but keep in mind I stopped the array and rebooted the server as part of my attempt to get the drives (or their replacements) working so the data in it doesn't reflect what was happening at the time the errors occurred and I imagine the data that is there is of limited value.  I didn't realize that logs got cleared on reboot.

 

"un-formatted simply indicate the drive could not be mounted.    (I hope you did not format it)"

 

No I did not format it.  The option was there but as the futile attempt to correct parity was under way I looked around to see what I might do in response and I noticed in several places warnings not to do so... so I got that going for me  :)

 

"You've already replaced the parity drive, so you've given up the ability to re-construct anything."

 

Actually I didn't replace it.  I attempted to but immediately upon putting the brand new drive in the backplane slot where the parity drive was and realizing it wasn't being seen I determined it was the backplane that was the problem not the drive itself so I removed the whole unit and remounted the original drives with the same cables plugged into the same slots to make them visible again - so the original parity drive is still there but along with disk1 showing red and not allowing me to restart the array due to what ever condition they're in.

 

 

 

Link to comment

An update for those interested:  Followed Tom's advice last night when I got home, flash drive with the OS checked out, booted up, took a screen cap of the existing drive assignments as he suggested then set about creating a new array with drives mounted sans the StarTech backplanes which after several years of good reliable service seem to be dying one, and now two, slots a time and the root of the problem.  Assigned the drives as before except the parity drive as Tom instructed to preserve it in case anything went wrong but everything mounted without incident all shares and data look to be intact... I love you unRAID Server :-) 

 

Parity sync  is underway as I type this. 

 

Thanks for all the responses but especially to Tom Mortenson who walked me through this in e-mail one step at a time.  Great product, great support, happier I purchased these licenses more everyday.

Link to comment

Archived

This topic is now archived and is closed to further replies.

×
×
  • Create New...