Disaster Struck


Recommended Posts

So here goes.

 

I could explain the long process I've gone through upgrading my server's hardware (Mobo, CPU, RAM, PSU)

 

I thought that taking everything out the server caused the initial problems I had, maybe a bad connection to one of the 4 IcyBoxes I already had.

http://www.raidsonic.de/products/internal_cases/backplanes/index_en.php?we_objectID=1152

It seems that somehow the IcyBox killed 5 disks then another 2 disks when I was swapping things around to find where the problem was coming from.

 

I'm still in the process of testing these disks but of the 3 I've plugged into a (new) USB harddrive enclosure into my Mac, none are seen at all, whereas when I connected 2 of the non missing disks OSX asked to initialise the disks, which of course I didn't, only ejected it.

 

I had a total of 17 disks before this issue.

 

https://www.dropbox.com/s/w35u75w1ak5qr4g/UNRAID-disks.png?dl=0

 

I only have one parity disk but even with 2 I know the data loss would have still been huge.

15x Data HDDs (4x4TB & 11x2TB)

1x Cache HDDs at 500GB

 

 

So my question is, how can I start unraid without causing any more data loss?

Is it simply a question of starting and unticking "correct Parity errors" ?

Thanks

Link to comment

What you could try, is putting all your disks in, assigning the to their original locations, being absolutely certain that the parity disk is in the correct parity location (as the parity drive) and doing a new config and ticking the box that parity is good. Then see if any of the drives come back to life. If some don't then you can remove those that are dead, and do a new config again, and again tick that parity is good. Did you have a backup of any of the data you lost?

Link to comment

It seems that somehow the IcyBox killed 5 disks then another 2 disks when I was swapping things around to find where the problem was coming from.

Going to leave your actual question to others here, but I will go and say that if you happen to have any power splitters to always plug the female end into the male end prior to installing and make sure that the wire colors line up with each other.  I have seen in the past some splitters where an end was on upside down which would easily cause this issue.

 

Side note:  Always hang onto an old small drive that you're never going to use to test out backplanes....

Link to comment

It seems that somehow the IcyBox killed 5 disks then another 2 disks when I was swapping things around to find where the problem was coming from.

Going to leave your actual question to others here, but I will go and say that if you happen to have any power splitters to always plug the female end into the male end prior to installing and make sure that the wire colors line up with each other.  I have seen in the past some splitters where an end was on upside down which would easily cause this issue.

 

Side note:  Always hang onto an old small drive that you're never going to use to test out backplanes....

 

I did use a Molex splitter as the cables from the PSU were not long enough, I've already taken out the IcyBox the killed my disks so I can't check exactly how it was connected but the colours line up correctly on the splitter.

Although what you say makes sense, otherwise I don't know how 7 disks would suddenly die.

Damn  :'(

Link to comment

What you could try, is putting all your disks in, assigning the to their original locations, being absolutely certain that the parity disk is in the correct parity location (as the parity drive) and doing a new config and ticking the box that parity is good. Then see if any of the drives come back to life. If some don't then you can remove those that are dead, and do a new config again, and again tick that parity is good. Did you have a backup of any of the data you lost?

 

I've just finished checking via the USB enclosure and no disk even spins up anymore.

 

By a new config, do you mean just unassigning the dead disks?

 

No I don't have a backup of the disks  :'(

Link to comment

If you go into the TOOLS tab there is a new config option, it will give the chance to reassign all the disks, so start by assigning the parity drive as the parity drive, and then assign the remaining working disks in whichever slots you want, make sure you check the box for 'parity is good' and then start the array. But only do this if you are 100% certain you have dead disks.

Link to comment

If you've have a major issue in a server, I would suggest FIRST - before doing anything else ...

STOP. Go do something else. Come back when you are fresh and thinking clearly. I cannot stress this enough.

 

Then, remove / disconnect any unnecessary parts including controller cards, network cards, extra video cards. You want just enough computer to display a video signal from the POST. Install a single disk connected to a single (preferrably new) sata cable connected to a motherboard port. Power connected to pigtail from PSU - no splitters. I'll sometimes put a drive upside down on some anti-static  foam beside the computer. Obviously you want to be very careful not to touch anything when power is applied.

 

Back up my config folder on your desktop computer, so you have the original should you need it.

 

Copy the disk.cfg file from a fresh unRAID .zip file (or I've attached one for convenience) into the config directory on the flash.

And delete super.dat from same folder. So config will be as before but with a fresh disk.cfg and no super.dat.

 

(You'll have original copies of both of these should you need them).

 

Boot. unRAID should come up with an empty array. Assign the one disk to disk1 slot (note parity is at the top, so this will be the second dropdown! (DO NOT ASSIGN ANYTHING TO THE PARITY SLOT!)

 

Start the array.

 

Check the contents of the disk. One of the disks may be parity and look unformatted. If you picked that one, this is normal. You can shutdown and reboot with a different disk. unRAID may be unhappy and force you to do a new config to forget about that first disk. Or you could always reset the disk.cfg and delete super.dat on a different machine.

 

Once you have confidence that several of the disks are readable, you can start to rebuild the server bit by bit, testing along the way.

 

(It is far better to break a server down like this and test in a very minimal configuration, and rebuild it until you have a problem, then it is to try to remove things one at a time until it works. I have never found the latter method to work. :)

 

If none of your disks show up in the dropdown list after trying several different disks, try on a different machine. This is your data - do what you can to make 100% sure you have a doornail of a hard drive before you give up on it. There are data recovery services that can help if the data is valuable. They tend to be pricey. Only you can decide the value of the data.

 

Best of luck!

disk.cfg

Link to comment

If you've have a major issue in a server, I would suggest FIRST - before doing anything else ...

STOP. Go do something else. Come back when you are fresh and thinking clearly. I cannot stress this enough.

 

Then, remove / disconnect any unnecessary parts including controller cards, network cards, extra video cards. You want just enough computer to display a video signal from the POST. Install a single disk connected to a single (preferrably new) sata cable connected to a motherboard port. Power connected to pigtail from PSU - no splitters. I'll sometimes put a drive upside down on some anti-static  foam beside the computer. Obviously you want to be very careful not to touch anything when power is applied.

 

Back up my config folder on your desktop computer, so you have the original should you need it.

 

Copy the disk.cfg file from a fresh unRAID .zip file (or I've attached one for convenience) into the config directory on the flash.

And delete super.dat from same folder. So config will be as before but with a fresh disk.cfg and no super.dat.

 

(You'll have original copies of both of these should you need them).

 

Boot. unRAID should come up with an empty array. Assign the one disk to disk1 slot (note parity is at the top, so this will be the second dropdown! (DO NOT ASSIGN ANYTHING TO THE PARITY SLOT!)

 

Start the array.

 

Check the contents of the disk. One of the disks may be parity and look unformatted. If you picked that one, this is normal. You can shutdown and reboot with a different disk. unRAID may be unhappy and force you to do a new config to forget about that first disk. Or you could always reset the disk.cfg and delete super.dat on a different machine.

 

Once you have confidence that several of the disks are readable, you can start to rebuild the server bit by bit, testing along the way.

 

(It is far better to break a server down like this and test in a very minimal configuration, and rebuild it until you have a problem, then it is to try to remove things one at a time until it works. I have never found the latter method to work. :)

 

If none of your disks show up in the dropdown list after trying several different disks, try on a different machine. This is your data - do what you can to make 100% sure you have a doornail of a hard drive before you give up on it. There are data recovery services that can help if the data is valuable. They tend to be pricey. Only you can decide the value of the data.

 

Best of luck!

 

Thanks a lot for your advise, you're absolutely right, I've been at this for hours today so I'll take a look tomorrow evening with a clearer head.

 

Link to comment

I know specialist companies exist but I couldn't afford that.

 

I have one or two disks of the same model laying around, does anyone know what are the chances of swapping the HDD controller from these and getting something working again?

Before you throw in the towel on your data, there are companies that specialize in fixing pretty much exactly what happened to you, and the cost isn't as bad as a full data recovery house. They will swap circuit boards to get the drives working again, and they know what is involved in swapping the boards on the specific model of drives that you have. Some models require swapping chips from original to donor board, some don't.

 

http://www.donordrives.com/ These guys treated me fairly, the cost was very reasonable (less than replacing it with a new drive).

 

I would talk to them and get estimates BEFORE you write anything to any of the drives. If you can recover all but one of your drives, the remaining drive can be rebuilt from parity, but only if you don't write to ANY of the drives.

Link to comment

So the controllers definitely got fried

 

https://www.dropbox.com/s/0ccsdor8nkbxpbs/20170206_184035.jpg?dl=0

After reading some information on the link you posted I thought I'd try to swap over a controller.

 

3 of the dead disks are WD 2.0TB

One model is from Dec 2010 which is within 3 months of the spare disk I have, this is said to be important http://www.donordrives.com/blog/matching-guide#wd

The Model numbers match.

Made in the same country.

 

It actually spun up and OSX asked me to initialise but I ejected. Although there was definitely a slight grinding/clicking noise so if it was ok in UNRAID I'd want to get my data off asap.

 

Obviously my ideal situation would be getting as much of my data back as I can.

 

I bought a 4TB disk today, would it be a good idea to follow what was said here by bjp999?

 

Back up my config folder on your desktop computer, so you have the original should you need it.

 

Copy the disk.cfg file from a fresh unRAID .zip file (or I've attached one for convenience) into the config directory on the flash.

And delete super.dat from same folder. So config will be as before but with a fresh disk.cfg and no super.dat.

 

(You'll have original copies of both of these should you need them).

 

Boot. unRAID should come up with an empty array. Assign the one disk to disk1 slot (note parity is at the top, so this will be the second dropdown! (DO NOT ASSIGN ANYTHING TO THE PARITY SLOT!)

 

Start the array.

 

I mean to say start with the new 4TB and the 2TB with the borrowed controller (Does UNRAID start without a parity drive?)

If this is successful then copy the data over from the 2TB to the 4TB.

If this works then I would be willing to order the other HDD chip/controllers from the site you mentioned, damn it would be amazing to be able to recover a lot if not all of what I lost

 

Link to comment

One model is from Dec 2010 which is within 3 months of the spare disk I have, this is said to be important http://www.donordrives.com/blog/matching-guide#wd

The Model numbers match.

Made in the same country.

The link you posted is about scavenging the heads from compatible models, it's not about the PCB matching and whether or not chips must be moved or reprogrammed from the original to replacement PCB to get your data back.

 

I STRONGLY advise you to call them and consult on your exact situation before you accidentally make it much worse.

 

Keep in mind the best chance of getting data back relies on NOT modifying any of the data on any of the drives. You can use the parity disk to replace any single data drive, but as soon as 2 or more drives is compromised, you lose the data on both those drives.

Link to comment

Does UNRAID start without a parity drive?

YES

 

Good to know, thanks.

 

So as my HDDs are indeed dead and if I'm lucky and manage to buy replacement controllers and even luckier that they work, I'll need to get the data off them.

Does what I wrote above make sense?

I could add the new 4TB drive with copied data back to my proper array at a later date and continue to do the same.

Link to comment

One model is from Dec 2010 which is within 3 months of the spare disk I have, this is said to be important http://www.donordrives.com/blog/matching-guide#wd

The Model numbers match.

Made in the same country.

The link you posted is about scavenging the heads from compatible models, it's not about the PCB matching and whether or not chips must be moved or reprogrammed from the original to replacement PCB to get your data back.

 

I STRONGLY advise you to call them and consult on your exact situation before you accidentally make it much worse.

 

Keep in mind the best chance of getting data back relies on NOT modifying any of the data on any of the drives. You can use the parity disk to replace any single data drive, but as soon as 2 or more drives is compromised, you lose the data on both those drives.

 

I'll definitely contact them and see what they advise, I'm just thinking ahead, probably too far I guess, about copying my data of these drives and the best way to do that.

Link to comment

I could add the new 4TB drive with copied data back to my proper array at a later date and continue to do the same.

Not entirely clear what you mean by your "proper array". Since you have too many failed disks, parity can't help, so in a sense you no longer have your "proper array". You just have a bunch of drives, each with their own files, that are no longer related by parity. Some of the drives are still good, some are bad and maybe unreadable.  Whatever you wind up with as far as getting new drives, getting files from the failed drives, and including your still good drives with their files, will become your array.

 

I would definitely follow bjp999 advice. Work with one disk at a time.

 

Maybe the part you are asking about is how to reconfigure your array as needed as you proceed. Do you know how to use Tools - New Config? Or you can just use the other method mentioned by bjp999, resetting disk.cfg and deleting super.dat each time you need to change, which is effectively what New Config does.

 

 

Link to comment

I could add the new 4TB drive with copied data back to my proper array at a later date and continue to do the same.

Not entirely clear what you mean by your "proper array". Since you have too many failed disks, parity can't help, so in a sense you no longer have your "proper array". You just have a bunch of drives, each with their own files, that are no longer related by parity. Some of the drives are still good, some are bad and maybe unreadable.  Whatever you wind up with as far as getting new drives, getting files from the failed drives, and including your still good drives with their files, will become your array.

 

I would definitely follow bjp999 advice. Work with one disk at a time.

 

Maybe the part you are asking about is how to reconfigure your array as needed as you proceed. Do you know how to use Tools - New Config? Or you can just use the other method mentioned by bjp999, resetting disk.cfg and deleting super.dat each time you need to change, which is effectively what New Config does.

 

You're quite correct that I no longer have a "proper" array. I just meant to say that, following bjp999's advice and checking the disks one by one and then putting these all together into what would be my new array.

 

I sent an email to www.donordrives.com, I'll see what they advise.

 

Let's imagine that all but one of the dead drives are repairable (surely too optimistic), then parity would then do it's job and I wouldn't have lost anything.

But am I correct in saying that at no time should I connect the current parity drive before having all (but one) HDDs back in the server?

Link to comment

But am I correct in saying that at no time should I connect the current parity drive before having all (but one) HDDs back in the server?

Yes. Leave parity out. Do you know which disk was parity? Parity is not a mountable disk since it has no filesystem, but it is also possible that one or more data disks would also be unmountable due to filesystem corruption, so if you already know which disk is parity it would simplify things.
Link to comment

But am I correct in saying that at no time should I connect the current parity drive before having all (but one) HDDs back in the server?

Yes. Leave parity out. Do you know which disk was parity? Parity is not a mountable disk since it has no filesystem, but it is also possible that one or more data disks would also be unmountable due to filesystem corruption, so if you already know which disk is parity it would simplify things.

 

Yes I do, luckily I had labelled all disks and took screenshots of the "Main" page.

 

On another note, does anyone know of a similar company in Europe as donordrives quoted me $60 dollars for diagnostics which is fair but $55 per disk to be shipped back to me in Switzerland?

 

Thanks for your help so far guys.

Link to comment

Here's an update.

 

I tested each previously working disk one by one and they are all ok (I have left out the parity dirive) Thanks bjp999.

 

Of the 7 fried disks, one is working without doing anything.

I mentioned previously that I swapped a PCB from a spare equivalent disk to a fried disk but didn't connect it to UNRAID and subsequently swapped it back as it were before, therefore no change. Thanks johnathanm.

 

After finding nothing in Europe, I contacted donordrives again and I'm waiting for info but I'll order 2 PCBs and swap these myself changing the BIOS ROM too.

 

Now could someone look over my plan on how to continue please?

 

This may seem optimistic but here goes.

I'd like to be able to access my data before putting the (hopefully) recovered disks back in the array.

I will go on the assumption that I am able to recover 5 of the remaining fried disks. That leaves one that could be rebuilt using my "old" parity drive.

 

Ignoring the Cache and Parity drives for the moment, I have 9 working drives with my data.

 

Would it be reasonable to start the array with these 9 drives, leaving out the parity drive for later and not writing anything to the array in the hope of rebuilding as it was before if I can indeed repair the fried disks?

 

Even better, could I do the above but add a new disk for new data, being careful that anything new is written to this disk only and when the times comes to add the 5 "repaired" disks, I can remove the new disk and add the old parity drive and it would be as it was before?

 

I'm aware that in the meantime I'd be unprotected of any subsequent disk failures due to no parity.

 

I hope I don't misunderstand completely how Unraid works and thanks for any input.

 

Link to comment

I wouldn't use the disks in an array in the meantime, there are simply way too many potential situations that can cause writes to them...

 

If you need access to the data while maintaining the hope of rebuilding one of the failed drives, make sure any access to the disks is 100% read-only.

 

If I understand, if even 1 byte of data is written the I can forget using the parity drive to recover in the future?

 

Is there a way to start the array in Read-Only mode?

Link to comment

I wouldn't use the disks in an array in the meantime, there are simply way too many potential situations that can cause writes to them...

 

If you need access to the data while maintaining the hope of rebuilding one of the failed drives, make sure any access to the disks is 100% read-only.

 

If I understand, if even 1 byte of data is written the I can forget using the parity drive to recover in the future?

 

Is there a way to start the array in Read-Only mode?

If even 1 byte of data is written then it would compromise any rebuild, but people have mostly recovered from much worse. Whether or not that one byte would actually corrupt a file is uncertain.

 

It sounds as if you understand parity pretty well since the scenarios you propose are somewhat reasonable.

 

I don't think there is any way to start the array read-only. It would be possible to mount individual drives read-only from the command line.

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.