Jump to content
We're Hiring! Full Stack Developer ×

4 Drives and 2 Norco Cages Lost in Power Surge


JHTom

Recommended Posts

Helping to fix my friend's unRaid after a power surge destroyed 4 drives with 14TB of data (2 drives when put into a docking station won't even allow power to go on, and 2 aren't spinning at all). In addition, 2 Norco SS-500 drive cages were destroyed as well. This was all despite having a UPS that would consistently show 15 minutes of battery time attached.

 

I know all the data is lost, and that once I spin up the other drives that are being recognized, there could be additional data loss. Luckily, things are backed up so nothing critical has been lost.

 

My question relates to best practices for moving forward. The motherboard, CPU, Ram & PSU have all been upgraded as part of an already planned upgrade, but should I be thinking of anything else or be looking out for any pitfalls?

 

Thanks

Link to comment

 

1) Look at all cabling in an installation - lightning strikes or similar will not just hurt through the mains power lines of the server itself. Phone lines, network cables, ... are also vulnerable. And having part of the connected equipment not using UPS or other good surge arrester technology means a power surge that kills a monitor can also attack through the graphics card of a system.

 

2) Analyze exactly which way that power surge did reach the system. Not all UPS protects from all types of power surges. Many UPS just UPS - i.e. giving protection from loss of power. Good UPS should have their own, very strong, surge arrester technology.

 

3) Make sure the PSU is the best. Maybe the power surge didn't destroy any disks - maybe the PSU did. It's the PSU that is responsible for producing all the critical voltages that the sensitive electronics depends on. It's the PSU that is responsible for making sure that the 3V3, 5V, 12V, ... lines are always kept within the expected +/- 5% range.

 

4) Parity-based solutions only handles some types of system failures. The number of parity drives affects the max number of total drives that can fail before missing drives can't be rebuilt. But only proper backup solutions will be able to handle all types of system failures. And at the same time also manage to protect against user errors like file overwrites etc. Good backup solutions can handle the loss of 10 drives in a 10 drive system, or the loss of 30 drives in a 30 drive system. That's way better than what RAID technology can manage.

Link to comment

I would go into your disk settings and untick enable auto start so the array doesn't mount on boot.

Note down the disk serial numbers that are detected.

power down the system and move the disks around between the two cages and see if it detects the disks in the other cage.

maybe the disks aren't faulty it could be the narco cage itself or even the power cables that plug into it.

hopefully its just the narco cage backplane or one of the power leads.

The two Molex connectors provide power between the drives it could be one of those that is dead.

But at least this will tell you the drive are fine..

 

Unraid is fine with drive being booted on different SATA connections so moving disks around is fine.

If that works at least then its a PSU problem or a Molex Connection problem its a matter of working out which one it is first.

 

Link to comment
On 2/7/2018 at 2:08 PM, JHTom said:

I know all the data is lost, and that once I spin up the other drives that are being recognized, there could be additional data loss. Luckily, things are backed up so nothing critical has been lost.

If it turns out that you wish to attempt recovery, I personally have used http://www.donordrives.com/services and they solved similar issues for way less money than a full drive recovery firm like ontrack. I'm not affiliated, just a satisfied customer.

Link to comment
On 2/7/2018 at 8:08 PM, JHTom said:

Helping to fix my friend's unRaid after a power surge destroyed 4 drives with 14TB of data (2 drives when put into a docking station won't even allow power to go on, and 2 aren't spinning at all). In addition, 2 Norco SS-500 drive cages were destroyed as well. This was all despite having a UPS that would consistently show 15 minutes of battery time attached.

 

I know all the data is lost, and that once I spin up the other drives that are being recognized, there could be additional data loss. Luckily, things are backed up so nothing critical has been lost.

 

My question relates to best practices for moving forward. The motherboard, CPU, Ram & PSU have all been upgraded as part of an already planned upgrade, but should I be thinking of anything else or be looking out for any pitfalls?

 

Thanks

 

Would you please tell us the brand of the UPS so that I make sure to skip them in the future?!

 

Thnx

Link to comment

Thanks for the thoughts and advice. Luckily the data is all backed up, so it is just a matter of replacing the drives and drive cages and importing the data again.

 

I will have to find out what UPS it was. I do know that power went out (though unknown for how long), and that unRaid did not complete a clean power down. The home is in the mountains and power sounds a bit spotty, though not enough to have an emergency generator. Some other devices attached to the UPS made it through with no issue.

Link to comment
8 hours ago, JHTom said:

Thanks for the thoughts and advice. Luckily the data is all backed up, so it is just a matter of replacing the drives and drive cages and importing the data again.

 

I will have to find out what UPS it was. I do know that power went out (though unknown for how long), and that unRaid did not complete a clean power down. The home is in the mountains and power sounds a bit spotty, though not enough to have an emergency generator. Some other devices attached to the UPS made it through with no issue.

 

Since it failed to protect your devices, aren't you entitled for some kind of compensation or perhaps an explanation?

Link to comment
16 minutes ago, Mat1926 said:

 

Since it failed to protect your devices, aren't you entitled for some kind of compensation or perhaps an explanation?

That would require that the customer can prove that the issue happened through the UPS - and that the UPS is of a type that specifies good surge arrester capabilities. And that there wasn't a surge of higher energy level than what the UPS was designed to handle.

 

In real life, there are more ways things can break.

Link to comment
4 minutes ago, pwm said:

That would require that the customer can prove that the issue happened through the UPS - and that the UPS is of a type that specifies good surge arrester capabilities. And that there wasn't a surge of higher energy level than what the UPS was designed to handle.

 

In real life, there are more ways things can break.

 

In other words all those claims that you are protected up to so ans so $$$$ is actually not true?!

Link to comment
3 hours ago, Mat1926 said:

 

In other words all those claims that you are protected up to so ans so $$$$ is actually not true?!

In other words: Most UPS aren't sold with any such claims.

 

And such claims aren't applicable if your equipment has any connections that are not properly protected by the UPS. If the monitor isn't run on the UPS, then a power surge can hit the computer over the video cable. Most TV sets are killed by lightning strikes affecting the antenna, and not the power lines. I have broken a network card after a lightning strike in another building where the central networking switch was. There really are lots of ways for power surges to kill things without it being possible to specifically blame the UPS.

Link to comment

Please observe that any protection device has a surge rating in joules.  If the power surge exceeds that rating, the protection will fail.  A close lightening hit can easily exceed a 100,000 joules.  It can blow a the top of a chimney completely off a house!  ( I know this to be a fact as it was my  house and the furthest piece were 150 feet away!    It also destroyed  a submersible well pump motor 70 feet below ground level.)  

Link to comment

Archived

This topic is now archived and is closed to further replies.

×
×
  • Create New...