Jump to content

Need help with a failed drive


Seanmc980
Go to solution Solved by JorgeB,

Recommended Posts

Hello, I went to shut down my server and noticed that one of my drives in the array had a red X next to it. It listed 64 errors in the column. I tried to restart hoping that a reboot would just fix the issue (knowing that it probably wouldn't) but the drive is still red. When I first noticed this, the drive was missing it's temperature reading. The system hung and never rebooted. I issued a shutdown, and eventually it did shut down, but it indicated that it wasn't a clean shut down. I started the array in maintenance mode and started a parity check.. I feel like this is probably a 20 hour waste of time, so i'm reaching out to support for help with getting this back up and running.

After the reboot, the log files seemed to have cleared. The drive is still showing under the array but the drive's letters have changed.. it went from SDK to SDG and the drive is shown below in under historical device..

 

This array is made up of 4 12TB drives, with 1 parity, 1 2TB drives and 3 unassigned drives. I want to make sure I perform the right steps, this is my main server for all my media and back ups.. I really do not want to screw this up, please someone walk me through the process.. I found a couple of forum posts, but they linked to outdated and removed posts.

Link to comment
1 hour ago, Seanmc980 said:

started a parity check

Fortunately you didn't start a correcting parity check

Sep 10 14:07:13 BTCH kernel: mdcmd (36): check nocorrect

since what you need to do is rebuild the disabled disk using the existing parity.

 

syslog indicates emulated disk3 mounted before you restarted in maintenance mode, and disk3 SMART looks OK. Probably just a connection problem, but can't say for sure since it happened before reboot so nothing in syslog about that.

 

It should be OK to rebuild to the same disk, but you should check connections first.

 

https://docs.unraid.net/unraid-os/manual/storage-management/#rebuilding-a-drive-onto-itself

 

Do you have backups of anything important and irreplaceable?

 

 

Link to comment

This drive was powered by an expansion sata power connector.. one of those old style HDD to Sata power adapters.. the one power cable was split three times to power 2 fans, then the sata adapter.. it was powering 4 drives, including the one with errors.. I wasn't aware that this was a terrible idea, until now. The power supply is an older modular ATX style, but I lost all the expansion cables in a move.. new power supply ordered, server shut down until I get it swapped and properly powered.. I'll report back when I get it retested. Thanks for your time.

Edited by Seanmc980
Link to comment

Bad communication with a disk can be caused by bad cables (power or SATA), bad connectors (power and splitters or SATA, either end), loose connections (power and SATA, either end).

 

Each connection must sit squarely on the connector, with no tension in the cable that might cause it to move. Don't bundle data cables or you could get crosstalk interference. Don't put more than 4 drives on a single PSU cable.

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...