Jump to content
We're Hiring! Full Stack Developer ×

My 10TB disabled itself and now my 4TB did on rebuild


Recommended Posts

Hey Guys,

Over the weekend, my 10TB Red data drive started getting errors and disabled itself. I did a check on it and it passed, so I set it to no device and re-enabled it to start the rebuild. During the rebuild my 4TB Data drive got a single error and disabled itself.

 

I feel I am in a pickle here. What should I do? I am not overly sure what I am looking at or for in the diag zip.

tower-diagnostics-20191223-0828.zip

Edited by JonesCKevin
Link to comment
10 minutes ago, johnnie.black said:

You shouldn't have resumed the rebuild, no point in going on with two invalid disks.

 

If all array data is unchanged since the rebuild started you can use the invalid slot command to re-enable disk2, since SMART looks fine, but it has millions of CRC errors so you should replace the SATA cable before doing it.

Roger that, 

 

To confirm before I break something, invalid slot command:

mdcmd set invalidslot 2

Link to comment
6 minutes ago, JonesCKevin said:

To confirm before I break something, invalid slot command:

It's more than, I'll post the procedure in a few minutes, in the meantime all your disks have millions of UDMA CRC errors, you're using an LSI firmware with known issues, so ignore the above advice to replace the SATA cable, you need to update to 20.00.07.00 before rebuilding.

Link to comment

-Tools -> New Config -> Retain current configuration: All -> Apply
-Assign any missing disk(s) if needed
-Important - After checking the assignments leave the browser on that page, the "Main" page.

-Open an SSH session/use the console and type (don't copy/paste directly from the forum, as sometimes it can insert extra characters):

mdcmd set invalidslot 1 29

-Back on the GUI and without refreshing the page, just start the array, do not check the "parity is already valid" box (GUI will still show that data on parity disk(s) will be overwritten, this is normal as it doesn't account for the invalid slot command, but they won't be as long as the procedure was correctly done), disk1 will start rebuilding, disk should mount immediately but if it's unmountable don't format, wait for the rebuild to finish and then run a filesystem check

  • Thanks 1
Link to comment
2 minutes ago, johnnie.black said:

-Tools -> New Config -> Retain current configuration: All -> Apply
-Assign any missing disk(s) if needed
-Important - After checking the assignments leave the browser on that page, the "Main" page.

-Open an SSH session/use the console and type (don't copy/paste directly from the forum, as sometimes it can insert extra characters):


mdcmd set invalidslot 1 29

-Back on the GUI and without refreshing the page, just start the array, do not check the "parity is already valid" box (GUI will still show that data on parity disk(s) will be overwritten, this is normal as it doesn't account for the invalid slot command, but they won't be as long as the procedure was correctly done), disk1 will start rebuilding, disk should mount immediately but if it's unmountable don't format, wait for the rebuild to finish and then run a filesystem check

I am just curious if it should be mdcmd set invalidslot 2 29  since it's disk 2 that isn't set

Link to comment

Small Update:

Just an update to the progress so far. I decided to wait on the Firmware update to try and get the drives up and running first.

So far this may not be the best method as other drives get 1 error at a random time in the rebuild, which causes those drives to disable. My next step will probably to just change the port the drives are on to SATA instead of the card until it is repaired. 

Link to comment

New Update and Probably Final Update:

I as many SAS cables to my on board SATA ports which is 6 of my 8 drives. I did this to alleviate the chances of CRC errors during the repair. I did the mentioned above to have disk one start the repair again. Just as an extra measure in case heat is an issue I did place a giant fan on my case which kept all my drives at 26 Degrees Celsius as well.

 

Everything appears to be up and running now.

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...