Jump to content

Second disk disabled during rebuild


Go to solution Solved by JorgeB,

Recommended Posts

Hello, 

The other day one of my data disks was disabled by unraid. The smart report was still reporting healthy so I followed some instructions in another post here which are the following.

  • stop array
  • remove disk from array
  • start array
  • stop array
  • add disk back to array
  • allow rebuild to begin

This seemed to work, but during the rebuild the disk again became disabled. So I swapped the sata ports for the disabled disk and my optical drive and repeated the above process.
This seemed to be going fine until sometime last night when 48% of the way done with the rebuild another of my data disks became disabled and stopped the rebuild. I lost connection to the ui so did a hard restart and now the array is not starting, which I assume is because of the new disabled disk and the other one being emulated. I have started running extended smart reports.

I am coming here to see if anyone has any ideas on where I can start and what I can do to recover any of the data on the disks. I understand though since I only have a single parity drive it might be a lost cause.


I have attached my diagnostics file and will provide any other data as requested.

unraid-diagnostics-20240408-0838.zip

Edited by flashbeetle
Added information
Link to comment

You can try force enabling disk1 to try and rebuild disk3 again, but the diags shows the SATA link going up and down for multiple disks, you need to try and solve that first, are you using any power splitters? You can also try a different PSU if available.

Link to comment
Posted (edited)

I think I am using a power splitter, but am not sure. I am using the cables that came with this power supply. One of them has multiple ends and that is plugged in to power the disks. Based on the diagnostics, can you tell when the SATA link started going up and down? I am not yet sure how to find anything in the diagnostics myself.

I have an old power supply I can see if that would work for testing purposes.

Edited by flashbeetle
Add link
Link to comment

Gotcha, Thanks for the information.

I think I may have resolved the power issue. I found a power cable that had come loose inside the machine while it had been moving around. I have had the system on running an extended SMART test on the second drive that became disabled and it came back without errors. I have attached the diagnostics from this boot in case that continues to help

I am guessing that at this point my next steps would be to start things back up?

It was previously mentioned force enabling disk1 and trying to rebuild disk3 again. How would i go about forcing disk1 enabled?

Are there any other steps I should take to try and validate the data on disk1 before moving forward?

unraid-diagnostics-20240409-1427.zip

Link to comment

I may have made a mistake. I attempted to do what I guessed was forcing disk1 to be enabled and stopped the array removed the disk and restarted the array, then stopping again and adding it back. At this point I cannot start the array as I am getting the message "Too many wrong and/or missing disks!" 

What should my next move be?

Link to comment
  • Solution

This will only work if parity is still valid:

 

-Tools -> New Config -> Retain current configuration: All -> Apply
-Check all assignments and assign any missing disk(s) if needed.
-IMPORTANT - Check both "parity is already valid" and "maintenance mode" and start the array (note that the GUI will still show that data on parity disk(s) will be overwritten, this is normal as it doesn't account for the checkbox, but it won't be as long as it's checked)
-Stop array
-Unassign disk3
-Start array (in normal mode now), and post new diags.


 

  • Like 1
Link to comment
Posted (edited)

Things are looking much better. It seems replacing the power supply was the ticket to fix the sata disconnects. The data rebuild completed and reports finding 0 errors. During the rebuild I did get a SMART warning for both disk1 and disk2. They both came up with a UDMA CRC Error Count with a raw value of 1. I have attached diagnostics again just in case, but based on some other posts here it sounds like as long as this value does not grow I should be ok with these disks.

Should I just acknowledge these errors and move on? 

Thank you so much for all your help through this process!

unraid-diagnostics-20240413-2112.zip

Edited by flashbeetle
More data
Link to comment
Posted (edited)
6 hours ago, itimpi said:

Yes.   As long as they only happen occasionally they will be a non-issue.

Perfect that is great to hear.


Now unfortunately I am seeing the problem that started this all again. In short it looks like all the sata devices, 3 data disks, 1 parity disk, and 1 optical drive are not available. I am posting diagnostics. At this point since I am using a brand new power supply I am wondering if it is an issue with the motherboard and I will need to replace that.

unraid-diagnostics-20240414-0859.zip

Edited by flashbeetle
Clarification
Link to comment
Posted (edited)

I have now replaced the hardware that should have been causing the failures. At this point I can see all the drives in unraid. 

I am thinking that since all the drives dropped out at the same time with the last hardware failure they should all be ok in terms of data. 

For my next steps I am guessing that I could either trigger another data rebuild for disk3 since that one is currently disabled. Or I could force enable disk3 using the instructions @JorgeB provided above since parity should be valid.

What do y'all think would be my best bet for next steps?

unraid-diagnostics-20240415-1556.zip

Edited by flashbeetle
Attached missing diagnostics
Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...