Jump to content

Missing drives after pausing and rebooting rebuild


mganoe

Recommended Posts

As the title describes I was in the middle of a drive rebuild on a precleared disk when my motherboard there was a high temp alarm on 2 CPUs. I paused the rebuild and shut the system down thinking perhaps I had dust buildup in the case. After blowing it out I fired it back up to find another drive failing and 5 disks just missing from the system. I double checked the cables and everything seems to be fine. My attached cards all seem to see the drives, but they do not show at all in UNRAD (6.9.2) from the array devices. I have not tried to restart the array, because I am afraid it might do irreversible damage.

 

Any thoughts would be most welcome at this point.

Link to comment

Looks like you have dual parity and 17 data disks in the array, plus 2 cache and flash for a total of 22 disks, disk9 was rebuilding, but disk4 is missing. With dual parity you should be able to rebuild both.

 

Since you rebooted before getting the diagnostics syslog doesn't tell us anything before that. And I can't tell what disk was assigned as disk4, but there are several unassigned disks in your smart folder.

 

09:00.0 IDE interface [0101]: JMicron Technology Corp. JMB368 IDE controller [197b:2368]
	Subsystem: JMicron Technology Corp. JMB368 IDE controller [197b:2368]
	Kernel driver in use: pata_jmicron
	Kernel modules: pata_jmicron

Do you actually have any IDE drives? If not check in BIOS to use AHCI for all your disks.

 

85:0e.0 RAID bus controller [0104]: Areca Technology Corp. ARC-1220 8-Port PCI-Express to SATA RAID Controller [17d3:1220]
	Subsystem: Areca Technology Corp. ARC-1220 8-Port PCI-Express to SATA RAID Controller [17d3:1220]
	Kernel driver in use: arcmsr
	Kernel modules: arcmsr
87:00.0 RAID bus controller [0104]: Areca Technology Corp. ARC-1680 series PCIe to SAS/SATA 3Gb RAID Controller [17d3:1680]
	Subsystem: Areca Technology Corp. ARC-1222 8-Port PCIe to SAS/SATA 3Gb RAID Controller [17d3:1222]
	Kernel driver in use: arcmsr
	Kernel modules: arcmsr

RAID controllers are not recommended, and I suspect this is the main problem.

Link to comment

The array has all 24 filled with drives, which is what concerns me.  The BIOS is set to use ACHI and the RAID cards are only being used to connect all the drives not as an actual RAID configuration. I moved to UNRAD when I began to get concerned with the age of the cards a few years back.

 

What is odd is the part of the bank that has fallen off are disks 18, 19, 20, 21, 22, which now that I look are all from one RAID card, but 17 is registered just fine and it's also on that same RAID card. The disks you see as unassigned have always been unassigned, but were used in VMs once things spin up normal.

Link to comment
8 hours ago, mganoe said:

What is odd is the part of the bank that has fallen off are disks 18, 19, 20, 21, 22, which now that I look are all from one RAID card, but 17 is registered just fine and it's also on that same RAID card. The disks you see as unassigned have always been unassigned, but were used in VMs once things spin up normal.

Disk 17 is on an LSI controller together with 7 other devices, and all are being detected, also unless you loaded an older Unraid config there are only 17 data disks, look at the screenshot you posted, there are 24 devices total, including cache devices, unassigned SSDs and disks, and an empty slot, you don't even have disk 17 there.

Link to comment

My issues is that 18, 19, 20, 21, 22 as well as 4 all show as unassiged with no option to select the drives that are actually connected to them. I did not load a new config what you see is how it just came back online are the reboot. Prior to that reboot all the drives were assigned a drive and nothing showed as unassigned. That said, I did have 3 drives I believe that were not part of the array.

 

Now I"m just trying to determine how I get the missing drives back so I can restart the array.

Link to comment

I now see what everyone has been talking about as far as the counts go. I just finished going through and pulling all the drives to verify what was where. It appears the only drive UNRAID is not picking up is the 4TB drive with serial number WD-WCC4E3CECN91, which I suspect should be in Disk 4. I have another drive on the way in the event the drive just died. Should I wait to receive the drive before spinning up the array and restarting the rebuild process on my other failed drive?

 

Also can I simply change the slot count on the main page without any negative side effects? I have no idea how it got like that.

Link to comment

If you wait for the other drive you can rebuild both at the same time. Actually, you can use your server with both disabled though there is no protection. Or go ahead and rebuild one then you would be back to having dual parity with only one disabled and so have single protection.

 

You can change the slot count.

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...