6.12 - hba connected disks spun down shown as active - suddenly read errors - fixed after reboot


Recommended Posts

Hi

 

I ran the 6.12 rc5 before updating to the final and I now encounter errors with my lsi 9300 hba. 

 

If everything is freshly booted and spun up, all is fine. However, when disks spin down, the gui still shows a green dot for the disks connected to the hba, while the ones connected to my mainboard sata controller have the correct grey dot. 

 

This wouldn't bother me, but the hba disks also suddenly report read errors after some time, which again are fixed after rebooting the host until the disks enter standby. 

 

Could this be due to power management of the pci e devices? Or is my hba on the way out? 

Link to comment

Hi JorgeB,

 

so, i think I f*cked up...

 

I pulled the HBA and repasted the heatspreader on the chip. The old thermal compound was completely dry, solid and oozed a solidified liquid that looked like treesap.

After reassembling, i booted unraid and started the rebuild of disk 10 (as referred in the screenshot).

During this, the connection broke down again. I believe the card is toast and dies when getting too warm (it's pretty warm here the last few days), even with new thermal compound and a directly attached fan.

 

The issue now is, that the rebuild of disk 10 hasn't finished and  disk 8 showed also suddenly as "disabled - content emulated." And this is where I made a mistake I think...

 

I stopped the array, set the failed disk 8 to "disabled", started the array in maintenance, stopped it again and tried reassigning the disk to slot 8. But now it shows as a new device... I can power up the array only when i set disk 8 to unassigned, otherwise too many disks are missing / changed.

 

I don't want to carry on with the rebuild of disk 10 with this shot HBA, a new one should arrive tomorrow.

However, will I be able to fix this situation at all and what would be the best course of action? Will I be able to correctly reassign disk 8 after disk 10 has been rebuilt, or is the data on disk 8 gone and I have to add it as a new device?

 

 

The partition is still there in unassigned devices and my array is only 30% full, so if I can save the files somehow, that would be great. The files are not irreplacable, but nevertheless would be a hassle to aquire again.

image.png.0dc5ff8e9523b2cd82f9da2b57810334.png

Link to comment
11 minutes ago, JorgeB said:

Unraid cannot emulate two disks with single parity

I know, thats why I'm a bit scared ;)

 

11 minutes ago, JorgeB said:

[...] what happened to disk8? It's not even assigned.

image.thumb.png.f23c525b48ff7fad4ed7efa02eb5baa5.png

 

It's showing up as new. The data on it is still accessible when mounted through unassigned devices, but I cannot reintroduce it into the array like this. And I also can't check the filesystem, because if i assign both, I cannot start the array due to 2 missing/new disks with single parity. 

 

 

Edited by WoRie
Link to comment

Yes, Disk8 suddendly showed up as dead. 

 

How can I force it back into the array? I believe parity should still be valid and the disks should be fine.

 

The issue was the HBA. In my old case it was directly cooled through a case fan that was near it, in the new case this fan is missing and it was 30° C the last few days. I believe that was the culprit and the HBA died. I zip tied a small Noctua to the new HBA to be safe in the future

Link to comment

This will only work if parity is still valid, but if nothing else should re-enable disk8 and its data:

 

-Tools -> New Config -> Retain current configuration: All -> Apply
-Check all assignments and assign any missing disk(s) if needed, including the old disk8 and current disk10
-IMPORTANT - Check both "parity is already valid" and "maintenance mode" and start the array (note that the GUI will still show that data on parity disk(s) will be overwritten, this is normal as it doesn't account for the checkbox, but it won't be as long as it's checked)
-Stop array
-Unassign disk10
-Start array (in normal mode now) and post new diags.

Link to comment

disk 5 had problems negotiating a link, even with new cables. Now after some up and down the array is up. disk 10 reports as being emulated and i only can perform a read check but no rebuild of disk 10...

 

I think if i will be able to restore the array in full, i immediatly should move all files from these old disks, some from 2011 to the newer 18tb drives...

 

Can I rebuild disk10 which currently appears empty in the array or should I wipe it and readd it?

wonas-diagnostics-20230625-1811.zip

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.