Sparkie Posted March 26, 2021 Share Posted March 26, 2021 Hi Guys: After upgrading from 6.8.3 to 6.9.1 started getting parity errors (168 total). Check smart status on all drives and all seems OK. The next day the Server locked up. Upon reboot all drives connected to the LSI Controller were shown as missing. Rebooting the server and looking at the LSI boot-up text before Unraid loads, all the missing drives are shown with their correct sizes. Continuing the boot-up into Unraid the drives are still shown missing. Note that all eight drives connected to the Motherboard Sata ports are showing up OK. All this started happening a few days after upgrading to 6.9.1 so not sure if this is just a coincidence or something else. As noted above first indicator was the Parity Check. Have been running successful parity check for well over a year on this server with no errors. I reverted back to 6.8.3 and rebooted but with no joy hoping that might be connected to this issue. Drives connected to the LSI Controller are still missing. BTW the controller is flashed to IT for well over a year at original install with no problems. I do note that upon boot-up the missing drives connected to the LSI controller do flash their drive indicators briefly indicating they are being scanned or so it seems but no data shows up in Unraid on the Main menu. The LSI pre-Unraid Boot shows all drives connected and shows the correct sizes (7-drives). Also noted that upon boot-up and after logging into Unraid both my Cache Drives (Western Digital NVME 1TB) were not show in the Cache and both appeared in Unassigned Devices. Re-assigning them to Cache and rebooting and they again appear in Unassigned Devices and not in the actual cache. I have attached diagnostics, any help or advice would be greatly appreciated. System Specs: Fatal!ty X399 Gaming MB AMD Threadripper 2950 (16C/32T) CPU 32Gb HyperX Memory 3 - Supermicro 5-Drive SATA Drive Cages LSI 9702-8i Drive Controller Zotac Geforce GTX 1050 OC 2Gb Video 1000W EVGA 1000GQ Power Supply Roswill 4U Rackmount Case APC Back-UPS XS 1300 tower-diagnostics-20210326-1305.zip Quote Link to comment
JorgeB Posted March 26, 2021 Share Posted March 26, 2021 28 minutes ago, Sparkie said: I reverted back to 6.8.3 and rebooted but with no joy This confirms it's not a v6.9 issue, strange though, the LSI is detected on the devices and the driver is loaded but it's never initialized, make sure it's well seated or try a different slot if available. Quote Link to comment
Hoopster Posted March 26, 2021 Share Posted March 26, 2021 38 minutes ago, Sparkie said: Drives connected to the LSI Controller are still missing. I am no expert on hardware passthrough to VMs, but it looks like the controller has been bound to vfio-pci. Are you trying to pass through the entire controller and attached disks to a VM? 08:00.0 Serial Attached SCSI controller [0107]: Broadcom / LSI SAS2308 PCI-Express Fusion-MPT SAS-2 [1000:0087] (rev 05) Subsystem: Broadcom / LSI 9207-8i SAS2.1 HBA [1000:3020] Kernel driver in use: vfio-pci Kernel modules: mpt3sas Quote Link to comment
JorgeB Posted March 26, 2021 Share Posted March 26, 2021 17 minutes ago, Hoopster said: I am no expert on hardware passthrough to VMs Yep, missed that, you need to unbind the HBA. Quote Link to comment
Sparkie Posted March 26, 2021 Author Share Posted March 26, 2021 Hi Guys: Thanks for the quick response. When I upgraded to 6.9.1 I seem to remember "Fix Common Problems" stating that something to do with VFIO was now a part of 6.9 and no longer needed as a separate "add-on" and suggested removing it. So I removed it. I wonder if doing that remapped the LSI controller to the VM. I will investigate and post an update. Thanks again for all the help. S. Quote Link to comment
Sparkie Posted March 26, 2021 Author Share Posted March 26, 2021 Hi again guys: I am a bit of a noob wrt linux/Unraid. With six missing disks will not be able to start the array and therefore cannot access the VM's. How can I unbind the HBA? Is there a setting in Tools or Settings or must this be done via command line and if so what command would I use? Cheers, S. Quote Link to comment
Hoopster Posted March 26, 2021 Share Posted March 26, 2021 39 minutes ago, Sparkie said: Is there a setting in Tools or Settings or must this be done via command line Do you have a vfio-pci.cfg file in the /config folder on the flash drive? Put your flash drive in a Windows PC and see if you can find that file. If you only have a bind of 08:00.0 in that file, you can just delete vfio-pci.cfg and the (.bak) if it exists. If you have more devices in there that you want bound, just remove the bind of 08:00.0 from the file and save. 1 Quote Link to comment
Sparkie Posted March 26, 2021 Author Share Posted March 26, 2021 Thanks Hoopster: Reverted back to 6.9.1 Unbound LSI Controller from VMs via Tools/System Devices Deleted vfio-pci.cfg & the .bak file from the flash drive as I had nothing bound to the VMs I am using. I did at one time (see below). I think what happened was I had another video card at 08:00:0 and a third-party USB controller in the original system and I was thinking these might have been my issue when the server was frozen so I powered down and removed them. Possibly looks like the LSI controller grabbed that address? Anyway all is now working, array back on-line, cache drives connected. All is well, much thanks for all the help. Quote Link to comment
ChatNoir Posted March 27, 2021 Share Posted March 27, 2021 9 hours ago, Sparkie said: I think what happened was I had another video card at 08:00:0 and a third-party USB controller in the original system and I was thinking these might have been my issue when the server was frozen so I powered down and removed them. Possibly looks like the LSI controller grabbed that address? Addresses can change in that situation. Glad it now works for you. Quote Link to comment
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.