[Solved] Upgrade to 6.9.1 from 6.8.3 - Parity Errors on first scheduled Check, LSI Controller Issues


Recommended Posts

Hi Guys:

After upgrading from 6.8.3 to 6.9.1 started getting parity errors (168 total). Check smart status on all drives and all seems OK.

The next day the Server locked up. Upon reboot all drives connected to the LSI Controller were shown as missing. Rebooting the server and looking at the LSI boot-up text before Unraid loads, all the missing drives are shown with their correct sizes. Continuing the boot-up into Unraid the drives are still shown missing. Note that all eight drives connected to the Motherboard Sata ports are showing up OK.

All this started happening a few days after upgrading to 6.9.1 so not sure if this is just a coincidence or something else. As noted above first indicator was the Parity Check. Have been running successful parity check for well over a year on this server with no errors.

 

I reverted back to 6.8.3 and rebooted but with no joy hoping that might be connected to this issue. Drives connected to the LSI Controller are still missing. BTW the controller is flashed to IT for well over a year at original install with no problems. I do note that upon boot-up the missing drives connected to the LSI controller do flash their drive indicators briefly indicating they are being scanned or so it seems but no data shows up in Unraid on the Main menu. The LSI pre-Unraid Boot shows all drives connected and shows the correct sizes (7-drives).

 

Also noted that upon boot-up and after logging into Unraid both my Cache Drives (Western Digital NVME 1TB) were not show in the Cache and both appeared in Unassigned Devices. Re-assigning them to Cache and rebooting and they again appear in Unassigned Devices and not in the actual cache.

 

I have attached diagnostics, any help or advice would be greatly appreciated.

 

System Specs:

Fatal!ty X399 Gaming MB

AMD Threadripper 2950 (16C/32T) CPU

32Gb HyperX Memory

3 - Supermicro 5-Drive SATA Drive Cages

LSI 9702-8i Drive Controller

Zotac Geforce GTX 1050 OC 2Gb Video

1000W EVGA 1000GQ Power Supply

Roswill 4U Rackmount Case

APC Back-UPS XS 1300

tower-diagnostics-20210326-1305.zip

Link to comment
28 minutes ago, Sparkie said:

I reverted back to 6.8.3 and rebooted but with no joy

 

This confirms it's not a v6.9 issue, strange though, the LSI is detected on the devices and the driver is loaded but it's never initialized, make sure it's well seated or try a different slot if available.

Link to comment
38 minutes ago, Sparkie said:

Drives connected to the LSI Controller are still missing.

I am no expert on hardware passthrough to VMs, but it looks like the controller has been bound to vfio-pci.  Are you trying to pass through the entire controller and attached disks to a VM?

 

08:00.0 Serial Attached SCSI controller [0107]: Broadcom / LSI SAS2308 PCI-Express Fusion-MPT SAS-2 [1000:0087] (rev 05)
    Subsystem: Broadcom / LSI 9207-8i SAS2.1 HBA [1000:3020]
    Kernel driver in use: vfio-pci
    Kernel modules: mpt3sas

Link to comment

Hi Guys:

Thanks for the quick response.

When I upgraded to 6.9.1 I seem to remember "Fix Common Problems" stating that something to do with VFIO was now a part of 6.9 and no longer needed as a separate "add-on" and suggested removing it. So I removed it. I wonder if doing that remapped the LSI controller to the VM.

 

I will investigate and post an update.

 

Thanks again for all the help.

S.

Link to comment

Hi again guys:

I am a bit of a noob wrt linux/Unraid.

With six missing disks will not be able to start the array and therefore cannot access the VM's.

How can I unbind the HBA?

Is there a setting in Tools or Settings or must this be done via command line and if so what command would I use?

Cheers,

S.

 

Link to comment
39 minutes ago, Sparkie said:

Is there a setting in Tools or Settings or must this be done via command line

Do you have a vfio-pci.cfg file in the /config folder on the flash drive? 

 

Put your flash drive in a Windows PC and see if you can find that file.  If you only have a bind of 08:00.0 in that file, you can just delete vfio-pci.cfg and the (.bak) if it exists.  If you have more devices in there that you want bound, just remove the bind of 08:00.0 from the file and save.

  • Thanks 1
Link to comment

Thanks Hoopster:

Reverted back to 6.9.1

Unbound LSI Controller from VMs via Tools/System Devices

Deleted vfio-pci.cfg & the .bak file from the flash drive as I had nothing bound to the VMs I am using. I did at one time (see below).

I think what happened was I had another video card at 08:00:0  and a third-party USB controller in the original system and I was thinking these might have been my issue when the server was frozen so I powered down and removed them. Possibly looks like the LSI controller grabbed that address?

Anyway all is now working, array back on-line, cache drives connected.

All is well, much thanks for all the help.

Link to comment
9 hours ago, Sparkie said:

I think what happened was I had another video card at 08:00:0  and a third-party USB controller in the original system and I was thinking these might have been my issue when the server was frozen so I powered down and removed them. Possibly looks like the LSI controller grabbed that address?

Addresses can change in that situation.

Glad it now works for you. :) 

Link to comment
  • ChatNoir changed the title to [Solved] Upgrade to 6.9.1 from 6.8.3 - Parity Errors on first scheduled Check, LSI Controller Issues

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.