May 28May 28 System: Supermicro CS836 chassis with a QNAP enclosure added via LSI SAS3216Mobo: MSI Pro Z790-P WI-FI with 12700KDrives: 22 mostly WDC SATA in BTRFSUpgrade went smoothly. Rebooted and immediately noticed that UNRAID CLI was reporting the following repeatedly when tyring to scan the hard drives:May 28 09:21:55 BigBoi kernel: I/O error, dev sdh, sector 128 op 0x0:(READ) flags 0x80700 phys_seg 48 prio class 2May 28 09:21:55 BigBoi kernel: sd 0:0:10:0: attempting task abort!scmd(0x00000000fcba39fa), outstanding for 30463 ms & timeout 30000 msMay 28 09:21:55 BigBoi kernel: sd 0:0:10:0: [sdk] tag#5615 CDB: opcode=0x28 28 00 00 00 00 80 00 01 80 00May 28 09:21:55 BigBoi kernel: scsi target0:0:10: handle(0x0018), sas_address(0x300062b202aed2ce), phy(14)May 28 09:21:55 BigBoi kernel: scsi target0:0:10: enclosure logical id(0x500062b202aed2c0), slot(4) May 28 09:21:55 BigBoi kernel: scsi target0:0:10: enclosure level(0x0000), connector name( )May 28 09:21:55 BigBoi kernel: sd 0:0:7:0: Power-on or device reset occurredThis happened over and over -- eventually the server booted (30-40 mins) and then allowed me to login where it struggled for another 30 mins to start the array (It eventually did)I was able to grab snippets and diags prior to rolling back to 7.3.0. Roll back successful and everything is running perfect on 7.3.0.Happy to provide more information.Skip bigboi-diagnostics-20260528-0934.zip
May 28May 28 Community Expert If it works fine again after downgrading it, it could be a kernel regression, the other LSI controller is apparently working fine, problem is just the 9305-16e.It may be worth looking for a firmware update; we can also see if any other users report similar issues.
May 28May 28 Author @JorgeB Fair enough. I didn't dive into the diagnostic- could you tell if the problem was actually coming from the 3216 or the 2308? I can check firmwares but i was pretty sure these old cards did not have updates.Thanks,Skip
May 28May 28 Community Expert 31 minutes ago, Skipdog said:could you tell if the problem was actually coming from the 3216It only appears to affect this one
May 31May 31 Author Just one more note-OpenAI analysis spit out:Diagnostics show repeated 30-second I/O timeouts resulting in task aborts and device resets on host0 (LSISAS3216 / SAS9305-16e running FW 16.00.11.00). No corresponding aborts are seen on host1 (LSISAS2308 running FW 20.00.07.00). Rolling back from Unraid 7.3.1 to 7.3.0 immediately resolves the issue. Controller remains operational and does not enter IOC fault state; failures appear to be command timeout related rather than HBA crashes.I'm wondering would BIOS/Firmware update be worth it ? My feeling is no it would not help.Skip
June 1Jun 1 Community Expert 15 hours ago, Skipdog said:I'm wondering would BIOS/Firmware update be worth it ?It''s definitely worth a try.
June 1Jun 1 Author @JorgeB I could be wrong but for the 9305-16e it appears it is maxed out stable at:IT_Nexus mode - fw: 16.00.11.00, nvdata: 10.00.91.xx: Channel_9305-16e_IT_Nexus.binAbort Task Set - fw: 16.00.11.00, nvdata: 10.00.92.xx: Channel_9305-16e_ATS.binI don't believe there is anything newer but definitely could be wrong.Skip Edited June 1Jun 1 by Skipdog
June 1Jun 1 Community Expert Broadcom's site has 16.00.12.00https://www.broadcom.com/support/download-search?pg=Storage+Adapters,+Controllers,+and+ICs&pf=Legacy+Host+Bus+Adapters&pn=SAS+9305-16e+Host+Bus+Adapter&pa=&po=&dk=&pl=&l=true
June 1Jun 1 Author @jynxsee I wasn't following the answer "Same firmware" -- same firmware as what? Are you running 16.00.11.00 or 16.00.12.00 ?@JorgeB I'm very hesitant to even try the firmware upgrade as it is the only card I have to drive the external enclosure and these cards have doubled in price now.. Definitely don't want to render my system unusable. I would love to find out if anyone else having this issue is already on 16.00.12...
June 1Jun 1 Author @JorgeB Before taking the plunge - the AI assistant is recommending trying to append "pcie_aspm=off" to the boot statement - do you think this is worth a try in lieu of flashing first?
June 1Jun 1 Community Expert I've not seen that help with a similar issue, but it won't hurt, so it's worth a try.
June 1Jun 1 Author OK some new data-Upgraded to 16.00.12.00 and rebooted into the same 7.3.0 version to make sure everything worked good (it did)Upgraded to 7.3.1 and rebooted and confirmed the timeout/resets are still happening (they are)Tried to add pci_aspm=off to boot statement -- did not fix. The AI wants a couple more tests on the boot statement like: /bzimage initrd=/bzroot pcie_aspm=off pci=noaer and nomodeset pci=noaer pcie_aspm=offand... pcie_port_pm=off pci=noaerAt this point I think i will revert it back to 7.3.0 and wait to see what can be done. Otherwise will need to change out the card to advance in UNRAID versions.Skip
June 2Jun 2 Community Expert Solution Doubt any kernel parameters will help, most likely you will need to wait for a kernel fix.
June 2Jun 2 Author @JorgeB What is the best way forward-- is there an avenue to log the bug report (regression) with Slackware, etc?
June 3Jun 3 Community Expert You can create a Linux kernel bug report, or just wait for a newer release, since those controllers are well used, I would think a fix would be available soon.
June 3Jun 3 Community Expert BTW, just found a user with a 9305-16i running Unraid without this issue, so it's not a general problem:08:00.0 Serial Attached SCSI controller [0107]: Broadcom / LSI SAS3224 PCI-Express Fusion-MPT SAS-3 [1000:00c4] (rev 01)Subsystem: Broadcom / LSI SAS9305-16i [1000:3190]Kernel driver in use: mpt3sasKernel modules: mpt3sasJun 1 22:06:14 DL380G9 kernel: mpt3sas_cm0: FW Package Ver(16.00.12.00)Jun 1 22:06:46 DL380G9 emhttpd: Unraid(tm) System Management Utility version 7.3.1
June 3Jun 3 Author @JorgeB Thanks for the input. That being the case it must be something different with the 9305-16e (my card) vs the internal variant -- SATA timings, expander negotiation, etc. I'll stay on the working version for now!
June 4Jun 4 I am having the same issue with my 9305-16e: 7.3.1 can't finish booting and gets stuck in the loop described by Skipdog.Configuration:[1000:00c9] 01:00.0 Serial Attached SCSI controller: Broadcom / LSI SAS3216 PCI-Express Fusion-MPT SAS-3 (rev 01)FWVersion(16.00.11.00), ChipRevision(0x01)QNAP TL-D1600S enclosure with 8 drives
June 4Jun 4 Author @paolobosco Good to know my scenario isn't a result of some strange configuration on my side. I can validate that going to 16.00.12.00 does not fix this problem so you can skip that!!@JorgeB I can mark your response to wait for a potential fix via Kernel update as the solution.
June 24Jun 24 Author @Kboogie Thanks for the detailed data point. Looks identical to my issue. I did look at another post where another user replaced his card with a 94xx series but they are expensive and i quickly ditched that idea. Hope they are able to fix the regression. 7.3.0 works great.
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.