pro_con Posted May 4, 2021 Share Posted May 4, 2021 (edited) Hi, I'm setting up a new Unraid instance hosted in an HP Z820 workstation. I have drives connected to the internal Intel disk controller, which appear successfully, and an internal LSI SAS2308 on the motherboard, which also appear correctly. I also have disks attached to an LSI SAS9201-16e (SAS2116), and those drives don't appear in Unraid at all. The controller itself appears in Unraid's system devices assigned to an IOMMU, and I can see where in the logs it's trying to initialize the controller, but without success: Quote May 4 15:48:11 SINGULARITY kernel: mpt2sas_cm1: CurrentHostPageSize is 0: Setting default host page size to 4k May 4 15:48:11 SINGULARITY kernel: mpt2sas_cm1: MSI-X vectors supported: 1 May 4 15:48:11 SINGULARITY kernel: no of cores: 24, max_msix_vectors: -1 May 4 15:48:11 SINGULARITY kernel: mpt2sas_cm1: 0 1 May 4 15:48:11 SINGULARITY kernel: mpt2sas_cm1: High IOPs queues : disabled May 4 15:48:11 SINGULARITY kernel: mpt2sas1-msix0: PCI-MSI-X enabled: IRQ 109 May 4 15:48:11 SINGULARITY kernel: mpt2sas_cm1: iomem(0x00000000db39c000), mapped(0x0000000089367df4), size(16384) May 4 15:48:11 SINGULARITY kernel: mpt2sas_cm1: ioport(0x0000000000008000), size(256) May 4 15:48:11 SINGULARITY kernel: mpt2sas_cm1: CurrentHostPageSize is 0: Setting default host page size to 4k May 4 15:48:11 SINGULARITY kernel: mpt2sas_cm1: scatter gather: sge_in_main_msg(1), sge_per_chain(9), sge_per_io(128), chains_per_io(15) May 4 15:48:11 SINGULARITY kernel: ------------[ cut here ]------------ May 4 15:48:11 SINGULARITY kernel: WARNING: CPU: 6 PID: 236 at mm/page_alloc.c:4935 __alloc_pages_nodemask+0x3a/0x1fc May 4 15:48:11 SINGULARITY kernel: Modules linked in: acpi_cpufreq(-) sb_edac wmi_bmof x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm crct10dif_pclmul crc32_pclmul crc32c_intel ghash_clmulni_intel aesni_intel crypto_simd cryptd i2c_i801 igb glue_helper i2c_algo_bit i2c_smbus rapl isci(+) intel_cstate mpt3sas(+) i2c_core ahci intel_uncore libahci libsas e1000e raid_class scsi_transport_sas mlx4_core(+) wmi button May 4 15:48:11 SINGULARITY kernel: CPU: 6 PID: 236 Comm: kworker/6:1 Not tainted 5.10.28-Unraid #1 May 4 15:48:11 SINGULARITY kernel: Hardware name: Hewlett-Packard HP Z820 Workstation/158B, BIOS J63 v03.90 06/01/2016 May 4 15:48:11 SINGULARITY kernel: Workqueue: events work_for_cpu_fn May 4 15:48:11 SINGULARITY kernel: RIP: 0010:__alloc_pages_nodemask+0x3a/0x1fc May 4 15:48:11 SINGULARITY kernel: Code: 00 00 48 83 ec 30 65 48 8b 04 25 28 00 00 00 48 89 44 24 28 31 c0 48 89 e7 83 fe 0a f3 ab 76 11 0f ba e5 0d 0f 82 9a 01 00 00 <0f> 0b e9 93 01 00 00 b8 22 01 32 01 48 63 d2 41 89 f5 48 89 5c 24 May 4 15:48:11 SINGULARITY kernel: RSP: 0018:ffffc90006b37cc0 EFLAGS: 00010202 May 4 15:48:11 SINGULARITY kernel: RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000 May 4 15:48:11 SINGULARITY kernel: RDX: 0000000000000001 RSI: 000000000000000b RDI: ffffc90006b37ce8 May 4 15:48:11 SINGULARITY kernel: RBP: 0000000000000cc0 R08: ffffffff824c4d58 R09: ffff8881000604f0 May 4 15:48:11 SINGULARITY kernel: R10: 0000000000000044 R11: 0000000000000228 R12: 0000000000428000 May 4 15:48:11 SINGULARITY kernel: R13: ffff888843ab50b8 R14: 000000000000000b R15: 0000000000000cc0 May 4 15:48:11 SINGULARITY kernel: FS: 0000000000000000(0000) GS:ffff88901fa00000(0000) knlGS:0000000000000000 May 4 15:48:11 SINGULARITY kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 May 4 15:48:11 SINGULARITY kernel: CR2: 0000000000656160 CR3: 000000000500a002 CR4: 00000000001706e0 May 4 15:48:11 SINGULARITY kernel: Call Trace: May 4 15:48:11 SINGULARITY kernel: intel_alloc_coherent+0x4f/0xf7 May 4 15:48:11 SINGULARITY kernel: dma_pool_alloc+0xac/0x163 May 4 15:48:11 SINGULARITY kernel: base_alloc_rdpq_dma_pool+0xe0/0x179 [mpt3sas] May 4 15:48:11 SINGULARITY kernel: mpt3sas_base_attach+0x648/0x17c6 [mpt3sas] May 4 15:48:11 SINGULARITY kernel: _scsih_probe+0x76e/0x86b [mpt3sas] May 4 15:48:11 SINGULARITY kernel: local_pci_probe+0x3c/0x7a May 4 15:48:11 SINGULARITY kernel: work_for_cpu_fn+0x11/0x17 May 4 15:48:11 SINGULARITY kernel: process_one_work+0x13c/0x1d5 May 4 15:48:11 SINGULARITY kernel: process_scheduled_works+0x22/0x27 May 4 15:48:11 SINGULARITY kernel: worker_thread+0x1a8/0x22f May 4 15:48:11 SINGULARITY kernel: ? process_scheduled_works+0x27/0x27 May 4 15:48:11 SINGULARITY kernel: kthread+0xe5/0xea May 4 15:48:11 SINGULARITY kernel: ? __kthread_bind_mask+0x57/0x57 May 4 15:48:11 SINGULARITY kernel: ret_from_fork+0x22/0x30 May 4 15:48:11 SINGULARITY kernel: ---[ end trace e43e97802c78afcf ]--- May 4 15:48:11 SINGULARITY kernel: mpt2sas_cm1: failure at drivers/scsi/mpt3sas/mpt3sas_scsih.c:11045/_scsih_probe()! I've confirmed using sas2flsh that both HBAs are running firmware version 20.X in IT mode - the 2308 has firmware and bios, and the 2116 is firmware only. Everything I've found indicates that the 2116 doesn't need the bios to work correctly as an HBA in Unraid though. I have tried attaching the 2116 to multiple PCIe ports with no change. I've found posts by other people saying that this is an issue with the linux kernel post 5.8 and that it is fixed by max_queue_depth=10000, but my system has a copy of /etc/modprobe.d/mpt3sas.conf with the correct content, and I've also tried adding the parameter directly to syslinux.cfg, with no success. I would prefer not to roll back to a previous version since I plan to use the Nvidia driver integration introduced in 6.9, but if that's the next thing that I have to do for testing I'm willing to. All of the other posts about this issue that don't refer to max_queue_depth seem to be for non-LSI HBAs, so I'm really not sure where to go next - any info or assistance would be greatly appreciated. Thank you! Edited May 6, 2021 by pro_con resolving thread Quote Link to comment
JorgeB Posted May 5, 2021 Share Posted May 5, 2021 If it's an option try that controller in a different server/board just to see if it initializes correctly. Quote Link to comment
pro_con Posted May 5, 2021 Author Share Posted May 5, 2021 Unfortunately I don't have access to another suitable chassis to try as a host. I did find a spare flash drive and loaded up a copy of Unraid 6.8.3, with kernel version 4.19.107, at which point it still doesn't work, but I get a different failure after initialization in the logs: Quote May 5 12:10:34 Tower kernel: mpt2sas_cm1: CurrentHostPageSize is 0: Setting default host page size to 4k May 5 12:10:34 Tower kernel: mpt2sas_cm1: MSI-X vectors supported: 1, no of cores: 24, max_msix_vectors: -1 May 5 12:10:34 Tower kernel: mpt2sas1-msix0: PCI-MSI-X enabled: IRQ 69 May 5 12:10:34 Tower kernel: mpt2sas_cm1: iomem(0x00000000db39c000), mapped(0x00000000fa8615ef), size(16384) May 5 12:10:34 Tower kernel: mpt2sas_cm1: ioport(0x0000000000008000), size(256) May 5 12:10:34 Tower kernel: mpt2sas_cm1: CurrentHostPageSize is 0: Setting default host page size to 4k May 5 12:10:34 Tower kernel: WARNING: CPU: 6 PID: 879 at mm/page_alloc.c:4385 __alloc_pages_nodemask+0x46/0xae1 May 5 12:10:34 Tower kernel: Modules linked in: acpi_cpufreq(-) sb_edac x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel wmi_bmof kvm crct10dif_pclmul crc32_pclmul crc32c_intel ghash_clmulni_intel pcbc aesni_intel aes_x86_64 crypto_simd cryptd mpt3sas(+) mlx4_core(+) glue_helper isci(+) libsas ahci i2c_i801 intel_cstate i2c_core igb(O) intel_uncore e1000e raid_class intel_rapl_perf scsi_transport_sas pcc_cpufreq libahci wmi button May 5 12:10:34 Tower kernel: CPU: 6 PID: 879 Comm: kworker/6:1 Tainted: G O 4.19.107-Unraid #1 May 5 12:10:34 Tower kernel: Hardware name: Hewlett-Packard HP Z820 Workstation/158B, BIOS J63 v03.90 06/01/2016 May 5 12:10:34 Tower kernel: Workqueue: events work_for_cpu_fn May 5 12:10:34 Tower kernel: RIP: 0010:__alloc_pages_nodemask+0x46/0xae1 May 5 12:10:34 Tower kernel: Code: 65 48 8b 04 25 28 00 00 00 48 89 84 24 b8 00 00 00 31 c0 48 8d 7c 24 58 83 fe 0a f3 ab 76 12 41 0f ba e0 09 0f 82 6b 0a 00 00 <0f> 0b e9 64 0a 00 00 b8 22 01 32 01 48 63 d2 41 89 f4 48 89 5c 24 May 5 12:10:34 Tower kernel: RSP: 0018:ffffc90007f3fbf8 EFLAGS: 00010202 May 5 12:10:34 Tower kernel: RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000 May 5 12:10:34 Tower kernel: RDX: 0000000000000001 RSI: 000000000000000b RDI: ffffc90007f3fc78 May 5 12:10:34 Tower kernel: RBP: 00000000006000c0 R08: 00000000006000c0 R09: ffff88901c60fc38 May 5 12:10:34 Tower kernel: R10: ffff8890024da000 R11: ffffea003ffa0000 R12: 00000000006000c0 May 5 12:10:34 Tower kernel: R13: ffff88901bb940a8 R14: 0000000000402000 R15: 000000000000000b May 5 12:10:34 Tower kernel: FS: 0000000000000000(0000) GS:ffff88901f800000(0000) knlGS:0000000000000000 May 5 12:10:34 Tower kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 May 5 12:10:34 Tower kernel: CR2: 0000154e155e7b10 CR3: 0000000004e0a001 CR4: 00000000001606e0 May 5 12:10:34 Tower kernel: Call Trace: May 5 12:10:34 Tower kernel: ? __domain_mapping+0x150/0x2dd May 5 12:10:34 Tower kernel: ? __intel_map_single+0xc8/0x122 May 5 12:10:34 Tower kernel: intel_alloc_coherent+0x75/0xe2 May 5 12:10:34 Tower kernel: dma_pool_alloc+0x11f/0x1ea May 5 12:10:34 Tower kernel: mpt3sas_base_attach+0x1200/0x1ad0 [mpt3sas] May 5 12:10:34 Tower kernel: _scsih_probe+0x6a2/0x7bc [mpt3sas] May 5 12:10:34 Tower kernel: local_pci_probe+0x39/0x7a May 5 12:10:34 Tower kernel: work_for_cpu_fn+0x11/0x17 May 5 12:10:34 Tower kernel: process_one_work+0x16e/0x24f May 5 12:10:34 Tower kernel: process_scheduled_works+0x22/0x27 May 5 12:10:34 Tower kernel: worker_thread+0x1ff/0x2b8 May 5 12:10:34 Tower kernel: ? rescuer_thread+0x2a7/0x2a7 May 5 12:10:34 Tower kernel: kthread+0x10c/0x114 May 5 12:10:34 Tower kernel: ? kthread_park+0x89/0x89 May 5 12:10:34 Tower kernel: ret_from_fork+0x35/0x40 May 5 12:10:34 Tower kernel: ---[ end trace 11a417b55e91be44 ]--- May 5 12:10:34 Tower kernel: mpt2sas_cm1: reply pool: dma_pool_alloc failed As an interesting experiment, I added mpt3sas.max_queue_depth=10000 to my syslinux.cfg for 6.8.3, at which point the controller actually appears to initialize successfully without the crash and stack trace, but the disks connected to it still aren't visible: Quote May 5 12:23:53 Tower kernel: mpt2sas_cm1: CurrentHostPageSize is 0: Setting default host page size to 4k May 5 12:23:53 Tower kernel: mpt2sas_cm1: MSI-X vectors supported: 1, no of cores: 24, max_msix_vectors: -1 May 5 12:23:53 Tower kernel: mpt2sas1-msix0: PCI-MSI-X enabled: IRQ 69 May 5 12:23:53 Tower kernel: mpt2sas_cm1: iomem(0x00000000db39c000), mapped(0x000000005985eb2a), size(16384) May 5 12:23:53 Tower kernel: mpt2sas_cm1: ioport(0x0000000000008000), size(256) May 5 12:23:53 Tower kernel: mpt2sas_cm1: CurrentHostPageSize is 0: Setting default host page size to 4k May 5 12:23:53 Tower kernel: mpt2sas_cm1: Allocated physical memory: size(6776 kB) May 5 12:23:53 Tower kernel: mpt2sas_cm1: Current Controller Queue Depth(9997),Max Controller Queue Depth(29251) May 5 12:23:53 Tower kernel: mpt2sas_cm1: Scatter Gather Elements per IO(128) May 5 12:23:53 Tower kernel: mpt2sas_cm1: LSISAS2116: FWVersion(20.00.11.00), ChipRevision(0x02), BiosVersion(07.39.02.00) May 5 12:23:53 Tower kernel: mpt2sas_cm1: Protocol=( May 5 12:23:53 Tower kernel: Initiator May 5 12:23:53 Tower kernel: ,Target May 5 12:23:53 Tower kernel: ), May 5 12:23:53 Tower kernel: Capabilities=( May 5 12:23:53 Tower kernel: TLR May 5 12:23:53 Tower kernel: ,EEDP May 5 12:23:53 Tower kernel: ,Snapshot Buffer May 5 12:23:53 Tower kernel: ,Diag Trace Buffer May 5 12:23:53 Tower kernel: ,Task Set Full May 5 12:23:53 Tower kernel: ,NCQ May 5 12:23:53 Tower kernel: ) May 5 12:23:53 Tower kernel: scsi host9: Fusion MPT SAS Host May 5 12:23:53 Tower kernel: mpt2sas_cm1: sending port enable !! May 5 12:23:53 Tower kernel: sas: phy-7:2 added to port-7:0, phy_mask:0x4 (500110ae0888c0f4) May 5 12:23:53 Tower kernel: sas: DOING DISCOVERY on port 0, pid:1348 May 5 12:23:53 Tower kernel: sas: Enter sas_scsi_recover_host busy: 0 failed: 0 May 5 12:23:53 Tower kernel: sas: ata7: end_device-7:0: dev error handler May 5 12:23:53 Tower kernel: ata7.00: supports DRM functions and may not be fully accessible May 5 12:23:53 Tower kernel: ata7.00: ATA-9: WDC WD120EDAZ-11F3RA0, 5PJHJRXC, 81.00A81, max UDMA/133 May 5 12:23:53 Tower kernel: ata7.00: 23437770752 sectors, multi 0: LBA48 NCQ (depth 32) May 5 12:23:53 Tower kernel: ata7.00: supports DRM functions and may not be fully accessible May 5 12:23:53 Tower kernel: ata7.00: configured for UDMA/133 May 5 12:23:53 Tower kernel: sas: --- Exit sas_scsi_recover_host: busy: 0 failed: 0 tries: 1 May 5 12:23:53 Tower kernel: scsi 7:0:0:0: Direct-Access ATA WDC WD120EDAZ-11 0A81 PQ: 0 ANSI: 5 May 5 12:23:53 Tower kernel: sd 7:0:0:0: Attached scsi generic sg5 type 0 May 5 12:23:53 Tower kernel: sd 7:0:0:0: [sdf] 23437770752 512-byte logical blocks: (12.0 TB/10.9 TiB) May 5 12:23:53 Tower kernel: sas: DONE DISCOVERY on port 0, pid:1348, result:0 May 5 12:23:53 Tower kernel: sd 7:0:0:0: [sdf] 4096-byte physical blocks May 5 12:23:53 Tower kernel: sas: phy-7:3 added to port-7:1, phy_mask:0x8 (500110ae0888c0f5) May 5 12:23:53 Tower kernel: sd 7:0:0:0: [sdf] Write Protect is off May 5 12:23:53 Tower kernel: sas: DOING DISCOVERY on port 1, pid:1348 May 5 12:23:53 Tower kernel: sd 7:0:0:0: [sdf] Mode Sense: 00 3a 00 00 May 5 12:23:53 Tower kernel: sd 7:0:0:0: [sdf] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA May 5 12:23:53 Tower kernel: sas: Enter sas_scsi_recover_host busy: 0 failed: 0 May 5 12:23:53 Tower kernel: sas: ata7: end_device-7:0: dev error handler May 5 12:23:53 Tower kernel: sas: ata8: end_device-7:1: dev error handler May 5 12:23:53 Tower kernel: mpt2sas_cm1: host_add: handle(0x0001), sas_addr(0x5000d31000181fa2), phys(16) May 5 12:23:53 Tower kernel: mpt2sas_cm1: port enable: SUCCESS May 5 12:23:53 Tower kernel: ata8.00: supports DRM functions and may not be fully accessible May 5 12:23:53 Tower kernel: ata8.00: ATA-9: WDC WD120EDAZ-11F3RA0, 5PJDUKGE, 81.00A81, max UDMA/133 May 5 12:23:53 Tower kernel: ata8.00: 23437770752 sectors, multi 0: LBA48 NCQ (depth 32) May 5 12:23:53 Tower kernel: ata8.00: supports DRM functions and may not be fully accessible May 5 12:23:53 Tower kernel: ata8.00: configured for UDMA/133 May 5 12:23:53 Tower kernel: sas: --- Exit sas_scsi_recover_host: busy: 0 failed: 0 tries: 1 May 5 12:23:53 Tower kernel: scsi 7:0:1:0: Direct-Access ATA WDC WD120EDAZ-11 0A81 PQ: 0 ANSI: 5 May 5 12:23:53 Tower kernel: sd 7:0:1:0: [sdg] 23437770752 512-byte logical blocks: (12.0 TB/10.9 TiB) May 5 12:23:53 Tower kernel: sd 7:0:1:0: Attached scsi generic sg6 type 0 May 5 12:23:53 Tower kernel: sd 7:0:1:0: [sdg] 4096-byte physical blocks May 5 12:23:53 Tower kernel: sas: DONE DISCOVERY on port 1, pid:1348, result:0 May 5 12:23:53 Tower kernel: sd 7:0:1:0: [sdg] Write Protect is off May 5 12:23:53 Tower kernel: sd 7:0:1:0: [sdg] Mode Sense: 00 3a 00 00 May 5 12:23:53 Tower kernel: sd 7:0:1:0: [sdg] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA May 5 12:23:53 Tower kernel: sdf: sdf1 May 5 12:23:53 Tower kernel: sd 7:0:0:0: [sdf] Attached SCSI disk May 5 12:23:53 Tower kernel: sdg: sdg1 May 5 12:23:53 Tower kernel: sd 7:0:1:0: [sdg] Attached SCSI disk I'm actually pretty confused by that output - the two disks that it initializes in the bottom of the block, sdf and sdg, after the "May 5 12:23:53 Tower kernel: mpt2sas_cm1: sending port enable !!" row are both attached to the internal Intel controller, not the external LSI one. Below I'm including the same block from the SAS2308, which is working, for comparison: Quote May 5 12:23:53 Tower kernel: mpt2sas_cm0: CurrentHostPageSize is 0: Setting default host page size to 4k May 5 12:23:53 Tower kernel: mpt2sas_cm0: sending message unit reset !! May 5 12:23:53 Tower kernel: mpt2sas_cm0: message unit reset: SUCCESS May 5 12:23:53 Tower kernel: e1000e 0000:01:00.0 0000:01:00.0 (uninitialized): registered PHC clock May 5 12:23:53 Tower kernel: ata1: SATA link down (SStatus 0 SControl 300) May 5 12:23:53 Tower kernel: e1000e 0000:01:00.0 eth5: (PCI Express:2.5GT/s:Width x1) 40:a8:f0:c9:81:2a May 5 12:23:53 Tower kernel: e1000e 0000:01:00.0 eth5: Intel(R) PRO/1000 Network Connection May 5 12:23:53 Tower kernel: e1000e 0000:01:00.0 eth5: MAC: 3, PHY: 8, PBA No: FFFFFF-0FF May 5 12:23:53 Tower kernel: mpt2sas_cm0: Allocated physical memory: size(5974 kB) May 5 12:23:53 Tower kernel: mpt2sas_cm0: Current Controller Queue Depth(8056),Max Controller Queue Depth(8192) May 5 12:23:53 Tower kernel: mpt2sas_cm0: Scatter Gather Elements per IO(128) May 5 12:23:53 Tower kernel: mpt2sas_cm0: overriding NVDATA EEDPTagMode setting May 5 12:23:53 Tower kernel: mpt2sas_cm0: LSISAS2308: FWVersion(20.00.07.00), ChipRevision(0x05), BiosVersion(07.39.02.00) May 5 12:23:53 Tower kernel: mpt2sas_cm0: Protocol=( May 5 12:23:53 Tower kernel: Initiator May 5 12:23:53 Tower kernel: ,Target May 5 12:23:53 Tower kernel: ), May 5 12:23:53 Tower kernel: Capabilities=( May 5 12:23:53 Tower kernel: TLR May 5 12:23:53 Tower kernel: ,EEDP May 5 12:23:53 Tower kernel: ,Snapshot Buffer May 5 12:23:53 Tower kernel: ,Diag Trace Buffer May 5 12:23:53 Tower kernel: ,Task Set Full May 5 12:23:53 Tower kernel: ,NCQ May 5 12:23:53 Tower kernel: ) May 5 12:23:53 Tower kernel: scsi host8: Fusion MPT SAS Host May 5 12:23:53 Tower kernel: mpt2sas_cm0: sending port enable !! May 5 12:23:53 Tower kernel: mpt2sas_cm0: host_add: handle(0x0001), sas_addr(0x50030480301821cd), phys(8) May 5 12:23:53 Tower kernel: mpt2sas_cm0: port enable: SUCCESS May 5 12:23:53 Tower kernel: scsi 8:0:0:0: Direct-Access ATA CT1000MX500SSD1 033 PQ: 0 ANSI: 6 May 5 12:23:53 Tower kernel: scsi 8:0:0:0: SATA: handle(0x000c), sas_addr(0x4433221103000000), phy(3), device_name(0x500a0751e4eabce5) May 5 12:23:53 Tower kernel: scsi 8:0:0:0: enclosure logical id (0x50030480301821cd), slot(0) May 5 12:23:53 Tower kernel: scsi 8:0:0:0: atapi(n), ncq(y), asyn_notify(n), smart(y), fua(y), sw_preserve(y) May 5 12:23:53 Tower kernel: sd 8:0:0:0: Attached scsi generic sg1 type 0 May 5 12:23:53 Tower kernel: sd 8:0:0:0: [sdb] 1953525168 512-byte logical blocks: (1.00 TB/932 GiB) May 5 12:23:53 Tower kernel: sd 8:0:0:0: [sdb] 4096-byte physical blocks May 5 12:23:53 Tower kernel: scsi 8:0:1:0: Direct-Access ATA CT1000MX500SSD1 033 PQ: 0 ANSI: 6 May 5 12:23:53 Tower kernel: sd 8:0:0:0: [sdb] Write Protect is off May 5 12:23:53 Tower kernel: scsi 8:0:1:0: SATA: handle(0x000b), sas_addr(0x4433221102000000), phy(2), device_name(0x500a0751e4eabce8) May 5 12:23:53 Tower kernel: sd 8:0:0:0: [sdb] Mode Sense: 7f 00 10 08 May 5 12:23:53 Tower kernel: scsi 8:0:1:0: enclosure logical id (0x50030480301821cd), slot(1) May 5 12:23:53 Tower kernel: scsi 8:0:1:0: atapi(n), ncq(y), asyn_notify(n), smart(y), fua(y), sw_preserve(y) May 5 12:23:53 Tower kernel: sd 8:0:0:0: [sdb] Write cache: enabled, read cache: enabled, supports DPO and FUA May 5 12:23:53 Tower kernel: sd 8:0:1:0: Attached scsi generic sg2 type 0 May 5 12:23:53 Tower kernel: sd 8:0:1:0: [sdc] 1953525168 512-byte logical blocks: (1.00 TB/932 GiB) May 5 12:23:53 Tower kernel: sd 8:0:1:0: [sdc] 4096-byte physical blocks May 5 12:23:53 Tower kernel: sd 8:0:1:0: [sdc] Write Protect is off May 5 12:23:53 Tower kernel: scsi 8:0:2:0: Direct-Access ATA Samsung SSD 860 4B6Q PQ: 0 ANSI: 6 May 5 12:23:53 Tower kernel: sd 8:0:1:0: [sdc] Mode Sense: 7f 00 10 08 May 5 12:23:53 Tower kernel: scsi 8:0:2:0: SATA: handle(0x000a), sas_addr(0x4433221101000000), phy(1), device_name(0x5002538ec0b140ce) May 5 12:23:53 Tower kernel: sd 8:0:1:0: [sdc] Write cache: enabled, read cache: enabled, supports DPO and FUA May 5 12:23:53 Tower kernel: scsi 8:0:2:0: enclosure logical id (0x50030480301821cd), slot(2) May 5 12:23:53 Tower kernel: scsi 8:0:2:0: atapi(n), ncq(y), asyn_notify(n), smart(y), fua(y), sw_preserve(y) May 5 12:23:53 Tower kernel: sd 8:0:0:0: [sdb] Attached SCSI disk May 5 12:23:53 Tower kernel: sd 8:0:2:0: Attached scsi generic sg3 type 0 May 5 12:23:53 Tower kernel: sd 8:0:1:0: [sdc] Attached SCSI disk May 5 12:23:53 Tower kernel: sd 8:0:2:0: [sdd] 1953525168 512-byte logical blocks: (1.00 TB/932 GiB) May 5 12:23:53 Tower kernel: sd 8:0:2:0: [sdd] Write Protect is off May 5 12:23:53 Tower kernel: scsi 8:0:3:0: Direct-Access ATA Samsung SSD 860 4B6Q PQ: 0 ANSI: 6 May 5 12:23:53 Tower kernel: sd 8:0:2:0: [sdd] Mode Sense: 7f 00 10 08 May 5 12:23:53 Tower kernel: scsi 8:0:3:0: SATA: handle(0x0009), sas_addr(0x4433221100000000), phy(0), device_name(0x5002538ec0b140cf) May 5 12:23:53 Tower kernel: sd 8:0:2:0: [sdd] Write cache: enabled, read cache: enabled, supports DPO and FUA May 5 12:23:53 Tower kernel: scsi 8:0:3:0: enclosure logical id (0x50030480301821cd), slot(3) May 5 12:23:53 Tower kernel: scsi 8:0:3:0: atapi(n), ncq(y), asyn_notify(n), smart(y), fua(y), sw_preserve(y) May 5 12:23:53 Tower kernel: sd 8:0:3:0: Attached scsi generic sg4 type 0 May 5 12:23:53 Tower kernel: sd 8:0:3:0: [sde] 1953525168 512-byte logical blocks: (1.00 TB/932 GiB) May 5 12:23:53 Tower kernel: sd 8:0:2:0: [sdd] Attached SCSI disk May 5 12:23:53 Tower kernel: mpt3sas 0000:42:00.0: can't disable ASPM; OS doesn't have ASPM control May 5 12:23:53 Tower kernel: sd 8:0:3:0: [sde] Write Protect is off May 5 12:23:53 Tower kernel: mpt2sas_cm1: 64 BIT PCI BUS DMA ADDRESSING SUPPORTED, total mem (65963140 kB) May 5 12:23:53 Tower kernel: sd 8:0:3:0: [sde] Mode Sense: 7f 00 10 08 May 5 12:23:53 Tower kernel: sd 8:0:3:0: [sde] Write cache: enabled, read cache: enabled, supports DPO and FUA May 5 12:23:53 Tower kernel: sd 8:0:3:0: [sde] Attached SCSI disk All four disks from this log excerpt (sdb through sde) are actually attached to the LSI2308. The key difference seems to me to be that for the 2308, the "sending port enable !!" command is immediately followed by "May 5 12:23:53 Tower kernel: mpt2sas_cm0: host_add: handle(0x0001), sas_addr(0x50030480301821cd), phys(8)" and "May 5 12:23:53 Tower kernel: mpt2sas_cm0: port enable: SUCCESS", but the 2116 doesn't have the success message. Again, all of the log excerpts in this post are from 6.8.3, the first with out of box syslinux and the last two with syslinux modified for max_queue_depth=10000 However, with all of that info, I still don't really have a sense of what that actually means as far as the actual underlying failure state. Does it just make it more likely that the problem is with the actual card itself? Thanks again for your assistance. Quote Link to comment
JorgeB Posted May 6, 2021 Share Posted May 6, 2021 10 hours ago, pro_con said: I added mpt3sas.max_queue_depth=10000 to my syslinux.cfg for 6.8.3, That's not needed for v6.8.x 10 hours ago, pro_con said: LSISAS2116: FWVersion(20.00.11.00), Never seen this firmware before, could be a beta, latest version on Avago's site is 20.00.07.00, try that one. Quote Link to comment
pro_con Posted May 6, 2021 Author Share Posted May 6, 2021 Yep, on downgrading the firmware to 20.00.07.00 the disks are now showing up correctly in 6.9.2. Super weird, the .11 firmware is what it shipped from the reseller with. Regardless, I really appreciate your help, and I'll go ahead and mark resolved. 2 Quote Link to comment
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.