I've had this problem occur twice now.
I get a fault on the 9211 card, and then all my devices start erroring out.
The error looks like this --
Mar 31 05:24:59 Tower kernel: mpt2sas_cm0: fault_state(0x2622)!
Mar 31 05:24:59 Tower kernel: mpt2sas_cm0: sending diag reset !!
Mar 31 05:25:00 Tower kernel: mpt2sas_cm0: diag reset: SUCCESS
Mar 31 05:25:00 Tower kernel: mpt2sas_cm0: CurrentHostPageSize is 0: Setting default host page size to 4k
Mar 31 05:25:00 Tower kernel: mpt2sas_cm0: LSISAS2008: FWVersion(20.00.07.00), ChipRevision(0x03), BiosVersion(00.00.00.00)
Mar 31 05:25:00 Tower kernel: mpt2sas_cm0: Protocol=(
Mar 31 05:25:00 Tower kernel: Initiator
Mar 31 05:25:00 Tower kernel: ,Target
Mar 31 05:25:00 Tower kernel: ),
Mar 31 05:25:00 Tower kernel: Capabilities=(
Mar 31 05:25:00 Tower kernel: TLR
Mar 31 05:25:00 Tower kernel: ,EEDP
Mar 31 05:25:00 Tower kernel: ,Snapshot Buffer
Mar 31 05:25:00 Tower kernel: ,Diag Trace Buffer
Mar 31 05:25:00 Tower kernel: ,Task Set Full
Mar 31 05:25:00 Tower kernel: ,NCQ
Mar 31 05:25:00 Tower kernel: )
Mar 31 05:25:00 Tower kernel: mpt2sas_cm0: sending port enable !!
After this occurs all my devices have I/O errors --
Mar 31 05:26:20 Tower kernel: print_req_error: I/O error, dev sdr, sector 3264152816
Mar 31 05:26:20 Tower kernel: sd 7:0:4:0: [sdf] tag#2 UNKNOWN(0x2003) Result: hostbyte=0x01 driverbyte=0x00
Mar 31 05:26:20 Tower kernel: sd 7:0:17:0: [sds] tag#50 UNKNOWN(0x2003) Result: hostbyte=0x01 driverbyte=0x00
Mar 31 05:26:20 Tower kernel: sd 7:0:4:0: [sdf] tag#2 CDB: opcode=0x88 88 00 00 00 00 02 2f 8c 76 c8 00 00 00 08 00 00
Mar 31 05:26:20 Tower kernel: sd 7:0:17:0: [sds] tag#50 CDB: opcode=0x88 88 00 00 00 00 00 c2 8f 04 f0 00 00 00 08 00 00
x
The motherboard is a X9SCL, in a super micro SC846 (SAS2 backplane), ECC ram
The 9211-8i card has the p20 firmware.
Has anyone had this occur?
I've attached the diagnostics as well
tower-diagnostics-20190331-1416.zip