Jump to content

LSI SAS 3008-8i reset loop during preclear


Recommended Posts

Sorry if I'm not in the right place on this.  I'm a new UnRaid user and just "completed" my first UnRaid build.

So I built this server:

AMD Ryzen Threadripper 2920X 12-Core @ 3500 MHz

Gigabyte Technology Co., Ltd. X399 AORUS XTREME-CF

Nvidia Quadro P2000

2 LSI SAS 3008-8i for connectivity to Hot-Swap bays

It seemed to be running well for a few days until I started adding drives and running a preclear to test the new drives.  I recently noticed that 2 of the drives seemed to have stalled out at about 90% of the pre-read.  When I looked at the log I saw this:

Oct 2 21:17:43 FileServer kernel: mpt3sas_cm1: port enable: SUCCESS
Oct 2 21:17:43 FileServer kernel: mpt3sas_cm1: search for end-devices: start
Oct 2 21:17:43 FileServer kernel: scsi target12:0:2: handle(0x0009), sas_addr(0x4433221100000000)
Oct 2 21:17:43 FileServer kernel: scsi target12:0:2: enclosure logical id(0x500605b000648d80), slot(3)
Oct 2 21:17:43 FileServer kernel: scsi target12:0:3: handle(0x000a), sas_addr(0x4433221106000000)
Oct 2 21:17:43 FileServer kernel: scsi target12:0:3: enclosure logical id(0x500605b000648d80), slot(4)
Oct 2 21:17:43 FileServer kernel: scsi target12:0:4: handle(0x000b), sas_addr(0x4433221107000000)
Oct 2 21:17:43 FileServer kernel: scsi target12:0:4: enclosure logical id(0x500605b000648d80), slot(5)
Oct 2 21:17:43 FileServer kernel: mpt3sas_cm1: search for end-devices: complete
Oct 2 21:17:43 FileServer kernel: mpt3sas_cm1: search for end-devices: start
Oct 2 21:17:43 FileServer kernel: mpt3sas_cm1: search for PCIe end-devices: complete
Oct 2 21:17:43 FileServer kernel: mpt3sas_cm1: search for expanders: start
Oct 2 21:17:43 FileServer kernel: mpt3sas_cm1: search for expanders: complete
Oct 2 21:17:43 FileServer kernel: mpt3sas_cm1: _base_fault_reset_work: hard reset: success
Oct 2 21:17:43 FileServer kernel: mpt3sas_cm1: removing unresponding devices: start
Oct 2 21:17:43 FileServer kernel: mpt3sas_cm1: removing unresponding devices: end-devices
Oct 2 21:17:43 FileServer kernel: mpt3sas_cm1: Removing unresponding devices: pcie end-devices
Oct 2 21:17:43 FileServer kernel: mpt3sas_cm1: removing unresponding devices: expanders
Oct 2 21:17:43 FileServer kernel: mpt3sas_cm1: removing unresponding devices: complete
Oct 2 21:17:43 FileServer kernel: mpt3sas_cm1: scan devices: start
Oct 2 21:17:43 FileServer kernel: mpt3sas_cm1: scan devices: expanders start
Oct 2 21:17:43 FileServer kernel: mpt3sas_cm1: break from expander scan: ioc_status(0x0022), loginfo(0x310f0400)
Oct 2 21:17:43 FileServer kernel: mpt3sas_cm1: scan devices: expanders complete
Oct 2 21:17:43 FileServer kernel: mpt3sas_cm1: scan devices: end devices start
Oct 2 21:17:43 FileServer kernel: mpt3sas_cm1: break from end device scan: ioc_status(0x0022), loginfo(0x310f0400)
Oct 2 21:17:43 FileServer kernel: mpt3sas_cm1: scan devices: end devices complete
Oct 2 21:17:43 FileServer kernel: mpt3sas_cm1: scan devices: pcie end devices start
Oct 2 21:17:43 FileServer kernel: mpt3sas_cm1: log_info(0x3003011d): originator(IOP), code(0x03), sub_code(0x011d)
Oct 2 21:17:43 FileServer kernel: mpt3sas_cm1: log_info(0x3003011d): originator(IOP), code(0x03), sub_code(0x011d)
Oct 2 21:17:43 FileServer kernel: mpt3sas_cm1: break from pcie end device scan: ioc_status(0x0021), loginfo(0x3003011d)
Oct 2 21:17:43 FileServer kernel: mpt3sas_cm1: pcie devices: pcie end devices complete
Oct 2 21:17:43 FileServer kernel: mpt3sas_cm1: scan devices: complete
Oct 2 21:17:43 FileServer kernel: program smartctl is using a deprecated SCSI ioctl, please convert it to SG_IO
Oct 2 21:17:44 FileServer kernel: sd 12:0:4:0: Power-on or device reset occurred
Oct 2 21:17:44 FileServer kernel: program smartctl is using a deprecated SCSI ioctl, please convert it to SG_IO
Oct 2 21:17:44 FileServer kernel: mpt3sas_cm1: fault_state(0x5862)!
Oct 2 21:17:44 FileServer kernel: mpt3sas_cm1: sending diag reset !!
Oct 2 21:17:45 FileServer kernel: mpt3sas_cm1: diag reset: SUCCESS
Oct 2 21:17:45 FileServer rc.diskinfo[6400]: SIGHUP ignored - already refreshing disk info.
Oct 2 21:17:45 FileServer kernel: mpt3sas_cm1: CurrentHostPageSize is 0: Setting default host page size to 4k
Oct 2 21:17:45 FileServer kernel: mpt3sas_cm1: _base_display_fwpkg_version: complete
Oct 2 21:17:45 FileServer kernel: mpt3sas_cm1: LSISAS3008: FWVersion(16.00.01.00), ChipRevision(0x02), BiosVersion(08.37.00.00)
Oct 2 21:17:45 FileServer kernel: mpt3sas_cm1: Protocol=(
Oct 2 21:17:45 FileServer kernel: Initiator
Oct 2 21:17:45 FileServer kernel: ,Target
Oct 2 21:17:45 FileServer kernel: ),
Oct 2 21:17:45 FileServer kernel: Capabilities=(
Oct 2 21:17:45 FileServer kernel: TLR
Oct 2 21:17:45 FileServer kernel: ,EEDP
Oct 2 21:17:45 FileServer kernel: ,Snapshot Buffer
Oct 2 21:17:45 FileServer kernel: ,Diag Trace Buffer
Oct 2 21:17:45 FileServer kernel: ,Task Set Full
Oct 2 21:17:45 FileServer kernel: ,NCQ
Oct 2 21:17:45 FileServer kernel: )
Oct 2 21:17:45 FileServer kernel: mpt3sas_cm1: sending port enable !!

This sequence of messages repeated over and over led me down the google path finding this Bug Link and this another Bug Link.  I'm wondering what I should be doing here.  Am I screwed?  Is this something I can fix or should I look into returning my HBAs?  I'm hoping for a little guidance here...

Edited by JT Marshall
Link to comment

Okay, so this is one that I took last night and then again this morning.  It would appear that shortly after posting this the looping of the SAS card seemed to stop and the preclear progress picked back up. Maybe I'm just being overly cautious with the build but I really didn't understand what was happening.  Still would like to understand what it was that I was seeing.  Another symptom of this is that the drives that are connected to that SAS controller are not showing up in the Unassigned Devies section like their predecessors.  I know I'm throwing a lot of information out there and don't expect it to all be relevant.  

bifrost-diagnostics-20191003-0558.zip bifrost-diagnostics-20191003-1319.zip

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...