Array read error with newly installed SAS card (LSI 9207-8i)


Neejrow

Recommended Posts

System specs: Ryzen 3900X, MSI B550M Mortar Wifi, unRAID 6.9.0-rc2, headless no GPU.

Hello, I installed a SAS card yesterday (into my top PCIE 4.0 16x slot) that was recommended to me by someone on the unRAID discord server. LSI 9207-8i

 

It seemed to be working fine, however this morning I got an email from my server to say that there was an array read error. I have never had one of these before. I checked and it's the drive which I connected using the SAS card. It's not a new drive, it has been running on the server for many months but I decided to connect it via the SAS card instead of directly to a SATA port on the motherboard, to test the card. Server booted up fine and there was no errors and the drive was detected and running, so I assumed it was fine until this morning.

 

I did not "flash" the SAS card or anything, as the ebay listing says it's already flashed and works for unRAID. 

 

I have attached below the system log, and I can see now from reading that the read errors occurred when the drive spun down. I have all my drives set to spin down after 4 hours of no use, because it's a Plex server only right now and so the drives don't get used too often. (is 4 hours too soon? they are all NAS drives and I am only doing it for power saving basically - would like some opinions on what a more optimal setup is)

Feb 23 11:52:49 Tower emhttpd: spinning down /dev/sde
Feb 23 13:58:01 Tower kernel: sd 7:0:0:0: attempting task abort!scmd(0x00000000bec51a11), outstanding for 15197 ms & timeout 15000 ms
Feb 23 13:58:01 Tower kernel: sd 7:0:0:0: [sde] tag#3307 CDB: opcode=0x85 85 06 20 00 00 00 00 00 00 00 00 00 00 40 e5 00
Feb 23 13:58:01 Tower kernel: scsi target7:0:0: handle(0x0009), sas_address(0x4433221107000000), phy(7)
Feb 23 13:58:01 Tower kernel: scsi target7:0:0: enclosure logical id(0x500605b009db1330), slot(4)
Feb 23 13:58:05 Tower kernel: sd 7:0:0:0: task abort: SUCCESS scmd(0x00000000bec51a11)
Feb 23 13:58:05 Tower kernel: sd 7:0:0:0: [sde] tag#3832 UNKNOWN(0x2003) Result: hostbyte=0x00 driverbyte=0x08 cmd_age=19s
Feb 23 13:58:05 Tower kernel: sd 7:0:0:0: [sde] tag#3832 Sense Key : 0x2 [current]
Feb 23 13:58:05 Tower kernel: sd 7:0:0:0: [sde] tag#3832 ASC=0x4 ASCQ=0x0
Feb 23 13:58:05 Tower kernel: sd 7:0:0:0: [sde] tag#3832 CDB: opcode=0x88 88 00 00 00 00 01 00 d2 c1 10 00 00 00 20 00 00
Feb 23 13:58:05 Tower kernel: blk_update_request: I/O error, dev sde, sector 4308779280 op 0x0:(READ) flags 0x0 phys_seg 4 prio class 0
Feb 23 13:58:05 Tower kernel: md: disk4 read error, sector=4308779216
Feb 23 13:58:05 Tower kernel: md: disk4 read error, sector=4308779224
Feb 23 13:58:05 Tower kernel: md: disk4 read error, sector=4308779232
Feb 23 13:58:05 Tower kernel: md: disk4 read error, sector=4308779240
Feb 23 13:58:05 Tower emhttpd: read SMART /dev/sde
Feb 23 13:58:16 Tower emhttpd: read SMART /dev/sdb
Feb 23 13:58:16 Tower emhttpd: read SMART /dev/sdc
Feb 23 13:58:25 Tower emhttpd: read SMART /dev/sdg
Feb 23 13:58:26 Tower kernel: sd 7:0:0:0: Power-on or device reset occurred
Feb 23 13:58:26 Tower emhttpd: read SMART /dev/sdd
Feb 23 13:59:01 Tower sSMTP[17393]: Creating SSL connection to host
Feb 23 13:59:03 Tower sSMTP[17393]: SSL connection using TLS_AES_256_GCM_SHA384
Feb 23 13:59:08 Tower sSMTP[17393]: Sent mail for [email protected] (221 2.0.0 closing connection

 

 

That's about all I can think of to provide. I have uploaded the diagnostics log and also the SMART report of the drive that had the read errors. Please let me know if there is anything else I need to provide. 

 

Thanks

tower-diagnostics-20210223-1459.zip tower-smart-20210223-1454.zip

Link to comment
2 hours ago, Vr2Io said:

pls try update HBA firmware

Sorry I am a little confused. When you say HBA do you mean my SAS PCIE card? (Attached SCSI controller: Broadcom / LSI SAS2308 PCI-Express Fusion-MPT SAS-2 (rev 05))

 

The Ebay listing for it says "FLASHED TO LATEST LSI P20 IT MODE FIRMWARE" so I assumed I do not need to update it, and I am not even sure how I would go about updating it either. I found a forum post on google about my card from May 2020 saying that the latest firmware is 20.0.07.0, and that is what my card says it is running in BIOS.

 

2 hours ago, Vr2Io said:

it also recommend turn off power saving ( i.e. ASPM ) for PCIe device in motherboard BIOS.

 

I tried looking in to this and found nothing related to ASPM for PCIE power saving in BIOS.

 

I did do a BIOS update to make sure I was fully up to date, and I also switched my SAS card to the bottom PCIE 16x slot on my board (which runs at PCIE 3.0 x4) just in case maybe the top slot was causing some sort of compatibility issue as it runs at PCIE 4.0 natively? I also set that slot to run in gen3 mode now just in case, although the slot will be empty from now.

 

I have also switched the cable to the second set of Mini-SAS cables which came with this SAS card, and used the 2nd mini-SAS slot on the card too. I am currently running a bunch of movies off the drive connected to the SAS card to force it to do a bunch of reads, to see if I can get any errors again. But so far so good (only been about 15 minutes though)

 

I took a few photos in BIOS of the SAS card, in case any of this information is useful to someone who might know if this is the latest version of the firmware etc. 

 

Anyway, will let the drive run for a few hours reading data and see if it gets any more read errors. After that I'll force it to spin down and see if that causes any read errors again too. Will update this post tomorrow.

 

SAS_CARD1.jpg

SAS_CARD2.jpg

SAS_CARD3.jpg

Edited by Neejrow
Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.