Neejrow Posted February 23, 2021 Share Posted February 23, 2021 System specs: Ryzen 3900X, MSI B550M Mortar Wifi, unRAID 6.9.0-rc2, headless no GPU. Hello, I installed a SAS card yesterday (into my top PCIE 4.0 16x slot) that was recommended to me by someone on the unRAID discord server. LSI 9207-8i It seemed to be working fine, however this morning I got an email from my server to say that there was an array read error. I have never had one of these before. I checked and it's the drive which I connected using the SAS card. It's not a new drive, it has been running on the server for many months but I decided to connect it via the SAS card instead of directly to a SATA port on the motherboard, to test the card. Server booted up fine and there was no errors and the drive was detected and running, so I assumed it was fine until this morning. I did not "flash" the SAS card or anything, as the ebay listing says it's already flashed and works for unRAID. I have attached below the system log, and I can see now from reading that the read errors occurred when the drive spun down. I have all my drives set to spin down after 4 hours of no use, because it's a Plex server only right now and so the drives don't get used too often. (is 4 hours too soon? they are all NAS drives and I am only doing it for power saving basically - would like some opinions on what a more optimal setup is) Feb 23 11:52:49 Tower emhttpd: spinning down /dev/sde Feb 23 13:58:01 Tower kernel: sd 7:0:0:0: attempting task abort!scmd(0x00000000bec51a11), outstanding for 15197 ms & timeout 15000 ms Feb 23 13:58:01 Tower kernel: sd 7:0:0:0: [sde] tag#3307 CDB: opcode=0x85 85 06 20 00 00 00 00 00 00 00 00 00 00 40 e5 00 Feb 23 13:58:01 Tower kernel: scsi target7:0:0: handle(0x0009), sas_address(0x4433221107000000), phy(7) Feb 23 13:58:01 Tower kernel: scsi target7:0:0: enclosure logical id(0x500605b009db1330), slot(4) Feb 23 13:58:05 Tower kernel: sd 7:0:0:0: task abort: SUCCESS scmd(0x00000000bec51a11) Feb 23 13:58:05 Tower kernel: sd 7:0:0:0: [sde] tag#3832 UNKNOWN(0x2003) Result: hostbyte=0x00 driverbyte=0x08 cmd_age=19s Feb 23 13:58:05 Tower kernel: sd 7:0:0:0: [sde] tag#3832 Sense Key : 0x2 [current] Feb 23 13:58:05 Tower kernel: sd 7:0:0:0: [sde] tag#3832 ASC=0x4 ASCQ=0x0 Feb 23 13:58:05 Tower kernel: sd 7:0:0:0: [sde] tag#3832 CDB: opcode=0x88 88 00 00 00 00 01 00 d2 c1 10 00 00 00 20 00 00 Feb 23 13:58:05 Tower kernel: blk_update_request: I/O error, dev sde, sector 4308779280 op 0x0:(READ) flags 0x0 phys_seg 4 prio class 0 Feb 23 13:58:05 Tower kernel: md: disk4 read error, sector=4308779216 Feb 23 13:58:05 Tower kernel: md: disk4 read error, sector=4308779224 Feb 23 13:58:05 Tower kernel: md: disk4 read error, sector=4308779232 Feb 23 13:58:05 Tower kernel: md: disk4 read error, sector=4308779240 Feb 23 13:58:05 Tower emhttpd: read SMART /dev/sde Feb 23 13:58:16 Tower emhttpd: read SMART /dev/sdb Feb 23 13:58:16 Tower emhttpd: read SMART /dev/sdc Feb 23 13:58:25 Tower emhttpd: read SMART /dev/sdg Feb 23 13:58:26 Tower kernel: sd 7:0:0:0: Power-on or device reset occurred Feb 23 13:58:26 Tower emhttpd: read SMART /dev/sdd Feb 23 13:59:01 Tower sSMTP[17393]: Creating SSL connection to host Feb 23 13:59:03 Tower sSMTP[17393]: SSL connection using TLS_AES_256_GCM_SHA384 Feb 23 13:59:08 Tower sSMTP[17393]: Sent mail for [email protected] (221 2.0.0 closing connection That's about all I can think of to provide. I have uploaded the diagnostics log and also the SMART report of the drive that had the read errors. Please let me know if there is anything else I need to provide. Thanks tower-diagnostics-20210223-1459.zip tower-smart-20210223-1454.zip Quote Link to comment
Vr2Io Posted February 23, 2021 Share Posted February 23, 2021 2 hours ago, Neejrow said: is 4 hours too soon? Not related. There are only one disk (problem disk) connect to HBA, pls try update HBA firmware, change cable / port, it also recommend turn off power saving ( i.e. ASPM ) for PCIe device in motherboard BIOS. Quote Link to comment
Neejrow Posted February 23, 2021 Author Share Posted February 23, 2021 (edited) 2 hours ago, Vr2Io said: pls try update HBA firmware Sorry I am a little confused. When you say HBA do you mean my SAS PCIE card? (Attached SCSI controller: Broadcom / LSI SAS2308 PCI-Express Fusion-MPT SAS-2 (rev 05)) The Ebay listing for it says "FLASHED TO LATEST LSI P20 IT MODE FIRMWARE" so I assumed I do not need to update it, and I am not even sure how I would go about updating it either. I found a forum post on google about my card from May 2020 saying that the latest firmware is 20.0.07.0, and that is what my card says it is running in BIOS. 2 hours ago, Vr2Io said: it also recommend turn off power saving ( i.e. ASPM ) for PCIe device in motherboard BIOS. I tried looking in to this and found nothing related to ASPM for PCIE power saving in BIOS. I did do a BIOS update to make sure I was fully up to date, and I also switched my SAS card to the bottom PCIE 16x slot on my board (which runs at PCIE 3.0 x4) just in case maybe the top slot was causing some sort of compatibility issue as it runs at PCIE 4.0 natively? I also set that slot to run in gen3 mode now just in case, although the slot will be empty from now. I have also switched the cable to the second set of Mini-SAS cables which came with this SAS card, and used the 2nd mini-SAS slot on the card too. I am currently running a bunch of movies off the drive connected to the SAS card to force it to do a bunch of reads, to see if I can get any errors again. But so far so good (only been about 15 minutes though) I took a few photos in BIOS of the SAS card, in case any of this information is useful to someone who might know if this is the latest version of the firmware etc. Anyway, will let the drive run for a few hours reading data and see if it gets any more read errors. After that I'll force it to spin down and see if that causes any read errors again too. Will update this post tomorrow. Edited February 23, 2021 by Neejrow Quote Link to comment
ChatNoir Posted February 23, 2021 Share Posted February 23, 2021 Looks like the LSI board has the appropriate version of FW : Feb 22 19:46:19 Tower kernel: mpt2sas_cm0: LSISAS2308: FWVersion(20.00.07.00), ChipRevision(0x05), BiosVersion(07.27.01.01) 1 Quote Link to comment
Vr2Io Posted February 23, 2021 Share Posted February 23, 2021 My bad, HBA have latest firmware version. Quote Link to comment
Neejrow Posted February 24, 2021 Author Share Posted February 24, 2021 Just an update. No read errors since making the changes. Hopefully it continues like this. Quote Link to comment
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.