edrohler Posted May 5, 2021 Posted May 5, 2021 Hello, I have ran into an issue with my sata card randomly disabling disks attached to it and I cannot figure out why. Below is the syslog output that I can find. The SMART settings say it is healthy. Any ideas what could cause this? I have had to rebuild the data onto the same disk twice in 2 days now. May 3 02:54:08 kernel: sd 7:0:0:0: attempting task abort!scmd(0x000000007cd90eff), outstanding for 15497 ms & timeout 15000 ms May 3 02:54:08 kernel: sd 7:0:0:0: [sdh] tag#2356 CDB: opcode=0x85 85 06 20 00 00 00 00 00 00 00 00 00 00 40 e5 00 May 3 02:54:08 kernel: scsi target7:0:0: handle(0x000a), sas_address(0x5001e677bbe6dfe0), phy(0) May 3 02:54:08 kernel: scsi target7:0:0: enclosure logical id(0x5001e677bbe6dfff), slot(0) May 3 02:54:08 kernel: sd 7:0:0:0: device_block, handle(0x000a) May 3 02:54:10 kernel: sd 7:0:0:0: device_unblock and setting to running, handle(0x000a) May 3 02:54:10 kernel: sd 7:0:0:0: [sdh] Synchronizing SCSI cache May 3 02:54:10 kernel: sd 7:0:0:0: [sdh] Synchronize Cache(10) failed: Result: hostbyte=0x01 driverbyte=0x00 May 3 02:54:10 rc.diskinfo[8086]: SIGHUP received, forcing refresh of disks info. May 3 02:54:12 kernel: scsi 7:0:0:0: [sdh] tag#7232 UNKNOWN(0x2003) Result: hostbyte=0x01 driverbyte=0x00 cmd_age=19s May 3 02:54:12 kernel: scsi 7:0:0:0: [sdh] tag#7232 CDB: opcode=0x88 88 00 00 00 00 02 02 51 49 00 00 00 00 40 00 00 May 3 02:54:12 kernel: blk_update_request: I/O error, dev sdh, sector 8628816128 op 0x0:(READ) flags 0x0 phys_seg 8 prio class 0 May 3 02:54:12 kernel: md: disk4 read error, sector=8628816064 May 3 02:54:12 kernel: md: disk4 read error, sector=8628816072 May 3 02:54:12 kernel: md: disk4 read error, sector=8628816080 May 3 02:54:12 kernel: md: disk4 read error, sector=8628816088 May 3 02:54:12 kernel: md: disk4 read error, sector=8628816096 May 3 02:54:12 kernel: md: disk4 read error, sector=8628816104 May 3 02:54:12 kernel: md: disk4 read error, sector=8628816112 May 3 02:54:12 kernel: md: disk4 read error, sector=8628816120 May 3 02:54:12 kernel: scsi 7:0:0:0: task abort: SUCCESS scmd(0x000000007cd90eff) May 3 02:54:12 kernel: mpt2sas_cm0: mpt3sas_transport_port_remove: removed: sas_addr(0x5001e677bbe6dfe0) May 3 02:54:12 kernel: mpt2sas_cm0: removing handle(0x000a), sas_addr(0x5001e677bbe6dfe0) May 3 02:54:12 kernel: mpt2sas_cm0: enclosure logical id(0x5001e677bbe6dfff), slot(0) May 3 02:54:13 kernel: scsi 7:0:2:0: Direct-Access ATA ST8000VN004-2M21 SC60 PQ: 0 ANSI: 6 May 3 02:54:13 kernel: scsi 7:0:2:0: SATA: handle(0x000a), sas_addr(0x5001e677bbe6dfe0), phy(0), device_name(0x0000000000000000) May 3 02:54:13 kernel: scsi 7:0:2:0: enclosure logical id (0x5001e677bbe6dfff), slot(0) May 3 02:54:13 kernel: scsi 7:0:2:0: atapi(n), ncq(y), asyn_notify(n), smart(y), fua(y), sw_preserve(y) May 3 02:54:13 kernel: scsi 7:0:2:0: qdepth(32), tagged(1), scsi_level(7), cmd_que(1) May 3 02:54:13 kernel: sd 7:0:2:0: Attached scsi generic sg7 type 0 May 3 02:54:13 kernel: end_device-7:0:2: add: handle(0x000a), sas_addr(0x5001e677bbe6dfe0) May 3 02:54:13 kernel: sd 7:0:2:0: Power-on or device reset occurred May 3 02:54:13 kernel: sd 7:0:2:0: [sdi] 15628053168 512-byte logical blocks: (8.00 TB/7.28 TiB) May 3 02:54:13 kernel: sd 7:0:2:0: [sdi] 4096-byte physical blocks May 3 02:54:13 kernel: sd 7:0:2:0: [sdi] Write Protect is off May 3 02:54:13 kernel: sd 7:0:2:0: [sdi] Mode Sense: 7f 00 10 08 May 3 02:54:13 kernel: sd 7:0:2:0: [sdi] Write cache: enabled, read cache: enabled, supports DPO and FUA May 3 02:54:13 kernel: sdi: sdi1 May 3 02:54:13 kernel: sd 7:0:2:0: [sdi] Attached SCSI disk May 3 02:54:13 rc.diskinfo[8086]: SIGHUP received, forcing refresh of disks info. May 3 02:54:14 unassigned.devices: Disk with serial 'ST8000VN004-2M2101_WRD0FLA6', mountpoint 'ST8000VN004-2M2101_WRD0FLA6' is not set to auto mount. May 3 02:54:15 kernel: br0: received packet on bond0 with own address as source address (addr:38:d5:47:aa:c8:f3, vlan:0) Quote
Squid Posted May 5, 2021 Posted May 5, 2021 If you're using breakout cabling, make sure everything is sitting properly. 1 Quote
edrohler Posted May 5, 2021 Author Posted May 5, 2021 Thank you for the reply. I assume by breakout cabling you mean the external Mini SAS SFF-8088 cable? I am using an 8-bay external enclosure. Quote
edrohler Posted May 5, 2021 Author Posted May 5, 2021 Thanks again for the help. I do believe it is how the cable is sitting. Quote
edrohler Posted May 9, 2021 Author Posted May 9, 2021 (edited) Well, it wasn't the cable. I replaced it and the issue still exists. The disk log shows this happening during the synchronizing SCSI cache. However, the disk was originally mounted at sdd and now it is trying to synch sdi. I am not sure why this is happening but it is making me like unraid more and more. I just want to set and forget it. May 8 02:05:00 kernel: sd 7:0:5:0: [sdi] tag#9555 CDB: opcode=0x85 85 06 20 00 00 00 00 00 00 00 00 00 00 40 e5 00 May 8 02:05:03 kernel: sd 7:0:5:0: [sdi] Synchronizing SCSI cache May 8 02:05:03 kernel: sd 7:0:5:0: [sdi] Synchronize Cache(10) failed: Result: hostbyte=0x01 driverbyte=0x00 May 8 02:05:04 kernel: scsi 7:0:5:0: [sdi] tag#9554 UNKNOWN(0x2003) Result: hostbyte=0x01 driverbyte=0x00 cmd_age=20s May 8 02:05:04 kernel: scsi 7:0:5:0: [sdi] tag#9554 CDB: opcode=0x88 88 00 00 00 00 02 85 d1 6a 80 00 00 01 00 00 00 May 8 02:05:04 kernel: blk_update_request: I/O error, dev sdi, sector 10835028608 op 0x0:(READ) flags 0x0 phys_seg 32 prio class 0 Also, I can't even just re-mount the disk to move the files onto one of the non-SFF-8088 cabled disks. I thought if a data disk fails, we can just mount it and access the files? Now, it seems like I need to wait another 11 hours for the parity-rebuild to finish and then hope it doesn't fail again while I move the files to more stable drives using unbalance and then ditch the 8-bay external enclosure. Edited May 9, 2021 by edrohler Quote
edrohler Posted May 10, 2021 Author Posted May 10, 2021 I think I found the issue: All of the disks that I have are ironwolf 8TB drives with a single parity. 4 drives are directly attached to the mother board sata ports and only 1 is using the SFF-8808 min-sas cable in the external enclosure. After a little digging, it is likely this [6.9.2] Ironwolf Drive Disablement and the [6.9.x] LSI Controllers & Ironwolf Disks - Summary & Fix. Quote
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.