Jump to content

One of my disks consistently getting disabled. SMART data looks good

Featured Replies

Posted

I have been having this issue where my Disk 3 will error out and gets disabled.  SMART data all looks good.  I usually just restart the server, wipe the disk, and rebuild the array and it is fine, but it is starting to happen with greater frequency.  Last time I pulled the disk and re-seated it in case there is a connection issue.  Here is a copy of the unraid logs when the disk gets disabled.

 

May 17 17:46:58 NASDex kernel: scsi target1:0:3: handle(0x000c), sas_address(0x4433221103000000), phy(3)
May 17 17:46:58 NASDex kernel: scsi target1:0:3: enclosure logical id(0x500605b006287f10), slot(4) 
May 17 17:46:58 NASDex kernel: sd 1:0:3:0: task abort: SUCCESS scmd(0x00000000e7172b62)
May 17 17:46:59 NASDex kernel: sd 1:0:3:0: Power-on or device reset occurred
May 17 17:46:59 NASDex kernel: sd 1:0:3:0: Power-on or device reset occurred
May 17 17:47:06 NASDex kernel: sd 1:0:3:0: attempting task abort!scmd(0x000000008f76039a), outstanding for 7395 ms & timeout 7000 ms
May 17 17:47:06 NASDex kernel: sd 1:0:3:0: [sde] tag#3095 CDB: opcode=0x12 12 00 00 00 fe 00
May 17 17:47:06 NASDex kernel: scsi target1:0:3: handle(0x000c), sas_address(0x4433221103000000), phy(3)
May 17 17:47:06 NASDex kernel: scsi target1:0:3: enclosure logical id(0x500605b006287f10), slot(4) 
May 17 17:47:06 NASDex kernel: sd 1:0:3:0: task abort: SUCCESS scmd(0x000000008f76039a)
May 17 17:47:06 NASDex kernel: sd 1:0:3:0: Power-on or device reset occurred
May 17 17:47:07 NASDex kernel: sd 1:0:3:0: Power-on or device reset occurred
May 17 17:47:37 NASDex kernel: sd 1:0:3:0: attempting task abort!scmd(0x000000007841706f), outstanding for 30035 ms & timeout 30000 ms
May 17 17:47:37 NASDex kernel: sd 1:0:3:0: [sde] tag#3110 CDB: opcode=0x88 88 00 00 00 00 05 8b 1a d3 40 00 00 00 08 00 00
May 17 17:47:37 NASDex kernel: scsi target1:0:3: handle(0x000c), sas_address(0x4433221103000000), phy(3)
May 17 17:47:37 NASDex kernel: scsi target1:0:3: enclosure logical id(0x500605b006287f10), slot(4) 
May 17 17:47:37 NASDex kernel: sd 1:0:3:0: task abort: SUCCESS scmd(0x000000007841706f)
May 17 17:47:37 NASDex kernel: sd 1:0:3:0: Power-on or device reset occurred
May 17 17:47:38 NASDex kernel: sd 1:0:3:0: Power-on or device reset occurred
May 17 17:47:44 NASDex kernel: sd 1:0:3:0: attempting task abort!scmd(0x0000000057f00eb9), outstanding for 7522 ms & timeout 7000 ms
May 17 17:47:44 NASDex kernel: sd 1:0:3:0: [sde] tag#3073 CDB: opcode=0x12 12 00 00 00 fe 00
May 17 17:47:44 NASDex kernel: scsi target1:0:3: handle(0x000c), sas_address(0x4433221103000000), phy(3)
May 17 17:47:44 NASDex kernel: scsi target1:0:3: enclosure logical id(0x500605b006287f10), slot(4) 
May 17 17:47:45 NASDex kernel: sd 1:0:3:0: task abort: SUCCESS scmd(0x0000000057f00eb9)
May 17 17:47:45 NASDex kernel: sd 1:0:3:0: Power-on or device reset occurred
May 17 17:47:45 NASDex kernel: sd 1:0:3:0: Power-on or device reset occurred
May 17 17:48:16 NASDex kernel: sd 1:0:3:0: attempting task abort!scmd(0x000000007bcdd088), outstanding for 30438 ms & timeout 30000 ms
May 17 17:48:16 NASDex kernel: sd 1:0:3:0: [sde] tag#3090 CDB: opcode=0x12 12 00 00 00 24 00
May 17 17:48:16 NASDex kernel: scsi target1:0:3: handle(0x000c), sas_address(0x4433221103000000), phy(3)
May 17 17:48:16 NASDex kernel: scsi target1:0:3: enclosure logical id(0x500605b006287f10), slot(4) 
May 17 17:48:16 NASDex kernel: sd 1:0:3:0: task abort: SUCCESS scmd(0x000000007bcdd088)
May 17 17:48:16 NASDex kernel: sd 1:0:3:0: attempting task abort!scmd(0x00000000f1a23639), outstanding for 30593 ms & timeout 30000 ms
May 17 17:48:16 NASDex kernel: sd 1:0:3:0: [sde] tag#3089 CDB: opcode=0x88 88 00 00 00 00 05 57 5b 2f a8 00 00 00 90 00 00
May 17 17:48:16 NASDex kernel: scsi target1:0:3: handle(0x000c), sas_address(0x4433221103000000), phy(3)
May 17 17:48:16 NASDex kernel: scsi target1:0:3: enclosure logical id(0x500605b006287f10), slot(4) 
May 17 17:48:16 NASDex kernel: sd 1:0:3:0: No reference found at driver, assuming scmd(0x00000000f1a23639) might have completed
May 17 17:48:16 NASDex kernel: sd 1:0:3:0: task abort: SUCCESS scmd(0x00000000f1a23639)
May 17 17:48:16 NASDex kernel: sd 1:0:3:0: Power-on or device reset occurred
May 17 17:48:17 NASDex kernel: sd 1:0:3:0: Power-on or device reset occurred
May 17 17:48:23 NASDex kernel: sd 1:0:3:0: attempting task abort!scmd(0x00000000cb2e9456), outstanding for 7333 ms & timeout 7000 ms
May 17 17:48:23 NASDex kernel: sd 1:0:3:0: [sde] tag#3082 CDB: opcode=0x12 12 00 00 00 fe 00
May 17 17:48:23 NASDex kernel: scsi target1:0:3: handle(0x000c), sas_address(0x4433221103000000), phy(3)
May 17 17:48:23 NASDex kernel: scsi target1:0:3: enclosure logical id(0x500605b006287f10), slot(4) 
May 17 17:48:23 NASDex kernel: sd 1:0:3:0: [sde] tag#3093 UNKNOWN(0x2003) Result: hostbyte=0x0b driverbyte=DRIVER_OK cmd_age=115s
May 17 17:48:23 NASDex kernel: sd 1:0:3:0: [sde] tag#3093 CDB: opcode=0x35 35 00 00 00 00 00 00 00 00 00
May 17 17:48:23 NASDex kernel: I/O error, dev sde, sector 0 op 0x1:(WRITE) flags 0x800 phys_seg 0 prio class 0
May 17 17:48:23 NASDex kernel: I/O error, dev sde, sector 2932785624 op 0x1:(WRITE) flags 0x800 phys_seg 1 prio class 0
May 17 17:48:23 NASDex kernel: md: disk3 write error, sector=2932785560
May 17 17:48:23 NASDex kernel: I/O error, dev sde, sector 2932785632 op 0x1:(WRITE) flags 0x800 phys_seg 1 prio class 0
May 17 17:48:23 NASDex kernel: md: disk3 write error, sector=2932785568
May 17 17:48:23 NASDex kernel: I/O error, dev sde, sector 2932785640 op 0x1:(WRITE) flags 0x800 phys_seg 1 prio class 0
May 17 17:48:23 NASDex kernel: md: disk3 write error, sector=2932785576
May 17 17:48:23 NASDex kernel: I/O error, dev sde, sector 2932785648 op 0x1:(WRITE) flags 0x800 phys_seg 1 prio class 0
May 17 17:48:23 NASDex kernel: md: disk3 write error, sector=2932785584
May 17 17:48:23 NASDex kernel: I/O error, dev sde, sector 2932785656 op 0x1:(WRITE) flags 0x800 phys_seg 1 prio class 0
May 17 17:48:23 NASDex kernel: md: disk3 write error, sector=2932785592
May 17 17:48:23 NASDex kernel: I/O error, dev sde, sector 2932785664 op 0x1:(WRITE) flags 0x800 phys_seg 1 prio class 0
May 17 17:48:23 NASDex kernel: md: disk3 write error, sector=2932785600
May 17 17:48:23 NASDex kernel: I/O error, dev sde, sector 2932785672 op 0x1:(WRITE) flags 0x800 phys_seg 1 prio class 0
May 17 17:48:23 NASDex kernel: md: disk3 write error, sector=2932785608
May 17 17:48:23 NASDex kernel: I/O error, dev sde, sector 2932785688 op 0x1:(WRITE) flags 0x800 phys_seg 1 prio class 0
May 17 17:48:23 NASDex kernel: md: disk3 write error, sector=2932785624
May 17 17:48:23 NASDex kernel: sd 1:0:3:0: task abort: SUCCESS scmd(0x00000000cb2e9456)
May 17 17:48:24 NASDex kernel: sd 1:0:3:0: Power-on or device reset occurred
May 17 17:48:24 NASDex kernel: sd 1:0:3:0: Power-on or device reset occurred
May 17 17:48:26 NASDex kernel: br-9ca54f048d75: port 8(veth0bbba04) entered disabled state
May 17 17:48:26 NASDex kernel: vethb049206: renamed from eth0
May 17 17:48:26 NASDex  avahi-daemon[13085]: Interface veth0bbba04.IPv6 no longer relevant for mDNS.
May 17 17:48:26 NASDex  avahi-daemon[13085]: Leaving mDNS multicast group on interface veth0bbba04.IPv6 with address fe80::c441:89ff:fe63:c4c9.
May 17 17:48:26 NASDex kernel: br-9ca54f048d75: port 8(veth0bbba04) entered disabled state
May 17 17:48:26 NASDex kernel: device veth0bbba04 left promiscuous mode
May 17 17:48:26 NASDex kernel: br-9ca54f048d75: port 8(veth0bbba04) entered disabled state
May 17 17:48:26 NASDex  avahi-daemon[13085]: Withdrawing address record for fe80::c441:89ff:fe63:c4c9 on veth0bbba04.
May 17 17:48:26 NASDex kernel: br-9ca54f048d75: port 8(veth5568b3c) entered blocking state
May 17 17:48:26 NASDex kernel: br-9ca54f048d75: port 8(veth5568b3c) entered disabled state
May 17 17:48:26 NASDex kernel: device veth5568b3c entered promiscuous mode
May 17 17:48:26 NASDex kernel: br-9ca54f048d75: port 8(veth5568b3c) entered blocking state
May 17 17:48:26 NASDex kernel: br-9ca54f048d75: port 8(veth5568b3c) entered forwarding state
May 17 17:48:26 NASDex kernel: eth0: renamed from veth0e987c1
May 17 17:48:26 NASDex kernel: IPv6: ADDRCONF(NETDEV_CHANGE): veth5568b3c: link becomes ready
May 17 17:48:28 NASDex  avahi-daemon[13085]: Joining mDNS multicast group on interface veth5568b3c.IPv6 with address fe80::40e6:aeff:fe78:8bac.
May 17 17:48:28 NASDex  avahi-daemon[13085]: New relevant interface veth5568b3c.IPv6 for mDNS.
May 17 17:48:28 NASDex  avahi-daemon[13085]: Registering new address record for fe80::40e6:aeff:fe78:8bac on veth5568b3c.*.
May 17 17:49:01 NASDex  sSMTP[17889]: Creating SSL connection to host
May 17 17:49:01 NASDex  sSMTP[17889]: SSL connection using TLS_AES_128_GCM_SHA256
May 17 17:49:02 NASDex  sSMTP[17889]: Authorization failed (535 Authentication failed)
May 17 17:49:02 NASDex  sSMTP[17934]: Creating SSL connection to host
May 17 17:49:02 NASDex  sSMTP[17934]: SSL connection using TLS_AES_128_GCM_SHA256
May 17 17:49:03 NASDex  sSMTP[17934]: Authorization failed (535 Authentication failed)

 

It looks like the disk simply just stops responding for some reason, I am unsure what to make of this.

SMART data

708261699_Screenshotfrom2023-05-1721-46-00.thumb.png.3b2b1c26fa116ecb9ab628fe0bacea6e.png

  • Community Expert

Looks more like a power/connection issue, replace cables (both power and SATA) and try again, you should also update the firmware for the 1st HBA:


 

LSISAS2008: FWVersion(10.00.08.00)
LSISAS2008: FWVersion(20.00.07.00)

 

2nd on is using the latest.

  • Author

The drive is in a Rosewill hotswappable cage and none of the other drives are having issues.  Is it possible the hotswap cage is at fault you think?

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...