Posted September 4, 20186 yr Hi, hoping to get some advice/next steps - my server just completed a successful parity check this morning (0 errors), but less than 15 hours later, one of my drives became disabled (red 'X'), and I'm seeing the following in the disk log: Sep 3 16:07:16 Proteus kernel: sd 1:0:1:0: [sde] tag#2 UNKNOWN(0x2003) Result: hostbyte=0x00 driverbyte=0x08 Sep 3 16:07:16 Proteus kernel: sd 1:0:1:0: [sde] tag#2 Sense Key : 0x5 [current] Sep 3 16:07:16 Proteus kernel: sd 1:0:1:0: [sde] tag#2 ASC=0x21 ASCQ=0x0 Sep 3 16:07:16 Proteus kernel: sd 1:0:1:0: [sde] tag#2 CDB: opcode=0x8a 8a 00 00 00 00 00 ae a9 eb c8 00 00 00 08 00 00Sep 3 16:07:16 Proteus kernel: print_req_error: critical target error, dev sde, sector 2930371528 Sep 3 16:07:16 Proteus kernel: print_req_error: critical target error, dev sde, sector 2930371528 What are my next steps for troubleshooting/repair? Is the disk toast, or should I attempt a repair and put it back into service? Thanks! proteus-diagnostics-20180903-1956.zip Edited September 4, 20186 yr by quinnjudge
September 4, 20186 yr Community Expert SMART for disk1 looks OK. Might just be a connection issue. You can rebuild to a spare disk if you have one. That would allow you to keep the original in reserve in case there is a problem rebuilding. Or you can rebuild to the same disk. Do you know the procedure? Do you have backups of any important and irreplaceable files?
September 4, 20186 yr Author Thanks for the quick reply! I don't have a spare disk, so I'll have to rebuild the existing one...I'll shut down, check the connections, and bring the server back up...can you point me to the rebuild procedure? (having a spare sitting around is on my to-do list, lol!) I do have good backups; just did a test restore
September 4, 20186 yr Once you're happy with the connections power up and if the array is set to auto-start, stop it. (At this point you can check the SMART status again, run a SMART self-test if you want.) Unassign the disk. Start the array. Stop the array. Re-assign the disk. Start the array and the rebuild with begin.
September 4, 20186 yr Author Server restarted, disk rebuilding...looks like I have ~8 hours until rebuild is complete; I'll go grab some popcorn and cross my fingers Thank you @trurl and @John_M for your quick help, it is appreciated!
September 6, 20186 yr Author Good news - rebuild completed without errors. Bad news - now I have a reported error on the same disk: Sep 3 21:28:14 Proteus kernel: mdcmd (2): import 1 sdf 64 2930266532 0 WDC_WD30EFRX-68EUZN0_WD-WMC4N0862856 Sep 3 21:28:14 Proteus kernel: md: import disk1: (sdf) WDC_WD30EFRX-68EUZN0_WD-WMC4N0862856 size: 2930266532 Sep 3 21:28:49 Proteus emhttpd: shcmd (886): /usr/local/sbin/set_ncq sdf 1 Sep 3 21:28:49 Proteus emhttpd: shcmd (887): echo 128 > /sys/block/sdf/queue/nr_requests Sep 5 20:57:41 Proteus kernel: sd 9:0:2:0: [sdf] tag#0 UNKNOWN(0x2003) Result: hostbyte=0x00 driverbyte=0x08 Sep 5 20:57:41 Proteus kernel: sd 9:0:2:0: [sdf] tag#0 Sense Key : 0x3 [current] Sep 5 20:57:41 Proteus kernel: sd 9:0:2:0: [sdf] tag#0 ASC=0x11 ASCQ=0x0 Sep 5 20:57:41 Proteus kernel: sd 9:0:2:0: [sdf] tag#0 CDB: opcode=0x88 88 00 00 00 00 01 2f 2a 88 98 00 00 00 08 00 00Sep 5 20:57:41 Proteus kernel: print_req_error: critical medium error, dev sdf, sector 5086283928 I did a short SMART test against the drive right before I started the rebuild (came back successful)...next steps? proteus-diagnostics-20180905-2155.zip
Archived
This topic is now archived and is closed to further replies.