Jump to content

Disabled Drive After Successful Parity Check?


quinnjudge

Recommended Posts

Hi, hoping to get some advice/next steps - my server just completed a successful parity check this morning (0 errors), but less than 15 hours later, one of my drives became disabled (red 'X'), and I'm seeing the following in the disk log:

 

Sep 3 16:07:16 Proteus kernel: sd 1:0:1:0: [sde] tag#2 UNKNOWN(0x2003) Result: hostbyte=0x00 driverbyte=0x08
Sep 3 16:07:16 Proteus kernel: sd 1:0:1:0: [sde] tag#2 Sense Key : 0x5 [current]
Sep 3 16:07:16 Proteus kernel: sd 1:0:1:0: [sde] tag#2 ASC=0x21 ASCQ=0x0
Sep 3 16:07:16 Proteus kernel: sd 1:0:1:0: [sde] tag#2 CDB: opcode=0x8a 8a 00 00 00 00 00 ae a9 eb c8 00 00 00 08 00 00
Sep 3 16:07:16 Proteus kernel: print_req_error: critical target error, dev sde, sector 2930371528
Sep 3 16:07:16 Proteus kernel: print_req_error: critical target error, dev sde, sector 2930371528

 

What are my next steps for troubleshooting/repair?  Is the disk toast, or should I attempt a repair and put it back into service?

 

Thanks!

proteus-diagnostics-20180903-1956.zip

Link to comment

SMART for disk1 looks OK. Might just be a connection issue.

 

You can rebuild to a spare disk if you have one. That would allow you to keep the original in reserve in case there is a problem rebuilding.

 

Or you can rebuild to the same disk. Do you know the procedure?

 

Do you have backups of any important and irreplaceable files?

Link to comment

Thanks for the quick reply!

 

I don't have a spare disk, so I'll have to rebuild the existing one...I'll shut down, check the connections, and bring the server back up...can you point me to the rebuild procedure? (having a spare sitting around is on my to-do list, lol!)

 

I do have good backups; just did a test restore :)

Link to comment

Once you're happy with the connections power up and if the array is set to auto-start, stop it. (At this point you can check the SMART status again, run a SMART self-test if you want.) Unassign the disk. Start the array. Stop the array. Re-assign the disk. Start the array and the rebuild with begin.

Link to comment

Good news - rebuild completed without errors.  Bad news - now I have a reported error on the same disk:

 

Sep 3 21:28:14 Proteus kernel: mdcmd (2): import 1 sdf 64 2930266532 0 WDC_WD30EFRX-68EUZN0_WD-WMC4N0862856
Sep 3 21:28:14 Proteus kernel: md: import disk1: (sdf) WDC_WD30EFRX-68EUZN0_WD-WMC4N0862856 size: 2930266532

Sep 3 21:28:49 Proteus emhttpd: shcmd (886): /usr/local/sbin/set_ncq sdf 1
Sep 3 21:28:49 Proteus emhttpd: shcmd (887): echo 128 > /sys/block/sdf/queue/nr_requests
Sep 5 20:57:41 Proteus kernel: sd 9:0:2:0: [sdf] tag#0 UNKNOWN(0x2003) Result: hostbyte=0x00 driverbyte=0x08
Sep 5 20:57:41 Proteus kernel: sd 9:0:2:0: [sdf] tag#0 Sense Key : 0x3 [current]
Sep 5 20:57:41 Proteus kernel: sd 9:0:2:0: [sdf] tag#0 ASC=0x11 ASCQ=0x0
Sep 5 20:57:41 Proteus kernel: sd 9:0:2:0: [sdf] tag#0 CDB: opcode=0x88 88 00 00 00 00 01 2f 2a 88 98 00 00 00 08 00 00
Sep 5 20:57:41 Proteus kernel: print_req_error: critical medium error, dev sdf, sector 5086283928

 

I did a short SMART test against the drive right before I started the rebuild (came back successful)...next steps?

proteus-diagnostics-20180905-2155.zip

Link to comment

Archived

This topic is now archived and is closed to further replies.

×
×
  • Create New...