Disabled Drive After Successful Parity Check?

September 4, 20187 yr

Hi, hoping to get some advice/next steps - my server just completed a successful parity check this morning (0 errors), but less than 15 hours later, one of my drives became disabled (red 'X'), and I'm seeing the following in the disk log:

Sep 3 16:07:16 Proteus kernel: sd 1:0:1:0: [sde] tag#2 UNKNOWN(0x2003) Result: hostbyte=0x00 driverbyte=0x08
Sep 3 16:07:16 Proteus kernel: sd 1:0:1:0: [sde] tag#2 Sense Key : 0x5 [current]
Sep 3 16:07:16 Proteus kernel: sd 1:0:1:0: [sde] tag#2 ASC=0x21 ASCQ=0x0
Sep 3 16:07:16 Proteus kernel: sd 1:0:1:0: [sde] tag#2 CDB: opcode=0x8a 8a 00 00 00 00 00 ae a9 eb c8 00 00 00 08 00 00
Sep 3 16:07:16 Proteus kernel: print_req_error: critical target error, dev sde, sector 2930371528
Sep 3 16:07:16 Proteus kernel: print_req_error: critical target error, dev sde, sector 2930371528

What are my next steps for troubleshooting/repair? Is the disk toast, or should I attempt a repair and put it back into service?

Thanks!

proteus-diagnostics-20180903-1956.zip

Edited September 4, 20187 yr by quinnjudge

Quote

September 4, 20187 yr

Community Expert

SMART for disk1 looks OK. Might just be a connection issue.

You can rebuild to a spare disk if you have one. That would allow you to keep the original in reserve in case there is a problem rebuilding.

Or you can rebuild to the same disk. Do you know the procedure?

Do you have backups of any important and irreplaceable files?

Quote

September 4, 20187 yr

Author

Thanks for the quick reply!

I don't have a spare disk, so I'll have to rebuild the existing one...I'll shut down, check the connections, and bring the server back up...can you point me to the rebuild procedure? (having a spare sitting around is on my to-do list, lol!)

I do have good backups; just did a test restore

Quote

September 4, 20187 yr

Once you're happy with the connections power up and if the array is set to auto-start, stop it. (At this point you can check the SMART status again, run a SMART self-test if you want.) Unassign the disk. Start the array. Stop the array. Re-assign the disk. Start the array and the rebuild with begin.

Quote

September 4, 20187 yr

Author

Server restarted, disk rebuilding...looks like I have ~8 hours until rebuild is complete; I'll go grab some popcorn and cross my fingers

Thank you @trurl and @John_M for your quick help, it is appreciated!

Quote

September 6, 20187 yr

Author

Good news - rebuild completed without errors. Bad news - now I have a reported error on the same disk:

Sep 3 21:28:14 Proteus kernel: mdcmd (2): import 1 sdf 64 2930266532 0 WDC_WD30EFRX-68EUZN0_WD-WMC4N0862856
Sep 3 21:28:14 Proteus kernel: md: import disk1: (sdf) WDC_WD30EFRX-68EUZN0_WD-WMC4N0862856 size: 2930266532
Sep 3 21:28:49 Proteus emhttpd: shcmd (886): /usr/local/sbin/set_ncq sdf 1
Sep 3 21:28:49 Proteus emhttpd: shcmd (887): echo 128 > /sys/block/sdf/queue/nr_requests
Sep 5 20:57:41 Proteus kernel: sd 9:0:2:0: [sdf] tag#0 UNKNOWN(0x2003) Result: hostbyte=0x00 driverbyte=0x08
Sep 5 20:57:41 Proteus kernel: sd 9:0:2:0: [sdf] tag#0 Sense Key : 0x3 [current]
Sep 5 20:57:41 Proteus kernel: sd 9:0:2:0: [sdf] tag#0 ASC=0x11 ASCQ=0x0
Sep 5 20:57:41 Proteus kernel: sd 9:0:2:0: [sdf] tag#0 CDB: opcode=0x88 88 00 00 00 00 01 2f 2a 88 98 00 00 00 08 00 00
Sep 5 20:57:41 Proteus kernel: print_req_error: critical medium error, dev sdf, sector 5086283928

I did a short SMART test against the drive right before I started the rebuild (came back successful)...next steps?

proteus-diagnostics-20180905-2155.zip

Quote

September 6, 20187 yr

Community Expert

Disk1 is failing and needs to be replaced

Quote

Disabled Drive After Successful Parity Check?

Featured Replies

Archived

Account

Navigation

Search

Configure browser push notifications

Chrome (Android)

Chrome (Desktop)

Safari (iOS 16.4+)

Safari (macOS)

Edge (Android)

Edge (Desktop)

Firefox (Android)

Firefox (Desktop)