(SOLVED) Disk disabled during parity check (Unraid 6.7.2)

Kevek79 · November 15, 2020

Dear fellow UnRaiders,

this morning during a nice breakfast with the family my server sent me a notification that one of my drives was disabled.

As this is my first real "red X-Event" in my time using unraid I want to make sure to not make any mistakes.

Regular monthly parity check was started tonight at 04:00 in the morning.

Notification of disabled drive was received this morning at around 08:30.

Data is emulated and i did a quick spot check on the network shares and data that resides on the emulated disk could be read.

Array is still running, parity check was aborted by the server.

The drive in question is one of my oldest and should have been replaced by now, but what should I say

I have 2 precleared replacement drives (hot spares) available, so I should be in a good shape for replacing the disabled drive.

Now I just want to make sure that I got the procedure for drive replacement correct and hope someone can look into my diagnostics to give me a hint what went wrong (besides using old drives ) and what would be the best way to go forward.

Procedure to replace a data drive:

1. Stop Array

2. Unassign drive 3
(Do I need to start the array once with no drive assigned to the slot of disk 3 ?)

3. Assign one of the hot spares to disk 3 slot

4. Go to the Main -> Array Operation section

5. Put a check in the Yes, I'm sure checkbox (next to the information indicating the drive will be rebuilt)

6, Start the Array

Is the procedure for disk replacement correct as described above ?

Can someone help me in finding out what caused the disk to be disabled?

Thanks in advance

morpheus-diagnostics-20201115-0945.zip

Edited November 15, 2020 by Kevek79
Diagnostics atttached

itimpi · November 15, 2020

you rebuild steps look fine (and no need for the extra step you queried - that is only required when trying to rebuild to the same disk).

Looks like the syslog at about 4 hours into the parity check you suddenly got read errors starting with

Nov 15 08:29:33 Morpheus kernel: sd 2:0:1:0: [sdd] tag#0 UNKNOWN(0x2003) Result: hostbyte=0x00 driverbyte=0x08
Nov 15 08:29:33 Morpheus kernel: sd 2:0:1:0: [sdd] tag#0 Sense Key : 0x3 [current] 
Nov 15 08:29:33 Morpheus kernel: sd 2:0:1:0: [sdd] tag#0 ASC=0x11 ASCQ=0x0 
Nov 15 08:29:33 Morpheus kernel: sd 2:0:1:0: [sdd] tag#0 CDB: opcode=0x28 28 00 db 27 25 b8 00 04 00 00
Nov 15 08:29:33 Morpheus kernel: print_req_error: critical medium error, dev sdd, sector 3676775864
Nov 15 08:29:33 Morpheus kernel: md: disk3 read error, sector=3676775800
Nov 15 08:29:33 Morpheus kernel: md: disk3 read error, sector=3676775808
Nov 15 08:29:33 Morpheus kernel: md: disk3 read error, sector=3676775816
Nov 15 08:29:33 Morpheus kernel: md: disk3 read error, sector=3676775824

and after a while started getting write errors as well. Not sure but my guess is that indicates something failing internally within the drive.

Kevek79 · November 15, 2020

Thanks @itimpi for confirming the procedure.

I did see this portion of the syslog, and i do also think that the drive is just dying after its >60.000 h in operation (tough little spinner).

I will start rebuilding on a new disk as soon as I am back home.

In the meantime, lets see if anyone else has seen this error massages before and knows where it comes from.

Kevek79 · November 15, 2020

So I am back in front of my server and there is one more question for the gurus in the forum before i atempt the rebuild of my disabled disk.

As stated above the server was in the middle of a scheduled parity check when disk 3 got disabled this morning.

When I look on my main page now under array operations it says "read check in progress" and it is indicating that the read check is paused.

Is this expected behaviour?

Do I need to stop the paused read-check first before I stop the array?

Could the paused parity / read check somehow interfere with the rebuild i am trying to achive with the procedure above?

I just want to make sure that I do not make my situation worse.

Thank you all for your help.

itimpi · November 15, 2020

A parity check would not normally be paused unless you either manually paused it or had the Parity Check Tuning plugin installed and you have met the criteria you set for pauses to occur.

stopping the array automatically abandons any array operation that is currently running.

you might want to post your current diagnostics so we can check what is the current state.

Kevek79 · November 15, 2020

Current diagnostics attached

besides stopping all docker containers and stopping scheduler from running mover nothing has changed yet since pulling diagnostics this morning.

Thanks for investigating @itimpi

morpheus-diagnostics-20201115-2138.zip

Kevek79 · November 16, 2020

Quote

stopping the array automatically abandons any array operation that is currently running.

Do I understand that correctly if I assume that stopping the array (not restarting the server) will also stop/cancel the read check that is paused now for whatever reason ? And the system will not resume this paused check after restarting the array but starting a rebuild of the disabled disk?

Can I go forward with stopping the array and assign the hotspare to the slot of the disabled disk?

Has anyone had the chance to look in my latest diagnostics so we can narrow down what happend?

thanks for all your support

Toby

trurl · November 16, 2020

6 hours ago, Kevek79 said:

Can I go forward with stopping the array and assign the hotspare to the slot of the disabled disk?

yes

trurl · November 16, 2020

6 hours ago, Kevek79 said:

latest diagnostics so we can narrow down what happend?

Looks like a disk problem. SMART attribute 1 is an important indicator on WD Red drives so you should add that to the attributes monitored on those disks.

Kevek79 · November 16, 2020

Thanks to @trurl and @itimpi!

Rebuild is now running.

Will let you know how it goes.

Kevek79 · November 17, 2020

After about 15h of rebuild the Server is back up and running as usual.

Thanks for all the help.

Thread Tagged Solved

(SOLVED) Disk disabled during parity check (Unraid 6.7.2)

Recommended Posts

Kevek79

Link to comment

itimpi

Link to comment

Kevek79

Link to comment

Kevek79

Link to comment

itimpi

Link to comment

Kevek79

Link to comment

Kevek79

Link to comment

trurl

Link to comment

trurl

Link to comment

Kevek79

Link to comment

Kevek79

Link to comment

Join the conversation