(SOLVED) Disk disabled during parity check (Unraid 6.7.2)


Kevek79

Recommended Posts

Dear fellow UnRaiders,

 

this morning during a nice breakfast with the family my server sent me a notification that one of my drives was disabled.

As this is my first real "red X-Event" in my time using unraid I want to make sure to not make any mistakes.

 

Regular monthly parity check was started tonight at 04:00 in the morning.

Notification of disabled drive was received this morning at around 08:30.

Data is emulated and i did a quick spot check on the network shares and data that resides on the emulated disk could be read.

Array is still running, parity check was aborted by the server.

 

The drive in question is one of my oldest and should have been replaced by now, but what should I say ;)

I have 2 precleared replacement drives (hot spares) available, so I should be in a good shape for replacing the disabled drive.

 

Now I just want to make sure that I got the procedure for drive replacement correct and hope someone can look into my diagnostics to give me a hint what went wrong (besides using old drives ;) ) and what would be the best way to go forward.

 

Procedure to replace a data drive:

1. Stop Array

2. Unassign drive 3
(Do I need to start the array once with no drive assigned to the slot of disk 3 ?)

3. Assign one of the hot spares to disk 3 slot

4. Go to the Main -> Array Operation section

5. Put a check in the Yes, I'm sure checkbox (next to the information indicating the drive will be rebuilt)

6, Start the Array

 

Is the procedure for disk replacement correct as described above ?

Can someone help me in finding out what caused the disk to be disabled?

 

Thanks in advance

morpheus-diagnostics-20201115-0945.zip

Edited by Kevek79
Diagnostics atttached
Link to comment

you rebuild steps look fine (and no need for the extra step you queried - that is only required when trying to rebuild to the same disk).
 

Looks like the syslog at about 4 hours into the parity check you suddenly got read errors starting with 

Nov 15 08:29:33 Morpheus kernel: sd 2:0:1:0: [sdd] tag#0 UNKNOWN(0x2003) Result: hostbyte=0x00 driverbyte=0x08
Nov 15 08:29:33 Morpheus kernel: sd 2:0:1:0: [sdd] tag#0 Sense Key : 0x3 [current] 
Nov 15 08:29:33 Morpheus kernel: sd 2:0:1:0: [sdd] tag#0 ASC=0x11 ASCQ=0x0 
Nov 15 08:29:33 Morpheus kernel: sd 2:0:1:0: [sdd] tag#0 CDB: opcode=0x28 28 00 db 27 25 b8 00 04 00 00
Nov 15 08:29:33 Morpheus kernel: print_req_error: critical medium error, dev sdd, sector 3676775864
Nov 15 08:29:33 Morpheus kernel: md: disk3 read error, sector=3676775800
Nov 15 08:29:33 Morpheus kernel: md: disk3 read error, sector=3676775808
Nov 15 08:29:33 Morpheus kernel: md: disk3 read error, sector=3676775816
Nov 15 08:29:33 Morpheus kernel: md: disk3 read error, sector=3676775824

and after a while started getting write errors as well.   Not sure but my guess is that indicates something failing internally within the drive.

  • Like 1
Link to comment

Thanks @itimpi for confirming the procedure.

 

I did see this portion of the syslog, and i do also think that the drive is just dying after its >60.000 h in operation (tough little spinner).

 

I will start rebuilding on a new disk as soon as I am back home.

In the meantime, lets see if anyone else has seen this error massages before and knows where it comes from.

 

 

 

Link to comment

So I am back in front of my server and there is one more question for the gurus in the forum before i atempt the rebuild of my disabled disk.

 

As stated above the server was in the middle of a scheduled parity check when disk 3 got disabled this morning.

When I look on my main page now under array operations it says "read check in progress" and it is indicating that the read check is paused.

 

Is this expected behaviour?

Do I need to stop the paused read-check first before I stop the array?

 

Could the paused parity / read check somehow interfere with the rebuild i am trying to achive with the procedure above?

I just want to make sure that I do not make my situation worse.

 

Thank you all for your help.

Link to comment

A parity check would not normally be paused unless you either manually paused it or had the Parity Check Tuning plugin installed and you have met the criteria you set for pauses to occur.

 

stopping the array automatically abandons any array operation that is currently running.

 

you might want to post your current diagnostics so we can check what is the current state.

 

 

  • Thanks 1
Link to comment
Quote

stopping the array automatically abandons any array operation that is currently running.

Do I understand that correctly if I assume that stopping the array (not restarting the server) will also stop/cancel the read check that is paused now for whatever reason ? And the system will not resume this paused check after restarting the array but starting a rebuild of the disabled disk?

 

Can I go forward with stopping the array and assign the hotspare to the slot of the disabled disk?

 

Has anyone had the chance to look in my latest diagnostics so we can narrow down what happend?

 

thanks for all your support

Toby

Link to comment
  • Kevek79 changed the title to (SOLVED) Disk disabled during parity check (Unraid 6.7.2)

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.