Disk disabled due to read errors


eribob

Recommended Posts

Hi! 

My array has been working perfect until today. One of the disks were suddenly disabled due to read errors. The SMART report seem to indicate that the disk is healthy however? (It has FAILED: Never on all attributes as far as I can see). I have posted diagnostics - should I replace the disk or can this be some other kind of bug? I never had a warning from unraid about the disk before today, which is strange if it was failing.

 

Best regards 

Erik

monsterservern-diagnostics-20200902-1225.zip

Edited by eribob
Link to comment

Hi again! 

I am following your instructions. I replaced the disk data cable and removed it from the array. After that I re-inserted it and the disk is rebuilding. However, the rebuild process keeps getting paused with the message: 

Parity Tuning Operation: 2020-09-02 16:05

Notification
unknown action: recon D1 (1.6% completed) Pause

I can resume the process again when it pauses and it will run for another couple of minutes or so, but then the same thing happens again. 

 

The system log also mentions the drives being overheated. Is that causing the recon D1 problem? 

Sep  2 16:00:34 Monsterservern kernel: md: recovery thread: recon D1 ...
Sep  2 16:05:01 Monsterservern parity.check.tuning.php: Paused unknown action: recon D1  (1.6% completed) : Following drives overheated: 34 34 34 31 
Sep  2 16:05:01 Monsterservern kernel: mdcmd (44): nocheck PAUSE
Sep  2 16:05:01 Monsterservern kernel: 
Sep  2 16:05:02 Monsterservern kernel: md: recovery thread: exit status: -4
Sep  2 16:08:04 Monsterservern kernel: mdcmd (45): check Resume
Sep  2 16:08:04 Monsterservern kernel: md: recovery thread: recon D1 ...
Sep  2 16:10:02 Monsterservern parity.check.tuning.php: Paused unknown action: recon D1  (2.1% completed) : Following drives overheated: 34 34 34 31 
Sep  2 16:10:02 Monsterservern kernel: mdcmd (46): nocheck PAUSE
Sep  2 16:10:02 Monsterservern kernel: 
Sep  2 16:10:03 Monsterservern kernel: md: recovery thread: exit status: -4

Perhaps I should remove the side panels from the case and attempt to continue? 

monsterservern-diagnostics-20200902-1614.zip

Link to comment
2 hours ago, eribob said:

Genius! It actually said so in the logs, I am too stressed to check properly. Thank you! 

The message being output by the parity check tuning plugin does not look quite right.     Is there any chance you can go into it’s settings and set the Debug logging to ‘testing’ level, reproduce the symptoms, and then post a copy of the syslog (or diagnostics which includes the syslog) so I can get more detail on exactly what is happening?    After doing that turn the level down again.   My system does not suffer from drives getting too hot so I have trouble testing all real-world scenarios relating to the temperature checks.    On the face of it you may have the temperature settings too low but I am not sure.    You can also disable the option to pause/resume based on temperature.

Link to comment

I disabled the option in the Parity Check tuning plugin "pause and resume array operations if disks overheat". I had warning disk temperature at 45 and critical at 55 (I believe it is default, since I cant remember ever changing those values). I now raised the warning to 50 and critical to 60 as well. After disabling the "pause if overheat" the rebuild process has been progressing without problems (now on 39%). So most likely it was pausing due to temperatures approaching the warning level.

 

Since I have important data on my array and no parity until the rebuild is finished, I want to await the rebuild process now. So I do not want to try and reproduce the error. 

 

Thanks again for quick support!

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.