Jump to content

Read errors and update strategy


Recommended Posts

Hello everyone,

 

this morning I woke up to a read error message. It occurred during the parity check, but the check finished with 0 errors. SMART Report attached. I just ran a short self test again and it says it completed w/o error. The attached report was before the self test I just ran. 

 

1) So some questions: Should I replace the drive right away? 

2) I have a situation I'm unsure what is the best approach. My drives are oldish. The drive affected is a 4 TB one (disk 2 in the array) and this is also the size of my parity. I also have no slots left currently (Basic with 4 data drives, one parity and one cache). The thing is, a 4 TB drive is between 70-100 Euro (the 100 Euro being a WD RED). A WD My book with 8 TB (to shuck) is 130 Euro (4 TB USB drives are also 100 Euro, as they are just too old). So not a lot more for double the storage. But, I can't just replace the disk 2 with an 8 TB, as I would need to first replace the parity drive.

 

I'm kinda unsure how to go forward, If I should replace the parity drive and then disk 2, as I'm afraid that the drive might break during the parity rebuilding process that I need to do when I replace parity. Or that the read errors would come in as an issue here. Is it safer to just replace the faulty drive 2 and then after everything is back to normal go forward with upgrading? I also read about parity swap in the documentation, but it sounds "big" (it is some time since I fiddled around with the array) and I'm not sure if that it the correct route either.

 

Thanks for suggestions. I'm currently on unraid 6.8.3.

WDC_WD40EFRX-68WT0N0_WD-WCC4E1892616-20210307-1955.txt

Link to comment

Here you go. Please note, that I removed log entries older than March (oldest where back from October 2020) from syslog and also removed most of the mover logs, as they basically just listed “moving file ABC” and nothing else. I left those in which are in proximity to the errors. You can see the error starting at line 127 in syslog.

 

Hope this helps.

 

mod-unraidtower-diagnostics-20210309-2038.zip

Link to comment

In the syslog it's also log as a disk problem, what you can do for WD disks is to monitor these attributes:

 

ID# ATTRIBUTE_NAME          FLAGS    VALUE WORST THRESH FAIL RAW_VALUE
  1 Raw_Read_Error_Rate     POSR-K   200   200   051    -    70
 200 Multi_Zone_Error_Rate   ---R--   200   200   000    -    0

 

Ideally both should be 0, when they start climbing is a bad sign, and if they keep climbing you'll likely going to run into more issues in the future.

Link to comment

So you suggest to just keep an eye on it for now to see if the numbers get worse? No HDD change necessary?

Can I "accept" the state as is; like "ok UnRAID these 168 read errors are "fine" for now, but warn me again when the number rises?

 

Thanks!

Link to comment
32 minutes ago, Sono said:

So you suggest to just keep an eye on it for now to see if the numbers get worse?

Yes, you can add those attribute to the ones Unraid monitors, so will get notified if they increase.

 

32 minutes ago, Sono said:

like "ok UnRAID these 168 read errors are "fine" for now,

You can clear those by rebooting or clicking on "clear stats"

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...