Read errors and update strategy

Sono · March 7, 2021

Hello everyone,

this morning I woke up to a read error message. It occurred during the parity check, but the check finished with 0 errors. SMART Report attached. I just ran a short self test again and it says it completed w/o error. The attached report was before the self test I just ran.

1) So some questions: Should I replace the drive right away?

2) I have a situation I'm unsure what is the best approach. My drives are oldish. The drive affected is a 4 TB one (disk 2 in the array) and this is also the size of my parity. I also have no slots left currently (Basic with 4 data drives, one parity and one cache). The thing is, a 4 TB drive is between 70-100 Euro (the 100 Euro being a WD RED). A WD My book with 8 TB (to shuck) is 130 Euro (4 TB USB drives are also 100 Euro, as they are just too old). So not a lot more for double the storage. But, I can't just replace the disk 2 with an 8 TB, as I would need to first replace the parity drive.

I'm kinda unsure how to go forward, If I should replace the parity drive and then disk 2, as I'm afraid that the drive might break during the parity rebuilding process that I need to do when I replace parity. Or that the read errors would come in as an issue here. Is it safer to just replace the faulty drive 2 and then after everything is back to normal go forward with upgrading? I also read about parity swap in the documentation, but it sounds "big" (it is some time since I fiddled around with the array) and I'm not sure if that it the correct route either.

Thanks for suggestions. I'm currently on unraid 6.8.3.

WDC_WD40EFRX-68WT0N0_WD-WCC4E1892616-20210307-1955.txt

JorgeB · March 8, 2021

It's logged as a disk problem, you should run an extended SMART test.

Sono · March 9, 2021

Attached are the results of the extended smart test and also a screenshot of the view under the main tab where errors are reported for drive2.

WDC_WD40EFRX-68WT0N0_WD-WCC4E1892616-20210309-0540.txt

Edited March 9, 2021 by Sono

JorgeB · March 9, 2021

SMART test passed so disk is OK for now, keep monitoring for future issues.

Sono · March 9, 2021

Thanks. I would like to understand this better and how unraid recognizes these errors or rather why they show up.

Is there something I can read up on to get more knowledge about this?

JorgeB · March 9, 2021

If you saved the diags (or haven't rebooted yet) post them, they might give more details on the problem.

Sono · March 9, 2021

Here you go. Please note, that I removed log entries older than March (oldest where back from October 2020) from syslog and also removed most of the mover logs, as they basically just listed “moving file ABC” and nothing else. I left those in which are in proximity to the errors. You can see the error starting at line 127 in syslog.

Hope this helps.

mod-unraidtower-diagnostics-20210309-2038.zip

JorgeB · March 10, 2021

In the syslog it's also log as a disk problem, what you can do for WD disks is to monitor these attributes:

ID# ATTRIBUTE_NAME          FLAGS    VALUE WORST THRESH FAIL RAW_VALUE
  1 Raw_Read_Error_Rate     POSR-K   200   200   051    -    70
 200 Multi_Zone_Error_Rate   ---R--   200   200   000    -    0

Ideally both should be 0, when they start climbing is a bad sign, and if they keep climbing you'll likely going to run into more issues in the future.

Sono · March 10, 2021

So you suggest to just keep an eye on it for now to see if the numbers get worse? No HDD change necessary?

Can I "accept" the state as is; like "ok UnRAID these 168 read errors are "fine" for now, but warn me again when the number rises?

Thanks!

JorgeB · March 10, 2021

32 minutes ago, Sono said:

So you suggest to just keep an eye on it for now to see if the numbers get worse?

Yes, you can add those attribute to the ones Unraid monitors, so will get notified if they increase.

32 minutes ago, Sono said:

like "ok UnRAID these 168 read errors are "fine" for now,

You can clear those by rebooting or clicking on "clear stats"

Read errors and update strategy

Recommended Posts

Sono

Link to comment

JorgeB

Link to comment

Sono

Link to comment

JorgeB

Link to comment

Sono

Link to comment

JorgeB

Link to comment

Sono

Link to comment

JorgeB

Link to comment

Sono

Link to comment

JorgeB

Link to comment

Join the conversation