March 7, 20215 yr Hello everyone, this morning I woke up to a read error message. It occurred during the parity check, but the check finished with 0 errors. SMART Report attached. I just ran a short self test again and it says it completed w/o error. The attached report was before the self test I just ran. 1) So some questions: Should I replace the drive right away? 2) I have a situation I'm unsure what is the best approach. My drives are oldish. The drive affected is a 4 TB one (disk 2 in the array) and this is also the size of my parity. I also have no slots left currently (Basic with 4 data drives, one parity and one cache). The thing is, a 4 TB drive is between 70-100 Euro (the 100 Euro being a WD RED). A WD My book with 8 TB (to shuck) is 130 Euro (4 TB USB drives are also 100 Euro, as they are just too old). So not a lot more for double the storage. But, I can't just replace the disk 2 with an 8 TB, as I would need to first replace the parity drive. I'm kinda unsure how to go forward, If I should replace the parity drive and then disk 2, as I'm afraid that the drive might break during the parity rebuilding process that I need to do when I replace parity. Or that the read errors would come in as an issue here. Is it safer to just replace the faulty drive 2 and then after everything is back to normal go forward with upgrading? I also read about parity swap in the documentation, but it sounds "big" (it is some time since I fiddled around with the array) and I'm not sure if that it the correct route either. Thanks for suggestions. I'm currently on unraid 6.8.3. WDC_WD40EFRX-68WT0N0_WD-WCC4E1892616-20210307-1955.txt
March 8, 20215 yr Community Expert It's logged as a disk problem, you should run an extended SMART test.
March 9, 20215 yr Author Attached are the results of the extended smart test and also a screenshot of the view under the main tab where errors are reported for drive2. WDC_WD40EFRX-68WT0N0_WD-WCC4E1892616-20210309-0540.txt Edited March 9, 20215 yr by Sono
March 9, 20215 yr Community Expert SMART test passed so disk is OK for now, keep monitoring for future issues.
March 9, 20215 yr Author Thanks. I would like to understand this better and how unraid recognizes these errors or rather why they show up. Is there something I can read up on to get more knowledge about this?
March 9, 20215 yr Community Expert If you saved the diags (or haven't rebooted yet) post them, they might give more details on the problem.
March 9, 20215 yr Author Here you go. Please note, that I removed log entries older than March (oldest where back from October 2020) from syslog and also removed most of the mover logs, as they basically just listed “moving file ABC” and nothing else. I left those in which are in proximity to the errors. You can see the error starting at line 127 in syslog. Hope this helps. mod-unraidtower-diagnostics-20210309-2038.zip
March 10, 20215 yr Community Expert In the syslog it's also log as a disk problem, what you can do for WD disks is to monitor these attributes: ID# ATTRIBUTE_NAME FLAGS VALUE WORST THRESH FAIL RAW_VALUE 1 Raw_Read_Error_Rate POSR-K 200 200 051 - 70 200 Multi_Zone_Error_Rate ---R-- 200 200 000 - 0 Ideally both should be 0, when they start climbing is a bad sign, and if they keep climbing you'll likely going to run into more issues in the future.
March 10, 20215 yr Author So you suggest to just keep an eye on it for now to see if the numbers get worse? No HDD change necessary? Can I "accept" the state as is; like "ok UnRAID these 168 read errors are "fine" for now, but warn me again when the number rises? Thanks!
March 10, 20215 yr Community Expert 32 minutes ago, Sono said: So you suggest to just keep an eye on it for now to see if the numbers get worse? Yes, you can add those attribute to the ones Unraid monitors, so will get notified if they increase. 32 minutes ago, Sono said: like "ok UnRAID these 168 read errors are "fine" for now, You can clear those by rebooting or clicking on "clear stats"
Archived
This topic is now archived and is closed to further replies.