Sono Posted March 7, 2021 Share Posted March 7, 2021 Hello everyone, this morning I woke up to a read error message. It occurred during the parity check, but the check finished with 0 errors. SMART Report attached. I just ran a short self test again and it says it completed w/o error. The attached report was before the self test I just ran. 1) So some questions: Should I replace the drive right away? 2) I have a situation I'm unsure what is the best approach. My drives are oldish. The drive affected is a 4 TB one (disk 2 in the array) and this is also the size of my parity. I also have no slots left currently (Basic with 4 data drives, one parity and one cache). The thing is, a 4 TB drive is between 70-100 Euro (the 100 Euro being a WD RED). A WD My book with 8 TB (to shuck) is 130 Euro (4 TB USB drives are also 100 Euro, as they are just too old). So not a lot more for double the storage. But, I can't just replace the disk 2 with an 8 TB, as I would need to first replace the parity drive. I'm kinda unsure how to go forward, If I should replace the parity drive and then disk 2, as I'm afraid that the drive might break during the parity rebuilding process that I need to do when I replace parity. Or that the read errors would come in as an issue here. Is it safer to just replace the faulty drive 2 and then after everything is back to normal go forward with upgrading? I also read about parity swap in the documentation, but it sounds "big" (it is some time since I fiddled around with the array) and I'm not sure if that it the correct route either. Thanks for suggestions. I'm currently on unraid 6.8.3. WDC_WD40EFRX-68WT0N0_WD-WCC4E1892616-20210307-1955.txt Quote Link to comment
JorgeB Posted March 8, 2021 Share Posted March 8, 2021 It's logged as a disk problem, you should run an extended SMART test. Quote Link to comment
Sono Posted March 9, 2021 Author Share Posted March 9, 2021 (edited) Attached are the results of the extended smart test and also a screenshot of the view under the main tab where errors are reported for drive2. WDC_WD40EFRX-68WT0N0_WD-WCC4E1892616-20210309-0540.txt Edited March 9, 2021 by Sono Quote Link to comment
JorgeB Posted March 9, 2021 Share Posted March 9, 2021 SMART test passed so disk is OK for now, keep monitoring for future issues. Quote Link to comment
Sono Posted March 9, 2021 Author Share Posted March 9, 2021 Thanks. I would like to understand this better and how unraid recognizes these errors or rather why they show up. Is there something I can read up on to get more knowledge about this? Quote Link to comment
JorgeB Posted March 9, 2021 Share Posted March 9, 2021 If you saved the diags (or haven't rebooted yet) post them, they might give more details on the problem. Quote Link to comment
Sono Posted March 9, 2021 Author Share Posted March 9, 2021 Here you go. Please note, that I removed log entries older than March (oldest where back from October 2020) from syslog and also removed most of the mover logs, as they basically just listed “moving file ABC” and nothing else. I left those in which are in proximity to the errors. You can see the error starting at line 127 in syslog. Hope this helps. mod-unraidtower-diagnostics-20210309-2038.zip Quote Link to comment
JorgeB Posted March 10, 2021 Share Posted March 10, 2021 In the syslog it's also log as a disk problem, what you can do for WD disks is to monitor these attributes: ID# ATTRIBUTE_NAME FLAGS VALUE WORST THRESH FAIL RAW_VALUE 1 Raw_Read_Error_Rate POSR-K 200 200 051 - 70 200 Multi_Zone_Error_Rate ---R-- 200 200 000 - 0 Ideally both should be 0, when they start climbing is a bad sign, and if they keep climbing you'll likely going to run into more issues in the future. Quote Link to comment
Sono Posted March 10, 2021 Author Share Posted March 10, 2021 So you suggest to just keep an eye on it for now to see if the numbers get worse? No HDD change necessary? Can I "accept" the state as is; like "ok UnRAID these 168 read errors are "fine" for now, but warn me again when the number rises? Thanks! Quote Link to comment
JorgeB Posted March 10, 2021 Share Posted March 10, 2021 32 minutes ago, Sono said: So you suggest to just keep an eye on it for now to see if the numbers get worse? Yes, you can add those attribute to the ones Unraid monitors, so will get notified if they increase. 32 minutes ago, Sono said: like "ok UnRAID these 168 read errors are "fine" for now, You can clear those by rebooting or clicking on "clear stats" Quote Link to comment
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.