superderpbro Posted March 6, 2022 Share Posted March 6, 2022 (edited) Woke up to an email from my server saying 23 read errors on 1 disc. Event: Unraid array errors Subject: Warning [UNDERP] - array has errors Description: Array has 1 disk with read errors Importance: warning Disk 4 - WDC_WD60EZRZ-00GZ5B1_WD-WXN1H26LW7ET (sdg) (errors 23) Didn't have time to take a look until now. I'm in the middle of an extended test on the drive but I've attached the diagnostics. Funny.. I was just thinking the other day that I should replace all discs before something goes wrong. My server has been running without problems for longer than I usually like to run spinning discs (most discs have been spinning for 5.5 years-ish). I feel like buying all new discs (including cache SSD) and copying everything over... but money is tight and I'm not even sure how to go about doing that, lol. Sucks not knowing which, if any, files may be corrupt on the disc with errors. underp-diagnostics-20220306-1403.zip Edited March 6, 2022 by superderpbro Quote Link to comment
OrdinaryButt Posted March 6, 2022 Share Posted March 6, 2022 (edited) Why not just replacing this 6tb drive first, assuming you have parity drives? Edited March 6, 2022 by OrdinaryButt Quote Link to comment
superderpbro Posted March 7, 2022 Author Share Posted March 7, 2022 (edited) Well, i had planned to down size. I have 5 6TBs (1 parity) and only 2.8TB of data on the server now. I stopped hording. Plus i I feel like my drives are so old I'll kill one during the rebuild lol Could I somehow take it out of the array? Take what data I can off it, put the data into the correct shares (of the other drives), and never add it back? I don't need it. lol Edited March 7, 2022 by superderpbro Quote Link to comment
trurl Posted March 7, 2022 Share Posted March 7, 2022 SMART attributes for disk1 look OK. Disable spindown on that disk and run an extended SMART test. Quote Link to comment
superderpbro Posted March 7, 2022 Author Share Posted March 7, 2022 (edited) I never spin down and i am 60% into an extended test. I was in there 3 days ago. Upgrading the RAM. Maybe i bumped a cable? IF that is it.. odd that it took days to error tho .. one can hope! hehe Also.. its Disk4.. or 5 if you count parity? Edited March 7, 2022 by superderpbro Quote Link to comment
trurl Posted March 7, 2022 Share Posted March 7, 2022 1 minute ago, superderpbro said: odd that it took days to error Maybe it was days before the disk was accessed. SMART attributes for disk4 also look OK. Run an extended SMART test on that one too. I see you have Most Free allocation for many of your shares, that's actually less efficient than the default highwater allocation. Quote Link to comment
superderpbro Posted March 7, 2022 Author Share Posted March 7, 2022 Disk4 is the one im running the test on. Too late to change now (Highwater)? Is it a big difference? Quote Link to comment
trurl Posted March 7, 2022 Share Posted March 7, 2022 4 minutes ago, superderpbro said: Disk4 is the one im running the test on. Too late to change now (Highwater)? Is it a big difference? You should run it on disk1 too since it was also reporting errors. Most Free allocation makes Unraid switch disks just because one disk temporarily has more free than another. Could require waiting for another drive to spin up. If lots of writing is happening, it could also get multiple data disks involved competing for parity updates at the same time. Since you're using Turbo Write it probably doesn't matter much. Quote Link to comment
trurl Posted March 7, 2022 Share Posted March 7, 2022 13 minutes ago, superderpbro said: Too late to change now Not too late to change, won't affect existing files of course. Quote Link to comment
superderpbro Posted March 7, 2022 Author Share Posted March 7, 2022 Disk1 is reporting errors too?! I dont spin my disks down. I also dont transfer much data .. *shrug* I may change it once i get all this sorted. Quote Link to comment
trurl Posted March 7, 2022 Share Posted March 7, 2022 7 minutes ago, superderpbro said: Disk1 is reporting errors too?! My bad. Misread the notification in your first post. Quote Link to comment
superderpbro Posted March 7, 2022 Author Share Posted March 7, 2022 Heh, no worries Takes SO loooong. lol Quote Link to comment
superderpbro Posted March 7, 2022 Author Share Posted March 7, 2022 (edited) Been stuck at 90% for hours Anyways, my cache SSD is at 42TBW and is 5.5 years old (24/7) now. Crucial says its endurance is 80TBW. If i buy some new drives should i also replace it? Of course after I typed that it finished. Without errors. What do i do now? underp-diagnostics-20220307-0054.zip Edited March 7, 2022 by superderpbro Quote Link to comment
JorgeB Posted March 7, 2022 Share Posted March 7, 2022 Errors are logged as a disk issue, but since the SMART test passed it's OK for now, keep monitoring, especially this attribute: ID# ATTRIBUTE_NAME FLAGS VALUE WORST THRESH FAIL RAW_VALUE 1 Raw_Read_Error_Rate POSR-K 200 200 051 - 2 If it keeps climbing you'll likely get more read errors, in that case replace the disk. Quote Link to comment
superderpbro Posted March 7, 2022 Author Share Posted March 7, 2022 (edited) Thanks. Can unspecified read errors cause corruption? I've tested every file, that had a checksum or torrent to recheck, that MC says is on disc4. They are all ok. I checksum a lot of stuff, but not everything. I don't really trust using this server with a questionable disc anymore. How do i just remove it permanently? I dont need the space. Keeping the data of course, heh. Edited March 7, 2022 by superderpbro Quote Link to comment
JorgeB Posted March 7, 2022 Share Posted March 7, 2022 12 minutes ago, superderpbro said: Can unspecified read errors cause corruption? They shouldn't. 12 minutes ago, superderpbro said: How do i just remove it permanently? I dont need the space. Keeping the data of course, heh. You have to manually move the data then shrink the array. 1 Quote Link to comment
superderpbro Posted March 7, 2022 Author Share Posted March 7, 2022 Thanks again Are there any good guides on safely moving the data? "Inside" the server. Or should i pull it off and put it back on? Quote Link to comment
JorgeB Posted March 7, 2022 Share Posted March 7, 2022 You can use the Unbalance plugin to move the data to other disks. Quote Link to comment
superderpbro Posted March 8, 2022 Author Share Posted March 8, 2022 (edited) Is the "target is busy" part a problem? EDIT: Seems to be working? Edited March 8, 2022 by superderpbro Quote Link to comment
JorgeB Posted March 8, 2022 Share Posted March 8, 2022 Script is known to be slow and/or not work properly with current releases, recommend doing it manually or using the remove drive and rebuild parity option. Quote Link to comment
superderpbro Posted March 8, 2022 Author Share Posted March 8, 2022 *sigh* Its been running for 6 hours. Will see if it worked when i get up.. Quote Link to comment
superderpbro Posted March 9, 2022 Author Share Posted March 9, 2022 (edited) They weren't kidding when they said it takes a long time, lol.. Figured it would be how long it takes to do a parity check... it's been almost 30 hours now heh EDIT: Still going................................... lol Edited March 9, 2022 by superderpbro Quote Link to comment
superderpbro Posted March 11, 2022 Author Share Posted March 11, 2022 Absolutely floored at how long this is taking. Been going since Monday night. Is it safe to stop it? Quote Link to comment
JorgeB Posted March 11, 2022 Share Posted March 11, 2022 Should be, but might require forcing a reboot. Quote Link to comment
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.