SkippyAlpha Posted April 1, 2019 Share Posted April 1, 2019 Hello guys! Looking for maybe a bit of clarity on why this may have happened. Monthly Parity check kicked off last night @2:00 April 1st. When I checked this morning I noticed that 3 consecutive disks had equal numbered read errors, as well as sync errors, all at 128. I've looked through the syslog a bit but not 100% sure what I might be looking for. One thing that stood out was that my weekly appdata backup kicked off at the same time and happened to finish at almost the exact moment the read errors stopped. I've adjusted the scheduling for that now in case it caused a problem. Other than that, nothing glaring stood out. Also the 3 disks in question looked healthy in smart data. The 3 disks were also not dropped from the array, so it seems Unraid was able to recover. Any thoughts on this? From memory, I don't recall ever having any sync/read errors before, so I was a bit surprised. Might some random file be corrupted because of this? I've attached a copy of the syslog, as well as the smart data for the 3 disks and a screenshot of my dashboard. The machine has been up for 29 days so there should be lots of historical data in there. Any help would be appreciated, thanks in advance! Parity check is still running currently btw, at 18% now. elmserver-smart-20190401-0746.zip elmserver-smart-20190401-0747 (1).zip elmserver-smart-20190401-0747.zip elmserver-syslog-20190401-0723.zip Quote Link to comment
JorgeB Posted April 1, 2019 Share Posted April 1, 2019 (edited) Next time please post the complete diagnostics: Tools -> Diagnostics All 3 disks timed out at the same time: Apr 1 02:03:19 elmserver kernel: sd 10:0:20:0: timing out command, waited 180s ... Apr 1 02:03:19 elmserver kernel: sd 10:0:21:0: timing out command, waited 180s ... Apr 1 02:03:19 elmserver kernel: sd 10:0:22:0: timing out command, waited 180s SMART reports for all 3 show a very high number of CRC errors, but they could be old error, keep on eye on that and see if the 3 disks share anything in common, power or miniSAS cable, backplane row, etc. You should also update LSI firmware to latest, there are known issues with that release. Edited April 1, 2019 by johnnie.black Quote Link to comment
SkippyAlpha Posted April 1, 2019 Author Share Posted April 1, 2019 Whoops! Sorry about that, I've attached it now. Thank you for the tip on the LSI cards firmware, I'll definitely look at getting it updated asap. Also good call, parity disk 2 and those three drives all share a backplane... I'll be making sure all cables are tight once everything completes. The udma crc errors are all old I believe, at least the majority. Thanks again elmserver-diagnostics-20190401-1004.zip Quote Link to comment
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.