Read Errors and Sync Errors During Parity Check


Recommended Posts

Hello guys! Looking for maybe a bit of clarity on why this may have happened. Monthly Parity check kicked off last night @2:00 April 1st. When I checked this morning I noticed that 3 consecutive disks had equal numbered read errors, as well as sync errors, all at 128. I've looked through the syslog a bit but not 100% sure what I might be looking for. One thing that stood out was that my weekly appdata backup kicked off at the same time and happened to finish at almost the exact moment the read errors stopped. I've adjusted the scheduling for that now in case it caused a problem. Other than that, nothing glaring stood out. Also the 3 disks in question looked healthy in smart data. The 3 disks were also not dropped from the array, so it seems Unraid was able to recover.

 

Any thoughts on this? From memory, I don't recall ever having any sync/read errors before, so I was a bit surprised. Might some random file be corrupted because of this? I've attached a copy of the syslog, as well as the smart data for the 3 disks and a screenshot of my dashboard. The machine has been up for 29 days so there should be lots of historical data in there. Any help would be appreciated, thanks in advance! Parity check is still running currently btw, at 18% now.

Capture.PNG

elmserver-smart-20190401-0746.zip

elmserver-smart-20190401-0747 (1).zip

elmserver-smart-20190401-0747.zip

elmserver-syslog-20190401-0723.zip

Link to comment

Next time please post the complete diagnostics: Tools -> Diagnostics

 

All 3 disks timed out at the same time:

 

Apr  1 02:03:19 elmserver kernel: sd 10:0:20:0: timing out command, waited 180s
...
Apr  1 02:03:19 elmserver kernel: sd 10:0:21:0: timing out command, waited 180s
...
Apr  1 02:03:19 elmserver kernel: sd 10:0:22:0: timing out command, waited 180s

SMART reports for all 3 show a very high number of CRC errors, but they could be old error, keep on eye on that and see if the 3 disks share anything in common, power or miniSAS cable, backplane row, etc.

 

You should also update LSI firmware to latest, there are known issues with that release.

 

Edited by johnnie.black
Link to comment

Whoops! Sorry about that, I've attached it now. Thank you for the tip on the LSI cards firmware, I'll definitely look at getting it updated asap. Also good call, parity disk 2 and those three drives all share a backplane... I'll be making sure all cables are tight once everything completes. The udma crc errors are all old I believe, at least the majority. Thanks again

elmserver-diagnostics-20190401-1004.zip

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.