Jump to content

Read Errors and Sync Errors During Parity Check


SkippyAlpha

Recommended Posts

Hello guys! Looking for maybe a bit of clarity on why this may have happened. Monthly Parity check kicked off last night @2:00 April 1st. When I checked this morning I noticed that 3 consecutive disks had equal numbered read errors, as well as sync errors, all at 128. I've looked through the syslog a bit but not 100% sure what I might be looking for. One thing that stood out was that my weekly appdata backup kicked off at the same time and happened to finish at almost the exact moment the read errors stopped. I've adjusted the scheduling for that now in case it caused a problem. Other than that, nothing glaring stood out. Also the 3 disks in question looked healthy in smart data. The 3 disks were also not dropped from the array, so it seems Unraid was able to recover.

 

Any thoughts on this? From memory, I don't recall ever having any sync/read errors before, so I was a bit surprised. Might some random file be corrupted because of this? I've attached a copy of the syslog, as well as the smart data for the 3 disks and a screenshot of my dashboard. The machine has been up for 29 days so there should be lots of historical data in there. Any help would be appreciated, thanks in advance! Parity check is still running currently btw, at 18% now.

Capture.PNG

elmserver-smart-20190401-0746.zip

elmserver-smart-20190401-0747 (1).zip

elmserver-smart-20190401-0747.zip

elmserver-syslog-20190401-0723.zip

Link to comment

Next time please post the complete diagnostics: Tools -> Diagnostics

 

All 3 disks timed out at the same time:

 

Apr  1 02:03:19 elmserver kernel: sd 10:0:20:0: timing out command, waited 180s
...
Apr  1 02:03:19 elmserver kernel: sd 10:0:21:0: timing out command, waited 180s
...
Apr  1 02:03:19 elmserver kernel: sd 10:0:22:0: timing out command, waited 180s

SMART reports for all 3 show a very high number of CRC errors, but they could be old error, keep on eye on that and see if the 3 disks share anything in common, power or miniSAS cable, backplane row, etc.

 

You should also update LSI firmware to latest, there are known issues with that release.

 

Link to comment

Whoops! Sorry about that, I've attached it now. Thank you for the tip on the LSI cards firmware, I'll definitely look at getting it updated asap. Also good call, parity disk 2 and those three drives all share a backplane... I'll be making sure all cables are tight once everything completes. The udma crc errors are all old I believe, at least the majority. Thanks again

elmserver-diagnostics-20190401-1004.zip

Link to comment

Archived

This topic is now archived and is closed to further replies.

×
×
  • Create New...