20 parity errors during latest check

iarp · May 9, 2018

I was just wondering, using the log system if there is anyway possible to track down what is located at the sectors listed as being in error. Also does P corrected mean it auto-fixed parity?

I'm not sure where these errors came from, I know we've had 3 power outtages in the last 2 weeks but UPS took over and i manually stopped array and shutdown properly before it ran out every time.

May  5 04:09:56 storage kernel: md: recovery thread: P corrected, sector=2666581768
May  5 04:09:56 storage kernel: md: recovery thread: P corrected, sector=2666581792
May  5 04:09:56 storage kernel: md: recovery thread: P corrected, sector=2666581800
May  5 04:09:56 storage kernel: md: recovery thread: P corrected, sector=2666581816
May  5 04:09:56 storage kernel: md: recovery thread: P corrected, sector=2666581832
May  5 04:09:56 storage kernel: md: recovery thread: P corrected, sector=2666581848
May  5 04:09:56 storage kernel: md: recovery thread: P corrected, sector=2666581872
May  5 04:09:56 storage kernel: md: recovery thread: P corrected, sector=2666581904
May  5 04:09:56 storage kernel: md: recovery thread: P corrected, sector=2666581944
May  5 04:09:56 storage kernel: md: recovery thread: P corrected, sector=2666581984
May  5 04:09:56 storage kernel: md: recovery thread: P corrected, sector=2666582048
May  5 04:09:56 storage kernel: md: recovery thread: P corrected, sector=2666582128
May  5 04:09:56 storage kernel: md: recovery thread: P corrected, sector=2666582240
May  5 04:09:56 storage kernel: md: recovery thread: P corrected, sector=2666582280
May  5 04:09:56 storage kernel: md: recovery thread: P corrected, sector=2666582408
May  5 04:09:56 storage kernel: md: recovery thread: P corrected, sector=2666582864
May  5 04:09:56 storage kernel: md: recovery thread: P corrected, sector=2666582872
May  5 04:09:56 storage kernel: md: recovery thread: P corrected, sector=2666582880
May  5 04:09:56 storage kernel: md: recovery thread: P corrected, sector=2666582888
May  5 04:09:56 storage kernel: md: recovery thread: P corrected, sector=2666583008

I'm just curious if its possible to track what data is at that sector/location to see if the file(s) are still ok or not.

pwm · May 9, 2018

No file data is stored on these addresses on the parity drive - just raw parity.

It would depend on file system if/how you can translate sector addresses into file data on the individual data disks.

trurl · May 9, 2018

You should run another parity check to make sure it's good now. You always want your last parity check to have exactly zero errors.

iarp · May 9, 2018

15 hours ago, pwm said:

No file data is stored on these addresses on the parity drive - just raw parity.

It would depend on file system if/how you can translate sector addresses into file data on the individual data disks.

How do we know that the sectors were on parity side?

I'll start another check now just to be sure. It'll take another day to run to find out.

JorgeB · May 9, 2018

15 minutes ago, iarp said:

How do we know that the sectors were on parity side?

We don't, we know they were corrected on the parity side so it agrees with the data on the disks, you'd need to have checksums (or be using btrfs) to check your data and make sure it's all unchanged, though it's more likely the errors really are on the parity disk.

pwm · May 9, 2018

1 hour ago, johnnie.black said:

We don't, we know they were corrected on the parity side so it agrees with the data on the disks, you'd need to have checksums (or be using btrfs) to check your data and make sure it's all unchanged, though it's more likely the errors really are on the parity disk.

I assume unRAID give higher priority to push out writes to the individual data disks - that's the traditional way for a system that doesn't have a battery backed RAID controller.

But it would actually be interesting if unRAID noted number of bit errors when it does find a block that needs correction.

Memory errors etc normally results in one or a few bits errors in a 4kB disk block - and in that case it's impossible to guess which disks are affected.

Synchronization errors between disks relating to problematic shutdowns, file system bugs or similar would normally fail a large percent of the bytes in each 4kB block, since the parity sector is computed for completely different data. And in that case, it's normally the data disks that are "master". But that might potentially need file system repair depending on what journaling/fallback logic the specific file system types has.

20 parity errors during latest check

Recommended Posts

iarp

Link to comment

pwm

Link to comment

trurl

Link to comment

iarp

Link to comment

JorgeB

Link to comment

pwm

Link to comment

Archived