Jump to content

Have 783 errors on one tower


Blade

Recommended Posts

My parity check ran on Dec 1 and I have 783 errors on my 2nd tower.

What should I do to correct this?

I want to make sure I do this correctly.

Thx

It depends on where you see the errors.

 

Were they parity errors?  Or read errors affiliated with a specific disk?

 

Joe L.

Link to comment

Did you have a power outage or otherwise had to hard boot your server?

 

Although it is true that unRaid will adjust parity so as to bring it in alignment with the data disks - that doesn't mean it truly corrected the cause of the problem. Suggest you post a syslog.

Link to comment

How can one determine if the errors are in data or parity? What if parity is "corrected" to reflect the data but the errors lie in the data? How do we decide wether to rebuild a data disk from parity or rebuild the parity disk from data?

If you press the "check" button on the supplied unMENU management interface parity is ALWAYS corrected to match what was read from the data disks.  The button is not a read-only check, it checks AND corrects parity.

 

It is possible, from the command line, to issue a read-only parity check.  It will report on the errors, but not correct them.  It can be run multiple times and will report the same errors, again and again.

 

Determining which disk is at fault in an error is a huge problem.  You basically can only verify checksums of files, if yo know what they should be, and the parity errors might be on a part of the disk not even in use.  A parity error is where the bits across a bit position are not an "even" number.  Knowing which bit needs to be flipped gets harder the more disks in your array.

 

If you had a read-only parity check (the automatic one on the first of the month many of us have out into place is a read-only check) you can run it again and  again in read-only mode.  In the syslog will be printed the first 10 or so parity errors.  If they are consistently in the same locations the errors are probably on a disk.  If in different locations if is probably hardware possibly memory possibly a disk possibly the power supply.  This is the advantage of the read-only check.

 

If the errors were detected in a read-only check you could check the syslog for disk errors.    You might be able to track the errors to a specific disk. (and you might not)

 

If you press the "Check" button on the unRAID management interface it will correct the parity disk to reflect what is being read from the data disks.  If after correcting parity you press it again and see more errors with a second parity check than it is probably a hardware issue or a failing disk. 

Link to comment

Archived

This topic is now archived and is closed to further replies.

×
×
  • Create New...