Jump to content

Parity Valid but 1 Error After Check w/o Corrections. What next?!


Mat1926

Recommended Posts

5 hours ago, Mat1926 said:

 

Since Parity is okay now, doesn't this confirm that the data is okay?

 

When you get a difference after a failed shutdown, then it's obvious to assume that the difference is caused by some data not reaching the disk.

 

But when you get a difference without a hard shutdown, then you can't just know exactly what caused the difference. It might be issues with the processor, the RAM, the PSU, transfer errors or some issue with one or more disks. The correcting parity run will clear any difference but that doesn't guarantee that all disks will contain correct data.

 

That's a reason why file servers should use ECC memory etc, and should not be overclocked and why it's good to use processors that has internal ECC for cache and memory buses. Your data depends on your hardware being 100% trustworthy.

Link to comment
2 hours ago, pwm said:

The correcting parity run will clear any difference but that doesn't guarantee that all disks will contain correct data.

 

@johnnie.black @pwm

I use ECC ram, and everything is @ stock....

 

In my case the correcting run did not find any errors at all, contradicting the previous non-correcting run! So, since all the disks are reporting the same numbers/hashes -not sure what you guys call them- then does this suggest that things look good for the existing data?

 

Thnx

 

 

 

 

 

Link to comment

If you get different results from two runs without having done any correction in-between, that means that

- one disk returned invalid data one time and correct data the next time (disk error, cable error, PSU error, controller card error, ...)

- the processor or RAM goofed in some way that resulted in data corruption of the just read data

- some corruption resulted in the wrong disk blocks being retrieved

- ...

 

In the end, something in the machine did goof during one of the non-correcting runs.

It doesn't help if the data is correctly stored on the disks, if the machine sometimes may hand over invalid data.

Link to comment
1 minute ago, pwm said:

It doesn't help if the data is correctly stored on the disks, if the machine sometimes may hand over invalid data.

 

Isn't the data is safely stored on the data disks and independent of any parity data? We need the parity to rebuild any missing data, but in my case I did not loose any data...

Link to comment
Just now, johnnie.black said:

Most likely, but it would be good to have checksums for the future in case it happens again.

 

I am in the process of transferring a lot of data to the array, then a non-correcting run will be executed, and if okay, then I will install the data integrity plugin...

Link to comment
36 minutes ago, Mat1926 said:

 

Isn't the data is safely stored on the data disks and independent of any parity data? We need the parity to rebuild any missing data, but in my case I did not loose any data...

 

But if the machine returns the wrong data at one time, it can return the wrong data at a later time too. And without knowing what part of the system that did wrong, we can't know if it's a data disk that returns the wrong data the next time. Or if it will write incorrect data instead of reading incorrect data.

Link to comment
6 minutes ago, pwm said:

 

But if the machine returns the wrong data at one time, it can return the wrong data at a later time too. And without knowing what part of the system that did wrong, we can't know if it's a data disk that returns the wrong data the next time. Or if it will write incorrect data instead of reading incorrect data.

 

Parity is stored in the array and helps in re-building any missing data, one of 2 parity disks returned different value once, we still have the data that is stored and the other parity and they are in agreement...So doesn't this eliminate any doubts? Or due to my lack of knowledge on the subject what I just said is wrong...

Link to comment

@johnnie.black

After transferring over 1 TB of data to the array, I started a non-correcting parity check, now it should finish within the next few hours. My array consists of 8 TB and 10 TB disks.

 

All the 8 TB disks LEDs are now inactive, and only the 10 TB disks LEDs ( 2 x parity and 2 x Data) are active. So far, no errors...Does this suggest that all the data on the 8 TB disks are w/o errors and now the system is checking the remaining data on the 10 TB disks?

Link to comment

Archived

This topic is now archived and is closed to further replies.

×
×
  • Create New...