Jump to content

Are parity checks really needed?


pyrater

Recommended Posts

So i have been an unraid user since DEC 2012 and have done a monthly parity check for the last 5 years. I have never once had a parity check find a single error and once a month i have to deal with my server running super slow for 18 hours while the check takes place. 

 

Is the parity check really needed, especially now with dual parity drives?

 

Thank you

 

-Py

Link to comment
1 minute ago, pyrater said:

I have never once had a parity check find a single error

That's how it should be, unless there was an unclean shutdown or a more serious hardware issue.

 

2 minutes ago, pyrater said:

Is the parity check really needed, especially now with dual parity drives?

Yes, it's the only way to regularly test all disks to know they are ok if you need to rebuild one, you could alternatively run long SMART tests, but it would take the same time* and not check parity.

 

*18H is a lot for 3TB parity if your sig is correct, in your case the SMART tests would be much faster, but you might want to looking at upgrading your hardware, my 8TB server takes about 15 Hours.

Link to comment

Thanx johnnie, yea it normally takes 18 - 24 hours for my 3 tb system. Not sure why or that that was slower than others. I wonder what the issue is on the time to do parity.

 

Total size: 3 TB  
Elapsed time: 13 hours, 48 minutes  
Current position: 1.68 TB (56.1 %)  
Estimated speed: 35.9 MB/sec  
Estimated finish: 10 hours, 12 minutes  
Sync errors corrected:

 

 

Model: N/A
M/B: Supermicro - H8DM8-2
CPU: Six-Core AMD Opteron™ 2419 EE @ 1800
HVM: Enabled
IOMMU: Disabled
Cache: 768 kB, 3072 kB, 6144 kB
Memory: 8 GB Single-bit ECC (max. installable capacity 8 GB)
Network: eth0: 1000 Mb/s, full duplex, mtu 1500 
 eth1: not connected
Kernel: Linux 4.14.13-unRAID x86_64
OpenSSL: 1.0.2n

server.JPG

Link to comment

I was the one that initially suggested the monthly parity check. It was not even because parity needed frequent checking, it was because the parity check was easy to do and as a byproduct all array disks are read complete. This gives the SMART system the opportunity to "notice" warning signs long before failure. Otherwise, you could have a seldom used disk that had developed bad sectors, and you'd only know when you need to do a rebuild and you get a failure. That's too late. The fact that parity is verified is just icing on the cake.

 

In those days we sometimes had users where parity was never built correctly. Unlike RAID that would defect such situations (or never let them happen in the first place), unRAID can operate flawlessly for a very long time with parity completely wrong. It is only when a drive fails that parity is used and this would get noticed when a rebuild produced garbage. So the monthly check also caught some of those situations and enabled users to get their parity straightened out before it was needed.

 

There is nothing magical about monthly. I actually don't schedule mine, and run them somewhere in the 30-60 day intervals. If 30 days is a little too frequent, changing your frequency is fine IMO. Just don't go too long.

Link to comment

The parity check combined with a proper setup of the Notification system can alert the user to a problem in his array while it is still confined to one disk and it easily correctable.  Without these two simple precautions, the user will usually not even realize there is a problem until the second disk fails (or the third, in the case of dual parity).  Then fixing the problem without losing data is often impossible!   

Link to comment
32 minutes ago, pyrater said:

I wonder what the issue is on the time to do parity.

It's usually a combination of things, in your case I'd say mostly CPU related, having some older disks and different sizes doesn't help also, but those same disks in a more recent motherboard/CPU with LSI controllers should not take more than 10 Hours, and you don't need a last gen CPU, e.g., a Supermicro X9 with a dual core Pentium with be more than enough, Xeon better if you also run VMs.

Link to comment

I have always been confused about the parity check and Write corrections to disk. During a parity check are you checking the data to make sure the parity is correct or vice versa?  How would you know which is correct, the data or the parity? After reading this thread it seems its more of a check of the hard drives to detect an early failure (or a drive that has already failed) than anything to do with the data, it just does this check as a method to detect failures kinda a 2 for 1 deal.

 

Passmark score of 5654 for my 2 cpu's.  It is an old old old system it may just be time to upgrade the internals...

Link to comment
4 minutes ago, pyrater said:

I have always been confused about the parity check and Write corrections to disk. During a parity check are you checking the data to make sure the parity is correct or vice versa?  How would you know which is correct, the data or the parity? After reading this thread it seems its more of a check of the hard drives to detect an early failure (or a drive that has already failed) than anything to do with the data, it just does this check as a method to detect failures kinda a 2 for 1 deal.

 

Passmark score of 5654 for my 2 cpu's.

The check does lots of things - but the regular scan should be run without writing corrections.

 

The two main things are:

1) It forces a full read of all disks, giving SMART a chance to detect and report weak sectors.

2) It verifies the integrity of the data - that 1+1+1=3 and not 2.9 or 3.17. So it informs that the machine doesn't have some unknown error with memory card, disk controller etc that may introduce bit errors.

Link to comment
2 minutes ago, pyrater said:

If the parity check is CPU related it must me single core only as my system is only using 1 core to 100% the rest are sub 20%.....If that is true is there any way to make it mult-threaded.

 

Yes, it's technically possible to make it multi-threaded, since each stripe is independent of other stripes. So the CPU could compute data for two stripes concurrently.

 

The main reason for not implementing a multi-threaded scan is the percentage of systems that are CPU-limited compared to the percentage of systems that are I/O-limited. Only systems that are CPU-limited would get any advantage from using multiple threads. My guess is that most installations are I/O-limited. Either by the total bandwidth supported by the disk controller or by the maximum bandwidth of the slowest disk.

Link to comment
1 minute ago, pyrater said:

I did run a disk speed test awhile back, seemed like slowest drive was 88 mb, not the 30 that the parity is doing. So i assume im CPU limited?

Most likely, assuming your using SAT2-MV8s and not some slower PCI controllers, those CPU have a very low single thread performance.

Link to comment
Just now, pyrater said:

I did run a disk speed test awhile back, seemed like slowest drive was 88 mb, not the 30 that the parity is doing. So i assume im CPU limited?

Yes. No disk you would ever be interested in is as slow as 30 MB/s for sequential read.

And 100% really is the hard limit for the CPU.

 

Given how long unRAID has been in existence, I have to assume that the code already contains the "standard" code for auto-detecting the optimum CPU instructions to compute parity for your specific processor.

Link to comment
5 minutes ago, pyrater said:

how hard would it be for me to enable multi core parity checks to see if theres a significant boost in speed?

You can't, unless you write your own program that computes parity.

 

And if you write your own parity computation program, that program would only work correctly if there are zero writes to any of the disks during the parity scan.

 

It's only LT that can introduce a multi-core parity scan that can compute in a live system, since they own the module responsible for writing parity and can catch all writes and synchronize reads and writes.

Link to comment

Archived

This topic is now archived and is closed to further replies.

×
×
  • Create New...