parity errors, possibly the same ones recurring from test to test

February 23, 201511 yr

So, the last few times I've done a parity check, it's reported 5 errors. 3 times in a row, 5 errors. Obviously not a good thing, but too consistent to be a coincidence, no?

how do i determine what the errors are, and get them corrected?

Quote

February 24, 201511 yr

Community Expert

Were you doing a non-correcting parity check? Or was it a correcting parity check that failed to correct the errors?

Quote

February 24, 201511 yr

Author

default parity check, Write corrections to parity disk is checked.

Quote

February 24, 201511 yr

Community Expert

What about smart reports on all drives?

Quote

February 24, 201511 yr

Author

all good, except this one

Quote

February 25, 201511 yr

Author

I ran another parity check last night, and it just finished a bit ago. It showed 5 errors again.

How do I find these recurring errors, and 'fix' them for good?

Quote

March 3, 201511 yr

Author

I ran another parity check yesterday, 5 errors again/still.

I really would like some feedback on how to 'fix' this issue once and for all.

Quote

March 3, 201511 yr

Community Expert

The best thing to do is suggested earlier and provide SMART reports on all the drives. This may well show which drive is misbehaving. If it does not, then I do not think there is an easy way to identify which drive is causing the errors.

Quote

March 3, 201511 yr

Author

all drives show good smart reports, but even if they didn't, from a usability standpoint, if unRAID feels there are errors, it should make it easy to see exactly what those errors are, and perhaps offer to help/fix them.

Asking a "normal" user to run smart reports on all their drives, one at a time, then post those results to the forum and hope for someone to find the errors is NOT a good way to handle this situation.

unRAID "knows" what these 5 errors are, why should I have to go individually scanning a dozen different drives to try to figure it out?

Attached is a pic showing the SMART reporting for all my drives. About 1/2 don't yet have extended tests, but I started an extended test for every one that still needs it.

Quote

March 3, 201511 yr

Community Expert

The problem is that unRAID does NOT know which drive caused the errors - just that the data drives do not correspond to the parity drive. That is a limitation of the simple XOR parity scheme currently being used.

When (if) unRAID moves to supporting dual parity drives, then I expect that the scheme chosen WILL allow identification of which drive caused an error. However I would not hold your breath for that becoming available.

Quote

March 3, 201511 yr

Community Expert

all drives show good smart reports, but even if they didn't, from a usability standpoint, if unRAID feels there are errors, it should make it easy to see exactly what those errors are, and perhaps offer to help/fix them.

Asking a "normal" user to run smart reports on all their drives, one at a time, then post those results to the forum and hope for someone to find the errors is NOT a good way to handle this situation.

unRAID "knows" what these 5 errors are, why should I have to go individually scanning a dozen different drives to try to figure it out?

Attached is a pic showing the SMART reporting for all my drives. About 1/2 don't yet have extended tests, but I started an extended test for every one that still needs it.

The Smart Report that they are asking for is this one. You get to it by double-clicking on 'Disk 1' (or 'Disk X' for the Xth disk) on the Main tab. Then click on the 'HEALTH' tab and the 'Disk attributes' in the box.

Added info in Edit: most of the time, attributes # 5, 196, 197, 198, and 199 are the ones that you should be concerned about. IF any of them are non-zero, that is an indication of a problem.

Quote

March 3, 201511 yr

all drives show good smart reports, but even if they didn't, from a usability standpoint, if unRAID feels there are errors, it should make it easy to see exactly what those errors are, and perhaps offer to help/fix them.

you're expecting too much at this point in time. sure it would be nice, but it's not feasible at this point in time except for SMART errors visible to unRAID6. Even then, unRAID can't offer to fix them.

Asking a "normal" user to run smart reports on all their drives, one at a time, then post those results to the forum and hope for someone to find the errors is NOT a good way to handle this situation.

That's the nature of the beast right now.

unRAID "knows" what these 5 errors are, why should I have to go individually scanning a dozen different drives to try to figure it out?

Currently the driver doesn't know what the 5 errors are, only that there were 5 errors.

ALL the drive attributes need to be reviewed.

There could be drive errors, an interface issues, or possibly a memory issue.

Pending sectors are a key attribute to review.

If you have md5sums of the files, that can be used to see if there is any kind of bitrot read errors or corruption.

After the parity check, review the syslog to see if there were any ATA errors. That would be a tell tale sign and point to a specific drive.

Quote

March 3, 201511 yr

Community Expert

Asking a "normal" user to run smart reports on all their drives, one at a time, then post those results to the forum and hope for someone to find the errors is NOT a good way to handle this situation.

unRAID "knows" what these 5 errors are, why should I have to go individually scanning a dozen different drives to try to figure it out?

As was mentioned unRAID does not know where the errors are. At least with v6 it is easy to get the SMART reports via the standard GUI. If you have notifications turned on then you can also get told about changes in key SMART attributes.

Quote

March 3, 201511 yr

Author

Okay, so I'm mistaken about unRAID knowing where the errors are; it happens

Meaning I have to find the problem(s) myself, with the smart reports (I've attached them all to this post).

I only saw a few errors while compiling the screenshots, most of which are on disk 5, which I mentioned in my first post, and linked to the thread discussing those errors. They have not grown or changed for a long time (as far as I can tell), so I'm still not sure why running a correcting parity check would result in exactly 5 errors coming back every time I run a parity check. I assume that they should be 'corrected' in the parity by this process, but it seems that's just not happening.

So, I still don't know what I can/should do to resolve this, nor do I know if my data is "okay" or not. I have no MD5 information for anything, nor do I have a good grasp of how to generate such information.

Quote

March 3, 201511 yr

Community Expert

Meaning I have to find the problem(s) myself, with the smart reports (I've attached them all to this post).

Seem to be missing!

I only saw a few errors while compiling the screenshots, most of which are on disk 5, which I mentioned in my first post, and linked to the thread discussing those errors. They have not grown or changed for a long time (as far as I can tell), so I'm still not sure why running a correcting parity check would result in exactly 5 errors coming back every time I run a parity check. I assume that they should be 'corrected' in the parity by this process, but it seems that's just not happening.

That tends to mean that either a data disk is being read unreliably and does not return the same data each time, or there is a write issue on the parity disk so what is read back is not what was written. Hopefully the reports will give a clue.

Quote

March 3, 201511 yr

Author

sorry, they were too big to fit. I'm adding them now

Quote

March 3, 201511 yr

Author

last one...

Quote

March 3, 201511 yr

Community Expert

Disk 2, you have ID#199 CRC Errors. This indicates a problem transferring data from the disk (where it was read correctly) to the Motherboard. This generally indicates cabling issues. Could be a bad cable, loose connection, or cross-talk between cables (caused by tying cable together to make things 'neat'). Remote possibility is a SATA controller.

Disk 5, has ID# 5 reallocated sectors. While the number is high, it is not an indication of a problem unless the number keeps increasing.

You have a number of disks reporting an ID# 187 Reported uncorrect errors on various disks. I am not sure what the significance of this condition is... (None of my disks even report this parameter and a quick Google search found nothing to answer this.)

EDIT: Look here for information on SMART attributes:

https://en.wikipedia.org/wiki/S.M.A.R.T.#Known_ATA_S.M.A.R.T._attributes

Quote

March 3, 201511 yr

Justin => I have to wonder if somehow the correcting/non-correcting status bit isn't set correctly.

Try "toggling" it.

i.e. Uncheck the box; start a parity check; then Stop the check. Now Check the box to correct the errors and then run another parity check.

At the end of the check, it should show that it corrected 5 errors (assuming this remains consistent). If so, then run it again and see if it's finally stable.

Quote

March 3, 201511 yr

Author

Thanks for the analysis.

So, check/replace cable from Disk2 as step 1. I've not tied, or otherwise bundled any of the cables, but there are many cables, so it could just be loose from moving everything else around.

It seems that the only other 'issue' is the "ID# 187 Reported uncorrect errors" on disks 5, 8 and the cache disk. All 3 disks are different manufacturers, sizes and ages, so nothing really in common with them, other than this error; strange.

When I review/fix the cable for disk2, I'll just double check those disks also, and confirm if they are all connecting to the same SATA controller, or anything else they may have in common.

I'm still not too sure why the correcting parity doesn't 'fix' the parity disk to reflect the array disks, even if one/some of the array disks have issues. it seems like parity should adjust to match the array disks, but as you can see; i'm nowhere near an expert in any of this

Quote

March 3, 201511 yr

Author

Justin => I have to wonder if somehow the correcting/non-correcting status bit isn't set correctly.

Try "toggling" it.

i.e. Uncheck the box; start a parity check; then Stop the check. Now Check the box to correct the errors and then run another parity check.

At the end of the check, it should show that it corrected 5 errors (assuming this remains consistent). If so, then run it again and see if it's finally stable.

good idea. I'll do that after I get a chance to check the disk cables, to hopefully eliminate that as a potential issue also.

I'm going to wait until all the extended SMART reports finish, so it'll be a couple hours before i can do anything else.

Quote

March 3, 201511 yr

Community Expert

Disk 5, has ID# 5 reallocated sectors. While the number is high, it is not an indication of a problem unless the number keeps increasing.

This might account for the '5 errors that will not go away' as Pending sectors can return a different value each time they are read (which is one reason they tend to impact any recovery from failure).

Quote

March 3, 201511 yr

Community Expert

Disk 5, has ID# 5 reallocated sectors. While the number is high, it is not an indication of a problem unless the number keeps increasing.

This might account for the '5 errors that will not go away' as Pending sectors can return a different value each time they are read (which is one reason they tend to impact any recovery from failure).

It is my understanding that Reallocated Sectors are 'bad' sectors that have already been retired from service and have been replaced by 'good' sectors from a pool of sectors that the drive manufacturer set up for this propose. Since there are no other parameters that are usually watched have any current failures, the drive may be OK. What, as I understand it, is that we don't want to see the number increase.

Oh, I went back through the disks again (I have problems reading the light gray text on black background) and noticed that Disks 4 and 6 also have CRC errors! I would be checking to see if a SATA card was common to all of these disks!

Quote

March 3, 201511 yr

Disk 5, has ID# 5 reallocated sectors. While the number is high, it is not an indication of a problem unless the number keeps increasing.

This might account for the '5 errors that will not go away' as Pending sectors can return a different value each time they are read (which is one reason they tend to impact any recovery from failure).

No -- a reallocated sector is simply a sector that has been re-mapped to a spare sector. It's NOT a bad sector. This is a normal function for modern disks, which have a number of spare sectors that can be assigned to replace sectors that fail. A more significant issue is "pending sectors" -- which are sectors that have exhibited issues but have not yet been reallocated.

Quote

March 3, 201511 yr

Community Expert

Disk 5, has ID# 5 reallocated sectors. While the number is high, it is not an indication of a problem unless the number keeps increasing.

This might account for the '5 errors that will not go away' as Pending sectors can return a different value each time they are read (which is one reason they tend to impact any recovery from failure).

No -- a reallocated sector is simply a sector that has been re-mapped to a spare sector. It's NOT a bad sector. This is a normal function for modern disks, which have a number of spare sectors that can be assigned to replace sectors that fail. A more significant issue is "pending sectors" -- which are sectors that have exhibited issues but have not yet been reallocated.

You are right - I saw the figure '5' without noticing that it was Reallocated sectors rather than Pending Sectors. Hope fully this was a bit more obvious by the fact that the text I posted mentioned Pending Sectors.

Quote

parity errors, possibly the same ones recurring from test to test

Featured Replies

Archived

Account

Navigation

Search

Configure browser push notifications

Chrome (Android)

Chrome (Desktop)

Safari (iOS 16.4+)

Safari (macOS)

Edge (Android)

Edge (Desktop)

Firefox (Android)

Firefox (Desktop)