Parity errors?

April 21, 201610 yr

I have recently added the parity drive and after the initial parity check I have observed one of my 4 TB disks shows 9162 errors.

What are those errors? Can I repair those errors?

Do I still have the system protected by the parity drive?

Thankyou

Gus

Quote

April 21, 201610 yr

Community Expert

A correcting parity check can sometimes incorrectly update parity when there are disk errors, post diagnostics (tools > diagnostics)

Quote

April 21, 201610 yr

Author

A correcting parity check can sometimes incorrectly update parity when there are disk errors, post diagnostics (tools > diagnostics)

Here it is.

Thankyou

Gus

tower-diagnostics-20160421-1448.zip

Quote

April 21, 201610 yr

Community Expert

Serial Number:    WD-WCC4E1170324
197 Current_Pending_Sector  0x0032   200   200   000    Old_age   Always       -       10

Disk2 has 10 pending. Normally I would say rebuild it but since you don't have reliable parity yet I would start by copying all its files to a drive not in the array, either to another computer on your network or to a drive mounted outside the array with Unassigned Devices.

Quote

April 21, 201610 yr

Community Expert

Disk 2 has pending sectors and should be replaced:

Device Model:     WDC WD40EFRX-68WT0N0
Serial Number:    WD-WCC4E1170324

197 Current_Pending_Sector  0x0032   200   200   000    Old_age   Always       -       10

Parity was however incorrectly updated, this is why I recommend always running non-correcting parity checks:

Apr 21 02:08:31 Tower kernel: md: disk2 read error, sector=7468304696
Apr 21 02:08:31 Tower kernel: md: disk2 read error, sector=7468304704
Apr 21 02:08:31 Tower kernel: md: disk2 read error, sector=7468304712
Apr 21 02:08:31 Tower kernel: md: disk2 read error, sector=7468304720
Apr 21 02:08:31 Tower kernel: md: correcting parity, sector=7468304728
Apr 21 02:08:31 Tower kernel: md: correcting parity, sector=7468304736
Apr 21 02:08:31 Tower kernel: md: correcting parity, sector=7468304744
Apr 21 02:08:31 Tower kernel: md: correcting parity, sector=7468304752
Apr 21 02:08:31 Tower kernel: md: correcting parity, sector=7468304760
Apr 21 02:08:31 Tower kernel: md: correcting parity, sector=7468304768
Apr 21 02:08:31 Tower kernel: md: correcting parity, sector=7468304776
Apr 21 02:08:31 Tower kernel: md: correcting parity, sector=7468304784
Apr 21 02:08:31 Tower kernel: md: correcting parity, sector=7468304792
Apr 21 02:08:31 Tower kernel: md: correcting parity, sector=7468304800
Apr 21 02:08:31 Tower kernel: md: correcting parity, sector=7468304808
Apr 21 02:08:31 Tower kernel: md: correcting parity, sector=7468304816
Apr 21 02:08:31 Tower kernel: md: correcting parity, sector=7468304824
Apr 21 02:08:31 Tower kernel: md: correcting parity, sector=7468304832
Apr 21 02:08:31 Tower kernel: md: correcting parity, sector=7468304840
Apr 21 02:08:31 Tower kernel: md: correcting parity, sector=7468304848
Apr 21 02:08:31 Tower kernel: md: correcting parity, sector=7468304856
Apr 21 02:08:31 Tower kernel: md: correcting parity, sector=7468304864
Apr 21 02:08:31 Tower kernel: md: correcting parity, sector=7468304872
Apr 21 02:08:31 Tower kernel: md: correcting parity, sector=7468304880
Apr 21 02:08:31 Tower kernel: md: correcting parity, sector=7468304888
Apr 21 02:08:31 Tower kernel: md: correcting parity, sector=7468304896
Apr 21 02:08:31 Tower kernel: md: correcting parity, sector=7468304904
Apr 21 02:08:31 Tower kernel: md: correcting parity, sector=7468304912
Apr 21 02:08:31 Tower kernel: md: correcting parity, sector=7468304920
Apr 21 02:08:31 Tower kernel: md: correcting parity, sector=7468304928
Apr 21 02:08:31 Tower kernel: md: correcting parity, sector=7468304936
Apr 21 02:08:31 Tower kernel: md: correcting parity, sector=7468304944
Apr 21 02:08:31 Tower kernel: md: correcting parity, sector=7468304952
Apr 21 02:08:31 Tower kernel: md: correcting parity, sector=7468304960
Apr 21 02:08:31 Tower kernel: md: correcting parity, sector=7468304968
Apr 21 02:08:31 Tower kernel: md: correcting parity, sector=7468304976
Apr 21 02:08:31 Tower kernel: md: correcting parity, sector=7468304984
Apr 21 02:08:31 Tower kernel: md: correcting parity, sector=7468304992
Apr 21 02:08:31 Tower kernel: md: correcting parity, sector=7468305000
Apr 21 02:08:31 Tower kernel: md: correcting parity, sector=7468305008
Apr 21 02:08:31 Tower kernel: md: correcting parity, sector=7468305016
Apr 21 02:08:31 Tower kernel: md: correcting parity, sector=7468305024
Apr 21 02:08:31 Tower kernel: md: correcting parity, sector=7468305032
Apr 21 02:08:31 Tower kernel: md: correcting parity, sector=7468305040
Apr 21 02:08:31 Tower kernel: md: correcting parity, sector=7468305048
Apr 21 02:08:31 Tower kernel: md: correcting parity, sector=7468305056
Apr 21 02:08:31 Tower kernel: md: correcting parity, sector=7468305064
Apr 21 02:08:31 Tower kernel: md: correcting parity, sector=7468305072
Apr 21 02:08:31 Tower kernel: md: correcting parity, sector=7468305080
Apr 21 02:08:31 Tower kernel: md: correcting parity, sector=7468305088
Apr 21 02:08:31 Tower kernel: md: correcting parity, sector=7468305096
Apr 21 02:08:31 Tower kernel: md: correcting parity, sector=7468305104
Apr 21 02:08:31 Tower kernel: md: correcting parity, sector=7468305112
Apr 21 02:08:31 Tower kernel: md: correcting parity, sector=7468305120
Apr 21 02:08:31 Tower kernel: md: correcting parity, sector=7468305128
Apr 21 02:08:31 Tower kernel: md: correcting parity, sector=7468305136
Apr 21 02:08:31 Tower kernel: md: correcting parity, sector=7468305144
Apr 21 02:08:31 Tower kernel: md: correcting parity, sector=7468305152
Apr 21 02:08:31 Tower kernel: md: correcting parity, sector=7468305160
Apr 21 02:08:31 Tower kernel: md: correcting parity, sector=7468305168
Apr 21 02:08:31 Tower kernel: md: correcting parity, sector=7468305176
Apr 21 02:08:31 Tower kernel: md: correcting parity, sector=7468305184
Apr 21 02:08:31 Tower kernel: md: correcting parity, sector=7468305192
Apr 21 02:08:31 Tower kernel: md: correcting parity, sector=7468305200
Apr 21 02:08:31 Tower kernel: md: correcting parity, sector=7468305208
Apr 21 02:08:31 Tower kernel: md: correcting parity, sector=7468305216
Apr 21 02:08:31 Tower kernel: md: correcting parity, sector=7468305224
Apr 21 02:08:31 Tower kernel: md: correcting parity, sector=7468305232
Apr 21 02:08:31 Tower kernel: md: correcting parity, sector=7468305240
Apr 21 02:08:31 Tower kernel: md: correcting parity, sector=7468305248
Apr 21 02:08:31 Tower kernel: md: correcting parity, sector=7468305256
Apr 21 02:08:31 Tower kernel: md: correcting parity, sector=7468305264
Apr 21 02:08:31 Tower kernel: md: correcting parity, sector=7468305272
Apr 21 02:08:31 Tower kernel: md: correcting parity, sector=7468305280
Apr 21 02:08:31 Tower kernel: md: correcting parity, sector=7468305288
Apr 21 02:08:31 Tower kernel: md: correcting parity, sector=7468305296
Apr 21 02:08:31 Tower kernel: md: correcting parity, sector=7468305304
Apr 21 02:08:31 Tower kernel: md: correcting parity, sector=7468305312

This is what I would do:

Replace disk2 and let it rebuild, then and if you have checksums for all files check which ones are corrupt and replace them.

If you don't have checksums use a file compare utility to compare the rebuilt disk with the old one.

Quote

April 21, 201610 yr

Author

Sorry but I don't really know if I got it...

Those disk errors represent a hardware disk sector failure? can I claim guarantee?

Once I get out all the content from this disk is there any way to reuse this disk?

If I remember well on my old synology the first time you put a disk, the system made a scaning of the disk to find if there where errors and mark those errors. Is the same as when we format a new drive in unraid?

Replace disk2 and let it rebuild

I'll use parity disk (for the momment) as the disk to copy the files and buy a new disk as parity.

Basically the disk with errors contain some movies (no important content).

When I copy those files to the new disk, the files with those errors will give me error when trying to copy?

Once discarted the problematic files, will I be safe if I try to recreate parity?

Thankyou

Gus

Quote

April 21, 201610 yr

Community Expert

Pending sectors are usually bad sectors, if the disk is under warranty WD will replace it.

You have more than one option to recover data, including:

1-use a spare disk to rebuild disk2, some files will be corrupt, if there are only videos they should still play and corruption can be almost unnoticeable, use old disk or checksums to find and replace corrupt files (this would be my preferred option).

2-remove disk from the array, do a new config with or without a new disk and copy all data that you can from the old disk disk, you probably won't be able to copy files from the affected sectors.

3-since you have space, move all the files you can from disk2 to other disk(s), then replace disk2 and do a parity sync instead of a rebuild.

Quote

April 21, 201610 yr

Author

One last thing.

I have disabled parity disk to use it as a temporal backup (since I get a new replacement unit) of the drive with those bad sectors but the bad sectors have disappeared...

Whats the explanation to this?

Thankyou

Gus

Quote

April 21, 201610 yr

One last thing.

I have disabled parity disk to use it as a temporal backup (since I get a new replacement unit) of the drive with those bad sectors but the bad sectors have disappeared...

What makes you think the bad sectors are gone? The unraid webgui drive error counter is reset every time you stop the array, and is just incremented every time unraid is unable to read from the disk. You need to look at the smart report to view the bad sector counts.

Quote

April 21, 201610 yr

Author

What makes you think the bad sectors are gone? The unraid webgui drive error counter is reset every time you stop the array, and is just incremented every time unraid is unable to read from the disk. You need to look at the smart report to view the bad sector counts.

OK, I got it.

Whats the way to look at the smart report?

Thankyou

Gus

Quote

April 21, 201610 yr

Whats the way to look at the smart report?

Click on the device text and scroll down to the attributes section.

Quote

April 21, 201610 yr

Author

Here they are:

Is nº1 the field to look at?

#	Attribute Name	Flag	Value	Worst	Threshold	Type	Updated	Failed	Raw Value
1	Raw read error rate	0x002f	200	200	051	Pre-fail	Always	Never	229
3	Spin up time	0x0027	193	175	021	Pre-fail	Always	Never	7341
4	Start stop count	0x0032	092	092	000	Old age	Always	Never	8712
5	Reallocated sector count	0x0033	200	200	140	Pre-fail	Always	Never	0
7	Seek error rate	0x002e	100	253	000	Old age	Always	Never	0
9	Power on hours	0x0032	080	080	000	Old age	Always	Never	14838 (1y, 8m, 9d, 6h)
10	Spin retry count	0x0032	100	100	000	Old age	Always	Never	0
11	Calibration retry count	0x0032	100	100	000	Old age	Always	Never	0
12	Power cycle count	0x0032	100	100	000	Old age	Always	Never	154
192	Power-off retract count	0x0032	200	200	000	Old age	Always	Never	91
193	Load cycle count	0x0032	198	198	000	Old age	Always	Never	8655
194	Temperature celsius	0x0022	120	105	000	Old age	Always	Never	32
196	Reallocated event count	0x0032	200	200	000	Old age	Always	Never	0
197	Current pending sector	0x0032	200	200	000	Old age	Always	Never	10
198	Offline uncorrectable	0x0030	100	253	000	Old age	Offline	Never	0
199	UDMA CRC error count	0x0032	200	200	000	Old age	Always	Never	0
200	Multi zone error rate	0x0008	100	253	000	Old age	Offline	Never	0

Thankyou

Gus

Quote

April 21, 201610 yr

Community Expert

Attribute 197 - Current pending sector is one of the most important and this disk's issue, it should always be 0.

Quote

April 21, 201610 yr

Author

Thankyou @johnnie.black

I'm now using mc to copy the content of the disk with those bad sectors to a new drive.

It's supposed that while I'm copying data the unraid webgui errors parameter will increment?

Thankyou

Gus

Quote

April 21, 201610 yr

Community Expert

When you try to copy a file on the affected sectors you should see the error counter increase and will probably get an error from mc that it can't copy that file.

With some luck it only affects 1 or 2 files.

Quote

April 22, 201610 yr

Author

When you try to copy a file on the affected sectors you should see the error counter increase and will probably get an error from mc that it can't copy that file.

With some luck it only affects 1 or 2 files.

mc stopped the copy process (I ssh unraid), perhaps it's disconnected when the computer enters sleep mode?

Now copying the content trough my windows vm... as you said the errors have appeared, by now 92 errors and 2 files with those read errors... 216min to go.

Gus

Quote

April 22, 201610 yr

mc stopped the copy process (I ssh unraid), perhaps it's disconnected when the computer enters sleep mode?

Precisely. If you are remotely accessing the console (as opposed to typing on a keyboard attached to unraid) then you need to either make sure the session will not drop, or invoke the screen command (available in the nerdtools plugin) before starting any lengthy activity. The cool thing about screen is that you can detach from a session initiated from one location or method, and reattach to the same session from elsewhere. For example, you could start a screen session from the local keyboard, detach and leave it running, then reattach from a SSH session later.

Just be cognizant of what you leave open, if you have a session open with an active prompt on an array drive, shutdown will fail until you close that session.

Quote

April 22, 201610 yr

Author

Thankyou @jonathanm

Gona take a look at screen !!!

Gus

Quote

April 24, 201610 yr

Author

I'm in the 2/3 process of a preclear of the "faulty" drive with those unallocated sectors

but when I have observed the smart again, I see that the "Current pending sector of 10" have disappeared and shows 0.

Must I wait to finish de 3rd preclear to get a "reliable" result?

Has the drive corrected those sectors?

I have a RMA, what must I do now? (they will not see any pending sector at WD)

Thankyou

Gus

Quote

April 24, 201610 yr

Personal opinion here, others may have different viewpoints. If the final smart stats after all the preclear cycles look good, I'd be tempted to keep the drive vs. getting an unknown refurb from the warranty process. Devil you know, etc. Post a smart report after all three cycles are done and you should get some better opinions on keep vs. trade.

BTW, WD will warranty replace a perfectly good drive if you tell them you don't trust it. It's too much hassle for them to vet each RMA request, and they are just going to turn around and test your incoming drive, if it looks decent to them it will get a refurb lable, have the smart data reset, and sent out to the next customer (victim) with an RMA. You never know if the drive you get as a replacement has some funky issue that evaded their testing.

Quote

April 24, 201610 yr

Or better yet, show the various smart reports to the vendor you bought the drive from and get a replacement from them. (one of the reason I never buy hard drives online -> so much easier to return them at a brick and mortar store if they are semi-questionable than online - well worth the extra $10 I spend)

EDIT: Although at 14000+ power on hours, you pretty much have no choice but to RMA it if you want.

Quote

April 24, 201610 yr

Community Expert

Personal opinion here, others may have different viewpoints. If the final smart stats after all the preclear cycles look good, I'd be tempted to keep the drive vs. getting an unknown refurb from the warranty process. Devil you know, etc. Post a smart report after all three cycles are done and you should get some better opinions on keep vs. trade.

BTW, WD will warranty replace a perfectly good drive if you tell them you don't trust it. It's too much hassle for them to vet each RMA request, and they are just going to turn around and test your incoming drive, if it looks decent to them it will get a refurb lable, have the smart data reset, and sent out to the next customer (victim) with an RMA. You never know if the drive you get as a replacement has some funky issue that evaded their testing.

+1

In my experience refurbished disks have at best a 50/50 chance of lasting more than a couple of months, post a SMART report when the preclear finishes.

Quote

April 24, 201610 yr

Author

In this case, the vendor ( a big online vendor ) will give me the money back to buy a new one.

Gus

Quote

April 24, 201610 yr

Community Expert

In that case I'd take the new one.

Quote

Parity errors?

Featured Replies

Archived

Account

Navigation

Search

Configure browser push notifications

Chrome (Android)

Chrome (Desktop)

Safari (iOS 16.4+)

Safari (macOS)

Edge (Android)

Edge (Desktop)

Firefox (Android)

Firefox (Desktop)