Jump to content

Removed two drives, did newconfig, parity check, new disk4 - 32 read errors


Recommended Posts

Hello, I recently upgraded my raid to 5.0.5 all ok.  Mapped all my drives and rebuild parity ok with no errors.

After getting to 5.0.5 I upgraded my parity drive to 4TB.  Rebuild parity on it - no problem.

Then I upgraded one drive from 1TB to 4TB (Disk4) - Did a parity data rebuild on Disk4 - no problem.

Moved data from Disk5 to Disk4 using windows PC with parity on - no problem.

Moved data from Disk15 to Disk4 using windows PC with parity on - no problem.

 

So I then wanted to remove disk5 and disk15 from the array since they aren't needed and I would have spare slots again.

I did a new config, remapped my drives (Except for disk5 and disk15 to be dropped)

Then started a parity rebuild - once finished got 32 sync errors all on Disk4.

 

I did run a reiserfsck on Disk4 with No corruptions found.

 

Trying to figure out my next step to verify everything is ok and no data loss or if yes what files(s) ?

Also wondering since Parity and Disk4 are new if either drive might has issues.  Didn't run any preclear checks on either drive (parity or disk4) before installing.

 

1. Best method to know if data on disk4 is ok?  Copy it? (Or will array create the bad sectors from parity?)

2. Best method to test parity and/or disk4 (has 2.5TB of data) to make sure they are good?

3. Array says parity is valid but wondering if good?

 

Thanks for any guidance to make sure my array will be stable going forward.

Array has 13 data drives now, parity active, two spare bays.

Still have original disk4, disk5, and disk15 not online.

Syslog and smart files for parity and disk4 attached.

syslog20140818.zip

smart_parity_20140818.txt

smart_disk4_20140818.txt

Link to comment

...

Didn't run any preclear checks on either drive (parity or disk4) before installing.

...

 

Disk4 has a single pending sector. A read error in unRAID is not a catistrophic event. In theory, if there is a read error from a disk, unRAID will reconstruct the data (from parity and all of the other disks) attempt to write the data back to the disk with the error, and return the reconstructed data. The writing of the data back to the original disk should cause a sector remap to occur (a sector remap means that a spare sector is substituted for the bad sector. Afterwards the pending count is reduced by one, the remapped count increases by one, and the actual tiny piece of the disk surface that caused the problem is never used again).

 

But it didn't happen on your disk. We see no reallocated sectors and we still see the pending sector.

 

A preclear may have identified this problem prior to your adding the disk to your array. It is definitely something everyone should do on every new drive. Others go for multiple cycles, but I use a single cycle which is sufficient for finding most issues. Maybe you'll run it next time!

 

At this point I think I would run a few parity checks and monitor the smart report. If the pending sector gets remapped and nothing else goes south, I'd say you are fine. But my experience is that even a single error like this is a precursor for more and more sectors to start misbehaving. But we don't know. My rule of thumb is if I run 3 consecutive preclears and get consistent results that things are stable for now.

 

On running preclear, it is funny that i have only not precleared a disk on one occasion since the tool came out, as I was trying to make urgent repairs and figured that the chances of a bad disk were quite small. I had probably precleared 20 straight disks with no errors at that time. But this disk I didn't preclear was a bad disk and caused me havoc. I wonder if the systematic read and write cycles that preclear generates on the drive might be therapeutic in breaking in a new disk and not only detect, but actually avoid, errors for some disks. But whether it is coincidence or not, preclearing every disk is certainly recommended.

Link to comment

Oh boy, I didn't see the 1 sector pending.  Any command to tell the drive to commit?

 

I never got any errors when I wrote the data to this drive.  It was when I was reconstructing my parity since I dropped two drives from the array. 

 

What about the 32 sector read errors during the parity reconstruct (Every eighth sector from 1040768160 to 1040768408) ?  Does that mean my parity even though it says its valid might not contain those 32 sectors? 

 

Best way to know if those 32 sectors contained data or were not used?

 

Should I preclear the troubled disk4 and start over moving the data from the three removed drives if it passes?

Link to comment

Oh boy, I didn't see the 1 sector pending.  Any command to tell the drive to commit?

 

I never got any errors when I wrote the data to this drive.  It was when I was reconstructing my parity since I dropped two drives from the array. 

 

What about the 32 sector read errors during the parity reconstruct (Every eighth sector from 1040768160 to 1040768408) ?  Does that mean my parity even though it says its valid might not contain those 32 sectors? 

 

Best way to know if those 32 sectors contained data or were not used?

 

Should I preclear the troubled disk4 and start over moving the data from the three removed drives if it passes?

 

I.suggested running parity checks.

 

You could compute md5 sums on the old disks and new one and check for mismatches to identify if there was any corruption.

Link to comment

Thanks everyone.  Since I still had the original three 1TB drives with the original data I decided to preclear the new 4TB drive (like I should of in the first place ::) and then copy over all of the data again.  It took a number of days but seems to be ok now and when I build my parity drive afterwords got no errors.  Smart is passing and doesn't show any pending sectors now.

Link to comment

Archived

This topic is now archived and is closed to further replies.

×
×
  • Create New...