UNRAID6 Parity Check Sync Errors but UNRAID5 OK


Recommended Posts

unRAID OS Version:

6.0.1

Description:

I upgraded my Unraid5 final not beta to Unraid6, everything is working fine except when I do parity check it find sync errors. About 1500-1600 for the whole array.

I did smart test of all my drive, I did test copying file in the array and check md5 and no error.

So I decided to let Unraid6 correct parity (thinking maybe something happens) then I redid a parity check and still sync errors...

I format my usb again and put back my backup Unraid5 and let correct the parity that Unraid6 f**kup.

Then redid a parity check with Unraid5 no error.

 

No plugins installed just plain Unraid6.. did play with Dockers but problem was there as soon as Unraid 6 booted.

 

Maybe a driver problem?

 

How to reproduce:

Here the question... maybe by getting same hardware and put unraid6, I dont know.

 

Expected results:

No parity sync errors

 

Actual results:

parity sync errors only in Unraid6

 

Other information:

M/B: Gigabyte Technology Co., Ltd. - GA-890GPA-UD3H

CPU: AMD Phenom II X4 955 @ 3200

No SATA Cards. Using onboard SATA.

 

 

Thank for the help

 

Link to comment

Here are the lines I'm looking at in the syslog:

 

First it does show an unclean previous shutdown that typically does lead to parity mismatch:

 

Aug 13 09:30:05 Tower emhttp: unclean shutdown detected

Aug 13 09:30:05 Tower kernel: mdcmd (43): check CORRECT

Aug 13 09:30:05 Tower kernel: md: recovery thread woken up ...

Aug 13 09:30:05 Tower kernel: md: recovery thread checking parity...

Aug 13 09:30:05 Tower kernel: md: using 1536k window, over a total of 2930266532 blocks.

 

Here are a couple:

 

Aug 13 09:30:14 Tower kernel: md: correcting parity, sector=1643264

Aug 13 09:30:15 Tower kernel: md: correcting parity, sector=1879488

 

Then it looks like you cancelled the operation:

 

Aug 13 09:30:16 Tower kernel: mdcmd (44): nocheck

Aug 13 09:30:16 Tower kernel: md: md_do_sync: got signal, exit...

Aug 13 09:30:16 Tower kernel: md: recovery thread sync completion status: -4

 

and then restarted it with "write corrections" unchecked:

 

Aug 13 09:30:25 Tower kernel: mdcmd (45): check NOCORRECT

Aug 13 09:30:25 Tower kernel: md: recovery thread woken up ...

Aug 13 09:30:25 Tower kernel: md: recovery thread checking parity...

Aug 13 09:30:25 Tower kernel: md: using 1536k window, over a total of 2930266532 blocks.

 

And we see some more mismatches pop out:

 

Aug 13 09:30:28 Tower kernel: md: parity incorrect, sector=470624

Aug 13 09:30:33 Tower kernel: md: parity incorrect, sector=1384880

Aug 13 09:30:34 Tower kernel: md: parity incorrect, sector=1643264

Aug 13 09:30:34 Tower kernel: md: parity incorrect, sector=1693688

Aug 13 09:30:35 Tower kernel: md: parity incorrect, sector=1879488

 

What's troubling are the two bolded above that match ones supposedly corrected before you cancelled.

 

Are you saying, with everything in this state, if you shutdown (cleanly) and then boot unRaid-5 you don't see those errors?  But then if you shutdown (cleanly) and merely boot unRaid-6 the errors reappear?  If so, are they in the exact same locations?

Link to comment

Here more information:

After going back to Unraid5 where everything is working fine.

I format my USB again and put back Unraid6 to take the logs.

So when I boot from Unraid6 first time after changing from Unraid5 it says unclean shutdown detected and start a parity check (with auto correct) automatically and I dont have the option to not correct so I cancelled the parity check right away and then started a new one without parity correct. That explain the logs.

 

Since Unraid6 did 2 correction, when I'll go back to unraid5, Unraid5 will have to correct this error from Unraid6 and then everything will be fine on Unraid5

But on Unraid6 if I let it go it will correct 15--1600 sync errors and if I start it again it will still find others errors... false errors because the parity is OK.

 

When I first move to Unraid6 I did a parity check just before and it was perfect.

Link to comment

Here more information:

After going back to Unraid5 where everything is working fine.

I format my USB again and put back Unraid6 to take the logs.

So when I boot from Unraid6 first time after changing from Unraid5 it says unclean shutdown detected and start a parity check (with auto correct) automatically and I dont have the option to not correct so I cancelled the parity check right away and then started a new one without parity correct. That explain the logs.

When you change versions, are you perhaps copying a backup of config/super.dat that you took over the network while the array was running? That would cause an unclean shutdown because when it boots up super.dat says it was already running instead of stopped.
Link to comment

Here more information:

After going back to Unraid5 where everything is working fine.

I format my USB again and put back Unraid6 to take the logs.

So when I boot from Unraid6 first time after changing from Unraid5 it says unclean shutdown detected and start a parity check (with auto correct) automatically and I dont have the option to not correct so I cancelled the parity check right away and then started a new one without parity correct. That explain the logs.

When you change versions, are you perhaps copying a backup of config/super.dat that you took over the network while the array was running? That would cause an unclean shutdown because when it boots up super.dat says it was already running instead of stopped.

 

Ah you got it.. When I did my Unraid5 backup folder I took it from the network while Unraid5 was running.

I then kept those file when I created my Unraid6 USB. That would explain the unclean shutdown but not the sync errors :(

Link to comment

Here more information:

After going back to Unraid5 where everything is working fine.

I format my USB again and put back Unraid6 to take the logs.

So when I boot from Unraid6 first time after changing from Unraid5 it says unclean shutdown detected and start a parity check (with auto correct) automatically and I dont have the option to not correct so I cancelled the parity check right away and then started a new one without parity correct. That explain the logs.

When you change versions, are you perhaps copying a backup of config/super.dat that you took over the network while the array was running? That would cause an unclean shutdown because when it boots up super.dat says it was already running instead of stopped.

 

Ah you got it.. When I did my Unraid5 backup folder I took it from the network while Unraid5 was running.

I then kept those file when I created my Unraid6 USB. That would explain the unclean shutdown but not the sync errors :(

The flash share still exists after you stop the array, so you can get a copy of super.dat over the network with the stopped array status if you just stop the array before making the copy. I often do this after making any changes to the array or any settings.
Link to comment

Yes, I'll take a new copy of super.dat while the array is off.

 

I hope I'll a way to fix the Parity Sync Error because I really like Unraid 6 but I'm stock with Unraid5 :(

I don't have any clue because it is not hardware, Unraid5 is working fine. The log doesn't tell precise details.

I don't know where to look.

 

I think something is happening when it read data on disk, it get corrupt or doesnt read properly... or something is happening in the memory while processing. I dont know.

I'll try to run plex and watch more program to see if everything is fine... today my girlfriend was watching something and was getting intermittent mpeg mosaic or artifact or pink bar on a episode of a show.. it's maybe the file that was like that but maybe it was reading error... so I'll check to see if it seems to happen with everything.

 

Link to comment

memtest on the boot menu.

 

I could but I have and had no problem at all with my server on UNRAID or others OS.

I only have it with UNRAID6... If I had problem with my ram... but but I just have to my mind... that I have 4 gb of RAM

so Unraid5 32bits only allocate 3gb maximum if i'm right and Unraid6 64 bits will use up to my 4gb.

 

I have nothing to lose doing a memtest. I'll go start it right away.

Link to comment

Confirmed no memory problem. Memtest X64 run for more than 12h did 17 pass with no errors.

memtest only tests the dram cells and the cpu/memory datapath.  It does not test dma/memory datapath.  I have seen 'bad memory' pass memtest but still write the wrong thing to storage devices.  Probably this is not the case, just pointing it out.

 

In my last post I asked two questions... can you answer please?

Link to comment

 

 

What's troubling are the two bolded above that match ones supposedly corrected before you cancelled.

 

Are you saying, with everything in this state, if you shutdown (cleanly) and then boot unRaid-5 you don't see those errors?  But then if you shutdown (cleanly) and merely boot unRaid-6 the errors reappear?  If so, are they in the exact same locations?

In that case where UNRAID 6 has corrected parity (corrupt it) when I'll boot UNRAID5 it will fix it.

In UNRAID6 errors will always reappears, UNRAID6 will always find sync errors.

 

 

- If I dont let UNRAID6 correct the parity the "false sync errors" (because parity is 100% correct, tested in UNRAID5) then if I boot UNRAID5 everything is fine.

- If I let UNRAID6 correct the parity then it will corrupt the parity then if I boot UNRAID5, UNRAID5 will detect the corruption made by UNRAID6 and then will correct it once and then all others parity check will be fine, no errors.

- If I let UNRAID6 correct the parity completly and then restart a parity check UNRAID6 will always find sync errors.

- It seems to be random.

- Look like reading errors when reading on disk

Link to comment

I guess I'll have to run UNRAID5 until UNRAID6 is final final.

I'll try again when there is another version out.

 

For now, I think I'll try to go ESXi with UNRAID5 to be able to run everything on the same machine.

 

Let me know if you have any clue what is wrong with my UNRAID6

 

Thank you

 

 

Link to comment

I guess I'll have to run UNRAID5 until UNRAID6 is final final.

I'll try again when there is another version out.

 

For now, I think I'll try to go ESXi with UNRAID5 to be able to run everything on the same machine.

 

Let me know if you have any clue what is wrong with my UNRAID6

 

Thank you

 

I have been unable to reproduce this.  What is your h/w config (Motherboard, controllers)?  I notice you're using a pata drive for cache and you have 7 sata drives but I count only 6 sata controller ports - are you using a pata/sata converter?

Link to comment

I'm using a GA-890GPA-UD3H motherboard that has

 

6 SATA 6DB port Marvell (south bridge)

2 SATA 3GB port Gigabyte chipset

1 PATA port

 

So there is 8 SATA port onboard.

I'm using onboard controllers, NO extra PCI/PCI-E card.

I'm using no converter but using my cache drive on the PATA port only for apps.

 

Thanks

 

Edited: More info on board North Bridge: AMD 890GX without IOMMU

 

Link to comment

I was working making ESXi working and I saw that my hdd controller was set to IDE so I changed them to AHCI.

I also added 8 gigs of RAM and downgraded the speed of it.

 

Now UNRAID6 still doing sync error but significantly less...

So I let the parity check running a bit to have a better idea and tomorrow I'll remove my old RAM and keep only the new one even if memtest didn't find error and unraid5 is running perfect. Just to double triple check and clear that possibility.

 

If it's still doing sync error then it would have maybe something to do with hdd controllers driver from unraid6 for my controllers.

 

 

Link to comment

Problem is fixed.

 

Even if that server was running Perfect with UNRAID5, ESXi, Win7 etc.

Even if memtest x64 didn't find any error.

 

Seem like changing the ram fixed my problem, that ram didn't like Unraid6 I guess... a bit weird but glad it's working fine on Unraid6 now.

 

Did a complete parity check on Unraid6 without any sync errors

 

Thank all for the help

 

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.