Jump to content

Parrity sync errors, always 5, not marvell


Recommended Posts

Hello. 

 

I'm scratching my head over this one. Not marvell controller, build in asmedia 1166. It is a qnap ts-664 with 16Gb ram. I never ever got an error shown in the main tab, always zero errors but I always get 5 sync errors when doing the parity, always the same sectors, identical sectors to what has happen to other people around here in the past but they seam to be using marvell controllers at that time.

 

May  2 11:15:54 unRaid kernel: md: recovery thread: P corrected, sector=3519069768
May  2 11:15:54 unRaid kernel: md: recovery thread: P corrected, sector=3519069776
May  2 11:15:54 unRaid kernel: md: recovery thread: P corrected, sector=3519069784
May  2 11:15:54 unRaid kernel: md: recovery thread: P corrected, sector=3519069792
May  2 11:15:54 unRaid kernel: md: recovery thread: P corrected, sector=3519069800

 

Basically I have no idea on how to fix this one. There is nothing to mess within the bios and messing around with tunable(s) is not fixing the issue.

 

Any suggestions ??

 

Thank you so much.

 

 

unraid-diagnostics-20230502-1129.zip

Link to comment

I'm currently double checking that. In the beginning I did saw parity check run after rebooting until I changed Shutdown time-out and

VM shutdown time-out to a larger value, but I'm almost sure I already run 2 parity checks without rebooting and got the same 5 sync errors. Anyhow, if this parity check does not produce errors but they re-appear after reboot when the reboot seams to be clean, what can I do ?

 

Does the parity check need to go all the way to the end to actually correct errors ? 

Link to comment
38 minutes ago, adolfotregosa said:

Does the parity check need to go all the way to the end to actually correct errors ?

You need to run a correcting check (when you run such a check it will report the same number of errors as it corrects them) - were you doing this earlier?    Your diagnostics only showed the one (correcting) check so it is not possible to check from them whether earlier ones were correcting or not.  . Any subsequent checks after a correcting one should then report 0 errors.

Link to comment
Just now, itimpi said:

You need to run a correcting check (when you run such a check it will report the same number of errors as it corrects them) - were you doing this earlier?    Your diagnostics only showed the one (correcting) check so it is not possible to check from them whether earlier ones were correcting or not.  . Any subsequent checks after a correcting one should then report 0 errors.

 

Yeah I know. I'm almost sure I did that yesterday, twice, but then I rebooted. It is currently doing another pass to see if they pop up. If they do not, I'll do another pass. If still nothing I'll reboot and then my gut say they are coming back. I'll report when it finishes all the steps.

Link to comment

Ok. 

I confirm that the sync errors were corrected. I stopped the array, started and re run parity check. Still zero errors. 
 

I then rebooted and the 5 errors came back. 


diagnostics 0750 has 2 parity checks, zero errors. I stopped the array between checks. 
 

Diagnostics 1252 is after reboot. 5 errors.

 

What do I do now? 

unraid-diagnostics-20230503-0750.zip unraid-diagnostics-20230503-1252.zip

Link to comment
5 minutes ago, trurl said:

SSDs in the array cannot be trimmed, and can only be written at parity speed.

 

I wonder if those Crucial MX500s might be involved in this.


I have no idea why would they but… I’ll wait for instructions from the masters. I confirm the issue only comes back after a reboot. 
 

Write speed is not an issue because of cache. 
 

What I want to see is the moment the sync errors appear. In my case I’m wondering if it is at 50%, when the smaller 2Tb ssd stops doing parity.

Edited by adolfotregosa
Link to comment

ok, so far as testing goes, it seams that EVERY time AFTER reboot, no matter what disks are selected, I will get sync errors. The only sniping rust results came with 2 sync errors and they coincidentally are always the "same sectors", is this case:

 

May  2 11:15:54 unRaid kernel: md: recovery thread: P corrected, sector=3519069768
May  2 11:15:54 unRaid kernel: md: recovery thread: P corrected, sector=3519069800

 

The parity drive seams to be fine, I think. I mean, if I don't reboot and keep checking parity it always comes with zero sync errors after they are "corrected".

 

basically after every reboot I get sync errors. The "error sectors" are strangely coincident with what other people have gotten in the past. I dunno. Bug ?

 

I am now in the process of doing ONLY the 2TB ssd with the 4TB parity hdd and just for my knowledge, Is it normal that the parity sync keeps going after the 2TB  ssd is no more? Probably! But what is it syncing ? air ? :D

 

So what now ?? lol help !

2TB.png

Edited by adolfotregosa
Link to comment

Well I give up. No matter what disks I choose I always get sync errors after a reboot. Same sectors coincidentally to what other people have had.  

 

May  2 11:15:54 unRaid kernel: md: recovery thread: P corrected, sector=3519069768
May  2 11:15:54 unRaid kernel: md: recovery thread: P corrected, sector=3519069800

 

I even took the risk and updated the asmedia 1166 firmware, no dice. If this is not an unraid issue and hardware is fine, I dunno what more to do. Is there anything else to do ?? Parity is fine until I reboot. I can stop and restart array, no issues, but if I restart sync errors appear. 

Link to comment
10 minutes ago, JorgeB said:

That would point to a controller issue.

“”Probably”” but I’m not going to assume that a brand new qnap is broken and the asmedia 1166 does not seam to have issues with unraid looking throughout the web so I’ll have to find alternatives to unRaid, kinda gutted on this one. I’ve been using unraid for such a long time now, oh well. 

Edited by adolfotregosa
Link to comment
  • 2 weeks later...

managed to borrow 3 identical 3Tb hdd and retested with only those 3 hdd.

image.thumb.png.812e5254cad502659e019f0bb86229ce.png

 

After reboot:

image.thumb.png.086c62009d24a0f7ed36e59dc6815299.png

May 14 03:58:43 unRaid kernel: md: recovery thread: P corrected, sector=1565565768
May 14 03:58:43 unRaid kernel: md: recovery thread: P corrected, sector=1565565800

 

Same sectors like before ??

 

In the mean while I've been using mergerfs with snapraid and while I understand it's a different working principle, I can reboot as many time as I want and parity check always comes clean.

I cannot understand how can this be an hardware error, I'm sorry. If i don't reboot unraid parity check always comes clean.

 

I gave it my best shot.

Link to comment
  • 2 months later...

Just a follow up. 

 

I realized I did not try ONE last combination after all this time. Use one of the NVME slot for the parity!  and wouldn't you know... it actually stops the 5 parity errors. It's the exact same hardware, same hdds, sdds and nvmes, same everything, just swapped the disk functions. The nvmes are not connected to the asmedia chip but the other 6 HDD are, just like before. I do not understand why this works, but it works ¬¬. Any suggestions ????

Edited by adolfotregosa
Link to comment
20 minutes ago, JorgeB said:

Same, kind of strange.

You still think it is not some kind of bug? Really strange this one. I even got it while using unraid on top of proxmox with hdds in pass-through, not the controller itself!! I could stop and start unraid VM no issues but if I rebooted proxmox the issue appeared. I did the nvme experiment while I was in proxmox and just now with unraid in bare metal. Both situations gave no parity errors as long the asmedia was not used for parity.

Link to comment
Just now, JorgeB said:

Not really, cannot see how this can be an Unraid bug, or there would be a lot more cases, IMHO it must be hardware related.

Hardware or not it is working correctly this way so kinda strange to think it's still hardware. Same hdds, same controllers just different functions.  Oh well I'll keep and eye on it.

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...