It's been a month since I had the first Hardware Error, and it just got worst.
The system is randomly rebooting since end of July (Kernel Panic reboots - see attached capture).
I haven't been able to perform a parity check as the system always reboots before it finishes (10 TB, 25hours usually) and I know there are parity errors so living in the edge now.
When not performing parity check, the maximum period of no reboots have been of 4 days, but is is so random, that sometimes it just reboots before I can start array again.
This is what I've discarded and why:
RAM: I removed all sticks but 1 and ran system. Same reboots. I did it with 3 different sticks and different slots.
PSU: I have dual PSU, have tried with only 1 at a time with same result.
APU: Ran the system directly to AC. Same results.
Latest Unraid upgrade. The problems started, more or less, when I upgraded to 6.7.2. I downgraded to 6.7.1 but reboots happens like always.
I also removed both CPUs, looked for dust or twisted pins, and applied new thermal grease after that.
I contacted the retailer and after some hardware tests they said this:
Could it be related to a buggy microcode or to a software problem? They say I could try downgrade to 6.3.2 as seemed to be the point of conversation in that thread. What do you think? Is it worh trying?
Also, two days ago I got a new Hardware Error:
Thanks all for you help.
PD: Title changed according to new symtoms.
syslog