Freezing after hardware upgrade

vb543 · December 5, 2018

I've been using unraid on an old Phenom x6 build for a while now without any issues. Over the last week, I've updated my hardware and since then - the system will completely lock up at random times. Sometimes a few hours after power on, sometimes over a day. When the freeze occurs, nothing responds. Connecting a monitor just displays a login prompt and connecting a keyboard doesn't yield anything. The keyboard itself doesn't even light up or respond when plugged in, forcing me to hard power cycle the system.

Any suggestions? Currently letting a memtest run overnight, but not sure how to proceed if that passes.

Build:

Gigabyte GA-7PESH2
Dual E5-2680
2x 16GB DDR3 DIMMS
StarTech PEXESATA32
Supernova G1 650W
Mediasonic ProBox

I have eight drives connected via onboard SAS headers, four via onboard SATA headers, three via external Mediasonic enclosure. Enclosure was formally connected with USB without any issues. BIOS is also fully updated for what it's worth. Also on the latest version of unraid (updated this shortly after the hardware upgrade).

Thanks!

jonp · December 5, 2018

Common thread here is that gigabyte motherboards tend to be problematic. Literally just had to tell some other poor soul that was likely his issue as well. Gigabyte just doesn't have the quality assurance/control that Asus, ASRock, and others have. That said, if you'd like to try some other things, you'll need to boot the system up, get a monitor and keyboard attached, and from the console, type this command:

tail /var/log/syslog -f > /boot/logdump.txt

This will both generate a running log on the screen that you can capture with a camera as well as save that log to a file on your flash device. When the server crashes again, take a picture of whatever the log has printed to the screen and repost it here. Likely it will be a call trace and even more likely it will be generic, pointing to a hardware-specific issue, but at least it will give us confirmation that is the case.

vb543 · December 5, 2018

That's unfortunate to hear about Gigabyte. I was hoping a 'server' grade motherboard from them wouldn't be as bad as their usual lineup.

Memtest ran overnight and didn't find any issues. I'll run the command and post an update when the system hangs up again. Thanks for your advice!

Fireball3 · December 5, 2018

2 hours ago, jonp said:

Gigabyte just doesn't have the quality assurance/control that Asus, ASRock, and others have.

ASRock is a subsidiary of Asus.

Asus and Gigabyte have about the same maket share in motherboards.

The return rate is also similar. I guess its no big difference what you choose.

I was a loyal Asus customer for several upgrade-cycles but when I started having troubles with

their products I swiched to Gigabyte.

2 hours ago, jonp said:

Common thread here is that gigabyte motherboards tend to be problematic.

If this is the return of experience of builds running unRAID, you have a point.

1 hour ago, vb543 said:

Memtest ran overnight and didn't find any issues.

You could also run a linux live system and see if it also locks up.

If not, it is something related to unRAID.

JorgeB · December 5, 2018

1 hour ago, Fireball3 said:

ASRock is a subsidiary of Asus.

It was, not anymore, for a few years now.

jonp · December 5, 2018

4 hours ago, Fireball3 said:

If this is the return of experience of builds running unRAID, you have a point.

That's an odd way to word it ;-). Basically here's what it boils down to (fyi, all my personal opinion too). I don't think vendors like Gigabyte do nearly as much testing on all the various ways their hardware can be utilized (as compared to other brands). I especially don't think they test much with Linux (nor KVM), virtualization in general, and PCI pass through. That's my general belief because in all the support cases I've seen and handled here, folks that use Gigabyte motherboards but DON'T use virtualization at all tend to not have issues. It's specifically when using VMs and even more specifically passing through PCI devices like GPUs that these motherboards tend to exhibit some wonky behavior.

That's not to say that other vendors don't have this issue, but Gigabyte tends to be the common theme. There's also my personal experiences with their motherboards and GPUs that I've used in the past which we'll save for another day in another topic ;-). Long story short: I don't trust their support and I don't like their products. Maybe in time that'll change, but I doubt it.

vb543 · December 6, 2018

Looks like another crash occurred shortly after I left for work this morning. Please see the below picture and attached log file. No virtual machines or docker containers were running.

logdump.txt

vb543 · December 7, 2018

Server has been stable for about 10 hours. This time array is online but I'm not running a parity check or pre-clear. Would that make a difference?

trurl · December 7, 2018

12 minutes ago, vb543 said:

I'm not running a parity check

Were you running a parity check before when it crashed?

I notice your PSU has 4 12V Rails, so all of the power isn't available for disks. Single rail is the usual recommendation.

vb543 · December 7, 2018

1 minute ago, trurl said:

Were you running a parity check before when it crashed?

I notice your PSU has 4 12V Rails, so all of the power isn't available for disks. Single rail is the usual recommendation.

I was indeed running parity checks during previous crashes. However, they seemed to run for hours without any issues so I didn't think too much of it.

I ended up with this PSU after trying to find something with two EPS connections that Amazon could get to me quickly. If this really could be the cause, I'll look around for different power supply.

vb543 · December 8, 2018

So it ran just fine with the array enabled for 24 hours. At the 24 hour mark I started a parity check and it failed six hours in. Last time it failed about four hours into the parity check. Any idea why? Still worth looking into the PSU or is it more likely something else?

vb543 · December 9, 2018

Was able to replicate the same results. Fine for nearly a day, start a parity check and it freezes up within the hour. What's the next best step?

vb543 · December 9, 2018

Just had another crash with a few VMs running but no parity check this time. I'm at a loss at this point.

Freezing after hardware upgrade

Recommended Posts

vb543

Link to comment

jonp

Link to comment

vb543

Link to comment

Fireball3

Link to comment

JorgeB

Link to comment

jonp

Link to comment

vb543

Link to comment

vb543

Link to comment

trurl

Link to comment

vb543

Link to comment

vb543

Link to comment

vb543

Link to comment

vb543

Link to comment

Archived