vb543 Posted December 5, 2018 Share Posted December 5, 2018 I've been using unraid on an old Phenom x6 build for a while now without any issues. Over the last week, I've updated my hardware and since then - the system will completely lock up at random times. Sometimes a few hours after power on, sometimes over a day. When the freeze occurs, nothing responds. Connecting a monitor just displays a login prompt and connecting a keyboard doesn't yield anything. The keyboard itself doesn't even light up or respond when plugged in, forcing me to hard power cycle the system. Any suggestions? Currently letting a memtest run overnight, but not sure how to proceed if that passes. Build: Gigabyte GA-7PESH2 Dual E5-2680 2x 16GB DDR3 DIMMS StarTech PEXESATA32 Supernova G1 650W Mediasonic ProBox I have eight drives connected via onboard SAS headers, four via onboard SATA headers, three via external Mediasonic enclosure. Enclosure was formally connected with USB without any issues. BIOS is also fully updated for what it's worth. Also on the latest version of unraid (updated this shortly after the hardware upgrade). Thanks! Link to comment
jonp Posted December 5, 2018 Share Posted December 5, 2018 Common thread here is that gigabyte motherboards tend to be problematic. Literally just had to tell some other poor soul that was likely his issue as well. Gigabyte just doesn't have the quality assurance/control that Asus, ASRock, and others have. That said, if you'd like to try some other things, you'll need to boot the system up, get a monitor and keyboard attached, and from the console, type this command: tail /var/log/syslog -f > /boot/logdump.txt This will both generate a running log on the screen that you can capture with a camera as well as save that log to a file on your flash device. When the server crashes again, take a picture of whatever the log has printed to the screen and repost it here. Likely it will be a call trace and even more likely it will be generic, pointing to a hardware-specific issue, but at least it will give us confirmation that is the case. Link to comment
vb543 Posted December 5, 2018 Author Share Posted December 5, 2018 That's unfortunate to hear about Gigabyte. I was hoping a 'server' grade motherboard from them wouldn't be as bad as their usual lineup. Memtest ran overnight and didn't find any issues. I'll run the command and post an update when the system hangs up again. Thanks for your advice! Link to comment
Fireball3 Posted December 5, 2018 Share Posted December 5, 2018 2 hours ago, jonp said: Gigabyte just doesn't have the quality assurance/control that Asus, ASRock, and others have. ASRock is a subsidiary of Asus. Asus and Gigabyte have about the same maket share in motherboards. The return rate is also similar. I guess its no big difference what you choose. I was a loyal Asus customer for several upgrade-cycles but when I started having troubles with their products I swiched to Gigabyte. 2 hours ago, jonp said: Common thread here is that gigabyte motherboards tend to be problematic. If this is the return of experience of builds running unRAID, you have a point. 1 hour ago, vb543 said: Memtest ran overnight and didn't find any issues. You could also run a linux live system and see if it also locks up. If not, it is something related to unRAID. Link to comment
JorgeB Posted December 5, 2018 Share Posted December 5, 2018 1 hour ago, Fireball3 said: ASRock is a subsidiary of Asus. It was, not anymore, for a few years now. Link to comment
jonp Posted December 5, 2018 Share Posted December 5, 2018 4 hours ago, Fireball3 said: If this is the return of experience of builds running unRAID, you have a point. That's an odd way to word it ;-). Basically here's what it boils down to (fyi, all my personal opinion too). I don't think vendors like Gigabyte do nearly as much testing on all the various ways their hardware can be utilized (as compared to other brands). I especially don't think they test much with Linux (nor KVM), virtualization in general, and PCI pass through. That's my general belief because in all the support cases I've seen and handled here, folks that use Gigabyte motherboards but DON'T use virtualization at all tend to not have issues. It's specifically when using VMs and even more specifically passing through PCI devices like GPUs that these motherboards tend to exhibit some wonky behavior. That's not to say that other vendors don't have this issue, but Gigabyte tends to be the common theme. There's also my personal experiences with their motherboards and GPUs that I've used in the past which we'll save for another day in another topic ;-). Long story short: I don't trust their support and I don't like their products. Maybe in time that'll change, but I doubt it. Link to comment
vb543 Posted December 6, 2018 Author Share Posted December 6, 2018 Looks like another crash occurred shortly after I left for work this morning. Please see the below picture and attached log file. No virtual machines or docker containers were running. logdump.txt Link to comment
vb543 Posted December 7, 2018 Author Share Posted December 7, 2018 Server has been stable for about 10 hours. This time array is online but I'm not running a parity check or pre-clear. Would that make a difference? Link to comment
trurl Posted December 7, 2018 Share Posted December 7, 2018 12 minutes ago, vb543 said: I'm not running a parity check Were you running a parity check before when it crashed? I notice your PSU has 4 12V Rails, so all of the power isn't available for disks. Single rail is the usual recommendation. Link to comment
vb543 Posted December 7, 2018 Author Share Posted December 7, 2018 1 minute ago, trurl said: Were you running a parity check before when it crashed? I notice your PSU has 4 12V Rails, so all of the power isn't available for disks. Single rail is the usual recommendation. I was indeed running parity checks during previous crashes. However, they seemed to run for hours without any issues so I didn't think too much of it. I ended up with this PSU after trying to find something with two EPS connections that Amazon could get to me quickly. If this really could be the cause, I'll look around for different power supply. Link to comment
vb543 Posted December 8, 2018 Author Share Posted December 8, 2018 So it ran just fine with the array enabled for 24 hours. At the 24 hour mark I started a parity check and it failed six hours in. Last time it failed about four hours into the parity check. Any idea why? Still worth looking into the PSU or is it more likely something else? Link to comment
vb543 Posted December 9, 2018 Author Share Posted December 9, 2018 Was able to replicate the same results. Fine for nearly a day, start a parity check and it freezes up within the hour. What's the next best step? Link to comment
vb543 Posted December 9, 2018 Author Share Posted December 9, 2018 Just had another crash with a few VMs running but no parity check this time. I'm at a loss at this point. Link to comment
Recommended Posts
Archived
This topic is now archived and is closed to further replies.