I'm not exactly sure what's going on, but I've been experiencing random crashes lately and I've been trying to isolate a common factor.
HW:
E5-2697v2 x2
SuperMicro X9Dri LN4 Motherboard
128GB DDR3 ECC
Previous config:
Unraid 6.12.10 - rock solid stable with infinite uptime other than version updates
What changed:
Added Sparkle A380 Elf Card
Upgraded to Unraid7 Beta2 to support Arc card
What I've been finding is that the server will crash- no terminal response, IP stops responding. HW is still running, but from an OS layer appears to be fully non-responsive. The added frustration is that logging does not appear to be working great. I generate logs while the server is up, but I've yet to capture anything during the crash itself. I have logs set to mirror to flash and syslog is pointed to the server IP.
While the server is up it's successfully transcoding via Plex as well as Unmaniac to bulk transcode some less important files from H264 to H265. I've played with the number of worker processes, deliberately trying to get it to crash, and it seems fine, but then randomly I'll refresh a page and find it non-responsive and the server will have crashed requiring a hard power-cycle.
I've run a memtest which came back fine, and again prior to adding the Intel Card and upgrading to 7b2 I had zero issues. I can obviously take out the Intel Card and downgrade back to 6.12.10, but I'd like to see if I can pinpoint what is going on.
Attached are the diagnostics files and any insights towards getting this working would be great, or at least isolating so I know if I need to downgrade or replace the GPU etc.
Recommended Comments
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.