JorgeB Posted February 19 Share Posted February 19 Unfortunately there's nothing relevant logged, this usually points to a hardware issue, one thing you can try is to boot the server in safe mode with all docker containers/VMs disabled, let it run as a basic NAS for a few days, if it still crashes it's likely a hardware problem, if it doesn't start turning on the other services one by one. Quote Link to comment
GatorMB Posted February 19 Author Share Posted February 19 11 hours ago, JorgeB said: Unfortunately there's nothing relevant logged, this usually points to a hardware issue, one thing you can try is to boot the server in safe mode with all docker containers/VMs disabled, let it run as a basic NAS for a few days, if it still crashes it's likely a hardware problem, if it doesn't start turning on the other services one by one. I'm starting to believe you are correct. I initially ran this box with a Supermicro x9scl mobo and cpu and it ran perfect. It just couldn't handle transcoding and the mobo didn't have a slot for a gpu card. I upgraded the mobo to the Supermicro x11ssh-f and the cpu to the xeon e3-1285v6. Ram went from 32 to 64. All drives, psu, cooling, case stayed thee same. I started having failures. I changed the ram and still same issue. I changed the mobo and same issue. I changed the psu from 600 80+ white to 850 80+ gold. I added liquid cooling. I added a Tesla P4. Nothing has eliminated the problem. The only thing left is the CPU. I am waiting for a xeon e3-1270v6 to arrive in a few days. I'll swap it out and see if that helps. If not then I'm at a total loss as to what could be causing it! Could it be bios related? I have BMC connected, but don't have the password, so I will need to reset it via the jumper? Then I can review it on a remote pc. I have link aggregation connected from the mobo to my ASUS GT-AC5300 router. I literally have no idea what else to try?! I have the syslog going to root on the flash drive, but nothing seems to stand out to you or others... Do you think it could just be a bad CPU?! Quote Link to comment
GatorMB Posted February 19 Author Share Posted February 19 I do also notice that I get a failure on the GPU plugin in Unraid on occasion. Do I need to disable the onboard video now that I am running the Tesla P4?Or do I have a bad GPU as well? Quote Link to comment
JonathanM Posted February 20 Share Posted February 20 12 hours ago, GatorMB said: I have link aggregation connected from the mobo to my ASUS GT-AC5300 router. Have you tried with just one cable? Are you connected to the correct ports? Apparently only 2 specific ports on that router support aggregation. Doubt it would cause what you are seeing, just grasping at straws. Quote Link to comment
GatorMB Posted February 27 Author Share Posted February 27 New processor arrived today. It’s now in and I’m up and running. I will report back within 72 hours if it’s stable. Fingers crossed! Quote Link to comment
GatorMB Posted March 2 Author Share Posted March 2 System is stable. It was 100% a bad processor. Thanks everyone for all your help! 1 Quote Link to comment
GatorMB Posted March 13 Author Share Posted March 13 Good morning! Here is the update... Server is 100% stable. It has not crashed at all since the cpu was swapped out. However, I did find another issue. The Nvidia Tesla P4 GPU gets super hot when transcoding 3 files in tdarr. I have disabled the tdarr container and it runs stable. I have ordered a fan for the GPU to help cool it. More to follow on the success of that! Again, huge thanks to JorgeB & JonathanM for all your help! Quote Link to comment
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.