August 7, 2025Aug 7 I've been having a vexing issue with one of my servers hard locking abruptly and becoming completely unresponsive. I was working on it remotely just now actually when it suddenly dropped off the face of the earth. I'll have to confirm when I get home but each time this has happened previously the system is completely unresponsive to any local inputs (keyboard/mouse) and all remote connection attempts (SSH, WebUI, etc) all fail despite it still being powered on.I've done multiple rounds of memtest (pretty much each time it does this I check the RAM with an overnight test) and it always passes. I turned on remote logging to another device hoping to catch a hint of what's going on but it seems to cut out so abruptly that nothing ever gets logged to indicate what the issue is. My last entry was:2025-08-07T12:24:51+01:00 VOID webgui: Kometa: Could not download icon https://raw.githubusercontent.com/Kometa-Team/Kometa/nightly/docs/_static/logomark-color.pngWhich was me switching my Kometa container to Nightly and roughly five minutes before the system went down.The system has a monitor hooked up but if it goes to sleep I can't wake the screen after the hard lock to try and see if anything was output to the display. I'll have to see if i can turn off any power saving features to keep it on all the time.Any hints on likely culprits? I'm thinking its got to be either CPU or MOBO? Quite frustrating given they're barely two years old (specs for VOID are in my signature).You can see a month old diagnostic here from my last thread about this server, nothing has changed since it was taken. void-diagnostics-20250710-0925.zip Edited December 2, 2025Dec 2 by weirdcrap
August 7, 2025Aug 7 Community Expert Recommend posting the diags mostly to see if the hardware used has known issues.
August 7, 2025Aug 7 Author 2 minutes ago, JorgeB said:Recommend posting the diags mostly to see if the hardware used has known issues.Are there known issues with thia CPU/MOBO combo? I've got a 12th gen in my other server with no issues but it's paired with a Z790.EDIT: I'm going to make a bootable windows USB and try to run Intel's processor diagnostic tool. Really want to figure out which piece of hardware the problem is so I'm not just buying random things in the hopes that they resolve whatever the issue is. Edited August 7, 2025Aug 7 by weirdcrap
August 7, 2025Aug 7 For what it's worth I'm having the exact same problem. I'm on 7.1.4 and it's been going on for several versions. It used to happen once every few months, maybe. Been much more frequent recently.Wasn't sure exactly what the problem was but I had a single 10Gb interface to it so with the recent update adding WiFi support I added an IP to my mobo's wireless NIC just to confirm it wasn't that 10 Gb interface, and sure enough when it locks even the wireless IP is nonresponsive. I don't have a mouse/keyboard hooked up, but I do have a portable screen and there's nothing indicative of any problem on the screen when it locks.
August 7, 2025Aug 7 Author 1 minute ago, keiser said:For what it's worth I'm having the exact same problem. I'm on 7.1.4 and it's been going on for several versions. It used to happen once every few months, maybe. Been much more frequent recently.Wasn't sure exactly what the problem was but I had a single 10Gb interface to it so with the recent update adding WiFi support I added an IP to my mobo's wireless NIC just to confirm it wasn't that 10 Gb interface, and sure enough when it locks even the wireless IP is nonresponsive. I don't have a mouse/keyboard hooked up, but I do have a portable screen and there's nothing indicative of any problem on the screen when it locks.Mine is also about once a month. Doesn't seem to be any particular activity that causes the problem and IIRC it was doing this on the earlier 7.1.x releases as well. This is the first time I've been actively connected to it when it goes down. I usually get on Plex and only then notice the server is down. I think this is the third or fourth time now. I'll report back if I can find any physical issues (like bad caps on the mobo) or if the Intel Processor Diagnostic tool yields any results.
August 7, 2025Aug 7 1 hour ago, weirdcrap said:Mine is also about once a month. Doesn't seem to be any particular activity that causes the problem and IIRC it was doing this on the earlier 7.1.x releases as well. This is the first time I've been actively connected to it when it goes down. I usually get on Plex and only then notice the server is down. I think this is the third or fourth time now. I'll report back if I can find any physical issues (like bad caps on the mobo) or if the Intel Processor Diagnostic tool yields any results.Happened again for me, and I finally got a log message that might indicate something:kernel: mce: [Hardware Error]: Machine check events loggedMight have a CPU issue.
August 7, 2025Aug 7 Author 40 minutes ago, keiser said:Happened again for me, and I finally got a log message that might indicate something:kernel: mce: [Hardware Error]: Machine check events loggedMight have a CPU issue.Nice, I wish mine would crash more frequently than once a month. Makes it way harder to try and track down the problem. If I had to guess mine is also probably CPU related.Visually my mobo looks fine (no bulging/leaking caps or anything like that). I'm waiting for the Windows to Go USB to be created so I can try the processor test software from intel.I checked the server when I got home and its the same as always, no screen output, everythings frozen, no connectivity.
August 8, 2025Aug 8 Community Expert 12 hours ago, weirdcrap said:Are there known issues with thia CPU/MOBO combo? I've got a 12th gen in my other server with no issues but it's paired with a Z790.Nope, but since memtest is only definitive if it finds errors, if you have multiple sticks try using the server with just one, if the same try with a different one, that will basically rule out bad RAM.
August 8, 2025Aug 8 Author I will give this a try, thankfully I've got 64GB of RAM in there (I thought I only had 32) so I should still be able to keep all my services up.I had zero luck with Windows to Go. Rufus made the USBs fine but the first attempt just kept failing to load the user profile service. So thinking Something had gone wrong in the imaging I imaged it again only to have it fail even earlier and suddenly reboot before I could even get to the login screen. I may just have to bite the bullet and install an SSD in here temporarily so I can run this damn processor testing software.
November 1, 2025Nov 1 Author This has taken a turn for the weird. I ran the server with one stick for over a month, then the other. worked flawlessly.I put both sticks back in a few weeks ago and it was working fine, i was running on 7.2 RC2 and everything seemed great.I installed the 7.2 stable today and the server never came back up after reboot. Upon investigating it appears that the system passes post, but then gets stuck before the UnRAID OS loads (I never get to the picker screen for safe mode, gui, etc).If I move the flash drive to another port, it will boot normally exactly once. The next time you reboot the server it hangs at the same spot. If you enter then exit the bios without changing anything, it will boot normally without having to move the flash drive around but this is obviously not ideal for a headless server.I tried going back down to one stick, then the other same behavior. I've tried both sticks in every slot on the motherboard, same behavior. I even found some entirely different DDR4 RAM that I had in a drawer and forgot about. Same issue.The motherboard has diagnostic LEDs but whatever this issue is is not causing any of them to light up. I'm stumped.I might take another crack at getting windows on this thing somehow so I can try to run that processor diagnostic.Any additional insights from the community? I'm beyond frustrated with it at this point. I hate just throwing money at a problem with no real idea if it's going to fix it or not. But I still can't tell if this is a RAM thing (seems rather unlikely at this point), a CPU issue (you'd think I'd get some kind of error or log entry with the server running), or something with the motherboard.EDIT: forgot to mention I also tried a brand new flash drive freshly flashed with 7.2 just to make sure it wasn't something with my USB. Same behavior. Edited November 1, 2025Nov 1 by weirdcrap
November 1, 2025Nov 1 Community Expert 23 minutes ago, weirdcrap said:If I move the flash drive to another port, it will boot normally exactly once. The next time you reboot the server it hangs at the same spot.Look for a Fast Boot or similar option in the BIOS, and if one exists, disable it.
November 1, 2025Nov 1 @weirdcrap do you have XMP profiles enabled on your memory? I was having the same issues earlier this year when running XMP, memtest passed and everything worked fine, but every 20-30 days my server would lock up, disappear from the network and after reboot operate fine for another cycle.After disabling XMP, my server has been up for almost three months; would be even longer if it wasn't for a power outage.
November 1, 2025Nov 1 Author 31 minutes ago, fuzzydunlop said:@weirdcrap do you have XMP profiles enabled on your memory? I was having the same issues earlier this year when running XMP, memtest passed and everything worked fine, but every 20-30 days my server would lock up, disappear from the network and after reboot operate fine for another cycle.After disabling XMP, my server has been up for almost three months; would be even longer if it wasn't for a power outage.Yeah I have XMP disabled.1 hour ago, JorgeB said:Look for a Fast Boot or similar option in the BIOS, and if one exists, disable it.There were two, an MSI Fast Boot (off) and regular Fast Boot (on). Turning it off seems to have fixed it, I've rebooted it a few times and it comes back up. Strange that it just suddenly became a problem, I assume it's been on for years.
December 2, 2025Dec 2 Author Solution I'm going to mark this as solved. My only thought is this had something to do with the specific kernel version(s) used in the 7.1.x releases as 7.2 the issue has completely gone away with no changes to the hardware.
December 16, 2025Dec 16 On 12/2/2025 at 7:48 AM, weirdcrap said:I'm going to mark this as solved. My only thought is this had something to do with the specific kernel version(s) used in the 7.1.x releases as 7.2 the issue has completely gone away with no changes to the hardware.Hey glad to hear it's resolved for you! Still having the same issue on 7.2. I'll check on disabling the memory XMP and make sure fast boot is always off. Edited December 16, 2025Dec 16 by keiser
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.