Jump to content

Can't Boot Into Unraid / Boot Loop; What Additional Steps Can I Take?


Recommended Posts

Hey folks!

 

I have searched this support forum and Reddit... but I haven't been able to find any solutions to my issue, so I apologize if I missed anything!

 

Last night, I noticed that I could not access Plex, so I immediately checked my Server Dashboard WebUI and it failed to load the page. I went to my server and saw that it was "On", but there was no video output (though I could not remember if I had booted it into GUI Mode or not). At this point I made the decision to reset the server. 

 

Upon boot, I did get to the Unraid boot menu... but after I selected any of the four boot options, it went to it's usual wall of text (no errors, or anything that indicated a failure). However, after several seconds, the server reboots. This effectively creates a boot loop situation.

 

Here are the steps I have taken so far:

  • Tried other USB ports on the server == Issue Persists
  • Created a new USB from backup == Issue Persists
  • Created a fresh USB == Issue Persists 
  • Disconnected all drives and GPU == Issue Persists
  • Cleared CMOS; also played around with various BIOS settings == Issue Persists
  • Tried various known working RAM sticks in isolation == Issue Persists
    • Did Memtest for an hour OK
  • Tried new power supply == Issue Persists

 

Frustratingly, there's no errors or failures or anything that I can spot before it decides to crash+reboot. Though the text scrolls very quickly, so it is hard to tell.

 

My hardware is as follows:

  • i5 6600K
  • Gigabyte GA-H170M-DS3H
  • I don't think any of the other hardware matters at this point, since it's all disconnected?

 

Is there anything else I can do to diagnose the issue? Are there any logs that I can generate somehow during the Unraid boot process?

 

 

I'm nearly about to pull the trigger on a new CPU+Mobo... but I would like to hold off if possible, so I am asking here for some help :/

 

 

Thanks in advance!

 

Link to comment

Below is an image of the last output that is displayed prior to the system rebooting.

I also updated the BIOS on the Motherboard for good measure. No difference.

 

So I do see that there's a [Hardware Error]: Machine check events logged... but I have no idea how to interpret this or dig into it further? I have seen other threads where it appears that this error is generally inconsequential? I'm not sure.

 

UnraidLastOutput.thumb.png.c8b5bf1a91cf1edf06822c101b0eef5c.png

Link to comment
10 hours ago, ecybopooch said:

I have no idea how to interpret this

 

In Windows, they call this the Blue Screen of Death (BSOD).  That's basically what the MCE is.  Your system has a hardware problem.  Have you tried clearing the CPU fan/heatsink of dust?  It could be caused by poor cooling.  Maybe open the case and point a fan in there?  It sounds like you're on the right track of elimination.  If you're positive your PSU and RAM are good and you've disconnected non-essentials, then I'd start suspecting the CPU or main board.  Does the system fail at this point every boot?

 

I believe MCE errors that are correctable are written to /var/log/mcelog; otherwise the system halts and reboots.  With Unraid, that log is probably empty regardless because the OS runs in RAM and logs aren't retained between reboots.  

 

  • Thanks 1
Link to comment
14 hours ago, dboonthego said:

 

In Windows, they call this the Blue Screen of Death (BSOD).  That's basically what the MCE is.  Your system has a hardware problem.  Have you tried clearing the CPU fan/heatsink of dust?  It could be caused by poor cooling.  Maybe open the case and point a fan in there?  It sounds like you're on the right track of elimination.  If you're positive your PSU and RAM are good and you've disconnected non-essentials, then I'd start suspecting the CPU or main board.  Does the system fail at this point every boot?

 

Thank you for your suggestions.

 

It certainly isn't issue with the cooler. Even if it were mounted incorrectly, there's no way it would be consistently failing at the same point every time.

With that said, I do plan on re-mounting the CPU in the future to rule out any contact issues between the LGA and the motherboard pins. 

 

 

Quote

I believe MCE errors that are correctable are written to /var/log/mcelog; otherwise the system halts and reboots.  With Unraid, that log is probably empty regardless because the OS runs in RAM and logs aren't retained between reboots.  

 

Yes, unless there's some additional way to tell the boot process to write logs/crashdump to the USB, I don't think there's any way to troubleshoot this further.

 

 

With all that said, I did run out and grab a i5 12400 and a DDR4 Z790 board. I was able to get things swapped out and running again pretty quickly. I will continue to work on diagnosing the issue with the old CPU and motherboard, but I'll probably take my time with that.

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...