Jump to content

UnRAID is crashing my entire network


Go to solution Solved by gobigred,

Recommended Posts

Pardon my noobiness.

The flash drive on my server failed a couple weeks ago. I did not have a backup, so I had to start over from scratch. Everything was running fine for a couple weeks, but I'm having a weird issue now.

 

After starting my array, my entire network will go down. Sometimes it happens immediately, sometimes it happens after 6-8 hours. As soon as I disconnect the Ethernet cable, the network is restored immediately. The server also becomes unresponsive, I am unable to access the GUI directly.

Here is what I have tried:
1. Starting in safe mode, no plugins or Docker. I do not have network issues in safe mode, but I haven't run it for an extended period of time.
2. Disabled auto-start on Dockers, slowly started Dockers to see if one of the Dockers was causing issues. I did this yesterday -- started a Docker, waited 2 hours, started another Docker. Everything seemed to run fine, but then everything crashed again last night.
3. Swapped the Ethernet cable for a different cable
4. Tried a different port on the switch

 

My network setup:

ISP - Starlink

Router - Unifi UDM SE

 

I attached diagnostics and my UnRAID network settings.
 

Tower-NetworkSettings.png

tower-diagnostics-20231206-0749.zip

Link to comment

Update: I ran the server in Safe Mode (no plugins or Dockers) for 24 hours. However, it just froze up and caused my network to crash again. I would think that eliminates the issue originating with a plugin or Docker.

 

I ran a memtest and it passed. I'm now going to try:

  1. Deleted the network config file from the USB
  2. Changing the assigned IP address in the router
Link to comment

I just had (almost) the same issue a few days ago - around the same time as your initial post.
 

No USB failure in my case, just the server crash taking down the entire network (UDMP).

Which version of Unraid are you running?  --EDIT--  >> I see you're on 6.12.6 in the attached images.

I updated to 6.12.6 shortly before - I'm wondering if the update has something to do with the crash.

Edited by yyc321
Link to comment

@yyc321 I'll try a clean 6.12.5 install on a USB and see if I can get it started. A few more updates:

  1. Loaded the USB on an old server of mine, eliminating flash drive issues
  2. Tested each RAM stick individually, same issues continued
  3. Loaded a clean install of 6.12.5 on a flash drive, tried to boot and got to the login but then had the following error:
    SQUASHFS error: xz decompression failed, data probably corrupt
    SQUASHFS error: Failed to read block 0x9b623ec: -5
    SQUASHFS error: Unable to read fragment cash entry (9b623ec)

PXL_20231208_204715149.jpg

Link to comment

Update here -- struggling.

 

My server started to just boot loop, no bios, no screen activity. It would just start booting, beep once then shut down and restart. I assumed it was the motherboard.

 

1. Replaced the motherboard with a new one.

2. Replaced the flash drive with a new one with stock/trial UnRAID

3. Booted up, ran Memtest and it passed.

4. Tried to boot into UnRAID but it takes over 5 mins then freezes, seems to be a lot of errors (SQUASHFS errors...idk).

5. Unplugged all drives, tried to boot stock UnRAID. Same issue.

6. Cycled memory sticks around to eliminate them as the cause, same issue.

 

I don't know how to proceed. Maybe a PSU issue? Maybe a CPU issue? Server is less than a year old.

 

Probably correlated, but when I did get into the BIOS before the reset loop issues, the CPU temp was over 100C. Boot loop issues started after that. However, CPU runs fine with the new mobo.

 

Edit: CPU is 13600K, which supposedly has heat issues with my former motherboard (AsRock Z690 Steel Legend)

 

At a loss here, do I just scrape everything and start over with another new build?

Edited by gobigred
Link to comment
  • Solution

Update: Replaced RAM and PSU -- still can't startup UnRAID. I attached a screenshot of the startup, this looks like a CPU issue to my untrained eye. Can anyone confirm?

On the hardware side:

  1. Tried multiple flash drives
  2. Unplugged all hard drives
  3. Replaced MOBO
  4. Replaced RAM
  5. Replaced PSU

I think the only thing left is the CPU?

PXL_20231211_230102372.MP.jpg

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...