Jump to content

Unraid crashing randomly every few days


Recommended Posts

In the past month or so, my Unraid has crashed about every 4-6 days.

It has been in use for about 6ish months, and only had these crashes in the past month or so.

 

WebUI does not respond to anything, and the only way to reboot it is by pulling the plug...

It has always happened while I am at work, so I don't know a specific time it happens.

 

I am running 1 VM. Windows Server 2019, with 2GB Ram, 1 CPU core

 

I am also running the following Dockers: Jackett. Krusader. Radarr. Sonarr. CAdvisor. Bazarr. Syncthing. Netdata. Pihole-DoT-DoH. Plex. Unifi Controller (Limited to 2 GB ram). Tubesync

 

Attached is a Diagnostics export.

 

Hope someone has a solution or is able to help.

goga-unraid-diagnostics-20210818-1657.zip

Link to comment

 

Server has not crashed yet, but today I got a notification from Fix Common Problems, telling me "Out Of Memory errors detected on your server"

 

Server has 32GB Ram installed.

 

VM Running with 2GB

All running docker containers's memory load added together equals about 61003 mb ram used.

 

According to https://www.linuxatemyram.com, then my system should be fine (I think).

root@GOGA-UNRAID:~# free -m

                   total        used        free      shared  buff/cache   available

Mem:          31987        9016         835         900       22135       21905

Swap:             0           0           0

 

And running dmesg | grep oom-killer shows the following output.

root@GOGA-UNRAID:~# dmesg | grep oom-killer

[129937.393367] cluster-Cluster invoked oom-killer: gfp_mask=0x100cca(GFP_HIGHUSER_MOVABLE), order=0, oom_score_adj=0

[256112.858195] mongod invoked oom-killer: gfp_mask=0x100cca(GFP_HIGHUSER_MOVABLE), order=0, oom_score_adj=0

 

 

Attached is a new Diagnostics, also containing the syslog.

 

Hope anyone is able to help solve this problem

goga-unraid-diagnostics-20210822-1832.zip

Link to comment
  • 2 weeks later...
  • 2 months later...

So I have been quite quiet in this thread for the past month(s), but I have been testing some things. I just forget to post my findings... So here we go:

 

It seemed that the server would only crash, when I was taking a backup (using Duplicati or IperiusBackup). But ONLY (as far as I know) when I was backing up data on my cache drives in raid1. And it only crashed about every 2-3 backups. So some backups were running fine.

So I stopped doing backup of the cache for a while, and it ran for 24+ days without crashing. But then it randomly crashed again sometime after... 

 

I tried to replace one of the SSD's in the Raid1. It did have a SMART error, but only due to old age. It was a old SSD anyways.

After the new SSD was working, I tried running a backup, and it crashed again within 2 days...

 

 

Now, I only have 6 SATA ports in my motherboard, but I need 7 in total (5 for HDD Array, 2 for SSD Cache). When I setup the raid1 cache, I purchased a "PCIe to 2x SATA port adapter" (StarTech.com 2 Port SATA 6 Gbps PCI Express SATA Controller Card), and connected one of the HDD drives to the PCIe card. both SSD's were connected to the motherboard.

Yesterday I removed the PCIe card, and now only using one SSD Cache. To see if the PCIe card caused the crash somehow.

I am not sure, but I THINK it all slowly started after I started using Raid1 cache.

 

Ran a full backup of the cache during the night, and the server is still running fine. Ill give ti a couple of days, and run backup every night to see if it crashes again.

 

Lets see what happens in a few days 

  • Like 1
Link to comment

Welp, it crashed again today.... So all the things I did, did not help at all :(

It crashed while I was not home, and I don't know the exact time of the crash.

 

Last thing I'm gonna try is to downgrade it. See if that helps.

Downgrading to 6.9.1. I know that version ran for 60+ days without problems.

 

 

Anyone got any other ideas to try?

Edited by GoGa_M
Link to comment
  • 2 months later...
11 minutes ago, JorgeB said:

That usually points to a hardware issue, one thing you can try it to boot the server in safe mode with all docker/VMs disable, let it run as a basic NAS for a few days, if it still crashes it's likely a hardware problem, if it doesn't start turning on the other services one by one.

i will try that, thanks alot for the help.

 

But actually i did have the same issue before i moved to another server..

The red thread should then be the USB-stick or one of the drives? Could an faulty drive cause this kind of issue? 

Edited by Matt3ra
Link to comment
  • 2 weeks later...

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...