Unraid OS stuck

3dee · January 20, 2022

Hello everyone,

today and last week I checked my array in the morning and Unraid was kind of stuck, today for the second time. In the Webinterface Docker and Dashboard didn't load at all, the Main tab showed all my devices but nothing under "Array Operation" (like reboot or shutdown). None of my docker apps were responding. I had to cold start the system to get it back running. I don't know where I should start searching for issues.

Before the shutdown I was able to pull diagnostics via SSH / WinSCP. They are untouched except for the mover logs I removed.

My CPU is an Intel Xeon E5-2630L V4 ES (Engineering Sample). All the other hardware information should be in the diagnostics.

I hope you can help me find the reason for my Unraid OS "crashes".

Thanks!

serverpc-diagnostics-20220113-0939.zip serverpc-diagnostics-20220120-0912.zip

Edited January 20, 2022 by 3dee

Squid · January 20, 2022

This appears to be the start of it

Jan 20 04:29:00 ServerPC kernel: BTRFS critical (device sdj1): corrupt leaf: root=2 block=3314483200 slot=95, unexpected item end, have 428459034 expect 13148

Are you running ECC memory? I would start investigating by downloading memtest (and setting up a separate boot stick) from https://www.memtest86.com/ (the up to date versions will catch ECC errors, but due to licencing restrictions can't be included with the base OS)

This can also have been caused by filling up the cache pool to 100% (btrfs does not respond well in that circumstance)

3dee · January 20, 2022

Yes, the RAM is REG ECC.

I will do memtest.

How can I prevent the cache going full? It actually was 99% some times. I'm using rTorrent with pre allocation of space but it seems to ignore the 30GB limit of minimum free space I set up for my media share. Files usually aren't larger than 5GB but never larger than 30GB.

Squid · January 20, 2022

One thing is to set up the shares you're using for downloads to be cache-prefer and not cache-only, and to also make sure the cache floor limit is set appropriately so the system knows when to overflow to the array. (And also don't directly reference /mnt/cache in any path mappings but always use /mnt/user/... so that the rules are obeyed

You might also want to have a look at the system event log (bios) to see if anything noteworthy is in there as to the lockups

3dee · January 21, 2022

IPMI did not log anything noteworthy, but the "Minimum free space" on my cache drive was 0, so I set it to 10GB now.

If it happens again, I will run memtest. Until that I hope Minimum free space was the reason for that issue.

Thanks so far!

JorgeB · January 21, 2022

Jan 13 06:47:17 ServerPC kernel: macvlan_broadcast+0x10e/0x13c [macvlan]
Jan 13 06:47:17 ServerPC kernel: macvlan_process_broadcast+0xf8/0x143 [macvlan]

Macvlan call traces are usually the result of having dockers with a custom IP address, upgrading to v6.10 and switching to ipvlan might fix it (Settings -> Docker Settings -> Docker custom network type -> ipvlan (advanced view must be enable, top right)), or see below for more info.

https://forums.unraid.net/topic/70529-650-call-traces-when-assigning-ip-address-to-docker-containers/

See also here:

https://forums.unraid.net/bug-reports/stable-releases/690691-kernel-panic-due-to-netfilter-nf_nat_setup_info-docker-static-ip-macvlan-r1356/

3dee · January 23, 2022

Yes, all my dockers are runnning on br0 with a fixed IP address. Are those macvlan call traces "bad"? Should I change all my containers to ipvlan?

Edit: I should read the links before asking I will check that. Thanks!

Edited January 23, 2022 by 3dee

JorgeB · January 23, 2022

1 hour ago, 3dee said:

Are those macvlan call traces "bad"?

Yes, they can make Unraid crash.

3dee · February 2, 2022

Hey Guys,

so I upgraded to version 6.10 and changed the network type to ipvlan. None of my docker containers was able to connect to the internet, only local network was working.

So I changed back to macvlan and disabled vlan tagging in the network settings. Macvlan traces still showed up every then and now and last night my server crashed completely (kernel panic I guess).

Is there a way I can get ipvlan working with my br0 docker containers? Or can I use macvlan with br0 containers without getting trace errors? I really don't need vlan tagging that much on my server.

JorgeB · February 2, 2022

I never used it but as I understand Ipvlan should work the same as before for connectivy.

3dee · February 2, 2022

So I just changed it again from everthing working macvlan setting to ipvlan and re-enabled docker. Nothing else changed.

The issues are back - to name some of them:

binhex-teamspeak - Server not reachable through WAN IP, only LAN

binhex-rtorrentvpn - No torrent conntects to its Tracker (Tracker: [Couldn't resolve host name])

swag - None of the proxy sites is reachable via its subdomain (but the containers are local reachable)

pi-hole - "Maximum number of concurrent DNS queries reached (max: 150)"

the webinterfaces of containers take like 10-20 seconds to connect for the first time (like owncloud or pi-hole)

It seems like none of the containers can connect to the internet, but pinging a server from the container shell still works:

Spoiler

root@67a0fc2da785:/# ping google.com
PING google.com (172.217.168.206): 56 data bytes
64 bytes from 172.217.168.206: seq=0 ttl=113 time=10.286 ms
64 bytes from 172.217.168.206: seq=1 ttl=113 time=9.862 ms
64 bytes from 172.217.168.206: seq=2 ttl=113 time=9.996 ms
64 bytes from 172.217.168.206: seq=3 ttl=113 time=9.887 ms
64 bytes from 172.217.168.206: seq=4 ttl=113 time=9.919 ms
64 bytes from 172.217.168.206: seq=5 ttl=113 time=9.940 ms
64 bytes from 172.217.168.206: seq=6 ttl=113 time=9.972 ms
64 bytes from 172.217.168.206: seq=7 ttl=113 time=9.854 ms
64 bytes from 172.217.168.206: seq=8 ttl=113 time=9.853 ms
64 bytes from 172.217.168.206: seq=9 ttl=113 time=9.938 ms
^C
--- google.com ping statistics ---
10 packets transmitted, 10 packets received, 0% packet loss
round-trip min/avg/max = 9.853/9.950/10.286 ms
root@67a0fc2da785:/#

I will switch back to macvlan for now, but I would love to get ipvlan running since it may not crash my server. Diagnostics are attached.

serverpc-diagnostics-20220202-1914.zip

hqueiroga · May 22, 2022

@3dee have you found the solution for your problem? I have same problem as you and IPVLAN doesn’t work to me whereas MACVLAN works but eventually crashes my server

This is so frustrating…

3dee · May 23, 2022

18 hours ago, hqueiroga said:

@3dee have you found the solution for your problem? I have same problem as you and IPVLAN doesn’t work to me whereas MACVLAN works but eventually crashes my server

This is so frustrating…

Nope I don't use VLANs anymore with my server. Switch port is now mode access. It's stable since then.

hqueiroga · May 23, 2022

6 hours ago, 3dee said:

Nope I don't use VLANs anymore with my server. Switch port is now mode access. It's stable since then.

This is new to me… so if the mode access do you get to run different t IP addresses? Do you have multiple network cards on your server? Sorry the questions… trying to find an alternative solution here… otherwise have no way to use multiple IP addresses here… if I choose macvlan works perfectly but eventually crashes my server… if I choose ipvlan doesn’t work

3dee · May 24, 2022

Switchport Mode access means that I'm not using VLAN tagging. I'm only using one ethernet port with a single IP address for the server. Trunk makes my server crash like every week, ipvlan does not work for me neither and I'm not willing to change every docker container from br0 to something else.

hqueiroga · May 24, 2022

Thanks for your reply @3dee. Think I need same solution as you... which switch do you use for that?

3dee · May 24, 2022

3 hours ago, hqueiroga said:

Thanks for your reply @3dee. Think I need same solution as you... which switch do you use for that?

Not sure what solution you mean. It's the "easiest" way of connecting the server to the network. The switch model is not relevant for this, you could also just use an unmanaged switch or your standard home router.

Unraid OS stuck

Recommended Posts

3dee

Link to comment

Squid

Link to comment

3dee

Link to comment

Squid

Link to comment

3dee

Link to comment

JorgeB

Link to comment

3dee

Link to comment

JorgeB

Link to comment

3dee

Link to comment

JorgeB

Link to comment

3dee

Link to comment

hqueiroga

Link to comment

3dee

Link to comment

hqueiroga

Link to comment

3dee

Link to comment

hqueiroga

Link to comment

3dee

Link to comment

Join the conversation