jbeazies Posted March 12, 2019 Share Posted March 12, 2019 I haven't made many posts because unraid has been near flawless for many years, but lately I've been having issues that I can't pinpoint and unfortunately I have not done a great job at tracking my changes. Most of the changes include docker updates and downloads. Unraid version: 6.6.7 Plugins: CA Auto updates, CA Backup, Community Applications, FCP, Preclear Disks, Server Layout, Statistics, Unassigned Devices Dockers: binhex deluge, jacket, unificontroller, duckdns, letsencrypt, radar, openvpnas, plex, sabnzbd, sonar, ombi, lazylibrarian VMs: ubuntu, WinServer 2016 essentials *letsencrypt, lazylibrarian, ubuntu, ws2016 are stopped most of the time and not running Hardware: Supermicro - X10SDV-TLN4F Intel Xeon CPU D-1541 NVM/IOMMU Enabled 32GB ECC Corsair Memory Issue: Somewhat randomly (might take a day, might take several), I lose all connectivity to the WebGUI, Dockers, SSH, seemingly everything. What continues to work is the local terminal (via mouse/key). I can't pinpoint what I'm doing at the exact time but generally just watching plex or checking out the unifi controller, so I'm guessing it must be some process happening behind the scenes or potentially something with an automatic downloader. I still can login via root on the local term but get further sometimes and other times not far at all (e.g. sometimes ifconfig will run, othertimes it won't - one of the few Linux commands I know Diagnostics never completes - just stays at collecting... for hours. I managed to collect a syslog a cp /var/log/syslog /boot/syslog command Troubleshooting: MemTest cleared 2 passes, no recent hardware changes, no smart errors that I can tell, parity always passes - and I've ran dozens over the past couple weeks due to the hard shutdowns. Last night I thought I'd boot to SafeMode - It was going smooth until I happened to notice today while I was vpn'd to my server via unraid supervisor app on ios that the memory usage seemed to be at 76% before it crashed this time - I thought that was a bit unusual because I've never seen it reach that high, but could have been a coincidence idk. I appreciate any help/advice. syslog.zip Quote Link to comment
Frank1940 Posted March 12, 2019 Share Posted March 12, 2019 Check that all of the Dockers and Plugins that collect data to be stored on the array have any storage area that they use assigned to one of the mount points in /mnt IF one on them is pointing to a place other than /mnt , then data is probably being stored on Unraid's RAM disk. The available RAM for this type of use is quite small and bad things will usually happen when unRAID runs out of RAM. 🙄 Quote Link to comment
jbeazies Posted March 12, 2019 Author Share Posted March 12, 2019 Thanks for the feedback Frank. I haven't noticed anything you mentioned out of place but I'll give it another lookover. I did actually find FCP pointing out my two dockers didn't have the Slave option, but adjusted that to the recommended RW/Slave. I had the parity running from yesterday and its at 81% but the CPU usage I just discovered is at 100% and isn't moving. I ran top and the process using my entire cpu is kworker/u32:1+events_power_efficient Quote Link to comment
jbeazies Posted March 24, 2019 Author Share Posted March 24, 2019 (edited) Quote I thought I had this fixed but issue reoccurred last night with the same kworker process utilizing 100% of CPU. Only thing I could do was a hard reset. I did as suggested and everything is pointed to a /mnt directory in docker & plugin configs. Here is something I've been able to catch Edited March 25, 2019 by jbeazies Quote Link to comment
jbeazies Posted September 17, 2019 Author Share Posted September 17, 2019 Hey guys, I'm still having issues with my server becoming unresponsive to anything and stuck with a screen similar to this.. Unable to access the system at all. This is the only notable interaction, a photo of the screen. The only commonality I've found is the Unifi-controller docker. If I enable it then it'll crash my unraid server with the following symptoms intermittently. When I originally observed this behavior I thought I had it narrowed down to this specific docker after many hours of troubleshooting. At the time I thought it was because I attempted to downgrade controller version from v5.9 to the LTS branch (~v5.6). Since then I completely removed the Unifi-controller docker with a clean install and a clean config. Here is my current config - same except for newer version of unraid & lts of unifi. Unraid version: 6.7.2 Plugins: CA Auto updates, CA Backup, Community Applications, FCP, Preclear Disks, Server Layout, Statistics, Unassigned Devices Dockers: binhex deluge, jacket, unificontroller, duckdns, letsencrypt, radar, openvpnas, plex, sabnzbd, sonar, ombi, lazylibrarian Hardware: Supermicro - X10SDV-TLN4F Intel Xeon CPU D-1541 NVM/IOMMU Enabled 32GB ECC Corsair Memory Symptoms : -WebUI not responsive (took to long to respond, site can't be reached) -Ping reply: Destination host unreachable -SSH not responsive (Connection timed out) -SMB Share / Network name via file Explorer unable to access (Network path was not found) Any other suggestions? I can't grab any logs because the system is completely hung Quote Link to comment
Frank1940 Posted September 17, 2019 Share Posted September 17, 2019 5 minutes ago, jbeazies said: Unraid version: 6.7.2 <<<< SNIP >>>> Any other suggestions? I can't grab any logs because the system is completely hung I am not enough of a Guru to be able to figure out from that screen shot what is going on but hopefully one of the true Gurus will be able to. However, you can now grab the syslog since you upgraded to 6.7.2! Go to Settings >>>> Syslog Server and set it up to 'Mirror syslog to flash:'. You can find out how to do this by using the built-in Unraid "Help" feature--- it is the question-mark-in-a-circle icon on the Toolbar. Quote Link to comment
jbeazies Posted September 17, 2019 Author Share Posted September 17, 2019 Frank I just happened to find these threads: I too had the unificontroller set to custom br0 with fixed ip. I'll work on altering those settings to host for the time being. Quote Link to comment
kingfetty Posted September 18, 2019 Share Posted September 18, 2019 Also have an IP assigned to a docker on a br0 port and it's causing crashes. I've been troubleshooting this for months and now found this and was able to stop the random crashes. Quote Link to comment
jbeazies Posted September 24, 2019 Author Share Posted September 24, 2019 @kingfetty - What did you do as a workaround? I still want to use static IP for my unifi-controller docker. Quote Link to comment
Hoopster Posted September 24, 2019 Share Posted September 24, 2019 2 hours ago, jbeazies said: @kingfetty - What did you do as a workaround? I still want to use static IP for my unifi-controller docker. If he did what I ended up doing as documented in the thread of mine you linked, the solution was to create a VLAN for docker containers. I would get call traces when assigning an IP address on br0, but once I created a VLAN (br0.3) and assigned IP addresses on that VLAN to the docker containers the call traces went away. It has been well over a year since I had a macvlan-related call trace or server lockup Your screenshots don't show any macvlan broadcast-related call traces, but, after a few of them occurred, my server would eventually lock up and had to be manually rebooted via the power button. Quote Link to comment
bonienl Posted September 24, 2019 Share Posted September 24, 2019 24 minutes ago, Hoopster said: I would get call traces when assigning an IP address on br0 Do you use VMs which connect over br0? The observation I have is that VM communication and Docker communication over the same interface (or VLAN) can have a collision once in a while, resulting in a broadcast related call trace. I have separated VM and Docker communications over different VLANs. Quote Link to comment
jbeazies Posted September 26, 2019 Author Share Posted September 26, 2019 Are there any instructions on creating and assigning VLAN to a docker container? Is it just setting VLAN on and putting the 169.x address into the custom br0 docker config? Quote Link to comment
bonienl Posted September 26, 2019 Share Posted September 26, 2019 VLANs are created under network settings. For example adding VLAN 10 to eth0 becomes interface eth0.10 (or br0.10 when bridging is enabled) In the container configuration you need to assign custom network br0.10 to make use of the VLAN network (keep in mind this is a completely separated network from eth0/br0 and your router needs to support VLANs too). 1 Quote Link to comment
Hoopster Posted September 26, 2019 Share Posted September 26, 2019 15 minutes ago, jbeazies said: Are there any instructions on creating and assigning VLAN to a docker container? Is it just setting VLAN on and putting the 169.x address into the custom br0 docker config? Here's my VLAN 3 for Docker containers as defined in unRAID Network Settings (since bridging is enabled it becomes br0.3): On the router side (UniFi USG), I also created the VLAN: And in the config for a Dcoker container I want to assign an address on the VLAN, I specify br0.3 as the Network Type and an IP address in the range (192.168.3.100...192.168.3.150) I assigned to that VLAN in the router: 1 Quote Link to comment
jbeazies Posted September 26, 2019 Author Share Posted September 26, 2019 Thanks all! This was primarily for my Unifi controller docker. I suppose I could slowly start migrating my other dockers over, but for now those remain on a combination of bridge/host network types. I'll monitor the status of the unifi docker and fingers cross, hope this resolves the issue. Now that I think about it, I'm wondering if this had anything to do with my original issue. I set a static IP to my primary network via USG for the controller docker. I also set the custom br0 within unraid docker config to that same static IP. I wonder if this somehow caused the conflict, resulting in unraid sys locking up? Quote Link to comment
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.