January 25, 20233 yr Hello, The server becomes unresponsive randomly every few days, no hdmi output and no answer from the network. It started about a month or so ago. I have tried to read the syslog in the past but I couldn't figure out what was the problem. Tonight it did it again but I've noticed some ?kernel errors @Jan 25 07:04:36 I'm attaching the diagnostic and the syslog. I'm sure something is failing, whatever it is hopefully it can wait another couple of weeks. So far to deal with the random unresponsiveness I had to set up a smart plug to a script that automatically pings the servers for a while, if no response, it brute resets the plug causing to reboot. Yes, not "clean" but it's the only way to not have the server down for hours when I'm away/asleep. (Little note: I noticed the clock and the name of the server got reset this last time, to be fair I think that was on me.) syslog-192.168.1.10.log tower-diagnostics-20230125-1227.zip
January 25, 20233 yr Community Expert Last crashes are related to the AMD GPU, try blacklisting the driver if you don't need it, there are also some OOM errors, you should limit resources.
January 25, 20233 yr Author 11 minutes ago, JorgeB said: Last crashes are related to the AMD GPU, try blacklisting the driver if you don't need it, there are also some OOM errors, you should limit resources. Thank you so much for answering and looking into it. I will try and limit resources as you suggest, I am just surprised as the only docker container that should be more active than anything else is Frigate, all the others are basically idles (and definitely should be while the crashes were happening). The OOM might be because I changed the value about memory usage in the tip and tweaks plug-in, but to debug this issue I did reset them to default a couple of days ago. But if you tell me the errors are more recent then, then is definitely something wrong with it. Odd. For the GPU side, I am using it for transcoding with Frigate (with Coral usb), and going to cpu transcoding would be quite inefficient. Is there anything else I can do? It is 2400g Ryzen, so it’s a APU/iGPU, don’t know if it makes any difference with unraid.
January 25, 20233 yr Community Expert If you need the GPU to transcode you need the driver, you can still disabled it for a few days just to try and see if that is causing the problems or not.
January 25, 20233 yr Author 1 minute ago, JorgeB said: If you need the GPU to transcode you need the driver, you can still disabled it for a few days just to try and see if that is causing the problems or not. Fair enough, would you mind to tell me how to do? I cannot access the server physically at the moment as I'm on holiday, I hope it is something I can do remotely.
January 25, 20233 yr Community Expert 1 hour ago, bexem said: I hope it is something I can do remotely. Yep: https://wiki.unraid.net/Manual/Release_Notes/Unraid_OS_6.10.0#Linux_Kernel
January 25, 20233 yr Author 33 minutes ago, JorgeB said: Yep: https://wiki.unraid.net/Manual/Release_Notes/Unraid_OS_6.10.0#Linux_Kernel Perfect, all done, now it's matter of waiting. Thank you so much! I've followed the steps, but when I run: lsmod | grep amdgpu I still have this as output: amdgpu 6705152 0 gpu_sched 40960 1 amdgpu i2c_algo_bit 16384 1 amdgpu drm_ttm_helper 16384 1 amdgpu ttm 73728 2 amdgpu,drm_ttm_helper drm_display_helper 135168 1 amdgpu drm_kms_helper 159744 4 drm_display_helper,amdgpu drm 475136 7 gpu_sched,drm_kms_helper,drm_display_helper,amdgpu,drm_ttm_helper,ttm i2c_core 86016 6 drm_kms_helper,i2c_algo_bit,drm_display_helper,amdgpu,i2c_piix4,drm backlight 20480 4 video,drm_display_helper,amdgpu,drm With: root@Tower:~# ls /boot/config/modprobe.d/ amdgpu.conf root@Tower:~# cat /boot/config/modprobe.d/amdgpu.conf blacklist amdgpu In the meantime I have disabled hardware acceleration on Frigate and Plex (the only two containers that had it). Forgot to add, yes I have rebooted. Also I have the same file "amdgpu.conf" in /etc/modprobe.d/ with the exact same content. Edited January 25, 20233 yr by bexem Update 2
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.