Jump to content

bexem

Members
  • Posts

    10
  • Joined

  • Last visited

bexem's Achievements

Noob

Noob (1/14)

3

Reputation

1

Community Answers

  1. Thank you for your reply! It doesn't cause any issues/flood, I just thought that "Errors only" would not send any other notification; I am a "no news is good news" type of person. Again, thank you for your work!
  2. Hi! I'm loving the plugin, it completely replaced my custom script. I'm only having a little issue about the notification settings, it doesn't seem to work? I still receive a summary message: Thank you!
  3. Sorry for the double post but I might have found the reason why unRaid was randomly crashing: the cache drive would randomly disconnect itself making unRaid unable to write the log file (as the syslog server was set on a share which prefers the cache) hence why no errors. I can’t say if it’s the drive itself or the SATA cable, but in the meantime I’ve removed the cache altogether and I’ve already observed the behaviour while mounted as external drive (can’t remember the name of the plug-in). I will replace the drive as currently without a cache the system is understandably slower, but I’m wondering if there is a way to notify or check somehow if the drive has disconnected itself or not (it does “reconnect” automatically by itself)? Basically I’m planning to replace the SATA cable and keep it connected (doing nothing), if that was the issue, great, otherwise I’ll have to send the disk back. Again, sorry for the double post but I wanted to share my experience in case other users encounter the same issue.
  4. Thank you for having a look! I did fix the OOM errors, I haven't seen any other since. I guess I'll need to try what you are suggesting...I just wish unraid gave me an error/reason for the crashes!
  5. Hello, The server is randomly crashing but I can't seem to find the reason (might just be blind). I've been reading the logs but I cannot see any error. I did have some OOM error days/week ago but it was because of two misconfigured containers, but since I've set them up I had no more issues and they actually never caused issues apart the error in the log. In February I changed the whole system, going from AMD to Intel, the only component not changed is the DDR4 (and drives), which is not overclocked (I rather have a more stable server but slower), I did also change the flash drive, power supply, and the server has is own UPS as well. I'm saying this just to not rule out an hardware issue, I'll swap the ram too if that is the issue. I'm attaching the diagnostic and the logs. I hope someone can help me identify what's going on. I've named the log with the time the crash happened, the 0430 is the most recent one. tower-diagnostics-20230829-0939.zip crash_0430.log crash_0102.log
  6. Perfect, all done, now it's matter of waiting. Thank you so much! I've followed the steps, but when I run: lsmod | grep amdgpu I still have this as output: amdgpu 6705152 0 gpu_sched 40960 1 amdgpu i2c_algo_bit 16384 1 amdgpu drm_ttm_helper 16384 1 amdgpu ttm 73728 2 amdgpu,drm_ttm_helper drm_display_helper 135168 1 amdgpu drm_kms_helper 159744 4 drm_display_helper,amdgpu drm 475136 7 gpu_sched,drm_kms_helper,drm_display_helper,amdgpu,drm_ttm_helper,ttm i2c_core 86016 6 drm_kms_helper,i2c_algo_bit,drm_display_helper,amdgpu,i2c_piix4,drm backlight 20480 4 video,drm_display_helper,amdgpu,drm With: root@Tower:~# ls /boot/config/modprobe.d/ amdgpu.conf root@Tower:~# cat /boot/config/modprobe.d/amdgpu.conf blacklist amdgpu In the meantime I have disabled hardware acceleration on Frigate and Plex (the only two containers that had it). Forgot to add, yes I have rebooted. Also I have the same file "amdgpu.conf" in /etc/modprobe.d/ with the exact same content.
  7. Fair enough, would you mind to tell me how to do? I cannot access the server physically at the moment as I'm on holiday, I hope it is something I can do remotely.
  8. Thank you so much for answering and looking into it. I will try and limit resources as you suggest, I am just surprised as the only docker container that should be more active than anything else is Frigate, all the others are basically idles (and definitely should be while the crashes were happening). The OOM might be because I changed the value about memory usage in the tip and tweaks plug-in, but to debug this issue I did reset them to default a couple of days ago. But if you tell me the errors are more recent then, then is definitely something wrong with it. Odd. For the GPU side, I am using it for transcoding with Frigate (with Coral usb), and going to cpu transcoding would be quite inefficient. Is there anything else I can do? It is 2400g Ryzen, so it’s a APU/iGPU, don’t know if it makes any difference with unraid.
  9. Hello, The server becomes unresponsive randomly every few days, no hdmi output and no answer from the network. It started about a month or so ago. I have tried to read the syslog in the past but I couldn't figure out what was the problem. Tonight it did it again but I've noticed some ?kernel errors @Jan 25 07:04:36 I'm attaching the diagnostic and the syslog. I'm sure something is failing, whatever it is hopefully it can wait another couple of weeks. So far to deal with the random unresponsiveness I had to set up a smart plug to a script that automatically pings the servers for a while, if no response, it brute resets the plug causing to reboot. Yes, not "clean" but it's the only way to not have the server down for hours when I'm away/asleep. (Little note: I noticed the clock and the name of the server got reset this last time, to be fair I think that was on me.) syslog-192.168.1.10.log tower-diagnostics-20230125-1227.zip
×
×
  • Create New...