September 1, 20214 yr Hello Everyone! I put this system together not so long ago, and most of the stuff is running and I was able to troubleshoot. But sometimes it crashes, and can't figure out why. If anyone has an idea or can point me in some direction, I would be happy. 🙂 Symptoms: webgui doesn't load, shares are unaccessable, fans are running. Since I have Pihole running on it, and my desktop and phone points to it as DNS server, the connection to the internet is not working on the said machines. If I leave it as it is, doesn't recover, after cold reboot everything start up fine (array, parity check, docker containers). Timing looks random. It happened usually the early morning hours, but happened also at 11:40. Uptime is also random, sometimes a week, sometimes it doesn't even reach 24h.  System: Acer Q67H2-AM mobo with i3-2120 6GB RAM: 2GB Kingmax + 4GB Crucial PSU: Cooler Master Elite Power 500W HDDs: 4x Samsung HD204UI 2TB  What I ruled out: - no cache - > no mover - no Vm - RAM: memtest86 found no errors  Syslog: doesn't show anything. (Crash happened between on 28th, 8:03 and 12:32.) Tried to capture with local syslog server and mirroring to the flash, but there is nothing suspicious.  Once the stats tab was running in my browser on my desktop PC, when the crash happened. I saw very small activities on everything, so CPU, network and disk usage were barely visible on the graphs. The RAM showed the usual values, around 3 gigs used, around 2-300 MB free, the rest is cached, which should be normal in Linux based system afaik.  From the Docker logfiles: I couldn’t find all the logs but I can post them, as I copied my whole appdata folder to my desktop after this crash. I checked the following dockers: Jellyfin, Jackett, Lidarr, NginxProxyManager, Ombi, Qbittorrent, Radarr, Readarr, Sonarr. I found, that the crash was between 8:03 and 12:30. The Ombi logfile had the last timestamp. Nothing suspicious was found, only normal sheduled tasks running successfully. I suspect a PSU fault, but I would like to rule out the SW side reasons. Currently I don't have a spare PSU to swap and test it. Any ideas are welcome. 🙂 thebrain-diagnostics-20210828-1246.zip syslog-127.0.0.1.log
September 3, 20214 yr Author I've ruled out one more thing since my last post. I had 2 containers on br0 custom network (Pihole and Unbound). Yesterday I stopped them, today my server crashed again. I will stop some more, after the next restart to narrow down if they are the problem, but I doubt Ican find the rootcause this way.  If anyone has the slightest idea, don't hesitate to share.
September 3, 20214 yr Hi we seem to be in the same boat have you tried different versions with 6.9.0 it ran for more than an hour and the unresponsive if you check the terminal the curser is still flashing but nothing in Syslog. 6.10.1 only runs for minutes and the hangs.
September 5, 20214 yr On 9/3/2021 at 4:51 PM, Gibbo592 said:  Seems to be similar I will try going back to 6.8.3 and see what happens  Keep me updated. Im not home until tomorrow night to try it. Are you using the myservers plugin? Im curious if it started around that time it released. They did have issues with the api at first.
September 5, 20214 yr Still no luck I have tried all the available versions as a trial, from 6.8.3 to 6.10. all randomly hang curser keeps flashing but won’t respond only reboot works no errors in Syslog.  installed several different os windows 10, Ubuntu server, truenas and currently Slackware 14.2 all run fine so I don’t believe a hardware problem it all points to unraid itself I’m trying to find an upto date guide on custom kernel and remove all the amd stuff and keep it generic Intel and Nvidia.
September 6, 20214 yr On 9/5/2021 at 4:59 PM, Gibbo592 said: Still no luck I have tried all the available versions as a trial, from 6.8.3 to 6.10. all randomly hang curser keeps flashing but won’t respond only reboot works no errors in Syslog.  installed several different os windows 10, Ubuntu server, truenas and currently Slackware 14.2 all run fine so I don’t believe a hardware problem it all points to unraid itself I’m trying to find an upto date guide on custom kernel and remove all the amd stuff and keep it generic Intel and Nvidia. Have you ran a memtest for a number of hours yet? That's one thing I still need to try. Im also running out of options.  Edit: Sorry just realized that you did do a memtest.  Edited September 6, 20214 yr by mkono87
September 7, 20214 yr no i have a trial key going with nothing just to try and figure out whats going on
September 8, 20214 yr @mkono87 I added BOOT_IMAGE=/bzimage initrd=/bzroot acpi=off it has been running for 10 hours so far lots of time errors still but at least it is alive Edited September 10, 20214 yr by Gibbo592
September 11, 20214 yr On 9/8/2021 at 4:42 PM, Gibbo592 said: @mkono87 I added BOOT_IMAGE=/bzimage initrd=/bzroot acpi=off it has been running for 10 hours so far lots of time errors still but at least it is alive  I realized that my bios was way out of date. I updated an its been running for 3 days so far. Im not out of the clear yet, but its a good sign. My app data drive appears to have a xfs corruption error so I need to repair that at some point too.
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.