January 14, 20197 yr Hello, I hope someone here has an idea of whats going on because I am all out! Anywhere between 6 hours to 48 hours, the server will become unresponsive. By unresponsive, I mean: 1. Will not respond to pings 2. Shares are not accessible 3. Docker containers are not available 4. The host itself is unresponsive (cannot even put in a username locally to take a look) The only resolution I have found is to perform a cold reboot to the server. If there is any diagnostic information that could help, please let me know and I will attach it. Thanks, DaemonHunter
January 14, 20197 yr Community Expert Tools - Diagnostics, attach complete diagnostics zip file to your next post. Do you have an attached monitor and keyboard? Have you done a memtest?
January 14, 20197 yr Author Yes I have a keyboard/mouse attached. both are unresponsive as if system locked up. I have not done a memtest yet. Would you recommend that for start? Thanks, DaemonHunter steelmountain-diagnostics-20190114-1416.zip
January 14, 20197 yr Another test. Boot up with a monitor and keyboard attached and load up the console (not the GUI). Login and then type the following command:tail /var/log/syslog -fThis will start printing the system log to the screen. Then when the system crashes next, take a picture of what's on that screen before rebooting. Sent from my Pixel 3 XL using Tapatalk
January 14, 20197 yr 41 minutes ago, DaemonHunter said: Yes I have a keyboard/mouse attached. both are unresponsive as if system locked up. I have not done a memtest yet. Would you recommend that for start? Thanks, DaemonHunter steelmountain-diagnostics-20190114-1416.zip You have a keyboard/mouse attached, but, no monitor? If you can attach a monitor, it will help in troubleshooting. Your diagnostics are after the last reboot so, of course, there is nothing in the syslog related to what might have causes the previous crash. The syslog, since it is stored in RAM, is reset on every reboot and information logged before then is lost. Try running 'tail -f /var/log/syslog" on the console or an open terminal window (leave it open). This will actively monitor the syslog and the last few lines of the syslog will be on your console monitor/terminal window when the server locks up and requires a reboot. Before rebooting, take a picture of the monitor screen and post that. Perhaps there will be a clue in what was last logged before the lockup. Have you noticed any call traces in previous syslogs? Do you have any dockers assigned custom IP addresses. I ask because this produced call traces/server lockups on my main server until I discovered a solution. A trurl mentioned, a memtest will determine if you have any memory issues. Run it at least 24 hours. It's an option in the unRAID boot menu, but, you need a monitor on the server to see what is going on with a memtest. I don't think the version included with unRAID has file logging. Edited January 14, 20197 yr by Hoopster
January 14, 20197 yr Author Sorry I guess I should read my response before sending it. Yes I have a monitor attached as well. I am tailing the log files currently and will let you know the outcome when it locks up. @Hoopster I have not noticed any call traces in previous syslogs. All dockers are assigned via static IP. Thanks, DaemonHunter
January 19, 20197 yr Community Expert Maybe a coincidence, but a couple of those screens show GPF with Plex transcoder. Which plex are you using? Does it happen if you don't run plex? What about that memtest?
January 19, 20197 yr Author I am using linuxserver/plex. I will be running the memtest over the next 24 hours.
January 19, 20197 yr Author Could a docker container be causing it? The only recent change I have made is adding the H265ize docker container but have not had a chance to set it up. I noticed that it was running and using 80% of the CPU?? Before the memtest I think Im gonna delete this container and see if this issue persists. Is there a way to make sure everything that the container installed is removed?
January 19, 20197 yr Community Expert Not familiar with that container but assuming it behaves like a container, if you just stop it then it shouldn't matter if you remove it or not since it is "contained".
February 12, 20197 yr Author UPDATE: Have run a memtest with no issues. Formatted and reinstalled the unraid flash drive. The issue seems to be the Plex Transcoder? I have posted files of what I am seeing in the syslog as well as a recent diagnostics. steelmountain-diagnostics-20190212-0031.zip syslog_1.txt syslog_2.txt syslog_3.txt syslog_4.txt syslog_5.txt syslog_6.txt syslog_7.txt syslog_8.txt syslog_9.txt
May 3, 20197 yr Hey mate, did you end up finding a fix? My server is doing the same thing, and I think Plex may have something to do with it as I've exhausted almost all other options.
May 3, 20197 yr Author I'm not sure if this is what did it, but I disabled the CPU governor in my BIOS and all my issues went away.
May 4, 20197 yr 13 hours ago, DaemonHunter said: I'm not sure if this is what did it, but I disabled the CPU governor in my BIOS and all my issues went away. Cheers. Do you know what this setting may be called on different boards? I have an ASRock can't find anything similar (I'm looking under Advanced > CPU Config > North Bridge Config etc.
July 19, 20196 yr I've been having this same issue. I have an AMD processor with an ASUS mobo. Where in BIOS is the setting you are talking about?
July 19, 20196 yr Author In my case, it was the Cool N' Quiet setting that needed to be disabled. Once that was disabled everything started working fine.
July 19, 20196 yr I had the same problem on a server, the solution to it was to move all dockers on to their own vlan separate from the host. If your network is able to support a vlan it's definitely worth a try.
November 15, 20196 yr On 7/19/2019 at 11:12 AM, dstark4 said: I had the same problem on a server, the solution to it was to move all dockers on to their own vlan separate from the host. If your network is able to support a vlan it's definitely worth a try. Is there any documentation on this? Why did this fix it?
Archived
This topic is now archived and is closed to further replies.