Haas360 Posted August 1, 2017 Share Posted August 1, 2017 Hello, I have an Unraid system that seems to be crashing once a week at random intervals. When this happens it actually takes my entire network down. I am a IT administrator so it could be related to a broadcast storm, but the fact that unraid freezes locally on the monitor leads me to believe we have a bigger issue. I am about to purchase the product but wont unless we can fix this bug. Thanks! Link to comment
Frank1940 Posted August 2, 2017 Share Posted August 2, 2017 It seems to me that I have heard of a couple of other cases of this happening. What happens if you disconnect the cat 5 cable from the server after the network lockup? Will this unlock the server? You should probably also install the 'Fix Common Problems' plugin. Then turn on the trouble shooting mode. Hopefully, it will catch something happening that will provide some insight as to what is going on. Link to comment
Haas360 Posted August 2, 2017 Author Share Posted August 2, 2017 45 minutes ago, Frank1940 said: It seems to me that I have heard of a couple of other cases of this happening. What happens if you disconnect the cat 5 cable from the server after the network lockup? Will this unlock the server? You should probably also install the 'Fix Common Problems' plugin. Then turn on the trouble shooting mode. Hopefully, it will catch something happening that will provide some insight as to what is going on. It will unfreeze my entire network instantly, however the console on Unraid will NOT come back. I have also disabled C states to see if that fixes anything. No difference so far. Something i have noticed, I have been running for 4 days with my W10 VM disabled. No crashes so far... Might be related? Link to comment
jonp Posted August 3, 2017 Share Posted August 3, 2017 Please obtain your system diagnostics from the tools > diagnostics page in the webgui and post them here. This will give us more insight into your system configuration. Sent from my SM-G930P using Tapatalk Link to comment
HereToStay Posted August 3, 2017 Share Posted August 3, 2017 I just had this exact same thing happen to me the other day. I've been running this unraid server for 4 or 5 years without incident (extremely stable). For what ever reason, it became non-responsive and took out my network. Didn't realize the two were connected until after I got it back up and running. Ended up having to take it down hard, did a parity check (8 hours) and everything seems fine now. Link to comment
lionelhutz Posted August 4, 2017 Share Posted August 4, 2017 8 hours ago, jonp said: Please obtain your system diagnostics from the tools > diagnostics page in the webgui and post them here. This will give us more insight into your system configuration. Sent from my SM-G930P using Tapatalk You can check this out while you're at it. I have had the same issue for a while now. I'm not convinced it's a crash because that wouldn't flood the network. So, I suspect it's some kind issue with a process taking over all the resources to the point nothing else will work. I have limited the docker container resources and updated the motherboard bios a number of times and it;s finally been more stable. But, I don't believe it's perfect yet. I have tried tailing the syslog locally to catch it and there is nothing. Here is a previous thread I've had this issue too. mediaserver-diagnostics-20170803-2147.zip Link to comment
Haas360 Posted August 20, 2017 Author Share Posted August 20, 2017 On 8/3/2017 at 11:46 AM, jonp said: Please obtain your system diagnostics from the tools > diagnostics page in the webgui and post them here. This will give us more insight into your system configuration. Sent from my SM-G930P using Tapatalk Seems many others are having issues with this as well. Here are the diags. Please respond asap. tower-diagnostics-20170820-0223.zip Link to comment
hodkenneth Posted August 22, 2017 Share Posted August 22, 2017 Have had this a few times myself. Thought it was a VM or Docker so I tried disabling different ones to no effect. Completely locks up the server. There are a few things that I've noticed looking at the logs. http://prntscr.com/gbhgis Bunch of enables/disables and promiscuous/blocking on boot up. Maybe after awhile something gets out of wack and locks a port? Also I've noticed FTP keeps turning to Enabled on every reboot. Even after changing it to disabled. cloud-diagnostics-20170822-0047.zip Link to comment
Cullgale Posted August 22, 2017 Share Posted August 22, 2017 Hi All this seems to be a common theme as I am having the same problems (ps I am really new at this unraid 6 and this formum stuf). In my case I am loading a 6TB stack with movies and videos from a patriot Valkyrie and other drives. The Network link drops out for no apparent reason. I was getting call track errors but these appear to have stopped but have been replaces with wrong csrf Tockens. I recently loaded cAdvisor into the Docker summary and note that although the Dashboard indicated <5% mem usage the dial indicator in cAdvisor zoomed up to 95% (i have 16GiB) and when it hits 99 the network drops out (see zip of word doc with screen shots also has system overview) with a 50:50 chance of the server freezing. this also happens when watch 1080p video through Plex but is OK for 720p or lower. I am running Plex, Dolfin and cAdvisor in Docker appreciate any advice/ solutions. Erratum They say when all else fails read the instructions. my docker virtual disk is at “/mnt/user/docker.img which the manual says is not a valid location - can I drag and drop this to the cashe disc with Dolfin? or do I start from scratch by disabling docker and reset from start? tower-syslog-20170822-1606 (1).zip tower-diagnostics-20170822-1607.zip Unraid 6 issues.zip Link to comment
Cullgale Posted August 22, 2017 Share Posted August 22, 2017 Well I really mucked that up without waiting for advice I disabled Docker and reset the docker path to /mnt/cache/docker/docker.img deleted the previous image now I get the message Warning: stream_socket_client(): unable to connect to unix:///var/run/docker.sock (Connection refused) in /usr/local/emhttp/plugins/dynamix.docker.manager/include/DockerClient.php on line 673 Couldn't create socket: [111] Connection refused Warning: Invalid argument supplied for foreach() in /usr/local/emhttp/plugins/dynamix.docker.manager/include/DockerClient.php on line 853 And any attempt to install Plex, Dolphin or cAdvisor fail. Link to comment
Haas360 Posted August 30, 2017 Author Share Posted August 30, 2017 On 8/3/2017 at 11:46 AM, jonp said: Please obtain your system diagnostics from the tools > diagnostics page in the webgui and post them here. This will give us more insight into your system configuration. Sent from my SM-G930P using Tapatalk Hello sir. Can we get an update? Link to comment
jonp Posted September 6, 2017 Share Posted September 6, 2017 On 8/30/2017 at 0:11 PM, Haas360 said: Hello sir. Can we get an update? Just reviewed your logs and I can't seem to find any smoking gun. My guess is that these diagnostics are coming from a time where unRAID has booted, but hasn't crashed. A few things on this issue: #1: We have not been able to reproduce this internally. This is by far the biggest problem preventing us from further troubleshooting / diagnosis. If we can't recreate it, then we can't fix it. The reason we can't recreate it I think is due to the fact that its hardware-specific to either the network equipment or server gear (or maybe both) being used. #2: Could be an ARP storm. Login to the webGui and check your network settings (click Help if need be). There is a note about ARP storms in the Help on the network settings page and I believe that may be the cause of your issue. Try toggling this setting and see if that helps. #3: If all else fails, attach a monitor and keyboard directly to the server and type "tail /var/log/syslog -f" and when it crashes, take a picture of what's on the screen and post it here. This is the only way we can see what happened right before a complete server crash. Link to comment
lionelhutz Posted September 7, 2017 Share Posted September 7, 2017 9 hours ago, jonp said: Just reviewed your logs and I can't seem to find any smoking gun. My guess is that these diagnostics are coming from a time where unRAID has booted, but hasn't crashed. A few things on this issue: #1: We have not been able to reproduce this internally. This is by far the biggest problem preventing us from further troubleshooting / diagnosis. If we can't recreate it, then we can't fix it. The reason we can't recreate it I think is due to the fact that its hardware-specific to either the network equipment or server gear (or maybe both) being used. #2: Could be an ARP storm. Login to the webGui and check your network settings (click Help if need be). There is a note about ARP storms in the Help on the network settings page and I believe that may be the cause of your issue. Try toggling this setting and see if that helps. #3: If all else fails, attach a monitor and keyboard directly to the server and type "tail /var/log/syslog -f" and when it crashes, take a picture of what's on the screen and post it here. This is the only way we can see what happened right before a complete server crash. Well, after a server crashes and doesn't respond it's rather hard to get the syslog. #2 - I don't see any ARP storm settings on the Network settings page. Is this new in the beta versions??? #3 - For me this didn't produce any results. No new lines appeared before the crash/lockup. Link to comment
brando56894 Posted September 7, 2017 Share Posted September 7, 2017 I've noticed multiple times that tail stops following the log after a few hundred lines and has to be restarted. Link to comment
Squid Posted September 7, 2017 Share Posted September 7, 2017 1 hour ago, brando56894 said: I've noticed multiple times that tail stops following the log after a few hundred lines and has to be restarted. Are you running the tail via SSH (putty) or directly on the local monitor / keyboard? Link to comment
brando56894 Posted September 7, 2017 Share Posted September 7, 2017 10 minutes ago, Squid said: Are you running the tail via SSH (putty) or directly on the local monitor / keyboard? Most of the time over SSH, sometimes locally. This isn't specifically in unRAID, but all flavors of Linux, I think I've experienced it in FreeBSD also. Link to comment
jonp Posted September 7, 2017 Share Posted September 7, 2017 16 hours ago, lionelhutz said: Well, after a server crashes and doesn't respond it's rather hard to get the syslog. #2 - I don't see any ARP storm settings on the Network settings page. Is this new in the beta versions??? #3 - For me this didn't produce any results. No new lines appeared before the crash/lockup. It looks like I misspoke about the setting. I don't see it either after having reviewed a few test systems. 4 hours ago, brando56894 said: I've noticed multiple times that tail stops following the log after a few hundred lines and has to be restarted. I've never had this problem tailing a syslog locally using a monitor and keyboard. Link to comment
rottenpotatoes Posted September 24, 2019 Share Posted September 24, 2019 Im sorry to try and resurrect an old thread but Im having this issue also. Should I continue here and post my diagnostics and screenshots and logs, or should I open a new thread? Link to comment
Frank1940 Posted September 27, 2019 Share Posted September 27, 2019 I would start a new thread... Link to comment
Recommended Posts
Archived
This topic is now archived and is closed to further replies.