limetech Posted August 24, 2015 Share Posted August 24, 2015 Limetech, No Dockers or plugins are running...completely stock unraid 6.01. I am not sure how to define the "Crash" other than the webgui becomes unresponsive...If I go to tower my web browser wait icon just spins. Look at the attached htop screenshot (you may need to download and zoom in to read) https://www.dropbox.com/s/wuyb33vm4zgr1kt/unraidhtop.jpg?dl=0 Memory and CPU have a high utilization...maybe that is keeping the webgui from running properly??? Checkout the attached unraid_crash_ps-A.txt file...it shows the output of ps -A and ps -ef. For the first time I counted...there are 255 instances of smbd running. Why would this many processes be running??? Thanks again for your assistance! Dan Yes a large number of smbd processes does not seem right. What is using the storage? Meaning, I'm guessing you have a windows PC or other o.s. accessing unRaid shares. What exactly is it? Link to comment
EGOvoruhk Posted August 24, 2015 Share Posted August 24, 2015 My WebUI crashes about once every other month/third month as well, and has done so since as far as I can remember. The shares still work, so I usually end up leaving it for a day or two, because it'll want to do a parity check once I force a restart Link to comment
goinsnoopin Posted August 24, 2015 Author Share Posted August 24, 2015 Limetech, I have one single Windows 7 64 bit professional PC connecting to unraid via samba. That windows 7 PC is running SageTV which records TV shows to unraid and Plex which I use to playback media on several devices..tablets, smartTV's, and a couple of roku players. I am utilizing a cache drive, and the mover script moves things at 3:00am. Any suggestions on how to troubleshoot this? Link to comment
limetech Posted August 24, 2015 Share Posted August 24, 2015 One thing to check first. Please run a 'File system check' on your devices. Stop array and then Start in Maintenance mode. You can then click each device on the Main page and click the 'Chec' button under Check Filesystem Status section. Assuming all file systems are good - next thing would be to have a terminal window open with htop running. Then fire up the applications you're running on your windows PC and see if you can observe any correlation between something started and the number of 'smbd' processes created. Link to comment
goinsnoopin Posted August 25, 2015 Author Share Posted August 25, 2015 Limetech, I checked each disk in the array and the cache drive and the file system check reported no corruptions found for all of the drives. Earlier you asked about what was connecting to unraid. I thought of two other applications: Lightroom, I import photos directly to a photo unraid share and I also use MCEBuddy to convert my SageTV recordings from Mpeg-2 to Mpeg-4. Please keep in mind that all of these applications, including plex and SageTV mentioned in an earlier email, were present and working in unraid 5 for well over a year. The crashes started as soon as I upgraded to unraid 6.01. Is there any other way to log what is spawning the smbd processes. I can try what you are suggesting but the crashes seem random, and I don't know what fires them off. I can tell you that all but one of them occurred during the night after the mover script ran...as I always see that in my syslog just before the unresponsive GUI. Dan Link to comment
goinsnoopin Posted August 26, 2015 Author Share Posted August 26, 2015 Limetech, I may have caught my server on its way to a crash...normal memory usage (2GB on my server) is at about 10 to 15% in htop. When I got home from work, htop was showing 40% usage and I watched and it was slowly increasing (went from 500mbytes to 760 megabytes of overall memory usage in an hour or so). There were a bunch of smbd processes listed in htop. The webgui was very sluggish...but after a minute or so the webgui refreshed. During that hour I closed PLEX, SageTV, MCEBuddy and Lightroomon on the windows 7 PC and ended any associated services to ensure that is was not using any samba connections and the memory usage still climbed. I then shutdown the Windows 7 PC for two hours or so...the memory usage did not climb it maintained at 760 megabytes of overall usage. Ultimately I tried to shutdown unraid via the webgui, but it did not shut down, so I had to do a manual reboot. Is there any way to use smbtree to debug the samba connection? Does unraid kill unused processes after a certain amount of time (my Win 7 PC was off for 2 hours and no processes were killed). Does unraid limit the number of smbd processes? If not can it until we understand what is causing them to spawn off. Thanks, Dan Link to comment
pickthenimp Posted August 27, 2015 Share Posted August 27, 2015 Dan, I know we were having similar issues. FWIW mine has not locked up since upgrading to 6.1-RC6. Maybe give that a go since you are running barebones anyway? Link to comment
goinsnoopin Posted August 27, 2015 Author Share Posted August 27, 2015 Thanks for the info. Just curious, how long have you been running 6.1 RC6? For me the crash has occurred anywhere from 18 hours to 4 days after a reboot. Link to comment
pickthenimp Posted August 27, 2015 Share Posted August 27, 2015 I lied, I spoke to soon. Everything was running smooth for 2.5 days (which was a record for me). Just went to check my uptime and noticed "Mover is running..." Now I cant hit any of my shares again. bloody hell. I will update my post with new syslog info...I feel your pain. Link to comment
goinsnoopin Posted August 28, 2015 Author Share Posted August 28, 2015 Update: latest crash shows two processes in htop that are each utilizing 100% of cpu. Based on the time they started around the time that the mover script runs. Both processes have the same command: /use/local/sbin/shfs /mnt/user0 -disks 494 -o noatime,big_writes,allow_other There are also 100+ smbd processes running. As usual any suggestions are appreciated! Dan Link to comment
bonienl Posted August 28, 2015 Share Posted August 28, 2015 Have you been using v6.0rc13 in the past together with AFP ? If yes then this topic may help. Link to comment
goinsnoopin Posted August 28, 2015 Author Share Posted August 28, 2015 Bonienl, I went from unraid 5 directly to 6.01. I never installed any release candidates. Also I have never used AFP. Dan Link to comment
goinsnoopin Posted September 1, 2015 Author Share Posted September 1, 2015 Still Crashing...less than 24 hours this time...no where near the mover script. Its crashed now...memory is only at 325megabytes used...cpu usage keeps showing the following line as using 200%??? Message is a little different than a few posts ago. /use/local/sbin/shfs /mnt/user0 -disks 495 2048000000 -o noatime,big_writes,allow_other -o remember=330 There are also 50+ smbd processes running, more instances of smbd keep showing up and memory usage increasing??? However when I rebooted my windows 7 machine...the memory freed up, however the same process above keeps indicating 200% cpu Will manual reboot again so I can get the diagnostics file. Again...this never happened in Unraid 5...I am really frustrated. Dan Link to comment
limetech Posted September 2, 2015 Share Posted September 2, 2015 Please upgrade to 6.1 stable. Probably won't fix the issue but I can only debug that version. Link to comment
goinsnoopin Posted September 2, 2015 Author Share Posted September 2, 2015 Thanks Tom...I will need to do a manual reboot...which will kick off a parity check... I assume that I should let this complete before performing the upgrade? Unless I hear otherwise, that is what I will do. I didn't realize 6.1 stable was out. Dan Link to comment
limetech Posted September 2, 2015 Share Posted September 2, 2015 Thanks Tom...I will need to do a manual reboot...which will kick off a parity check... I assume that I should let this complete before performing the upgrade? Unless I hear otherwise, that is what I will do. I didn't realize 6.1 stable was out. Dan In general to shut things down from the command line this should do it: (logged in at console or telnet session) /etc/rc.d/rc.docker stop samba stop sync umount /mnt/* mdcmd stop If that doesn't do it you might have other programs hanging onto file handles, or VM's or plugins that are not shut down. Link to comment
goinsnoopin Posted September 2, 2015 Author Share Posted September 2, 2015 Upgraded to Unraid 6.1 this morning....everything went smooth. I have my fingers crossed! I will report back...its just a waiting game at this point. Thanks again Link to comment
goinsnoopin Posted September 3, 2015 Author Share Posted September 3, 2015 Limetech, After 8 hours Unraid 6.1 crashed. Same exact symptoms as described in this thread for unraid 6.01. I attempted the manual shutdown you described...I typed the first command in the console and telnet locked up and the command never finished : /etc/rc.d/rc.docker stop I kept a log window open to see if it captured anything, attached is a screen capture of that window. If you look at the end of the screen capture you will see 3 instances of telnet. Instance #1 = the rc.docker stop above that never completed, Instance #2 - I typed diagnostics and I got the message Starting diagnostics collection, however it never completed, Instance #3 - is my telnet login this morning to just see if I could connect to unraid to see what was going on since Instance 1 and 2 locked up my telnet session. Its interesting to note that unraid is still functioning in some manner as the third instance this morning shows up in the log window. I will also include a .jpg screenshot of htop showing the 200% cpu usage. https://www.dropbox.com/s/vjo0bs6bcjnhahz/unraid_logwindow.jpg?dl=0 https://www.dropbox.com/s/wuyb33vm4zgr1kt/unraidhtop.jpg?dl=0 Please note that prior to instance #1 above, I did run diagnostics in a telnet console and it completed, so I should have a diagnostics file for this crash. I have not manually rebooted yet (which I need to do to get this diagnostics file...unless someone can tell me an easy way to grab it or send it somewhere via command line). I am always hesitant to manually reboot in case there is something that someone on the forums would like me to try in this crashed state. Again any help is appreciated! Link to comment
goinsnoopin Posted September 3, 2015 Author Share Posted September 3, 2015 Interesting development...I left my unraid 6.1 server in the crashed state with a log window open and additional items were written to the log...many hours after the original crash. I am hoping this will provide additional details to work towards a solution and/or better understanding of what is happening. Look at the log starting at 9:20 this morning...a lot going on...starts with the line below: Sep 3 09:20:20 Tower kernel: BUG: unable to handle kernel NULL pointer dereference at 0000000000000008 My telnet session at 13:08 stopped responding...I had htop running...CPU usage was no longer at 100%...more like 28%...tried to close htop and it would not return to command prompt. Attempted to open a new telnet session and it no longer connects. I have left Unraid in this state. Dan unraid_logwindow9_3_15.txt Link to comment
goinsnoopin Posted September 9, 2015 Author Share Posted September 9, 2015 I just saw the Unraid 6.1.2 changelog and am hoping that this version may address my smbd processes that are staying open. I just upgraded...so we will have to wait and see. Dan Link to comment
pickthenimp Posted September 9, 2015 Share Posted September 9, 2015 I just saw the Unraid 6.1.2 changelog and am hoping that this version may address my smbd processes that are staying open. I just upgraded...so we will have to wait and see. Dan Good luck Dan. I still feel your pain. Link to comment
bardsleyb Posted September 9, 2015 Share Posted September 9, 2015 I have been having issues with the WEBUI hanging and not responding as well. I can always telnet in though. My shares DO go offline when the WEBUI hangs though, so I cannot access any data at all. I also cannot get into any docker apps that are running when the web UI hangs. They just spin in the browser and eventually timeout. Plex seems to die too and become inaccessible. I will bookmark this forum post and keep watching it. I have to force an unclean shutdown anywhere from 8 hours to 4 or 5 days as well. I will post diagnostic files here in this forum post when/if it happens again. I will also be certain to update as soon as they come out from Limetech. I run quite a bit on my server as far as dockers and VMs go but I would be willing to stop almost all of them for debugging and finding a root cause to this issue. My wife would probably scream at me if I left the Plex docker off, as this is our only form of watching TV now that I cut the cord. I am sure I could leach off a few friends servers while we resolve this issue though. Best of luck to us all! Link to comment
itimpi Posted September 9, 2015 Share Posted September 9, 2015 I have been having issues with the WEBUI hanging and not responding as well. I can always telnet in though. My shares DO go offline when the WEBUI hangs though, so I cannot access any data at all. I also cannot get into any docker apps that are running when the web UI hangs. They just spin in the browser and eventually timeout. Plex seems to die too and become inaccessible. I will bookmark this forum post and keep watching it. I have to force an unclean shutdown anywhere from 8 hours to 4 or 5 days as well. I will post diagnostic files here in this forum post when/if it happens again. I will also be certain to update as soon as they come out from Limetech. I run quite a bit on my server as far as dockers and VMs go but I would be willing to stop almost all of them for debugging and finding a root cause to this issue. My wife would probably scream at me if I left the Plex docker off, as this is our only form of watching TV now that I cut the cord. I am sure I could leach off a few friends servers while we resolve this issue though. Best of luck to us all! Have you done the upgrade to 6.1.2 which includes a fix for GUI hangs and has been getting positive feedback on the fix. Submitting any reports for an earlier release would almost certainly result in being told to upgrade to 6.1.2 to see if it still happens. Link to comment
bardsleyb Posted September 9, 2015 Share Posted September 9, 2015 I have been having issues with the WEBUI hanging and not responding as well. I can always telnet in though. My shares DO go offline when the WEBUI hangs though, so I cannot access any data at all. I also cannot get into any docker apps that are running when the web UI hangs. They just spin in the browser and eventually timeout. Plex seems to die too and become inaccessible. I will bookmark this forum post and keep watching it. I have to force an unclean shutdown anywhere from 8 hours to 4 or 5 days as well. I will post diagnostic files here in this forum post when/if it happens again. I will also be certain to update as soon as they come out from Limetech. I run quite a bit on my server as far as dockers and VMs go but I would be willing to stop almost all of them for debugging and finding a root cause to this issue. My wife would probably scream at me if I left the Plex docker off, as this is our only form of watching TV now that I cut the cord. I am sure I could leach off a few friends servers while we resolve this issue though. Best of luck to us all! Have you done the upgrade to 6.1.2 which includes a fix for GUI hangs and has been getting positive feedback on the fix. Submitting any reports for an earlier release would almost certainly result in being told to upgrade to 6.1.2 to see if it still happens. No but this will be done as soon as I get home from work. I'm running the 6.1 right now still. Thanks! Link to comment
jonp Posted September 10, 2015 Share Posted September 10, 2015 I'm actually going to close this thread as the issue here has been resolved. If folks are still experiencing a webUI issue after upgrading to 6.1.2, please open a new thread. Thanks! Link to comment
Recommended Posts