December 1, 201510 yr Ever since I upgraded to v6 I have had pretty regular webgui crashes. When this happens my shares are still available and I can telnet to the server and run commands. The only way to get the webui back is to reboot. Once I reboot, the webgui will be available for 15-30 mins or so then it crashes. This weekend I upgraded my ram from 2gb to 4gb in hopes of fixing this. No dice. The crashes seem to happen even when all of my docker apps are stopped (sab, sick & plex). I have attached my syslog. Any help would be appreciated. FWIW - I never had a single unraid crash on this hardware until I upgraded to v6. Thanks! JD syslog.txt
December 1, 201510 yr Community Expert Have you done a memtest? For v6 it is preferred that instead of attaching a syslog, go to Tools - Diagnostics and attach the complete diagnostics zip.
December 1, 201510 yr Author memtest in progress. So I should reboot (after the crash) and run a diag? Would that show the issue though since I have already rebooted? Didnt know if it was similar to syslog that clears with a reboot.
December 1, 201510 yr Community Expert memtest in progress. So I should reboot (after the crash) and run a diag? Would that show the issue though since I have already rebooted? Didnt know if it was similar to syslog that clears with a reboot. For your case, it would be better to get one after a crash. From the command line, typediagnostics and get the resulting diagnostics zip from your flash drive.
December 2, 201510 yr Author Memtest came back clear. Diag attached. thevault-diagnostics-20151201-1836.zip
December 2, 201510 yr Community Expert Memtest came back clear. Diag attached. Was this diagnostic taken after a crash?
December 2, 201510 yr Community Expert ps shows the webGUI (emhttp) is running. Not sure what to suggest. Maybe someone else will have a look.
December 2, 201510 yr The webGui is not really crashing, but becomes unresponsive due to an incomplete session transaction. What is your setting for NTP server (see Settings -> Time and Date) ? The syslog complaints that the time is unsynchronized and the webserver (emhttp) made an outgoing request to a google domain on which it gets stuck and becomes unresponsive. 192.168.0.29:80->66.249.67.224:33381 (CLOSE_WAIT) Also one of your Dockers failed installation: level=error msg="Failed to load container 074a3d1af3167da4db9fab2dffc699cb3162acd68e9e794f77cf1cfa09c121a7
December 2, 201510 yr 18:10:53 Array is started 18:10:56 Array is up, parity check started 18:10:59 Docker is started 18:11:02 Network address 172.17.42.1 is provided to a Docker container 18:11:03 Ntp sets up listener on that address (172.17.42.1) But according to docker.log - time="2015-12-01T18:11:02.711330656-08:00" level=error msg="Failed to load container 074a3d1af3167da4db9fab2dffc699cb3162acd68e9e794f77cf1cfa09c121a7: EOF" So it looks like you have one Docker container, and it's not working, could not be loaded, but a network address was assigned and NTP tried to use it. That seems like a problem. Try disabling Docker support, see if no crashes. Then try repairing/replacing/recreating the container, or docker.img itself. Edit: bonienl beat me!
December 3, 201510 yr Author Disabled Docker and deleted all docker images. Turned off NTP and rebooted. Webgui locked up a few mins after starting the array. New Diag attached. thevault-diagnostics-20151202-2002.zip
December 3, 201510 yr Just like the previous diagnostics, there is absolutely NO evidence of a crash! The server appears to be running fine. Can you connect on a different machine? You Telnet'ed in to it, did it appear crashed in the telnet session? Can you try connecting in another browser? One minor issue, almost certainly not related, you have a share named Training and another named training. That's allowed in Linux but not Windows, causes problems there, and is a source of confusion everywhere. I would decide which one you want, move the files from the unwanted one and delete it.
December 3, 201510 yr Author No. That is the strange thing. All SMB shares and telnet'ing to the server work fine. The ONLY issue is that the Webgui is not responsive. Also, it does not give me a 404 error in my browser. It just hangs trying to emhttp on port 80. Almost as if the IP/port is listening and available but the "webserver" doesnt serve pages. If I initiate a reboot via telnet the server comes up fine. Once I start the array (even with no Dockers or plugins) it will function for a few mins and then hang. I'm at a loss.
December 3, 201510 yr Set a NTP server, e.g. pool.ntp.org and see if that makes a difference. In your syslog it is complaining about being unsynchronized, make sure you have a correct time. Not sure what the webserver is calling (the IP address points to google) but this is the reason it stalls, this particular session is not terminated properly and the webserver is waiting indefinitely.
December 3, 201510 yr this particular session is not terminated properly and the webserver is waiting indefinitely.I think it would solve a whole bunch of issues if you could implement a watchdog of some sort to kickstart the webserver if it detects this sort of situation for more than a few minutes.
December 3, 201510 yr this particular session is not terminated properly and the webserver is waiting indefinitely.I think it would solve a whole bunch of issues if you could implement a watchdog of some sort to kickstart the webserver if it detects this sort of situation for more than a few minutes. It would be easier and better if emhttp was replaced with a real web server. We wouldnt have the majority of these odd issues with bad implimentations of single threaded processes.
December 4, 201510 yr this particular session is not terminated properly and the webserver is waiting indefinitely.I think it would solve a whole bunch of issues if you could implement a watchdog of some sort to kickstart the webserver if it detects this sort of situation for more than a few minutes. It would be easier and better if emhttp was replaced with a real web server. We wouldnt have the majority of these odd issues with bad implimentations of single threaded processes. But that would mean totally rewriting the licensing code, which probably isn't going to happen in the near future.
December 4, 201510 yr Author Set a NTP server, e.g. pool.ntp.org and see if that makes a difference. In your syslog it is complaining about being unsynchronized, make sure you have a correct time. Not sure what the webserver is calling (the IP address points to google) but this is the reason it stalls, this particular session is not terminated properly and the webserver is waiting indefinitely. No dice. Still locked up. Should I pull another diag?
December 4, 201510 yr Set a NTP server, e.g. pool.ntp.org and see if that makes a difference. In your syslog it is complaining about being unsynchronized, make sure you have a correct time. Not sure what the webserver is calling (the IP address points to google) but this is the reason it stalls, this particular session is not terminated properly and the webserver is waiting indefinitely. No dice. Still locked up. Should I pull another diag? I would start from scratch ... save your current flash contents, reformat the flash and do a fresh installation! Your data isn't lost and you can do basic installation and rebuild your shares (see also these tutorial videos).
December 4, 201510 yr Author Reloaded the flash drive with latest stable. Everything appeared fine. 40% through the parity rebuild emhttp stopped responding again. Same symptoms as before. Process shows active with a ps. I guess I will let it finish rebuilding parity tonight and reboot in the morning. This sucks.
December 4, 201510 yr Author One more thing that just came to me. I DID replace the original USB drive (2GB) that failed after 5 yrs with a new 16GB usb when I upgraded to v6. Does the size of the USB matter? Should I replace the USB drive?
Archived
This topic is now archived and is closed to further replies.