Jump to content

[Solved] WebGUI not available, shares, docker/remote console still work


Recommended Posts

Greetings!

 

The news of Xen support being deprecated pushed me to finally update from 6beta5a to 6beta15 so that I could move my Plex server from a VM to Docker (it worked great and I've been busy, so I just let it go).  I jumped in and just tried a direct upgrade by just replacing the main files on the flash and had a issues initially, which I described in a separate post http://lime-technology.com/forum/index.php?topic=39568.0 for posterity, but then went to a clean installation, where the issue below came up.

 

Machine:

- MB: Gigabyte H97N-WIFI

- PROC: Intel i5-4570 (I think)

- RAM: 8GB memory

- 3x 3TB Seagate drives

- 1x 600GB old western digital drive mounted outside of array (go file) for my old XEN VMs (to be replaced with an SSD cache once I get things stable)

 

Software Info:

- UnRAID v6b15 Basic License

- Fresh installation

- No plugins

- Xen boot option

- No VMs running at the time, but I had a few domains configured (all stopped because I got my Plex Docker running)

- One Docker running, Plex Media Server

 

Problem

I got my Plex docker up and running successfully with my old Library copied over, so I stopped my old VM running Plex, kicked off a parity check and went to bed.  In the morning, Plex docker was still working, shares are available, SSH works, but no WebGUI.  Webpage Not Available.  Tried Chrome and IE, cleared cache, still no WebGUI.

 

Syslog attached:

- interesting bits most likely near the end

- worked on the machine on and off through the day, you can see some of that in the syslog I think

- line 1127 was when I left the machine for the night (failed notification authentication, decided to deal with it later)

 

Server is still running this way - I thought I'd ask for feedback first since it might not happen again and we'd miss the chance for diagnostics.  I'm relatively useless in Linux based systems but I can handle the command line if the directions are clear.  Let me know if there's anything else I can do for troubleshooting/diagnostics.  Thanks in advance for the help!

2014.04.27.syslog.zip

Link to comment

I ran into the same problem this morning so I thought i could post in the same thread.

 

I can see the server in router list

i can access the shares

i can start webgui for transmission running in docker

I can telnet to the server

I cannot start webgui for unraid with ip-adress

 

Worked just fint yesterday and early this morning, then it just stopped working. Weirdly I think it happened when i tried to close transmission via webgui, but not sure since i shut my computer down at the same time.

syslog.txt

Link to comment

Greetings!

 

The news of Xen support being deprecated pushed me to finally update from 6beta5a to 6beta15 so that I could move my Plex server from a VM to Docker (it worked great and I've been busy, so I just let it go).  I jumped in and just tried a direct upgrade by just replacing the main files on the flash and had a issues initially, which I described in a separate post http://lime-technology.com/forum/index.php?topic=39568.0 for posterity, but then went to a clean installation, where the issue below came up.

 

Machine:

- MB: Gigabyte H97N-WIFI

- PROC: Intel i5-4570 (I think)

- RAM: 8GB memory

- 3x 3TB Seagate drives

- 1x 600GB old western digital drive mounted outside of array (go file) for my old XEN VMs (to be replaced with an SSD cache once I get things stable)

 

Software Info:

- UnRAID v6b15 Basic License

- Fresh installation

- No plugins

- Xen boot option

- No VMs running at the time, but I had a few domains configured (all stopped because I got my Plex Docker running)

- One Docker running, Plex Media Server

 

Problem

I got my Plex docker up and running successfully with my old Library copied over, so I stopped my old VM running Plex, kicked off a parity check and went to bed.  In the morning, Plex docker was still working, shares are available, SSH works, but no WebGUI.  Webpage Not Available.  Tried Chrome and IE, cleared cache, still no WebGUI.

 

Syslog attached:

- interesting bits most likely near the end

- worked on the machine on and off through the day, you can see some of that in the syslog I think

- line 1127 was when I left the machine for the night (failed notification authentication, decided to deal with it later)

 

Server is still running this way - I thought I'd ask for feedback first since it might not happen again and we'd miss the chance for diagnostics.  I'm relatively useless in Linux based systems but I can handle the command line if the directions are clear.  Let me know if there's anything else I can do for troubleshooting/diagnostics.  Thanks in advance for the help!

 

Jim,

 

Can you try booting NOT in Xen mode?  Since you've moved to Docker now, no need to boot into Xen anymore, right?

Link to comment

Jim,

 

Can you try booting NOT in Xen mode?  Since you've moved to Docker now, no need to boot into Xen anymore, right?

 

No need for Xen, so I will try tonight.  Note that I haven't replicated the WebGUI failure since the first time, so it may be difficult to say if this resolved things or not.  Last night I rebooted the server to get the GUI back, did a bit of work on my dockers, then kicked off another parity check.  I thought perhaps the parity check may have had something to do with it, but it completed successfully last night and I still have GUI this morning.

 

One thing I didn't mention is that I put my docker image and docker appdata on disk1 in the array rather than on a cache drive (the eventual plan).  I don't know why I didn't put it on the disk I have mounted outside the array.  Could this have anything to do with it?  I thought perhaps the combination of dockers writing to the array and the parity check coinciding could have somehow helped trigger the GUI failure, but the parity check last night didn't have any impact.  Plex was doing more disk activity than normal the night of the failure though as it was re-linking all my media files to a new installation.

 

What's the best way to move my docker container and appdata to another disk?

 

Thanks,

James

Link to comment

Jim,

 

Can you try booting NOT in Xen mode?  Since you've moved to Docker now, no need to boot into Xen anymore, right?

 

No need for Xen, so I will try tonight.  Note that I haven't replicated the WebGUI failure since the first time, so it may be difficult to say if this resolved things or not.  Last night I rebooted the server to get the GUI back, did a bit of work on my dockers, then kicked off another parity check.  I thought perhaps the parity check may have had something to do with it, but it completed successfully last night and I still have GUI this morning.

 

One thing I didn't mention is that I put my docker image and docker appdata on disk1 in the array rather than on a cache drive (the eventual plan).  I don't know why I didn't put it on the disk I have mounted outside the array.  Could this have anything to do with it?  I thought perhaps the combination of dockers writing to the array and the parity check coinciding could have somehow helped trigger the GUI failure, but the parity check last night didn't have any impact.  Plex was doing more disk activity than normal the night of the failure though as it was re-linking all my media files to a new installation.

 

What's the best way to move my docker container and appdata to another disk?

 

Thanks,

James

 

I am speaking strictly from memory here which has been known to fail me from time to time, but if I remember correctly, the GUI is one of the first things to go when your RAM is overloaded.  The kernel out of memory process killer will look through your list of processes and try to find something that has been idle the longest.  emhttp is one of them.

 

I had multiple dockers running concurrently causing the GUI to go down.  Only way to get it back was a reboot of the server and better distribute resources.

Link to comment

I am speaking strictly from memory here which has been known to fail me from time to time, but if I remember correctly, the GUI is one of the first things to go when your RAM is overloaded.  The kernel out of memory process killer will look through your list of processes and try to find something that has been idle the longest.  emhttp is one of them.

 

I had multiple dockers running concurrently causing the GUI to go down.  Only way to get it back was a reboot of the server and better distribute resources.

 

Interesting.  I'm a bit skeptical though that unraid + one plex docker would consume all 8GB.  I ran the same thing for months on end without issues previously albeit with Plex in a linux VM rather than docker.  Currently the dashboard tab is showing 14% memory usage.

Link to comment

I'm seeing WebGUI failures pretty regularly now, within a few hours the last couple times.  I'm also seeing some segfaults(!) and other weird things (I think) in the syslog, which could be related to docker. 

 

Chronology...

- I booted into regular (KVM) mode. 

- Everything seemed ok the next day, so I installed a couple more dockers.  Now have PMS, plexWatch, and BTSync (installed but not doing anything yet).

- At this point things seemed okay.

- Next day I tried moving my docker.img and configuration files to a disk outside the array (currently on disk1).  Docker.img moved successfully but issue getting PMS to see the config folder on the other disk for some reason.  Reverted to config data on disk1.

- WebGUI failed again that night.  And now 1-2 times per day.

- Since moving my docker.img and playing with the config directory, I'm also now getting segfaults every now and again and other things that look weird in the syslog.  some of them appear to be linked to plexwatch.

 

I attached a syslog where the segfaults were first noticed and a couple from today leading up to GUI failures.

 

I'm disabling docker and will run overnight and going to see what happens.  Overall I'm not very impressed with the stability of b15 for me.  Even if it's issues related to the specific dockers, I'd much rather have a failure in a VM that's isolated from the main OS...

logs.zip

Link to comment

Even if it's issues related to the specific dockers, I'd much rather have a failure in a VM that's isolated from the main OS...

 

Ideally Docker is suposed to work that way. This recent rash of "Docker" issues impacting the WebGui / other host processes is very much unexpected and based on things Jonp has said in other threads it might just be an issue with Docker 1.5, with the suggetion that Lime-Tech is planning to move to Docker 1.6 in future relases.

Link to comment
  • 4 weeks later...

Upgraded to RC3, hoping docker 1.6 would fix my problems.  Nope.

 

Still seeing issues

- webgui crashes

- segmentation faults

- monitor script errors in syslog

- php errors in syslog

- btrfs checksum issues crop up after a while, seen in the syslog when I attempt a scrub of hte docker.img file.  I've tried deleted and regenerating the docker.img file a few times and the problems inevitably comes back.  I attached a few syslogs for review.

 

syslog-20150521-191617.zip   

I think this one was a webgui failure, so I did a initiated a powerdown over SSH.  Plexwatch looked to be a culprit (similar findings on b15), so I disabled that docker going forward, but still had issues.

 

syslog-20150526.zip

From today.  See back a couple days for btrfs checksum errors which popped into the syslog after doing a btrfs scrub on docker.img.  This was after redoing my docker.img a couple days ago.  More recently there was a general protection fault (I got a segfault email through notifications), but no other failures yet - still running.

 

My SSD is mounted outside the array from my go file (ext4).  It was previously a old spinner drive (had the same problems).

 

Thanks in advance.

 

syslog-20150521-191617.zip

syslog-20150526.zip

Link to comment

Seem to be several problems, possibly unrelated.

 

* "IRQ 16: nobody cared" - this happens in almost all of your syslogs, but since it appears to only involve one USB controller with nothing connected (probably your front ports), it seems harmless.  It also seems unrelated to any other issue, since it occurs completely randomly, not at the same time as any other issue, and sometimes before and sometimes after, so not indirectly causing them.  Might as well ignore it.

 

* The other crashes (General protection faults, traps, panics, seem to usually involve the addon software (php, python, perl, etc), but not the same way each time, seem random in time and behavior.  It sometimes looks like program bugs (double frees, wrong free, etc), but often looks like compiler and linker issues.  I recommend running a LONG Memtest, from the boot menu, just to eliminate any memory issues.  Run it at least over night.

 

* It will be interesting to see how long you can run in SAFE mode without issues, should tell us a lot.  However, I don't know if you can live without your Plex running, for several days!

 

* One other thing to try, keep a monitor open with a syslog tail running, and keep an eye on it, to see of you can associate any of the traps, general protection issues, panics, etc with a specific command or action.

Link to comment

Seem to be several problems, possibly unrelated.

...

 

Start up in SAFE-mode. Reproduce problem and them post a new syslog.

 

Thanks for the input.  To be honest, I've been hesitant to try safe mode largely because of my (family's) reliance on Plex.  But I think it's time to bite the bullet and try it out.  At the start shortly after moving from b6-b15 (clean install), I only had Plex running in a docker and NOTHING else (plugins or otherwise), and was still having issues.

 

todo list

* long memtest

* safe mode for a few days at least

* if all is good, slowly add things back in prioritized order and see if/when it starts breaking

 

Might take a few days to get started as free time is limited.

Link to comment

Forgot to mention, that there's a small possibility that a BIOS update (if available) *might* fix the IRQ 16 issue, and possibly help with the crashing.  It's worth a shot, although somewhat unlikely.  Your BIOS does not look very old though, so may not be a newer one available.

Link to comment

I design electronics hardware for a living, and I spend a lot of time giving the software developers a hard time about how all our problems are software issues.  Well, this time I'm eating my words because it appears I have a bad memory stick.  Tons of errors in memtest86+ immediately.  Currently running it with a borrowed stick from my desktop with no errors.  Seems pretty conclusive.

 

I'll continue on the borrowed ram for time being and report back if it solves my issues.

 

Thanks again for helping out.  I should've done this a long time ago, but since it ran for so long on beta6 I assumed everything must've been okay on the hardware. 

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...