Docker Crashing unRaid


SSMI

Recommended Posts

On my system docker seems to be crashing unraid.  I can't tell why it seems to be happening.  My unraid runs fine for at least a couple of weeks if I have no Docker containers running.  Once I start one running my unraid will crash in the next few days.  Sometimes it is one day sometimes about four.  When it crashes the computer still has power but the WebUI, Samba shares and telnet all become unresponsive.  To recover I have to cut the power and start it back up.  When I do so neither of the dockers will run again.  This didn't used to happen back on one of the RCs I was running.

 

Once think are back up and running the docker images wont run again. Nothing else is corrupted or missing though.

 

Here is my general config.

 

Parity 3GB WD HDD

Data 2x3GB WD HDD

Cache 256GB Crucial SSD

Flash 8GB

 

Shares

appdata - no cache - no nfs - yes smb

SSMI - yes cache - no nfs - yes smb

Media - no cache - yes nfs - yes smb

docker (autogenerated) - no cache - no nfs - no smb

 

Docker Settings

version: 1.6.2

image: /mnt/cache/docker.img

size: 10GB

 

Common Dockers

MariaDB

owncloud

deluge

 

If there is some other  data I need to get let me know and tell me how to get it.

 

Thank you for the help.

Link to comment

I don't have it setup with those attached.  It would take at least a few days to do so.  Anything else I should try and get when I do the memtest?

 

Docker shouldn't be able to crash unraid should it?  From what I understand it is similar to a virtual machine.  Could the docker install on my box be corrupted somehow?

Link to comment

I don't have it setup with those attached.  It would take at least a few days to do so.  Anything else I should try and get when I do the memtest?

 

Docker shouldn't be able to crash unraid should it?  From what I understand it is similar to a virtual machine.  Could the docker install on my box be corrupted somehow?

The word docker is sometimes used in different ways and it might help to clarify what we mean.

 

There is docker that is built in to unRAID to manage the docker containers. This is very unlikely to be corrupt since all of the unRAID OS is unpacked fresh on each boot.

 

There is the docker img file, a "virtual disk" where the code and "internals" of all the docker containers are stored. While this might become corrupted, I think the most likely result would be dockers not starting, rather than things working OK for a while.

 

There are the files used by specific containers, such as the config data, that is mapped to unRAID storage such as the cache drive. Corruption there would be most likely to affect specific dockers.

 

What we are missing is any sort of diagnostic information from the time of the crash. Maybe you could keep the syslog up on the screen updating until it happens and then get a copy of that. Also see either of the help links in my sig for other ideas about capturing this.

Link to comment

I'm also wondering if the docker.img file isn't filling completely up due to wrong settings within the containers.  You say you have to start over again the containers from scratch.  Does changing the size of the .img file from 10Gig to say 20Gig prolong the life of the system before it crashes?

Link to comment

I'm also wondering if the docker.img file isn't filling completely up due to wrong settings within the containers.  You say you have to start over again the containers from scratch.  Does changing the size of the .img file from 10Gig to say 20Gig prolong the life of the system before it crashes?

Yes, it might be useful to know what the volume mappings of all the containers are. If some things are not being mapped to unRAID storage like they should then they will be consuming space in docker img.
Link to comment

I'm also wondering if the docker.img file isn't filling completely up due to wrong settings within the containers.  You say you have to start over again the containers from scratch.  Does changing the size of the .img file from 10Gig to say 20Gig prolong the life of the system before it crashes?

Yes, it might be useful to know what the volume mappings of all the containers are. If some things are not being mapped to unRAID storage like they should then they will be consuming space in docker img.

I don't think knowing the mappings alone will be able to tell that.  You're talking about the settings within each container operating.

 

@SSMI  If, for example you tell deluge to store its incomplete downloads at /incomplete, but you have not mapped /incomplete to an unRaid share then its going to store the incomplete files within the docker.img file.  And that file is only 10G.

Link to comment

There are a couple of containers I like to be able to access all the shares.  Do I need to do something special to allow this?  Right not I am just including /mnt/user/ (I might have that wrong as I'm at work) as a volume mapping.

 

Edit:

I do put those types of storage on volume mappings.  When I set everything up again I will try a lager img size as well.

Link to comment

That mapping is fine.  But (and I'm not saying this is the case), but if you tell deluge for example to store its incomplete downloads somewhere other than within the mapping for /mnt/user, then it winds up storing within the docker.img file which will cause you some issues.

 

I'm going down this road because its a fairly common mistake, and the size of the img file is only 10G  (which if everything is set up correctly is plenty)

 

Link to comment

I understand.  For deluge everything is outside of the docker.img.  I am pretty sure MariaDB is as well.  I need to double check owncloud because that one is new for me but I was using a template from here on the forum so it was probably setup correctly as well.

 

I will double check them though because maybe I was missing something.

Link to comment

Alright.  Set everything up again with owncloud and MriaDB.  Evefrything worked like it was fupposed to for between one and a half and two days.  Then it crashed again.

 

I had a monitor and keyboard plugged in.  I left it at the login screen.  After the crash I can't access the web ui.  On the physical device the login screen is gone.  I will copy what was left as best I can.

 

[<ffffffff8111b671>] ? pin_re,pve+0x15/0xae
[<ffffffff8111b671>] drop_mountpoint +0x1a/0x2a
[<ffffffff8111b671>] pin_kill +0x6a/0xec
[<ffffffff8111b671>] ? wait_woken+0x7d/0x7d
[<ffffffff8111b671>] mnt_pin_kill+0x2e/0x30
[<ffffffff8111b671>] cleanup_mnt+0x39/0x74
[<ffffffff8111b671>] __cleanup_mnt+0xd/0xf
[<ffffffff8111b671>] task_work_run+0x93/0xae
[<ffffffff8111b671>] do_notify_resume+0x40/0x4e
[<ffffffff8111b671>] int_signal+0x12/0x17
---[ end trace 163b45793ad5a3d0 ]---
mdcmd (44): spindown 2
mdcmd (45): spindown 0
mdcmd (46): spindown 1
mdcmd (47): spindown 0
mdcmd (48): spindown 1
mdcmd (49): spindown 0
mdcmd (50): spindown 1
mdcmd (51): spindown 0
mdcmd (52): spindown 1
mdcmd (53): spindown 0
mdcmd (54): spindown 1

 

I entered that by hand so there may be an error or two there.  Escape doesn't do anything and enter scrolls a blank line.  Any ideas what is causing this?

Link to comment
  • 2 weeks later...

So I'm am finishing up setup but I completly cleared my unraid box and started over from scratch.  Now it shows the following under Docker

 

Label: none  uuid: 8fd80e0f-fde6-4b81-8bbd-4acbabe2df41
Total devices 1 FS bytes used 627.23MiB
devid    1 size 20.00GiB used 3.04GiB path /dev/loop0

btrfs-progs v4.0.1

 

I think that has fixed things.  I will post again after a week or so of uptime.

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.