docker service


Recommended Posts

Hello y'all it is my first time posting.

(I apologize in advance if this has been answered before but i used the search and thought i got the solution but the problem manifested itself again) 

 

I have had unRIAD OS Plus for a short time now and i have had a lot of fun with it and it has been a great learning experience.

However on Sunday we had a power outage that lasted for 4 hours and obviously killed the batteries on the UPS which was not configured with auto shutdown and in the morning both my Docker service and VMs will not start.

 

I looked for solutions and i found that i should delete the docker image and reinstall the apps again.

I did that and it worked fine for 24 hrs however now the service will not start again and it says the cache is write only and the docker image is full but Docker is at 36% and cache has 170gb free and no one has set it to read only.

Here is the screen shot from the Fix Common Problems plugin and the zip files containing the diagnostic logs 

 

Again when i click on Docker tab i get this Docker Service failed to start.

and my Windows 10 VM is no where to be found on VMs

 

Please guide me on where to start troubleshooting this and what my next steps should be 

 

thank you in advance

Denis

unraid.png

tower-diagnostics-20200820-2257.zip

Link to comment

johnnie.black addressed the main thing, but some other things to consider.

 

8 hours ago, richland007 said:

power outage that lasted for 4 hours and obviously killed the batteries on the UPS which was not configured with auto shutdown

You shouldn't run for long on batteries. The point of having batteries is so you can cleanly shutdown, not so you can keep running. Is your UPS compatible with the APCUPSd builtin to Unraid?

 

Why do you have 50G allocated to docker.img? Have you had problems filling it? 20G should be more than enough, but I see you are already using 18G. I suspect you have some application misconfigured and it is writing to a path that isn't mapped.

 

Also, your system share has files on disk1 instead of all on cache. Possibly this happened when you recreated docker.img while still having cache problems, or maybe you enabled dockers / VMs before you installed cache. Your dockers/VMs will keep those files open, so array disks can't spin down, and they will have their performance impacted by slower parity.

Link to comment

@johnnie.black @trurl I greatly appreciate your feedback as i am trying to better understand unRAID

 

I followed the guide given on the above link for balancing the cache drive with the btrfs balance start -dusage=75 /mnt/cache command and it would not let me i tried with -dusage=1 and it worked all the way up till the whopping 4 :) but when i did 5 it says no space available.

 

So i am trying to remove some files from the cache disk from the /downloads/completed folder that Sonar has not moved (i have no idea why it moves the majority but not all) and see if that will solve some of the problem however as stated on my original post the system reports a 107Gb free space on Cache.

Truth be told i am a bit unsure on what the step by step path forward for a solution to this is??

What would one do Steps 1,2,3 ...x

 

On 8/21/2020 at 7:53 AM, trurl said:

Why do you have 50G allocated to docker.img? Have you had problems filling it? 20G should be more than enough, but I see you are already using 18G. I suspect you have some application misconfigured and it is writing to a path that isn't mapped.

I suspect that the instance of Tdarr is not configured right but i never used it however the first time i had to increase from 20 to 50g was right after i installed that app when i stopped it the issue did not replicate.  

 

On 8/21/2020 at 7:53 AM, trurl said:

You shouldn't run for long on batteries. The point of having batteries is so you can cleanly shutdown, not so you can keep running. Is your UPS compatible with the APCUPSd builtin to Unraid?

The ups i had was not APCUPSd compatible (it would recognize the UPS but only 2 parameters would report and incorrectly) and i am changing that now with a genuine APC smart and on top of all that i was not home when it happened and it lasted for 4+ hrs.

Question: Will the built in UPS feature take care of the shutdown process automatically if connected to a genuine APC ups??

On 8/21/2020 at 7:53 AM, trurl said:

Also, your system share has files on disk1 instead of all on cache. Possibly this happened when you recreated docker.img while still having cache problems, or maybe you enabled dockers / VMs before you installed cache. Your dockers/VMs will keep those files open, so array disks can't spin down, and they will have their performance impacted by slower parity.

I have had a cache drive from the beginning when i set up the system however i had the System share to use Cache: Preferred instead of Cache: Yes...

Will a simple change of that setting fix that issue or do i have to do something special now??

 

Looking forward to your help and guidance on these issues and quick guide on how to get this up and running the soon

 

Thanks again

Denis

Link to comment
1 hour ago, richland007 said:

had the System share to use Cache: Preferred instead of Cache: Yes...

Prefer is the correct setting and Yes is most definitely the wrong setting. Probably 

On 8/21/2020 at 8:53 AM, trurl said:

this happened when you recreated docker.img while still having cache problems

After you get Cache fixed post new diagnostics and we can work on that. 

Link to comment

Ok gentleman i removed some files from the Cache and run the balance and got a successful message although the empty space seems to have decreased

(I am having a hard time comprehending this) i have turned off everything VMs and Dockers.

I run fix common problems and this is what i got .....that the docker image might be full or corrupted. I did try to turn the Docker service on briefly and i got the Docker Service failed to start so it turned it back off.

What do we do next?? Should i delete the Docker image again?? What did this fix because i have a feeling that if i delete the image and start installing the apps again the same thing will happen all over. 

I am attaching the new Diagnostic zip file.

 

Thank you as always 

Denis

btrfsCache.png

FixCommonProb.png

tower-diagnostics-20200823-1144.zip

Link to comment

Good evening gentlemen 

On 8/24/2020 at 3:49 AM, johnnie.black said:

Cache is balanced now but the syslog is being spammed with ACPI errors, see here for how to fix then reboot and post new diags.

I did the modification of /boot/config/go to include the rmmod acpi_power_meter and rebooted when i rebooted the Docker service was set to start and before i  started the array i checked there were no ACPI errors when i started the array and the Docker service started automatically i saw the ACPI errors  started surfacing up again. How is that correlated somehow cause i thought we are dealing with hardware os level stuff not Docker....... and i made sure that it had saved the change on the /boot/config/go file to include the rmmod.

So in other words the spamming error still remains even though the config is modified to kill the acpi_power_metering...how??? it is beyond my understating ability LOL

 

On 8/23/2020 at 9:24 PM, trurl said:

Do you know how to examine the disks?

 

Do you have any VMs?

i did examine the disk and yes there is a vm i am trying to run on disk1 system folder...is it supposed to be there?? how do we move that an others it in order for the spin downs to occur .

 

thank you again waiting for the next troubleshooting steps 

Denis

ps diagnostics zip included 

disk1.png

disk1a.png

bootconfig.png

tower-diagnostics-20200825-2113.zip

Link to comment
5 hours ago, richland007 said:

before i  started the array i checked there were no ACPI errors when i started the array and the Docker service started automatically i saw the ACPI errors  started surfacing up again.

That is strange, and no idea why you're still getting spammed but unrelated to your issues, at least docker is back to working correct?

Link to comment

Hello Gentleman,

Well i was up and running for a couple of days and than back to square one. I tried to restart the PiHole docker container and  got error 403 and it says can not write to docker image either full or corrupted. And the damn log is still spammed by the ACPI error although i have the  rmmod acpi_power_meter added to the go file. So i restarted the tower and the docker came back up so i removed the pi hole (i wanted to anyhow) than i executed in the terminal rmmod acpi_power_meter again and the ACPI errors stopped. My array does not automatically start up after a reboot and as a result the docker service is not up and running so the command rmmod acpi_power_meter has to be executed after Docker service has started for it to work  ...so how do we do that??

 

Cache gets full pretty quickly when downloading loads at the same time and/or converting with handbreak/tdarr. What do i have to change to always have room in the cache ( i have the Mover scheduled to run hourly) ... I blamed PiHole for the docker hiccup this time but i think previously it was either handbreak or tdarr that caused problems with the cache and docker .... also Sonarr/Radarr move about 90% of the Completed downloads from the /cache/downloads/completed folder i have no clue why they wont move 100% of them to Media/movies share

 

On 8/26/2020 at 9:49 AM, trurl said:

From the command line, what do you get with these?


ls -lah /mnt/cache/system/libvirt

ls -lah /mnt/disk1/system/libvirt

 

Here is what i am getting for the above commands  .... looks like the same file but different sizes...by the way i used to have a VM that disappeared at the same time with my dockers the first time and i tried to recreate it again.

 

Thank you

Denis

 

 

disk1libvirt.png

cachelibvirt.png

Link to comment

That seems to indicate that the version on cache is the currently used one, as it should be. You can delete the system folder on disk1.

 

2 hours ago, richland007 said:

i have the Mover scheduled to run hourly

Running mover more frequently often doesn't help anything. It is impossible to move to the slower array as fast as you can write to the faster cache. Mover is really intended for idle time.

 

Your cache is really pretty large. Are you really writing hundreds of gig every day? You might consider writing some of that directly to the array.

 

2 hours ago, richland007 said:

can not write to docker image either full or corrupted.

Why is anything writing to docker.img? Normally you want your docker applications writing to mapped storage. Docker.img is really just for the executable code of your docker containers. The docker applications themselves should not be writing into it.

 

My docker.img is 20G. That is the size I usually recommend. I have 17 dockers running, and they use less that half of that 20G.

 

You have 50G allocated to docker.img. Shouldn't be necessary to have that much. Don't know if you have filled it or not. Diagnostics would tell.

 

Any application that is writing to a path that doesn't correspond to a container path that is mapped to a host path is writing into the docker.img. Common mistakes are specifying a different upper/lower case than that in the mappings, or writing to a relative path (not beginning with /)

Link to comment

Hey thank you for the quick response...

1 hour ago, trurl said:

That seems to indicate that the version on cache is the currently used one, as it should be. You can delete the system folder on disk1

That seems to be the copy of my old VM that i lost how do i use that to get my VM back .....i have been browsing all over but i keep reading about the old xml file that i dont have

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.