June 30, 20179 yr I started having issues with Docker containers failing to start and recently Docker completely stopped working. I checked the logs and see a number of BTRFS errors on my cache drive. It is a single Samsung SSD, which has been running for ~6 months. I've tried to search for my issue, and see its somewhat common, but I couldn't find anything specifically explaining what the log entries mean or how to interpret. I ran a scrub, which returned 0 errors, and an extended SMART report which seemed to pass. I'm attaching my diagnostics in the hope someone more knowledgeable can spare a few minutes to look at the logs and explain what the issue is or point me towards appropriate resources. I dont mind reformatting the cache drive and starting over, but would like to know what caused the issue to prevent it from happening again. unraid-diagnostics-20170630-1506.zip Edited June 30, 20179 yr by doctor15 SOLVED
June 30, 20179 yr Community Expert Didn't go through the complete syslog but errors appear limited to the loop device, so start by deleting and recreating your docker image. Also check your cache device for errors: btrfs dev stats /mnt/cache Output should be all zeros if all is well.
June 30, 20179 yr Community Expert Also your btrfs filesystem is fully allocated, this is bad news and can give not enough space errors, see thread below to try to fix it:
June 30, 20179 yr Author Oh, right, I forgot. I believe the problem actually started when one of my Docker containers started logging like crazy and blew up to 50gb. My Docker file was set to 100gb, but the actual disk had plenty of free space. I fixed it by moving the file to the array via terminal Are you saying the disk is full or the file system inside docker is full?
June 30, 20179 yr 1 minute ago, doctor15 said: I believe the problem actually started when one of my Docker containers started logging like crazy and blew up to 50gb. Do This: 2 minutes ago, doctor15 said: My Docker file was set to 100gb In any but the absolute most extreme cases, 20GB should actually be enough (assuming everything is properly configured)
June 30, 20179 yr Author 14 minutes ago, johnnie.black said: Didn't go through the complete syslog but errors appear limited to the loop device, so start by deleting and recreating your docker image. Also check your cache device for errors: btrfs dev stats /mnt/cache Output should be all zeros if all is well. Yup, all zeros 12 minutes ago, johnnie.black said: Also your btrfs filesystem is fully allocated, this is bad news and can give not enough space errors, see thread below to try to fix it: Ah, thanks for the help! Any obvious way to catch when this happens? I'm pretty surprised I could fill up a 500gb SSD when its just used for cache + 100gb Docker system. Am I better off just using XFS for my cache drive? I don't plan on setting up a cache pool
June 30, 20179 yr Community Expert 6 minutes ago, doctor15 said: Ah, thanks for the help! Any obvious way to catch when this happens? It can only happen if the device is completely filled, you can monitor with btrfs fi show /mnt/cache Label: none uuid: cd78c3da-0440-4cca-bbae-ed91c965df83 Total devices 1 FS bytes used 243.88GiB devid 1 size 465.76GiB used 465.76GiB path /dev/sdc1 When allocated (used) is same or very close to total size it's time to run a balance, as you can see all 465,76GiB are allocated, not good.
June 30, 20179 yr Author Cool, I'm back up and running. Thanks for the help and explanations! # btrfs fi df /mnt/cache Data, single: total=26.00GiB, used=24.48GiB System, single: total=32.00MiB, used=16.00KiB Metadata, single: total=2.00GiB, used=93.14MiB GlobalReserve, single: total=16.00MiB, used=0.00B Per recommendations, I've changed my Docker file size to 20gb. I'll try to be more vigilant keeping an eye on things.
Archived
This topic is now archived and is closed to further replies.