abhi.ko Posted November 23, 2020 Share Posted November 23, 2020 (edited) None of the docker containers were able to be accessed, and this just came to my notice, I believe it might have started sometime today morning. Everything was working fine about a couple of hours ago, I even updated Plex with the latest beta server version today morning, and have been getting pushbullet notifications from radarr and tautulli etc as recent as 45 minutes back. Now none of the containers can load because the docker image is inaccessible (read-only) it looks like, from what I can see in the logs. I tired stopped the docker service and tried to restart it but that fails, saying docker image location is read only - excerpt from the log below. No idea how and why the permissions got changed. I am on 6.8.3 (stable) Nov 23 11:29:42 Tower emhttpd: shcmd (1708): umount /var/lib/docker Nov 23 11:30:09 Tower login[15119]: ROOT LOGIN on '/dev/pts/0' Nov 23 11:34:46 Tower ool www[27234]: /usr/local/emhttp/plugins/dynamix/scripts/emcmd 'cmdStatus=Apply' Nov 23 11:34:46 Tower emhttpd: Starting services... Nov 23 11:34:46 Tower emhttpd: shcmd (1723): /etc/rc.d/rc.samba restart Nov 23 11:34:48 Tower root: Starting Samba: /usr/sbin/smbd -D Nov 23 11:34:48 Tower root: /usr/sbin/nmbd -D Nov 23 11:34:48 Tower root: /usr/sbin/wsdd Nov 23 11:34:48 Tower root: /usr/sbin/winbindd -D Nov 23 11:34:48 Tower emhttpd: shcmd (1736): /usr/local/sbin/mount_image '/mnt/user/system/docker.img' /var/lib/docker 100 Nov 23 11:34:48 Tower root: truncate: cannot open '/mnt/cache/system/docker.img' for writing: Read-only file system Nov 23 11:34:48 Tower kernel: BTRFS info (device loop2): using free space tree Nov 23 11:34:48 Tower kernel: BTRFS info (device loop2): has skinny extents Nov 23 11:34:48 Tower root: ERROR: unable to resize '/var/lib/docker': Read-only file system Nov 23 11:34:48 Tower root: Resize '/var/lib/docker' of 'max' Nov 23 11:34:48 Tower emhttpd: shcmd (1738): /etc/rc.d/rc.docker start Nov 23 11:34:48 Tower root: starting dockerd ... Nov 23 11:35:03 Tower emhttpd: shcmd (1740): umount /var/lib/docker Will post diagnostics soon. In the meanwhile can someone please provide some guidance as to why this happened and what caused this? Thanks. Edited November 26, 2020 by abhi.ko Quote Link to comment
JorgeB Posted November 23, 2020 Share Posted November 23, 2020 5 minutes ago, abhi.ko said: why this happened and what caused this? Without the diags we can only guess, possibly a problem with the cache filesystem. Quote Link to comment
abhi.ko Posted November 23, 2020 Author Share Posted November 23, 2020 1 minute ago, JorgeB said: Without the diags we can only guess, possibly a problem with the cache filesystem. Thanks @JorgeB. Trying to post it, it has been downloading for a long time (>20 min), not sure if something is wrong. Hopefully will be done soon. The funny thing is I haven't done anything to my cache drive recently. Did change the drive about 2-3 months back and everything has been working fine till now. Quote Link to comment
JorgeB Posted November 23, 2020 Share Posted November 23, 2020 If it didn't download them after 2 minutes it won't, get them on the console by typing "diagnostics" 1 Quote Link to comment
abhi.ko Posted November 23, 2020 Author Share Posted November 23, 2020 10 minutes ago, JorgeB said: If it didn't download them after 2 minutes it won't, get them on the console by typing "diagnostics" Thanks got it. tower-diagnostics-20201123-1240.zip Quote Link to comment
trurl Posted November 23, 2020 Share Posted November 23, 2020 On mobile now so can't look at Diagnostics yet, but likely you corrupted docker.img by filling it. Further evidence in that syslog snippet shows you want it at 100G when 20G should be more than enough. Making it larger won't fix anything it will just make it take longer to fill. Quote Link to comment
JorgeB Posted November 23, 2020 Share Posted November 23, 2020 There's a problem with the cache filesystem, you should backup and re-format, then restore the data and also best recreate the docker image. Quote Link to comment
abhi.ko Posted November 23, 2020 Author Share Posted November 23, 2020 (edited) 1 hour ago, trurl said: On mobile now so can't look at Diagnostics yet, but likely you corrupted docker.img by filling it. Further evidence in that syslog snippet shows you want it at 100G when 20G should be more than enough. Making it larger won't fix anything it will just make it take longer to fill. 40 minutes ago, JorgeB said: There's a problem with the cache filesystem, you should backup and re-format, then restore the data and also best recreate the docker image. Thank you @JorgeB and @trurl ! How do I achieve this. I am planning to manually cp over all the cache folders to a location on the array. Is there a faster or better way to do this? I have the appdata and USB backed up as recently as yesterday via the CA Backup/Restore plugin. so that should hopefully not be corrupt. How do I go about formatting an already assigned drive? Sorry for the questions, just don't want to do something I think is right and make this problem worse. Recreating the docker image and reducing size to 20G - do I just recreate a new image and reinstall all the dockers from my templates? The templates should still be available right? Edited November 23, 2020 by abhi.ko Quote Link to comment
trurl Posted November 23, 2020 Share Posted November 23, 2020 Your system share where docker.img lives has some files on disk15. And as mentioned there is not any good reason to have 100G docker.img. Have you had problems filling it? What do you get from the command line with this? ls -lah /mnt/disk15/system Quote Link to comment
abhi.ko Posted November 23, 2020 Author Share Posted November 23, 2020 3 minutes ago, trurl said: Your system share where docker.img lives has some files on disk15. And as mentioned there is not any good reason to have 100G docker.img. Have you had problems filling it? What do you get from the command line with this? ls -lah /mnt/disk15/system thanks @trurl . Was typing a reply to both of you. But here is the output of the ls command: root@Tower:~# ls -lah /mnt/disk15/system total 0 drwxrwxrwx 3 root root 21 Nov 7 13:15 ./ drwxrwxrwx 12 nobody users 201 Nov 23 00:17 ../ drwxrwxrwx 2 root root 25 Nov 7 13:15 libvirt/ Quote Link to comment
abhi.ko Posted November 23, 2020 Author Share Posted November 23, 2020 11 minutes ago, abhi.ko said: And as mentioned there is not any good reason to have 100G docker.img. Have you had problems filling it? What do you mean by 'problems filling it'? Quote Link to comment
trurl Posted November 23, 2020 Share Posted November 23, 2020 17 minutes ago, abhi.ko said: What do you mean by 'problems filling it'? You can see how much of docker.img is used on the Dashboard, and we can also see when you post diagnostics. Currently you have no docker.img, but when you recreate it you should set it to 20G, not 100G. I have 17 dockers and they use less than half of 20G. The usual reason for filling docker.img is an app writing to a path that isn't mapped. One common mistake is specifying a path with an app that doesn't exactly match the mapped container path in upper/lower case. 1 hour ago, abhi.ko said: Recreating the docker image Quote Link to comment
trurl Posted November 23, 2020 Share Posted November 23, 2020 37 minutes ago, abhi.ko said: output of the ls command That is libvirt so docker.img isn't on disk15. Do you actually have any VMs? Quote Link to comment
abhi.ko Posted November 23, 2020 Author Share Posted November 23, 2020 6 minutes ago, trurl said: Do you actually have any VMs? Yes total of 4 VM's - only one that I do care about. My hassos VM, and maybe the Win 10 VM. The rest are rarely run, so no big deal if there are gone. I just copied over the domains and systems share contents to the array (share). Quote Link to comment
abhi.ko Posted November 23, 2020 Author Share Posted November 23, 2020 11 minutes ago, trurl said: You can see how much of docker.img is used on the Dashboard, and we can also see when you post diagnostics. Currently you have no docker.img, but when you recreate it you should set it to 20G, not 100G. I have 17 dockers and they use less than half of 20G. The usual reason for filling docker.img is an app writing to a path that isn't mapped. One common mistake is specifying a path with an app that doesn't exactly match the mapped container path in upper/lower case. No - never filled. Docker is not currently running - so I couldn't tell you what the status is currently. I think it was a stupid mistake some years back when I changed the default size of the file to 100G, not understanding what I was doing. But never had it filled up. Quote Link to comment
abhi.ko Posted November 23, 2020 Author Share Posted November 23, 2020 20 minutes ago, trurl said: That is libvirt so docker.img isn't on disk15. That is interesting since I have it mapped to the system share which is set cache prefer, and there is plenty of space left on the cache drive. So not sure why there is another image on Disk15. Quote Link to comment
abhi.ko Posted November 25, 2020 Author Share Posted November 25, 2020 So an update on this: Thanks @JorgeB - I did what you suggested and everything seems to be working from a docker perspective. But my VM's are not starting up now, not a big deal for me because I have a snapshot of my hassos VM saved up. What I did: Run CA Backup & Restore to backup appdata and and other files. Copy over systems and domain shares (they were set to Cache Prefer) manually to location on the array. Did not copy over the data since there was nothing valuable to me in cache at the time. Just some media files. Stop the array. For formatting the cache SSD - Changed the filesystem for the Cache Drive to xfs from btrfs, and clicked apply & done. I am not planning to use a Cache pool anytime soon, not sure if that corruption was any fault of the btrfs. (my knowledge on linux filesystems is very limited, so this is just google/unraid wiki based. Cache Drive showed up to be formatted - clicked checkbox and format. Once done, restarted the array. Restored appdata using the CA backup/restore Restarted the server as suggested by the plugin. Copied everything over other than the docker image. Re-enabled docker in settings and set the image size to 20G - thanks @trurl Rebuilt docker image from existing templates in my templates. All containers are now back up and running fine! Re-enabled VM service, with the copied over image and the VM's showed up. But they are not starting, so not sure what is wrong there, unsure if the filesystem change caused that? That's it now. Dockers are all back and shares are intact, so I'm happy. Will have to try fixing VM again later, any suggestions here would be appreciated. Quote Link to comment
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.