audiocycle Posted December 29, 2019 Share Posted December 29, 2019 (edited) Hi, I noticed this afternoon that my usual dockers weren't running anymore. I tried starting them back up but I get a "execution error" message so I thought maybe updating them would help. When I try updating though I get another message, indicating my that a certain file is read-only. I haven't changed any settings on my server lately so I don't get why this problem just appeared! I didn't get any notifications about something going wrong either. Fix Common Problems tell me to investigate the docker settings because there is a issue with my Docker Image but I don't know what I should be looking at on said page. Can anyone help me with this? You'll find all the error messages and the diagnostics file below. I am running unRaid v6.7.2 and have no issues in the WebGUI, haven't restarted the server yet because I read that erases info that could be helpful for troubleshooting purposes. Thanks! gateway-diagnostics-20191228-2340.zip Edited January 5, 2020 by audiocycle solved Quote Link to comment
FreeMan Posted December 29, 2019 Share Posted December 29, 2019 I would venture to say your docker image file is full or corrupted. Take a look at your docker configs, make sure that nothing is set to log to the docker file instead of to a share. Even though things have been running just fine for a while, if you've got something logging infrequently, but to the wrong place, it would eventually fill the docker.img file. Someone who actually can interpret the diagnostics will be along soon to see if there's anything more specific, but that should be something to get you started. Quote Link to comment
audiocycle Posted December 29, 2019 Author Share Posted December 29, 2019 (edited) The only docker that points to /var/lib/docker in it's config is cadvisor but I can't even remember the last time I had it running or why I installed it tbh. The /var/lib/docker/ is the path for "Host Path 4" but I'm not familiar with what is the purpose. Edited December 29, 2019 by audiocycle typo Quote Link to comment
Squid Posted December 29, 2019 Share Posted December 29, 2019 Your problem is because of an underlying problem on the cache filesystem, the docker.img is getting remounted as read-only Dec 23 13:58:27 Gateway kernel: BTRFS warning (device sdk1): sdk1 checksum verify failed on 1037221888 wanted 5771E135 found 6AAAD160 level 0 Dec 23 13:58:27 Gateway kernel: print_req_error: I/O error, dev loop2, sector 0 Dec 23 13:58:27 Gateway kernel: BTRFS error (device loop2): bdev /dev/loop2 errs: wr 0, rd 0, flush 1, corrupt 0, gen 0 Dec 23 13:58:27 Gateway kernel: BTRFS warning (device loop2): chunk 13631488 missing 1 devices, max tolerance is 0 for writeable mount Dec 23 13:58:27 Gateway kernel: BTRFS: error (device loop2) in write_all_supers:3716: errno=-5 IO failure (errors while submitting device barriers.) Dec 23 13:58:27 Gateway kernel: BTRFS info (device loop2): forced readonly Since it's BTRFS, I'm going to bow out on the filesystem repair, as I know squat about it, but I do recommend XFS as the filesystem if you have no intent to upgrade to a cache pool Quote Link to comment
audiocycle Posted December 29, 2019 Author Share Posted December 29, 2019 @Squid Is there a way to change the filesystem of my cache drive without losing any data on it? I don't plan on changing for a cache pool anytime soon, unless someone would strongly recommend it. As for BTRFS vs XFS I can't say I know the difference so I'll follow your recommendation! I tried stopping the Docker service and turning it back on again and my containers all seem to work now. Would anybody know if I can expect the issue to reappear? This feels like blind luck right now.. Quote Link to comment
Squid Posted December 29, 2019 Share Posted December 29, 2019 No, it requires a format so you'd have to copy everything you want onto the array and then switch the format and copy it back. @johnnie.black though would be able to advise on the actual issue with the cache drive (as stated, I don't like to advise on btrfs issues) Quote Link to comment
JorgeB Posted December 30, 2019 Share Posted December 30, 2019 On 12/29/2019 at 2:28 AM, Squid said: BTRFS warning (device sdk1): sdk1 checksum verify failed on 1037221888 wanted 5771E135 found 6AAAD160 level 0 Checksum failed means data got corrupt, you should run a scrub and make sure there are no uncorrectable errors, running memtest is also a good idea. Quote Link to comment
audiocycle Posted January 3, 2020 Author Share Posted January 3, 2020 (edited) Thanks for all the advice guys. @johnnie.black How do I run a scrub? A quick googling tells me it's a linux command but I am not familiar at all with those. I assume running Memtest on a ssd is done in the same way as when testing ram sticks? If so I'll be fine with that part. Is there a better way than using unBalance to move all of the cache drive contents on the array? I'm not sure how I should approach making a copy of it without playing with disk shares, which seem to be generally discouraged. Edited January 3, 2020 by audiocycle more info Quote Link to comment
JorgeB Posted January 3, 2020 Share Posted January 3, 2020 7 hours ago, audiocycle said: How do I run a scrub? On the main GUI page click on cache then scroll down to "Scrub Status" section. 7 hours ago, audiocycle said: Is there a better way than using unBalance to move all of the cache drive contents on the array? You can just use the mover, first part of this procedure. Quote Link to comment
audiocycle Posted January 3, 2020 Author Share Posted January 3, 2020 So the scrub didn't unveil any errors but it also only scans about 1G of the 1TB drive because that is what remains of data after following the Mover procedure you linked. (Mover didn't work for me previously as I hadn't disabled the dockers service I believe) libvirt.img in the System share is what didn't move. Is that normal? The SMART extended self-test completed without error but the SMART #202 (Percent lifetime remain) raw value is 10, which sounds worrisome. I have just rebooted my server with MemTest86 on a USB but as I thought I remembered it doesn't seem to be testing the SSD. Could that function be limited to newer revisions of MemTest? I'm using memtest 5.01, the most recent BIOS compatible version. Quote Link to comment
JorgeB Posted January 4, 2020 Share Posted January 4, 2020 Memtest is for RAM testing, not SSDs. If there were no errors on scrub cache is fine for now, keep an eye on it and do regular scrubs. Quote Link to comment
audiocycle Posted January 4, 2020 Author Share Posted January 4, 2020 Ok thanks for all your advice. Last question: any idea why libvirt.img in my System share wouldn't move when the Mover moved the rest of the share to the array? Quote Link to comment
JorgeB Posted January 5, 2020 Share Posted January 5, 2020 Either VM service was still running or there's already libvrit.img on the same system folder in the array. Quote Link to comment
audiocycle Posted January 5, 2020 Author Share Posted January 5, 2020 Ok I'll see if a duplicate existed. Thanks again! Quote Link to comment
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.