DaveDoesStuff Posted March 5, 2022 Share Posted March 5, 2022 (edited) Hi all, Up all night with this one, any help would be appreciated. Basically I had a corrupted btrfs filesystem on my 2xNVME cache pool that revealed itself in the form of read only pool mode last night. Wrestled with that until 3am and decided to just say F it and format the pool (after getting as much as possible off). I also has an ASM1061 controller issue that was unaware of and I solved. But all things considered this appears to have been unrelated. Got the pool formatted and everything moved back onto the cache (system/appdata/domains) and have been able to get my VMs back up. However docker was having none of it, I was using it in "directory mode" and I knew not all files made it off the cache so I decided to delete the directory via the GUI and rebuild (Settings > Docker > Directory field > Delete checkbox) however upon attemping to start docker (I've tried directory mode and btrfs image mode) it just errors out as per the below...no combination of options (even if I manually delete the directory via terminal) fix it. Docker settings page: Directory mode error: Mar 5 10:28:02 iBstorage emhttpd: shcmd (63791): /usr/local/sbin/mount_image '/mnt/user/system/docker/' /var/lib/docker 20 Mar 5 10:28:02 iBstorage root: mount: /var/lib/docker: mount(2) system call failed: No such file or directory. Mar 5 10:28:02 iBstorage emhttpd: shcmd (63791): exit status: 32 BRTFS image error: Mar 5 10:31:22 iBstorage root: Checksum: crc32c Mar 5 10:31:22 iBstorage root: Number of devices: 1 Mar 5 10:31:22 iBstorage root: Devices: Mar 5 10:31:22 iBstorage root: ID SIZE PATH Mar 5 10:31:22 iBstorage root: 1 20.00GiB /mnt/cache/system/docker/docker.img Mar 5 10:31:22 iBstorage root: Mar 5 10:31:22 iBstorage kernel: BTRFS: device fsid dd9a73e4-509f-43d1-8455-685e0eb24740 devid 1 transid 5 /dev/loop3 scanned by udevd (9404) Mar 5 10:31:22 iBstorage kernel: BTRFS info (device loop3): enabling free space tree Mar 5 10:31:22 iBstorage kernel: BTRFS info (device loop3): using free space tree Mar 5 10:31:22 iBstorage kernel: BTRFS info (device loop3): has skinny extents Mar 5 10:31:22 iBstorage kernel: BTRFS info (device loop3): flagging fs with big metadata feature Mar 5 10:31:22 iBstorage kernel: BTRFS info (device loop3): enabling ssd optimizations Mar 5 10:31:22 iBstorage kernel: BTRFS info (device loop3): creating free space tree Mar 5 10:31:22 iBstorage kernel: BTRFS info (device loop3): setting compat-ro feature flag for FREE_SPACE_TREE (0x1) Mar 5 10:31:22 iBstorage kernel: BTRFS info (device loop3): setting compat-ro feature flag for FREE_SPACE_TREE_VALID (0x2) Mar 5 10:31:22 iBstorage kernel: BTRFS info (device loop3): checking UUID tree Mar 5 10:31:22 iBstorage root: mount: /var/lib/docker: mount(2) system call failed: No such file or directory. Mar 5 10:31:22 iBstorage root: mount error The docker page itself displays this error after it fails to start: Warning: stream_socket_client(): unable to connect to unix:///var/run/docker.sock (No such file or directory) in /usr/local/emhttp/plugins/dynamix.docker.manager/include/DockerClient.php on line 682 Couldn't create socket: [2] No such file or directory Warning: Invalid argument supplied for foreach() in /usr/local/emhttp/plugins/dynamix.docker.manager/include/DockerClient.php on line 866 And the dashboard page has this chap at the bottom, all clearly the same thing being screamed at me: Warning: stream_socket_client(): unable to connect to unix:///var/run/docker.sock (No such file or directory) in /usr/local/emhttp/plugins/dynamix.docker.manager/include/DockerClient.php on line 682 Couldn't create socket: [2] No such file or directory Diagnostics attached. I'd really appreciate any help possible on this...have had basically no sleep so I wouldn't be surprised if I'm doing something stupid. ibstorage-diagnostics-20220305-1046.zip Edited March 5, 2022 by DaveDoesStuff additional info Quote Link to comment
Squid Posted March 5, 2022 Share Posted March 5, 2022 I suspect its something with the btrfs on the cache drive in which case @JorgeB should be able to assist. (Personally, I'm not a fan of btrfs if you're not going to be running a multi-device pool - xfs is better on a single device) Other thoughts though: At some point you switched to a docker folder and for some reason half of it wound up on the array, and mover is attempting to move it. You really should just delete that entire docker folder "image" as it's not necessary if you're now using an image. 1 Quote Link to comment
Solution DaveDoesStuff Posted March 5, 2022 Author Solution Share Posted March 5, 2022 (edited) 8 minutes ago, Squid said: I suspect its something with the btrfs on the cache drive in which case @JorgeB should be able to assist. (Personally, I'm not a fan of btrfs if you're not going to be running a multi-device pool - xfs is better on a single device) Other thoughts though: At some point you switched to a docker folder and for some reason half of it wound up on the array, and mover is attempting to move it. You really should just delete that entire docker folder "image" as it's not necessary if you're now using an image. Thank you for the reply. It is a 2xNVME pool...ended up doing this after a massive cache failure a year ago. I may re-examine it if this happens again. I switched from a docker img aronud the time 6.9.0 came out because I was getting excessive writes to the cache pool and it did indeed fix it. Some folders ended up on the array during lastnight/this mornings attempt to rescue shares from the cache pool (moved moved system to the array, well it moved 75% of it the rest was corrupted so badly it would put the pool in read mode when I tried to move them). I'm not actually going back to an image, I only wanted to try it to see if it was any different. I also had totaly nuked the docker directory before posting both via the gui and a good old rm -rf when the gui failed to do it. SOLUTION: On the plus side I have actually resolved this with the single worst tool everyone has in their toolbox. A system restart (the first one since formating the pool and nuking the docker directories). I have now been able to re-create the docker directory and get my apps back. Thanks for the help anyway. I will mark the subject as solved. Edited March 5, 2022 by DaveDoesStuff Quote Link to comment
JorgeB Posted March 5, 2022 Share Posted March 5, 2022 Good that it's solved but this is a new filesystem and already some data corruption was detected: Mar 5 03:27:57 iBstorage kernel: BTRFS info (device dm-6): bdev /dev/mapper/nvme0n1p1 errs: wr 0, rd 0, flush 0, corrupt 1, gen 0 Mar 5 03:27:57 iBstorage kernel: BTRFS info (device dm-6): bdev /dev/mapper/nvme1n1p1 errs: wr 0, rd 0, flush 0, corrupt 1, gen 0 Ryzen with overclocked RAM is known to corrupt data, max speed for that config should be set at 2133MT/s, not 3000MT/s. 1 Quote Link to comment
DaveDoesStuff Posted March 5, 2022 Author Share Posted March 5, 2022 7 minutes ago, JorgeB said: Good that it's solved but this is a new filesystem and already some data corruption was detected: Mar 5 03:27:57 iBstorage kernel: BTRFS info (device dm-6): bdev /dev/mapper/nvme0n1p1 errs: wr 0, rd 0, flush 0, corrupt 1, gen 0 Mar 5 03:27:57 iBstorage kernel: BTRFS info (device dm-6): bdev /dev/mapper/nvme1n1p1 errs: wr 0, rd 0, flush 0, corrupt 1, gen 0 Ryzen with overclocked RAM is known to corrupt data, max speed for that config should be set at 2133MT/s, not 3000MT/s. Thanks Jorge, I could be wrong with my sleep deprived memory but I think that was before I destroyed the pool...again I could be wrong though. No idea how that RAM is running at that speed. It SHOULD be set to 2133...had some blackouts here a while ago so possibly the bios settings default without me realising. I'll get that sorted this evening. Thank you for pointing it out! Quote Link to comment
JorgeB Posted March 5, 2022 Share Posted March 5, 2022 12 minutes ago, DaveDoesStuff said: I could be wrong with my sleep deprived memory but I think that was before I destroyed the pool Likely was, since it was probably detecting corruption before, and probably what caused the filesystem to corrupt, but this is new, after a format stats start over for that filesystem. 1 Quote Link to comment
JorgeB Posted March 5, 2022 Share Posted March 5, 2022 Once you set the RAM to the correct speeds you can reset the stats and then monitor to see if there are no more errors, more info below: https://forums.unraid.net/topic/46802-faq-for-unraid-v6/?do=findComment&comment=700582 1 Quote Link to comment
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.