August 20, 20214 yr Randomly I have lost the ability to start any dockers besides netdata. No updates or randomly power outages. Everything was working fine server uptime of 31 days. When trying to refresh web interfaces for dockers that were running, I was seeing some SQL errors. I tried to restart the docker service and now nothing will start up. Last time I was having issues I was advised to delete and recreate the docker image which worked so its a fairly new image. I don't think I have any hardware failure happening, as fix common problems doesn't see anything. Any advice would be appreciated. Diagnostics have been attached. tower-diagnostics-20210819-2149.zip
August 20, 20214 yr Community Expert Looks like something wrong with your cache pool and that must have broken user shares. Jul 20 00:14:00 Tower emhttpd: shcmd (35802): /sbin/btrfs balance start -dconvert=raid1 -mconvert=raid1 /mnt/cache && /sbin/btrfs balance start -dconvert=raid1,soft -mconvert=raid1,soft /mnt/cache & Jul 20 00:14:00 Tower kernel: BTRFS error (device sdj1): balance: invalid convert data profile raid1 Pool: cache Overall: Device size: 953.87GiB Device allocated: 450.03GiB Device unallocated: 503.84GiB Device missing: 0.00B Used: 202.04GiB Free (estimated): 749.69GiB (min: 749.69GiB) Free (statfs, df): 749.69GiB Data ratio: 1.00 Metadata ratio: 1.00 Global reserve: 512.00MiB (used: 0.00B) Multiple profiles: no Data Metadata System Id Path single single single Unallocated -- --------- --------- --------- -------- ----------- 2 /dev/sdj1 447.00GiB 3.00GiB 32.00MiB 503.84GiB -- --------- --------- --------- -------- ----------- Total 447.00GiB 3.00GiB 32.00MiB 503.84GiB Used 201.15GiB 914.97MiB 96.00KiB I will have to pass this off to @JorgeB, may be a few hours before he sees it.
August 20, 20214 yr Community Expert There's a problem with the pool, it's only using one device, looks like the 2nd one was never successfully added, but unlikely to be related to your issue, to fix that you can try this: -Stop array -Unassign cache1 (sdk currently) -Start array -Stop array -Re-assign cache1 -Start array and post new diags.
August 20, 20214 yr Community Expert 12 hours ago, trurl said: must have broken user shares The reason I said that is because no /mnt/user in df
August 20, 20214 yr Community Expert 41 minutes ago, trurl said: because no /mnt/user in df Yes, missed that, but that's not because of the pool: Aug 18 23:08:20 Tower shfs: shfs: ../lib/fuse.c:1451: unlink_node: Assertion `node->nlookup > 1' failed. It's this issue: A reboot will fix it.
August 21, 20214 yr Author 18 hours ago, JorgeB said: There's a problem with the pool, it's only using one device, looks like the 2nd one was never successfully added, but unlikely to be related to your issue, to fix that you can try this: -Stop array -Unassign cache1 (sdk currently) -Start array -Stop array -Re-assign cache1 -Start array and post new diags. Thank you for your time JorgeB. Attached the new diagnostics after following your steps. I did not reboot just yet but will shortly after posting this. tower-diagnostics-20210820-2115.zip
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.