trurl Posted September 20, 2021 Share Posted September 20, 2021 21 hours ago, Squid said: partial explanation for no containers Because your appdata and system shares are on the array instead of all on pool where they belong. Quote Link to comment
Profezor Posted September 20, 2021 Author Share Posted September 20, 2021 (edited) On 9/19/2021 at 5:05 PM, Squid said: I just assumed the logs got cut off. Corruption exists on drive 4 (partial explanation for no containers), and if the server hasn't been powered off and cabling reseated, then the parity drive needs that and it also explains why the check is slow. Squid - Dockers Tab started after the Disk 4 system check and repair. You guys are a life saver. I guess I am not totally out of the woods yet. As most dockers are giving a Execution Error 403. I need some newer, better drives. I guess I should run a parity asap. galaxy-diagnostics-20210920-1756.zip Edited September 20, 2021 by Profezor Quote Link to comment
trurl Posted September 20, 2021 Share Posted September 20, 2021 1 hour ago, Profezor said: most dockers are giving a Execution Error 403 You will probably have to recreate docker.img. It was ridiculously large at 150G anyway. Looks like it was using 29G of that 150G, which makes me question whether or not you don't have some application writing to a path that isn't mapped. Try 20G when you recreate it. But while you have Docker disabled and before recreating it at 20G, also disable VM Manager, then run Mover so you can get appdata, domains, system shares moved off the array. Having them on the array will keep drives spunup since these always have open files, and will impact performance of dockers/VMs if they are on slower parity array. https://wiki.unraid.net/Manual/Troubleshooting#How_do_I_recreate_docker.img.3F https://wiki.unraid.net/Manual/Troubleshooting#Restoring_your_Docker_Applications Quote Link to comment
Profezor Posted September 20, 2021 Author Share Posted September 20, 2021 4 hours ago, trurl said: You will probably have to recreate docker.img. It was ridiculously large at 150G anyway. Looks like it was using 29G of that 150G, which makes me question whether or not you don't have some application writing to a path that isn't mapped. Try 20G when you recreate it. But while you have Docker disabled and before recreating it at 20G, also disable VM Manager, then run Mover so you can get appdata, domains, system shares moved off the array. Having them on the array will keep drives spunup since these always have open files, and will impact performance of dockers/VMs if they are on slower parity array. https://wiki.unraid.net/Manual/Troubleshooting#How_do_I_recreate_docker.img.3F https://wiki.unraid.net/Manual/Troubleshooting#Restoring_your_Docker_Applications Doing this now. Will report back. Quote Link to comment
Profezor Posted September 22, 2021 Author Share Posted September 22, 2021 Dockers visible mostly. Most dockers green lights (running), but database errors and execution errors. Delete docker.img again. Diagnostics attached galaxy-diagnostics-20210922-1339.zip Quote Link to comment
trurl Posted September 22, 2021 Share Posted September 22, 2021 Are these the latest diagnostics? Doesn't look like you did anything I said. On 9/20/2021 at 1:02 PM, trurl said: while you have Docker disabled and before recreating it at 20G, also disable VM Manager, then run Mover so you can get appdata, domains, system shares moved off the array. All these shares still have files on the array. And your docker.img is still 150G Quote Link to comment
Profezor Posted September 22, 2021 Author Share Posted September 22, 2021 I did everything you said. Perhaps I sent the wrong file. Quote Link to comment
Profezor Posted September 22, 2021 Author Share Posted September 22, 2021 (edited) Ran them again. And again, I did everything. Fixed Disk 4, used mover, shrank img to 20 down from 150, recreated img, etc. Dockers visible mostly. Most dockers green lights (running), but database errors and execution errors so they aren't functioning. galaxy-diagnostics-20210922-1545.zip Edited September 22, 2021 by Profezor Quote Link to comment
trurl Posted September 22, 2021 Share Posted September 22, 2021 Still doesn't look like you did anything, still corruption on disk4, docker.img is still 150G though it looks like you did configure it for 20G, which wouldn't take effect until you recreate it. Even those previous diagnostics had it reconfigured for 20G but it was still 150G. And appdata, domains, system still have files on the array. Doesn't even look like you ran mover. Let's break things down. Go to Settings and disable Docker. Leave it disabled for now. Then check filesystem on disk4 and post the output. Quote Link to comment
Profezor Posted September 23, 2021 Author Share Posted September 23, 2021 (edited) Not sure of you time zone. I am in Europe. I assure you I have done everything you have said TWICE. I can't tell you what the logs say as I am not that technical. I need the server back up now after a month. SO I will do what every you say again happily 🙂 Thanks Off to Settings. Edited September 23, 2021 by Profezor Quote Link to comment
trurl Posted September 23, 2021 Share Posted September 23, 2021 3 hours ago, Profezor said: Not sure of you time zone. I am in Europe. Irrelevant. We see lots of different timezones. I am basing what I said completely on the contents of your diagnostics. How exactly are you getting those diagnostics? Are you sure you are giving us the current diagnostics? For example from those latest diagnostics you posted: 20 hours ago, trurl said: still corruption on disk4 Though it is mounting (from logs/syslog.txt) Sep 22 13:34:42 Galaxy emhttpd: shcmd (39): mount -t xfs -o noatime /dev/md4 /mnt/disk4 Sep 22 13:34:42 Galaxy kernel: XFS (md4): Mounting V5 Filesystem Sep 22 13:34:42 Galaxy kernel: XFS (md4): Ending clean mount later we see Sep 22 13:46:23 Galaxy kernel: XFS (md4): Metadata corruption detected at xfs_dinode_verify+0xa3/0x581 [xfs], inode 0x23f721790 dinode Sep 22 13:46:23 Galaxy kernel: XFS (md4): Unmount and run xfs_repair Perhaps that is a recurring problem due to something like bad connections and so we would see them again even after repair. 20 hours ago, trurl said: docker.img is still 150G though it looks like you did configure it for 20G system/df.txt Filesystem Size Used Avail Use% Mounted on ... /dev/loop2 150G 29G 120G 20% /var/lib/docker config/docker.cfg DOCKER_ENABLED="yes" DOCKER_IMAGE_FILE="/mnt/user/system/docker/docker.img" DOCKER_IMAGE_SIZE="20" 20 hours ago, trurl said: appdata, domains, system still have files on the array shares/appdata.cfg shareUseCache="prefer" # Share exists on cache,disk1,disk3,disk4,disk5 shares/domains.cfg shareUseCache="prefer" # Share exists on disk3,disk4 shares/system.cfg shareUseCache="prefer" # Share exists on cache,disk3,disk4 There are some reasons that mover might not move those, but searching your syslog doesn't show mover ever being invoked. Quote Link to comment
trurl Posted September 23, 2021 Share Posted September 23, 2021 20 hours ago, trurl said: Go to Settings and disable Docker. Leave it disabled for now. Then check filesystem on disk4 and post the output. Also, go to User Shares, click Compute All button at the bottom, wait for the complete results. If you don't see the complete results after several minutes, refresh the page. After you get the complete results post a screenshot. Quote Link to comment
JorgeB Posted September 23, 2021 Share Posted September 23, 2021 Make sure you're running xfs_repair without the -n flag or nothing will be done. Quote Link to comment
Profezor Posted September 24, 2021 Author Share Posted September 24, 2021 (edited) I haven't forgotten. I can't reach the system remotely anymore apparently so I need to setup a monitor and then figure out how get you the output from the Disk4 check. I did run the repair with -n flag. I will not do that this time as per JorgeB. MOVER - runs for hours and hours, but I guess it really sin't moving much as you point out. BTW, I mentioned timezone just to let you know, I am not igne.noring this problem, but I just may be sleeping 🙂 Thanks for your continued help. I gotta get this back online Edited September 24, 2021 by Profezor accuracy Quote Link to comment
Profezor Posted September 25, 2021 Author Share Posted September 25, 2021 I was only able to take a photo of the disk 4 file system check. what next? Quote Link to comment
itimpi Posted September 25, 2021 Share Posted September 25, 2021 That looks as if it has managed to repair any corruption without any data loss. Quote Link to comment
Profezor Posted September 25, 2021 Author Share Posted September 25, 2021 Yes, it seems. waiting for further instructions from Squid as my last diagnostic report did not show the actions I took. Additionally, still can’t use my dockers and MOVE doesn’t seem to actually move much. Quote Link to comment
Profezor Posted September 25, 2021 Author Share Posted September 25, 2021 On 9/23/2021 at 3:08 PM, trurl said: Also, go to User Shares, click Compute All button at the bottom, wait for the complete results. If you don't see the complete results after several minutes, refresh the page. After you get the complete results post a screenshot. Quote Link to comment
trurl Posted September 25, 2021 Share Posted September 25, 2021 10 hours ago, Profezor said: only able to take a photo Why weren't you able to get a screenshot? 1 hour ago, Profezor said: waiting for further instructions from Squid as my last diagnostic report did not show the actions I took. Are you sure you aren't waiting for me? Post new diagnostics. Quote Link to comment
Squid Posted September 26, 2021 Share Posted September 26, 2021 6 hours ago, Profezor said: instructions from Squid Hope you're actually waiting for Trurl, as I'm very inconsistent in following threads, and am extremely busy with a CA project that is taking up all my time. 1 Quote Link to comment
Profezor Posted September 27, 2021 Author Share Posted September 27, 2021 On 9/25/2021 at 11:10 PM, trurl said: Why weren't you able to get a screenshot? Are you sure you aren't waiting for me? Post new diagnostics. Wasn't able to take a screenshoot as I lost internet access on the server machine. Fixed now. Attached are the diagnostics post Disk4 fix but with dockers disable. Thanks galaxy-diagnostics-20210927-1633.zip Quote Link to comment
trurl Posted October 1, 2021 Share Posted October 1, 2021 Your screenshot of User Shares above shows appdata on cache and disks 1,3,4 but your diagnostics say there is also appdata on disk5. And the diagnostics also show you had filled log space with entries from mover. Nothing can move open files, and mover won't move duplicates. Looks like you have duplicates. Reboot if you haven't already to get log space cleared. Then go to the command line and post the results of ls -lah /mnt/cache/appdata and ls -lah /mnt/user0/appdata Quote Link to comment
Profezor Posted October 1, 2021 Author Share Posted October 1, 2021 (edited) 11 hours ago, trurl said: Your screenshot of User Shares above shows appdata on cache and disks 1,3,4 but your diagnostics say there is also appdata on disk5. And the diagnostics also show you had filled log space with entries from mover. Nothing can move open files, and mover won't move duplicates. Looks like you have duplicates. Reboot if you haven't already to get log space cleared. Then go to the command line and post the results of ls -lah /mnt/cache/appdata and ls -lah /mnt/user0/appdata root@Galaxy:~# ls -lah /mnt/cache/appdata total 232K drwxrwxrwx 1 nobody users 1.1K Aug 15 19:58 ./ drwxrwxrwx 1 nobody users 40 Mar 7 2021 ../ -rw-rw-rw- 1 terrence users 8.1K Sep 17 00:46 .DS_Store -rw-rw-rw- 1 terrence users 368 Mar 15 2021 ._.DS_Store -rw-r--r-- 1 nobody users 2.4K Feb 21 2021 .bashrc drwxr-xr-x 1 nobody users 28 Mar 12 2021 .cache/ drwxr-xr-x 1 nobody users 42 Mar 12 2021 .config/ drwx------ 1 nobody users 10 Feb 21 2021 .local/ drwx------ 1 nobody users 10 Mar 12 2021 .pki/ -rw-r--r-- 1 nobody users 27K Mar 12 2021 .xorgxrdp.10.log -rw-r--r-- 1 nobody users 163K Mar 12 2021 .xorgxrdp.10.log.old drwxrwxrwx 1 nobody users 78 Apr 30 11:38 Grafana-Unraid-Stack/ drwxrwxrwx 1 nobody users 22 May 29 01:58 MusicBrainz-Picard/ drwxrwxrwx 1 nobody users 334 Sep 20 17:22 NginxProxyManager/ drwxrwxrwx 1 nobody users 102 Jun 28 01:02 bazarr/ drwxrwxr-x 1 nobody users 546 Sep 21 13:37 binhex-delugevpn/ drwxrwxr-x 1 nobody users 256 Aug 21 08:57 binhex-jellyfin/ drwxrwxr-x 1 nobody users 56 Feb 18 2021 binhex-krusader/ drwxrwxr-x 1 nobody users 416 Sep 22 13:40 binhex-lidarr/ drwxrwxr-x 1 nobody users 58 Apr 17 22:53 binhex-nginx/ drwxrwxr-x 1 nobody users 202 Jul 6 22:55 binhex-nzbhydra2/ drwxrwxr-x 1 nobody users 208 Jul 30 22:47 binhex-plex/ drwxrwxr-x 1 nobody users 332 Sep 22 13:39 binhex-radarr/ drwxrwxr-x 1 nobody users 246 Aug 17 07:19 binhex-sabnzbdvpn/ drwxrwxr-x 1 nobody users 388 Sep 22 13:40 binhex-sonarr/ drwxrwxrwx 1 nobody users 222 Sep 20 17:22 bitwarden/ drwxr-xr-x 1 root root 120 Jun 11 10:14 cloudflared/ drwxrwxrwx 1 nobody users 12 Jul 3 09:17 dupeGuru/ drwxrwxrwx 1 root root 82 Jul 10 14:46 macinabox/ drwxrwxrwx 1 nobody users 114 Jun 25 01:02 mariadb/ drwxrwxrwx 1 nobody users 160 Jun 21 01:02 nextcloud/ drwxrwxrwx 1 root root 28 Jun 21 13:08 onlyoffice/ drwxrwxrwx 1 nobody users 136 Jun 15 21:33 openvpn-as/ drwxrwxrwx 1 nobody users 36 Jun 30 16:40 organizrv2/ drwxrwxr-x 1 nobody users 50 May 2 12:29 overseerr/ drwxrwxrwx 1 root root 8 Apr 22 20:45 paperless-ng/ drwxrwxrwx 1 nobody users 58 May 28 20:19 papermerge/ drwxrwxrwx 1 nobody users 88 Sep 20 17:22 photoprism/ drwxrwxr-x 1 nobody users 154 Sep 22 13:40 prowlarr/ drwxrwxrwx 1 nobody users 0 Apr 29 21:57 rebuild-dndc/ drwxrwxrwx 1 nobody users 0 Aug 15 19:58 socials/ drwxr-xr-x 1 911 911 114 Jul 15 16:56 speedtest-tracker/ -rwxr-xr-x 1 nobody users 54 Feb 21 2021 startwm.sh* drwxrwxrwx 1 nobody users 32 Jul 25 12:31 tailscale/ drwxr-xr-x 1 nobody users 40 Apr 2 11:24 tubesync/ drwxr-xr-x 1 nobody users 102 Aug 12 21:44 unmanic/ drwxrwxrwx 1 nobody users 40 Mar 21 2021 vm_custom_icons/ drwxrwxrwx 1 root root 264 Aug 17 13:21 windows11_uupdump/ drwxr-xr-x 1 sshd sshd 582 Apr 2 14:34 wordpress/ root@Galaxy:~# ls -lah /mnt/user0/appdata total 0 drwxrwxrwx 1 nobody users 37 Aug 15 19:58 ./ drwxrwxrwx 1 nobody users 160 May 28 20:17 ../ drwxrwxrwx 1 nobody users 26 Apr 30 11:38 Grafana-Unraid-Stack/ drwxrwxrwx 1 nobody users 26 Sep 20 17:22 NginxProxyManager/ drwxrwxrwx 1 nobody users 25 Jun 28 01:02 bazarr/ drwxrwxr-x 1 nobody users 35 Sep 13 18:17 binhex-delugevpn/ drwxrwxr-x 1 nobody users 26 Feb 18 2021 binhex-krusader/ drwxrwxr-x 1 nobody users 39 Jul 30 22:47 binhex-plex/ drwxrwxrwx 1 nobody users 64 Sep 13 18:17 photoprism/ But now Disk 4 seems to have a big issue on top of all of this 😞 Edited October 1, 2021 by Profezor Quote Link to comment
trurl Posted October 1, 2021 Share Posted October 1, 2021 1 hour ago, Profezor said: now Disk 4 Post new diagnostics Quote Link to comment
Profezor Posted October 6, 2021 Author Share Posted October 6, 2021 (edited) On 10/1/2021 at 3:06 PM, trurl said: Post new diagnostics I am running a re build on Disk 4. I got to 92% and 1 hour to go and now it has slowed to say the next 8% will take 40 days. galaxy-diagnostics-20211006-1810.zip Edited October 6, 2021 by Profezor Quote Link to comment
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.