Bait Fish Posted January 1, 2023 Share Posted January 1, 2023 (edited) [Version: 6.11.5] Unraid, containers, and plugins. Seems to have had a BTFRS error on SDD1 this morning. Log attached. Gotta run to work now, so keeping this description short. Happy new year, and thanks for any help. homer-diagnostics-20230101-0456.zip Edited January 1, 2023 by Bait Fish version, title mod Quote Link to comment
Bait Fish Posted January 1, 2023 Author Share Posted January 1, 2023 Now that I'm back in the land of the living, details I could not include in my haste this morning. These errors happened in the middle of the night on a system that was humming along without any other concerns previously. Shares are not showing that had been there previously. I'm sure this is a good part of the problem. Unraid Version: 6.11.5. Fix Common Problems greeted me with these issues when I checked on the server after coffee: Unable to write to cache_nvme Drive mounted read-only or completely full. Unable to write to cache_ssd Drive mounted read-only or completely full. Unable to write to Docker Image Docker Image either full or corrupted. On the Docker tab in Unraid, these errors show. Docker Containers APPLICATION VERSION NETWORK PORT MAPPINGS (APP TO HOST) VOLUME MAPPINGS (APP TO HOST) AUTOSTART UPTIME Warning: stream_socket_client(): unable to connect to unix:///var/run/docker.sock (Connection refused) in /usr/local/emhttp/plugins/dynamix.docker.manager/include/DockerClient.php on line 712 Couldn't create socket: [111] Connection refused Warning: Invalid argument supplied for foreach() in /usr/local/emhttp/plugins/dynamix.docker.manager/include/DockerClient.php on line 898 Warning: stream_socket_client(): unable to connect to unix:///var/run/docker.sock (Connection refused) in /usr/local/emhttp/plugins/dynamix.docker.manager/include/DockerClient.php on line 712 Couldn't create socket: [111] Connection refused Warning: Invalid argument supplied for foreach() in /usr/local/emhttp/plugins/dynamix.docker.manager/include/DockerClient.php on line 967 No Docker containers installed And Unraid Settings>Docker mirrors these errors. Enable Docker: Yes One or more paths do not exist (view) Docker vDisk location: /mnt/user/system/docker/docker.img Path does not exist Default appdata storage location: /mnt/user/appdata/ Path does not exist Quote Link to comment
itimpi Posted January 2, 2023 Share Posted January 2, 2023 Could be that the ‘fuse’ system that handles User Shares has crashed. Would need your system’s diagnostics take while the problem is occurring to be sure. The FCP issues are a bit. concerning so something else could be going on. Have you run a memtest on your system recently in case it is a RAM related issue. Quote Link to comment
Bait Fish Posted January 2, 2023 Author Share Posted January 2, 2023 I'll get a memtest going first thing tomorrow. Thanks for your guidance. Sent from my Pixel 7 Pro using Tapatalk Quote Link to comment
Bait Fish Posted January 3, 2023 Author Share Posted January 3, 2023 The memory test passed.First boot into unRAID indicates the cache drive is missing. I'll check cables next. This SSD is only a couple/few months old. Sent from my Pixel 7 Pro using Tapatalk Quote Link to comment
trurl Posted January 3, 2023 Share Posted January 3, 2023 4 minutes ago, Bait Fish said: First boot into unRAID indicates the cache drive is missing. I'll check cables next. Post new diagnostics after. Disable Docker and VM Manager in Settings till things are working well again. cache_nvme is showing (XFS) corruption, check filesystem docker and libvirt img both showing corruption. Since system share is on cache_nvme fix that filesystem then you will probably have to recreate them. Corruption on cache_ssd, (btrfs) may be more complicated to fix. 1 Quote Link to comment
trurl Posted January 3, 2023 Share Posted January 3, 2023 14 minutes ago, Bait Fish said: The memory test passed. How long did you let it run? Quote Link to comment
Bait Fish Posted January 3, 2023 Author Share Posted January 3, 2023 It took about 10 hours, 4 passes Quote Link to comment
Bait Fish Posted January 3, 2023 Author Share Posted January 3, 2023 (edited) I started it up a second time. The cache drive (sde) that showed missing this morning, now showed present and ready. Nothing was done but restarting 9 hours later. I shut it down, reseated the cables for cache sde. Then started back up, and cache sde remained available. I saved diagnostics from this session and have uploaded to this post as suggested above. I'll attempt repairs now. Update: Cache_nvme repair with default -n option results with, Phase 1 - find and verify superblock... Phase 2 - using internal log - zero log... ALERT: The filesystem has valuable metadata changes in a log which is being ignored because the -n option was used. Expect spurious inconsistencies which may be resolved by first mounting the filesystem to replay the log. - scan filesystem freespace and inode maps... block (0,50297084-50297193) multiply claimed by cnt space tree, state - 2 block (0,48998093-48998203) multiply claimed by cnt space tree, state - 2 block (0,50379227-50379336) multiply claimed by cnt space tree, state - 2 block (0,49633120-49633230) multiply claimed by cnt space tree, state - 2 agf_freeblks 64128684, counted 64133581 in ag 0 agf_freeblks 97418327, counted 97438471 in ag 2 sb_icount 4368768, counted 4433280 sb_ifree 44699, counted 1263236 sb_fdblocks 340342643, counted 354388165 - found root inode chunk Phase 3 - for each AG... - scan (but don't clear) agi unlinked lists... - process known inodes and perform inode discovery... - agno = 0 data fork in ino 58802146 claims free block 7583440 data fork in ino 58802146 claims free block 7583505 data fork in ino 59896646 claims free block 7574017 data fork in ino 59896646 claims free block 7574079 - agno = 1 - agno = 2 bad nblocks 10397115 for inode 2175513269, would reset to 10397118 bad nextents 207685 for inode 2175513269, would reset to 207683 - agno = 3 bad CRC for inode 3227792998 bad CRC for inode 3227792998, would rewrite would have cleared inode 3227792998 - process newly discovered inodes... Phase 4 - check for duplicate blocks... - setting up duplicate extent list... free space (0,48811889-48811997) only seen by one free space btree free space (0,50494325-50494435) only seen by one free space btree - check for inodes claiming duplicate blocks... - agno = 0 - agno = 1 - agno = 2 - agno = 3 bad CRC for inode 3227792998, would rewrite would have cleared inode 3227792998 bad nblocks 10397115 for inode 2175513269, would reset to 10397118 bad nextents 207685 for inode 2175513269, would reset to 207683 No modify flag set, skipping phase 5 Phase 6 - check inode connectivity... - traversing filesystem ... would rebuild directory inode 1239086149 Metadata corruption detected at 0x46e010, inode 0xc0643666 dinode couldn't map inode 3227792998, err = 117 - traversal finished ... - moving disconnected inodes to lost+found ... disconnected inode 3227798067, would move to lost+found Phase 7 - verify link counts... Metadata corruption detected at 0x46e010, inode 0xc0643666 dinode couldn't map inode 3227792998, err = 117, can't compare link counts No modify flag set, skipping filesystem flush and exiting. homer-diagnostics-20230103-1458.zip Edited January 3, 2023 by Bait Fish Repair scan results Quote Link to comment
Bait Fish Posted January 3, 2023 Author Share Posted January 3, 2023 Having trouble on mobile modifying the code box above. Scanned with the -nv flag per docs. Quote Phase 1 - find and verify superblock... - block cache size set to 3061336 entries Phase 2 - using internal log - zero log... zero_log: head block 417567 tail block 393702 ALERT: The filesystem has valuable metadata changes in a log which is being ignored because the -n option was used. Expect spurious inconsistencies which may be resolved by first mounting the filesystem to replay the log. - scan filesystem freespace and inode maps... block (0,50297084-50297193) multiply claimed by cnt space tree, state - 2 block (0,48998093-48998203) multiply claimed by cnt space tree, state - 2 block (0,50379227-50379336) multiply claimed by cnt space tree, state - 2 block (0,49633120-49633230) multiply claimed by cnt space tree, state - 2 agf_freeblks 64128684, counted 64133581 in ag 0 agf_freeblks 97418327, counted 97438471 in ag 2 sb_icount 4368768, counted 4433280 sb_ifree 44699, counted 1263236 sb_fdblocks 340342643, counted 354388165 - found root inode chunk Phase 3 - for each AG... - scan (but don't clear) agi unlinked lists... - process known inodes and perform inode discovery... - agno = 0 data fork in ino 58802146 claims free block 7583440 data fork in ino 58802146 claims free block 7583505 data fork in ino 59896646 claims free block 7574017 data fork in ino 59896646 claims free block 7574079 - agno = 1 - agno = 2 bad nblocks 10397115 for inode 2175513269, would reset to 10397118 bad nextents 207685 for inode 2175513269, would reset to 207683 - agno = 3 bad CRC for inode 3227792998 bad CRC for inode 3227792998, would rewrite would have cleared inode 3227792998 - process newly discovered inodes... Phase 4 - check for duplicate blocks... - setting up duplicate extent list... free space (0,48811889-48811997) only seen by one free space btree free space (0,50494325-50494435) only seen by one free space btree - check for inodes claiming duplicate blocks... - agno = 0 - agno = 2 - agno = 3 - agno = 1 bad CRC for inode 3227792998, would rewrite would have cleared inode 3227792998 bad nblocks 10397115 for inode 2175513269, would reset to 10397118 bad nextents 207685 for inode 2175513269, would reset to 207683 No modify flag set, skipping phase 5 Phase 6 - check inode connectivity... - traversing filesystem ... - agno = 0 - agno = 1 would rebuild directory inode 1239086149 - agno = 2 - agno = 3 Metadata corruption detected at 0x46e010, inode 0xc0643666 dinode couldn't map inode 3227792998, err = 117 - traversal finished ... - moving disconnected inodes to lost+found ... disconnected inode 3227798067, would move to lost+found Phase 7 - verify link counts... Metadata corruption detected at 0x46e010, inode 0xc0643666 dinode couldn't map inode 3227792998, err = 117, can't compare link counts No modify flag set, skipping filesystem flush and exiting. XFS_REPAIR Summary Tue Jan 3 15:25:14 2023 Phase Start End Duration Phase 1: 01/03 15:25:04 01/03 15:25:04 Phase 2: 01/03 15:25:04 01/03 15:25:04 Phase 3: 01/03 15:25:04 01/03 15:25:09 5 seconds Phase 4: 01/03 15:25:09 01/03 15:25:10 1 second Phase 5: Skipped Phase 6: 01/03 15:25:10 01/03 15:25:14 4 seconds Phase 7: 01/03 15:25:14 01/03 15:25:14 Total run time: 10 seconds Quote Link to comment
trurl Posted January 4, 2023 Share Posted January 4, 2023 35 minutes ago, Bait Fish said: No modify flag set Do it again without -n, if it ask for it use -L Quote Link to comment
Bait Fish Posted January 4, 2023 Author Share Posted January 4, 2023 I ran through a couple repair sessions and tried to follow its directions. I pasted all the notes in this code block including which flag I was using, typically keeping verbose on. I did not catch it telling me to run the L option so I have not done that. Quote Phase 1 - find and verify superblock... - block cache size set to 3061336 entries Phase 2 - using internal log - zero log... zero_log: head block 417567 tail block 393702 ERROR: The filesystem has valuable metadata changes in a log which needs to be replayed. Mount the filesystem to replay the log, and unmount it before re-running xfs_repair. If you are unable to mount the filesystem, then use the -L option to destroy the log and attempt a repair. Note that destroying the log may cause corruption -- please attempt a mount of the filesystem before doing this. Stopped maint mode array. Started array. Stopped array. Started maint mode array. Rechecked xfs -nv Phase 1 - find and verify superblock... - block cache size set to 3061320 entries Phase 2 - using internal log - zero log... zero_log: head block 417613 tail block 417613 - scan filesystem freespace and inode maps... - found root inode chunk Phase 3 - for each AG... - scan (but don't clear) agi unlinked lists... - process known inodes and perform inode discovery... - agno = 0 - agno = 1 - agno = 2 - agno = 3 bad CRC for inode 3227792998 bad CRC for inode 3227792998, would rewrite would have cleared inode 3227792998 - process newly discovered inodes... Phase 4 - check for duplicate blocks... - setting up duplicate extent list... - check for inodes claiming duplicate blocks... - agno = 0 - agno = 1 - agno = 2 - agno = 3 bad CRC for inode 3227792998, would rewrite would have cleared inode 3227792998 No modify flag set, skipping phase 5 Phase 6 - check inode connectivity... - traversing filesystem ... - agno = 0 - agno = 1 would rebuild directory inode 1239086149 - agno = 2 - agno = 3 Metadata corruption detected at 0x46e010, inode 0xc0643666 dinode couldn't map inode 3227792998, err = 117 - traversal finished ... - moving disconnected inodes to lost+found ... disconnected inode 3227798067, would move to lost+found Phase 7 - verify link counts... Metadata corruption detected at 0x46e010, inode 0xc0643666 dinode couldn't map inode 3227792998, err = 117, can't compare link counts No modify flag set, skipping filesystem flush and exiting. XFS_REPAIR Summary Tue Jan 3 16:05:11 2023 Phase Start End Duration Phase 1: 01/03 16:05:02 01/03 16:05:02 Phase 2: 01/03 16:05:02 01/03 16:05:02 Phase 3: 01/03 16:05:02 01/03 16:05:07 5 seconds Phase 4: 01/03 16:05:07 01/03 16:05:08 1 second Phase 5: Skipped Phase 6: 01/03 16:05:08 01/03 16:05:11 3 seconds Phase 7: 01/03 16:05:11 01/03 16:05:11 Total run time: 9 seconds Ran xfs repair -v Phase 1 - find and verify superblock... - block cache size set to 3061320 entries Phase 2 - using internal log - zero log... zero_log: head block 417613 tail block 417613 - scan filesystem freespace and inode maps... - found root inode chunk Phase 3 - for each AG... - scan and clear agi unlinked lists... - process known inodes and perform inode discovery... - agno = 0 - agno = 1 - agno = 2 - agno = 3 bad CRC for inode 3227792998 bad CRC for inode 3227792998, will rewrite cleared inode 3227792998 - process newly discovered inodes... Phase 4 - check for duplicate blocks... - setting up duplicate extent list... - check for inodes claiming duplicate blocks... - agno = 0 - agno = 3 - agno = 2 - agno = 1 clearing reflink flag on inode 2149504425 clearing reflink flag on inode 1073935201 clearing reflink flag on inode 2149760794 clearing reflink flag on inode 3223369736 clearing reflink flag on inode 2193593298 clearing reflink flag on inode 3270882671 clearing reflink flag on inode 286726471 clearing reflink flag on inode 286743730 clearing reflink flag on inode 3307207359 clearing reflink flag on inode 1148980934 clearing reflink flag on inode 1148980936 clearing reflink flag on inode 1148980938 clearing reflink flag on inode 1148980939 clearing reflink flag on inode 1148980940 clearing reflink flag on inode 1148980941 clearing reflink flag on inode 1148980942 clearing reflink flag on inode 1151756820 clearing reflink flag on inode 1151756823 clearing reflink flag on inode 325929482 clearing reflink flag on inode 325929522 clearing reflink flag on inode 325929602 clearing reflink flag on inode 325929606 clearing reflink flag on inode 325929607 clearing reflink flag on inode 325929608 clearing reflink flag on inode 3376954786 clearing reflink flag on inode 1232940612 clearing reflink flag on inode 1232940615 clearing reflink flag on inode 1232940616 clearing reflink flag on inode 1232940617 clearing reflink flag on inode 379736619 clearing reflink flag on inode 380663459 clearing reflink flag on inode 380663497 clearing reflink flag on inode 380663502 clearing reflink flag on inode 380663503 clearing reflink flag on inode 380663510 clearing reflink flag on inode 380663511 clearing reflink flag on inode 380663520 clearing reflink flag on inode 380663521 clearing reflink flag on inode 380663522 clearing reflink flag on inode 380663523 clearing reflink flag on inode 380663524 clearing reflink flag on inode 380663525 clearing reflink flag on inode 380663540 clearing reflink flag on inode 380663541 clearing reflink flag on inode 3410317497 clearing reflink flag on inode 3410343811 clearing reflink flag on inode 3410343893 clearing reflink flag on inode 3410343927 clearing reflink flag on inode 3410649145 Phase 5 - rebuild AG headers and trees... - agno = 0 - agno = 1 - agno = 2 - agno = 3 - reset superblock... Phase 6 - check inode connectivity... - resetting contents of realtime bitmap and summary inodes - traversing filesystem ... - agno = 0 - agno = 1 rebuilding directory inode 1239086149 - agno = 2 - agno = 3 - traversal finished ... - moving disconnected inodes to lost+found ... Phase 7 - verify and correct link counts... XFS_REPAIR Summary Tue Jan 3 16:07:43 2023 Phase Start End Duration Phase 1: 01/03 16:07:33 01/03 16:07:33 Phase 2: 01/03 16:07:33 01/03 16:07:33 Phase 3: 01/03 16:07:33 01/03 16:07:39 6 seconds Phase 4: 01/03 16:07:39 01/03 16:07:40 1 second Phase 5: 01/03 16:07:40 01/03 16:07:40 Phase 6: 01/03 16:07:40 01/03 16:07:43 3 seconds Phase 7: 01/03 16:07:43 01/03 16:07:43 Total run time: 10 seconds done xfs repair -nv Phase 1 - find and verify superblock... - block cache size set to 3061320 entries Phase 2 - using internal log - zero log... zero_log: head block 417613 tail block 417613 - scan filesystem freespace and inode maps... - found root inode chunk Phase 3 - for each AG... - scan (but don't clear) agi unlinked lists... - process known inodes and perform inode discovery... - agno = 0 - agno = 1 - agno = 2 - agno = 3 - process newly discovered inodes... Phase 4 - check for duplicate blocks... - setting up duplicate extent list... - check for inodes claiming duplicate blocks... - agno = 0 - agno = 1 - agno = 3 - agno = 2 No modify flag set, skipping phase 5 Phase 6 - check inode connectivity... - traversing filesystem ... - agno = 0 - agno = 1 - agno = 2 - agno = 3 - traversal finished ... - moving disconnected inodes to lost+found ... Phase 7 - verify link counts... No modify flag set, skipping filesystem flush and exiting. XFS_REPAIR Summary Tue Jan 3 16:08:50 2023 Phase Start End Duration Phase 1: 01/03 16:08:41 01/03 16:08:41 Phase 2: 01/03 16:08:41 01/03 16:08:41 Phase 3: 01/03 16:08:41 01/03 16:08:46 5 seconds Phase 4: 01/03 16:08:46 01/03 16:08:47 1 second Phase 5: Skipped Phase 6: 01/03 16:08:47 01/03 16:08:50 3 seconds Phase 7: 01/03 16:08:50 01/03 16:08:50 Total run time: 9 seconds xfs repair -v Phase 1 - find and verify superblock... - block cache size set to 3061320 entries Phase 2 - using internal log - zero log... zero_log: head block 417613 tail block 417613 - scan filesystem freespace and inode maps... - found root inode chunk Phase 3 - for each AG... - scan and clear agi unlinked lists... - process known inodes and perform inode discovery... - agno = 0 - agno = 1 - agno = 2 - agno = 3 - process newly discovered inodes... Phase 4 - check for duplicate blocks... - setting up duplicate extent list... - check for inodes claiming duplicate blocks... - agno = 0 - agno = 3 - agno = 1 - agno = 2 Phase 5 - rebuild AG headers and trees... - agno = 0 - agno = 1 - agno = 2 - agno = 3 - reset superblock... Phase 6 - check inode connectivity... - resetting contents of realtime bitmap and summary inodes - traversing filesystem ... - agno = 0 - agno = 1 - agno = 2 - agno = 3 - traversal finished ... - moving disconnected inodes to lost+found ... Phase 7 - verify and correct link counts... XFS_REPAIR Summary Tue Jan 3 16:10:45 2023 Phase Start End Duration Phase 1: 01/03 16:10:35 01/03 16:10:35 Phase 2: 01/03 16:10:35 01/03 16:10:35 Phase 3: 01/03 16:10:35 01/03 16:10:40 5 seconds Phase 4: 01/03 16:10:40 01/03 16:10:41 1 second Phase 5: 01/03 16:10:41 01/03 16:10:42 1 second Phase 6: 01/03 16:10:42 01/03 16:10:45 3 seconds Phase 7: 01/03 16:10:45 01/03 16:10:45 Total run time: 10 seconds done Quote Link to comment
trurl Posted January 4, 2023 Share Posted January 4, 2023 Start the array in Normal (not Maintenance mode) and post new diagnostics Quote Link to comment
Bait Fish Posted January 4, 2023 Author Share Posted January 4, 2023 New diagnostics after starting array in normal mode homer-diagnostics-20230103-1639.zip Quote Link to comment
trurl Posted January 4, 2023 Share Posted January 4, 2023 Looks like they are mounted now. Have you checked contents? What do you get from command line with this? ls -lah /mnt/cache_nvme/system Quote Link to comment
Bait Fish Posted January 4, 2023 Author Share Posted January 4, 2023 (edited) root@Homer:~# ls -lah /mnt/cache_nvme/system total 0 drwxrwxrwx 4 nobody users 35 Nov 16 2021 ./ drwxrwxrwx 5 nobody users 50 Jan 3 16:48 ../ drwxrwxrwx 2 nobody users 24 Oct 11 16:32 docker/ drwxrwxrwx 2 nobody users 25 Nov 16 2021 libvirt/ root@Homer:~# Contents appear ok looking through the various directories. Edited January 4, 2023 by Bait Fish Quote Link to comment
trurl Posted January 4, 2023 Share Posted January 4, 2023 What do you get from command line with this? ls -lah /mnt/cache_nvme/system/docker and this? ls -lah /mnt/cache_nvme/system/libvirt Quote Link to comment
Bait Fish Posted January 4, 2023 Author Share Posted January 4, 2023 root@Homer:/mnt/cache_nvme/system# ls -lah /mnt/cache_nvme/system/docker total 40G drwxrwxrwx 2 nobody users 24 Oct 11 16:32 ./ drwxrwxrwx 4 nobody users 35 Nov 16 2021 ../ -rw-rw-rw- 1 nobody users 50G Jan 1 01:39 docker.img root@Homer:/mnt/cache_nvme/system# ls -lah /mnt/cache_nvme/system/libvirt total 104M drwxrwxrwx 2 nobody users 25 Nov 16 2021 ./ drwxrwxrwx 4 nobody users 35 Nov 16 2021 ../ -rw-rw-rw- 1 nobody users 1.0G Dec 31 17:00 libvirt.img root@Homer:/mnt/cache_nvme/system# Quote Link to comment
trurl Posted January 4, 2023 Share Posted January 4, 2023 Looks like the img files are there, no guarantee they are usable though. docker.img is easy enough to recreate. Do you have a backup of libvirt.img? Quote Link to comment
Bait Fish Posted January 4, 2023 Author Share Posted January 4, 2023 (edited) Uh oh. I have the VM autobackup itself on schedule. BUT I recall the morning the system went bad, that the VM backup executed. I may not. edit: libvirt.img is not in the backup location . . . I do not recall making any backup of it elsewhere manually. x2: and i did not have a location set in Appdata Backup/Restore for backing up libvirt.img Edited January 4, 2023 by Bait Fish Quote Link to comment
trurl Posted January 4, 2023 Share Posted January 4, 2023 Enable VM Manager and see what happens. Quote Link to comment
Bait Fish Posted January 4, 2023 Author Share Posted January 4, 2023 VM Manager started without any obvious issues. Started a Win10 VM successfully. Seems good. Quote Link to comment
trurl Posted January 4, 2023 Share Posted January 4, 2023 I guess you could try enabling Docker with that existing img Quote Link to comment
Bait Fish Posted January 4, 2023 Author Share Posted January 4, 2023 Docker service started without complaints. Containers appear to have auto started successfully as well! You mebtioned BTFRS corruption. I haven't tried that yet. Should I tackle that next? And, thanks for walking me thorough all this. Quote Link to comment
trurl Posted January 4, 2023 Share Posted January 4, 2023 cache_ssd is mounted but only has 7.4G data. Is that the expected amount? Quote Link to comment
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.