Jump to content

Shares Missing, Docker Service Failed. Suspect failed drive


Go to solution Solved by trurl,

Recommended Posts

Now that I'm back in the land of the living, details I could not include in my haste this morning. These errors happened in the middle of the night on a system that was humming along without any other concerns previously.

 

Shares are not showing that had been there previously. I'm sure this is a good part of the problem.

 

Unraid Version: 6.11.5. 

 

Fix Common Problems greeted me with these issues when I checked on the server after coffee:

Unable to write to cache_nvme    Drive mounted read-only or completely full.
Unable to write to cache_ssd    Drive mounted read-only or completely full. 
Unable to write to Docker Image    Docker Image either full or corrupted.

 

On the Docker tab in Unraid, these errors show.

Docker Containers
APPLICATION    VERSION    NETWORK    PORT MAPPINGS (APP TO HOST)    VOLUME MAPPINGS (APP TO HOST)    AUTOSTART    UPTIME    

Warning: stream_socket_client(): unable to connect to unix:///var/run/docker.sock (Connection refused) in /usr/local/emhttp/plugins/dynamix.docker.manager/include/DockerClient.php on line 712
Couldn't create socket: [111] Connection refused
Warning: Invalid argument supplied for foreach() in /usr/local/emhttp/plugins/dynamix.docker.manager/include/DockerClient.php on line 898

Warning: stream_socket_client(): unable to connect to unix:///var/run/docker.sock (Connection refused) in /usr/local/emhttp/plugins/dynamix.docker.manager/include/DockerClient.php on line 712
Couldn't create socket: [111] Connection refused
Warning: Invalid argument supplied for foreach() in /usr/local/emhttp/plugins/dynamix.docker.manager/include/DockerClient.php on line 967
No Docker containers installed

 

And Unraid Settings>Docker mirrors these errors.

Enable Docker:

 Yes  One or more paths do not exist (view)

Docker vDisk location:

  /mnt/user/system/docker/docker.img  Path does not exist

Default appdata storage location:

  /mnt/user/appdata/  Path does not exist

Link to comment
  • Bait Fish changed the title to Shares Missing, Docker Service Failed. Suspect failed drive
4 minutes ago, Bait Fish said:

First boot into unRAID indicates the cache drive is missing. I'll check cables next.

Post new diagnostics after.

 

Disable Docker and VM Manager in Settings till things are working well again.

 

cache_nvme is showing (XFS) corruption, check filesystem

 

docker and libvirt img both showing corruption. Since system share is on cache_nvme fix that filesystem then you will probably have to recreate them.

 

Corruption on cache_ssd, (btrfs) may be more complicated to fix.

  • Like 1
Link to comment

I started it up a second time. The cache drive (sde) that showed missing this morning, now showed present and ready. Nothing was done but restarting 9 hours later. 

 

I shut it down, reseated the cables for cache sde.

 

Then started back up, and cache sde remained available.  I saved diagnostics from this session and have uploaded to this post as suggested above.

 

I'll attempt repairs now.

 

Update:

Cache_nvme repair with default -n option results with,

Phase 1 - find and verify superblock...
Phase 2 - using internal log
        - zero log...
ALERT: The filesystem has valuable metadata changes in a log which is being
ignored because the -n option was used.  Expect spurious inconsistencies
which may be resolved by first mounting the filesystem to replay the log.
        - scan filesystem freespace and inode maps...
block (0,50297084-50297193) multiply claimed by cnt space tree, state - 2
block (0,48998093-48998203) multiply claimed by cnt space tree, state - 2
block (0,50379227-50379336) multiply claimed by cnt space tree, state - 2
block (0,49633120-49633230) multiply claimed by cnt space tree, state - 2
agf_freeblks 64128684, counted 64133581 in ag 0
agf_freeblks 97418327, counted 97438471 in ag 2
sb_icount 4368768, counted 4433280
sb_ifree 44699, counted 1263236
sb_fdblocks 340342643, counted 354388165
        - found root inode chunk
Phase 3 - for each AG...
        - scan (but don't clear) agi unlinked lists...
        - process known inodes and perform inode discovery...
        - agno = 0
data fork in ino 58802146 claims free block 7583440
data fork in ino 58802146 claims free block 7583505
data fork in ino 59896646 claims free block 7574017
data fork in ino 59896646 claims free block 7574079
        - agno = 1
        - agno = 2
bad nblocks 10397115 for inode 2175513269, would reset to 10397118
bad nextents 207685 for inode 2175513269, would reset to 207683
        - agno = 3
bad CRC for inode 3227792998
bad CRC for inode 3227792998, would rewrite
would have cleared inode 3227792998
        - process newly discovered inodes...
Phase 4 - check for duplicate blocks...
        - setting up duplicate extent list...
free space (0,48811889-48811997) only seen by one free space btree
free space (0,50494325-50494435) only seen by one free space btree
        - check for inodes claiming duplicate blocks...
        - agno = 0
        - agno = 1
        - agno = 2
        - agno = 3
bad CRC for inode 3227792998, would rewrite
would have cleared inode 3227792998
bad nblocks 10397115 for inode 2175513269, would reset to 10397118
bad nextents 207685 for inode 2175513269, would reset to 207683
No modify flag set, skipping phase 5
Phase 6 - check inode connectivity...
        - traversing filesystem ...
would rebuild directory inode 1239086149
Metadata corruption detected at 0x46e010, inode 0xc0643666 dinode
couldn't map inode 3227792998, err = 117
        - traversal finished ...
        - moving disconnected inodes to lost+found ...
disconnected inode 3227798067, would move to lost+found
Phase 7 - verify link counts...
Metadata corruption detected at 0x46e010, inode 0xc0643666 dinode
couldn't map inode 3227792998, err = 117, can't compare link counts
No modify flag set, skipping filesystem flush and exiting.

 

homer-diagnostics-20230103-1458.zip

Edited by Bait Fish
Repair scan results
Link to comment

Having trouble on mobile modifying the code box above. Scanned with the -nv flag per docs. 

Quote

 

 

Phase 1 - find and verify superblock...
        - block cache size set to 3061336 entries
Phase 2 - using internal log
        - zero log...
zero_log: head block 417567 tail block 393702
ALERT: The filesystem has valuable metadata changes in a log which is being
ignored because the -n option was used.  Expect spurious inconsistencies
which may be resolved by first mounting the filesystem to replay the log.
        - scan filesystem freespace and inode maps...
block (0,50297084-50297193) multiply claimed by cnt space tree, state - 2
block (0,48998093-48998203) multiply claimed by cnt space tree, state - 2
block (0,50379227-50379336) multiply claimed by cnt space tree, state - 2
block (0,49633120-49633230) multiply claimed by cnt space tree, state - 2
agf_freeblks 64128684, counted 64133581 in ag 0
agf_freeblks 97418327, counted 97438471 in ag 2
sb_icount 4368768, counted 4433280
sb_ifree 44699, counted 1263236
sb_fdblocks 340342643, counted 354388165
        - found root inode chunk
Phase 3 - for each AG...
        - scan (but don't clear) agi unlinked lists...
        - process known inodes and perform inode discovery...
        - agno = 0
data fork in ino 58802146 claims free block 7583440
data fork in ino 58802146 claims free block 7583505
data fork in ino 59896646 claims free block 7574017
data fork in ino 59896646 claims free block 7574079
        - agno = 1
        - agno = 2
bad nblocks 10397115 for inode 2175513269, would reset to 10397118
bad nextents 207685 for inode 2175513269, would reset to 207683
        - agno = 3
bad CRC for inode 3227792998
bad CRC for inode 3227792998, would rewrite
would have cleared inode 3227792998
        - process newly discovered inodes...
Phase 4 - check for duplicate blocks...
        - setting up duplicate extent list...
free space (0,48811889-48811997) only seen by one free space btree
free space (0,50494325-50494435) only seen by one free space btree
        - check for inodes claiming duplicate blocks...
        - agno = 0
        - agno = 2
        - agno = 3
        - agno = 1
bad CRC for inode 3227792998, would rewrite
would have cleared inode 3227792998
bad nblocks 10397115 for inode 2175513269, would reset to 10397118
bad nextents 207685 for inode 2175513269, would reset to 207683
No modify flag set, skipping phase 5
Phase 6 - check inode connectivity...
        - traversing filesystem ...
        - agno = 0
        - agno = 1
would rebuild directory inode 1239086149
        - agno = 2
        - agno = 3
Metadata corruption detected at 0x46e010, inode 0xc0643666 dinode
couldn't map inode 3227792998, err = 117
        - traversal finished ...
        - moving disconnected inodes to lost+found ...
disconnected inode 3227798067, would move to lost+found
Phase 7 - verify link counts...
Metadata corruption detected at 0x46e010, inode 0xc0643666 dinode
couldn't map inode 3227792998, err = 117, can't compare link counts
No modify flag set, skipping filesystem flush and exiting.

        XFS_REPAIR Summary    Tue Jan  3 15:25:14 2023

Phase		Start		End		Duration
Phase 1:	01/03 15:25:04	01/03 15:25:04
Phase 2:	01/03 15:25:04	01/03 15:25:04
Phase 3:	01/03 15:25:04	01/03 15:25:09	5 seconds
Phase 4:	01/03 15:25:09	01/03 15:25:10	1 second
Phase 5:	Skipped
Phase 6:	01/03 15:25:10	01/03 15:25:14	4 seconds
Phase 7:	01/03 15:25:14	01/03 15:25:14

Total run time: 10 seconds

 

Link to comment

I ran through a couple repair sessions and tried to follow its directions. I pasted all the notes in this code block including which flag I was using, typically keeping verbose on.

 

I did not catch it telling me to run the L option so I have not done that. 

 

Quote
Phase 1 - find and verify superblock...
        - block cache size set to 3061336 entries
Phase 2 - using internal log
        - zero log...
zero_log: head block 417567 tail block 393702
ERROR: The filesystem has valuable metadata changes in a log which needs to
be replayed.  Mount the filesystem to replay the log, and unmount it before
re-running xfs_repair.  If you are unable to mount the filesystem, then use
the -L option to destroy the log and attempt a repair.
Note that destroying the log may cause corruption -- please attempt a mount
of the filesystem before doing this.


Stopped maint mode array. Started array. Stopped array. Started maint mode array. Rechecked xfs -nv

Phase 1 - find and verify superblock...
        - block cache size set to 3061320 entries
Phase 2 - using internal log
        - zero log...
zero_log: head block 417613 tail block 417613
        - scan filesystem freespace and inode maps...
        - found root inode chunk
Phase 3 - for each AG...
        - scan (but don't clear) agi unlinked lists...
        - process known inodes and perform inode discovery...
        - agno = 0
        - agno = 1
        - agno = 2
        - agno = 3
bad CRC for inode 3227792998
bad CRC for inode 3227792998, would rewrite
would have cleared inode 3227792998
        - process newly discovered inodes...
Phase 4 - check for duplicate blocks...
        - setting up duplicate extent list...
        - check for inodes claiming duplicate blocks...
        - agno = 0
        - agno = 1
        - agno = 2
        - agno = 3
bad CRC for inode 3227792998, would rewrite
would have cleared inode 3227792998
No modify flag set, skipping phase 5
Phase 6 - check inode connectivity...
        - traversing filesystem ...
        - agno = 0
        - agno = 1
would rebuild directory inode 1239086149
        - agno = 2
        - agno = 3
Metadata corruption detected at 0x46e010, inode 0xc0643666 dinode
couldn't map inode 3227792998, err = 117
        - traversal finished ...
        - moving disconnected inodes to lost+found ...
disconnected inode 3227798067, would move to lost+found
Phase 7 - verify link counts...
Metadata corruption detected at 0x46e010, inode 0xc0643666 dinode
couldn't map inode 3227792998, err = 117, can't compare link counts
No modify flag set, skipping filesystem flush and exiting.

        XFS_REPAIR Summary    Tue Jan  3 16:05:11 2023

Phase		Start		End		Duration
Phase 1:	01/03 16:05:02	01/03 16:05:02
Phase 2:	01/03 16:05:02	01/03 16:05:02
Phase 3:	01/03 16:05:02	01/03 16:05:07	5 seconds
Phase 4:	01/03 16:05:07	01/03 16:05:08	1 second
Phase 5:	Skipped
Phase 6:	01/03 16:05:08	01/03 16:05:11	3 seconds
Phase 7:	01/03 16:05:11	01/03 16:05:11

Total run time: 9 seconds



Ran xfs repair -v

Phase 1 - find and verify superblock...
        - block cache size set to 3061320 entries
Phase 2 - using internal log
        - zero log...
zero_log: head block 417613 tail block 417613
        - scan filesystem freespace and inode maps...
        - found root inode chunk
Phase 3 - for each AG...
        - scan and clear agi unlinked lists...
        - process known inodes and perform inode discovery...
        - agno = 0
        - agno = 1
        - agno = 2
        - agno = 3
bad CRC for inode 3227792998
bad CRC for inode 3227792998, will rewrite
cleared inode 3227792998
        - process newly discovered inodes...
Phase 4 - check for duplicate blocks...
        - setting up duplicate extent list...
        - check for inodes claiming duplicate blocks...
        - agno = 0
        - agno = 3
        - agno = 2
        - agno = 1
clearing reflink flag on inode 2149504425
clearing reflink flag on inode 1073935201
clearing reflink flag on inode 2149760794
clearing reflink flag on inode 3223369736
clearing reflink flag on inode 2193593298
clearing reflink flag on inode 3270882671
clearing reflink flag on inode 286726471
clearing reflink flag on inode 286743730
clearing reflink flag on inode 3307207359
clearing reflink flag on inode 1148980934
clearing reflink flag on inode 1148980936
clearing reflink flag on inode 1148980938
clearing reflink flag on inode 1148980939
clearing reflink flag on inode 1148980940
clearing reflink flag on inode 1148980941
clearing reflink flag on inode 1148980942
clearing reflink flag on inode 1151756820
clearing reflink flag on inode 1151756823
clearing reflink flag on inode 325929482
clearing reflink flag on inode 325929522
clearing reflink flag on inode 325929602
clearing reflink flag on inode 325929606
clearing reflink flag on inode 325929607
clearing reflink flag on inode 325929608
clearing reflink flag on inode 3376954786
clearing reflink flag on inode 1232940612
clearing reflink flag on inode 1232940615
clearing reflink flag on inode 1232940616
clearing reflink flag on inode 1232940617
clearing reflink flag on inode 379736619
clearing reflink flag on inode 380663459
clearing reflink flag on inode 380663497
clearing reflink flag on inode 380663502
clearing reflink flag on inode 380663503
clearing reflink flag on inode 380663510
clearing reflink flag on inode 380663511
clearing reflink flag on inode 380663520
clearing reflink flag on inode 380663521
clearing reflink flag on inode 380663522
clearing reflink flag on inode 380663523
clearing reflink flag on inode 380663524
clearing reflink flag on inode 380663525
clearing reflink flag on inode 380663540
clearing reflink flag on inode 380663541
clearing reflink flag on inode 3410317497
clearing reflink flag on inode 3410343811
clearing reflink flag on inode 3410343893
clearing reflink flag on inode 3410343927
clearing reflink flag on inode 3410649145
Phase 5 - rebuild AG headers and trees...
        - agno = 0
        - agno = 1
        - agno = 2
        - agno = 3
        - reset superblock...
Phase 6 - check inode connectivity...
        - resetting contents of realtime bitmap and summary inodes
        - traversing filesystem ...
        - agno = 0
        - agno = 1
rebuilding directory inode 1239086149
        - agno = 2
        - agno = 3
        - traversal finished ...
        - moving disconnected inodes to lost+found ...
Phase 7 - verify and correct link counts...

        XFS_REPAIR Summary    Tue Jan  3 16:07:43 2023

Phase		Start		End		Duration
Phase 1:	01/03 16:07:33	01/03 16:07:33
Phase 2:	01/03 16:07:33	01/03 16:07:33
Phase 3:	01/03 16:07:33	01/03 16:07:39	6 seconds
Phase 4:	01/03 16:07:39	01/03 16:07:40	1 second
Phase 5:	01/03 16:07:40	01/03 16:07:40
Phase 6:	01/03 16:07:40	01/03 16:07:43	3 seconds
Phase 7:	01/03 16:07:43	01/03 16:07:43

Total run time: 10 seconds
done


xfs repair -nv

Phase 1 - find and verify superblock...
        - block cache size set to 3061320 entries
Phase 2 - using internal log
        - zero log...
zero_log: head block 417613 tail block 417613
        - scan filesystem freespace and inode maps...
        - found root inode chunk
Phase 3 - for each AG...
        - scan (but don't clear) agi unlinked lists...
        - process known inodes and perform inode discovery...
        - agno = 0
        - agno = 1
        - agno = 2
        - agno = 3
        - process newly discovered inodes...
Phase 4 - check for duplicate blocks...
        - setting up duplicate extent list...
        - check for inodes claiming duplicate blocks...
        - agno = 0
        - agno = 1
        - agno = 3
        - agno = 2
No modify flag set, skipping phase 5
Phase 6 - check inode connectivity...
        - traversing filesystem ...
        - agno = 0
        - agno = 1
        - agno = 2
        - agno = 3
        - traversal finished ...
        - moving disconnected inodes to lost+found ...
Phase 7 - verify link counts...
No modify flag set, skipping filesystem flush and exiting.

        XFS_REPAIR Summary    Tue Jan  3 16:08:50 2023

Phase		Start		End		Duration
Phase 1:	01/03 16:08:41	01/03 16:08:41
Phase 2:	01/03 16:08:41	01/03 16:08:41
Phase 3:	01/03 16:08:41	01/03 16:08:46	5 seconds
Phase 4:	01/03 16:08:46	01/03 16:08:47	1 second
Phase 5:	Skipped
Phase 6:	01/03 16:08:47	01/03 16:08:50	3 seconds
Phase 7:	01/03 16:08:50	01/03 16:08:50

Total run time: 9 seconds


xfs repair -v

Phase 1 - find and verify superblock...
        - block cache size set to 3061320 entries
Phase 2 - using internal log
        - zero log...
zero_log: head block 417613 tail block 417613
        - scan filesystem freespace and inode maps...
        - found root inode chunk
Phase 3 - for each AG...
        - scan and clear agi unlinked lists...
        - process known inodes and perform inode discovery...
        - agno = 0
        - agno = 1
        - agno = 2
        - agno = 3
        - process newly discovered inodes...
Phase 4 - check for duplicate blocks...
        - setting up duplicate extent list...
        - check for inodes claiming duplicate blocks...
        - agno = 0
        - agno = 3
        - agno = 1
        - agno = 2
Phase 5 - rebuild AG headers and trees...
        - agno = 0
        - agno = 1
        - agno = 2
        - agno = 3
        - reset superblock...
Phase 6 - check inode connectivity...
        - resetting contents of realtime bitmap and summary inodes
        - traversing filesystem ...
        - agno = 0
        - agno = 1
        - agno = 2
        - agno = 3
        - traversal finished ...
        - moving disconnected inodes to lost+found ...
Phase 7 - verify and correct link counts...

        XFS_REPAIR Summary    Tue Jan  3 16:10:45 2023

Phase		Start		End		Duration
Phase 1:	01/03 16:10:35	01/03 16:10:35
Phase 2:	01/03 16:10:35	01/03 16:10:35
Phase 3:	01/03 16:10:35	01/03 16:10:40	5 seconds
Phase 4:	01/03 16:10:40	01/03 16:10:41	1 second
Phase 5:	01/03 16:10:41	01/03 16:10:42	1 second
Phase 6:	01/03 16:10:42	01/03 16:10:45	3 seconds
Phase 7:	01/03 16:10:45	01/03 16:10:45

Total run time: 10 seconds
done

 

 

Link to comment
root@Homer:~# ls -lah /mnt/cache_nvme/system
total 0
drwxrwxrwx 4 nobody users 35 Nov 16  2021 ./
drwxrwxrwx 5 nobody users 50 Jan  3 16:48 ../
drwxrwxrwx 2 nobody users 24 Oct 11 16:32 docker/
drwxrwxrwx 2 nobody users 25 Nov 16  2021 libvirt/
root@Homer:~# 

 

Contents appear ok looking through the various directories. 

Edited by Bait Fish
Link to comment
root@Homer:/mnt/cache_nvme/system# ls -lah /mnt/cache_nvme/system/docker
total 40G
drwxrwxrwx 2 nobody users  24 Oct 11 16:32 ./
drwxrwxrwx 4 nobody users  35 Nov 16  2021 ../
-rw-rw-rw- 1 nobody users 50G Jan  1 01:39 docker.img
root@Homer:/mnt/cache_nvme/system# ls -lah /mnt/cache_nvme/system/libvirt
total 104M
drwxrwxrwx 2 nobody users   25 Nov 16  2021 ./
drwxrwxrwx 4 nobody users   35 Nov 16  2021 ../
-rw-rw-rw- 1 nobody users 1.0G Dec 31 17:00 libvirt.img
root@Homer:/mnt/cache_nvme/system# 

 

Link to comment

Uh oh. I have the VM autobackup itself on schedule. BUT I recall the morning the system went bad, that the VM backup executed.

 

I may not.

 

edit: libvirt.img is not in the backup location . . . I do not recall making any backup of it elsewhere manually.

 

x2: and i did not have a location set in Appdata Backup/Restore for backing up libvirt.img

Edited by Bait Fish
Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...