[6.12.2] Array stop stuck on "Retry unmounting disk share(s)"


Go to solution Solved by ljm42,

Recommended Posts

I just upgraded to 6.12.2 as part of troubleshooting for another issue, and it seems to have caused an issue of its own. I can no longer fully stop the array once it is started. The status at the bottom left corner just repeatedly says "Retry unmounting disk share(s)" and never completes. In my logs I see this repeatedly:

Quote

Jul  3 12:05:39 Tower root: umount: /mnt/disk3: target is busy.
Jul  3 12:05:39 Tower emhttpd: shcmd (347): exit status: 32
Jul  3 12:05:39 Tower emhttpd: Retry unmounting disk share(s)...
Jul  3 12:05:44 Tower emhttpd: Unmounting disks...
Jul  3 12:05:44 Tower emhttpd: shcmd (348): umount /mnt/disk3
Jul  3 12:05:44 Tower root: umount: /mnt/disk3: target is busy.
Jul  3 12:05:44 Tower emhttpd: shcmd (348): exit status: 32
Jul  3 12:05:44 Tower emhttpd: Retry unmounting disk share(s)...
Jul  3 12:05:49 Tower emhttpd: Unmounting disks...
Jul  3 12:05:49 Tower emhttpd: shcmd (349): umount /mnt/disk3
Jul  3 12:05:49 Tower root: umount: /mnt/disk3: target is busy.
Jul  3 12:05:49 Tower emhttpd: shcmd (349): exit status: 32
Jul  3 12:05:49 Tower emhttpd: Retry unmounting disk share(s)...
Jul  3 12:05:54 Tower emhttpd: Unmounting disks...
Jul  3 12:05:54 Tower emhttpd: shcmd (351): umount /mnt/disk3
Jul  3 12:05:54 Tower root: umount: /mnt/disk3: target is busy.
Jul  3 12:05:54 Tower emhttpd: shcmd (351): exit status: 32
Jul  3 12:05:54 Tower emhttpd: Retry unmounting disk share(s)...
Jul  3 12:05:59 Tower emhttpd: Unmounting disks...
Jul  3 12:05:59 Tower emhttpd: shcmd (352): umount /mnt/disk3
Jul  3 12:05:59 Tower root: umount: /mnt/disk3: target is busy.
Jul  3 12:05:59 Tower emhttpd: shcmd (352): exit status: 32
Jul  3 12:05:59 Tower emhttpd: Retry unmounting disk share(s)...
Jul  3 12:06:04 Tower emhttpd: Unmounting disks...
Jul  3 12:06:04 Tower emhttpd: shcmd (353): umount /mnt/disk3
Jul  3 12:06:04 Tower root: umount: /mnt/disk3: target is busy.
Jul  3 12:06:04 Tower emhttpd: shcmd (353): exit status: 32
Jul  3 12:06:04 Tower emhttpd: Retry unmounting disk share(s)...
Jul  3 12:06:09 Tower emhttpd: Unmounting disks...
Jul  3 12:06:09 Tower emhttpd: shcmd (354): umount /mnt/disk3

 

My diagnostics are attached. How can I fix this?

 

tower-diagnostics-20230703-1207 2.zip

Edited by sazrocks
  • Upvote 1
Link to comment

I have seen this behavior too since upgrading to 6.12.0, it was never an issue using 6.11.5

 

I haven't found the reason and am sorely tempted to go back to 6.11.5.

 

Because as this issue am rebooting the server instead of stopping it, but this causes problems with unclean shutdowns. Again this was never a problem before 6.12.0

  • Upvote 1
Link to comment

Just encountered the same issue. There is nothing running (no vms, no docker, open files plugin shows nothing out of the ordinary) that should be preventing the array from stopping. I did just complete a scheduled parity check, but otherwise everything is normal. And after about 20-30 minutes I'm intermittently unable to reach the web GUI.

Edited by bar_foo
corrected detail about UI access
Link to comment

I have also noticed this behavior after upgrading to 6.12. I am currently on 6.12.2. I shut down my server to remove an unused cleared drive, and it came back up as unclean. I am waiting for the parity check to finish, and am going to extend my shutdown times for disks (90 to 120 seconds), VMs (60 to 90 seconds), and Docker (60 to 90 seconds) to see if that fixes it, but according to my syslog.txt file which I have posted a snippet of below it appears it is my /mnt/cache_nvme device that keeps being busy and fails to unmount. Docker and libvirt appear to shut down gracefully, the array disks appear to unmount properly, as well as my ZFS disk pool that I created as an experiment to test 6.12's new ZFS functionality. It tries to unmount cache_nvme like this repeatedly after everything else has stopped, until the log file ends before it powers off. (Kveer is my server name)

 

Jul  5 22:32:59 Kveer emhttpd: shcmd (1269849): umount /mnt/cache_nvme
Jul  5 22:32:59 Kveer root: umount: /mnt/cache_nvme: target is busy.
Jul  5 22:32:59 Kveer emhttpd: shcmd (1269849): exit status: 32
Jul  5 22:32:59 Kveer emhttpd: Retry unmounting disk share(s)...
Jul  5 22:33:04 Kveer emhttpd: Unmounting disks...
Jul  5 22:33:04 Kveer emhttpd: shcmd (1269850): umount /mnt/cache_nvme
Jul  5 22:33:04 Kveer root: umount: /mnt/cache_nvme: target is busy.
Jul  5 22:33:04 Kveer emhttpd: shcmd (1269850): exit status: 32
Jul  5 22:33:04 Kveer emhttpd: Retry unmounting disk share(s)...
Jul  5 22:33:09 Kveer emhttpd: Unmounting disks...
Jul  5 22:33:09 Kveer emhttpd: shcmd (1269851): umount /mnt/cache_nvme
Jul  5 22:33:09 Kveer root: umount: /mnt/cache_nvme: target is busy.
Jul  5 22:33:09 Kveer emhttpd: shcmd (1269851): exit status: 32
Jul  5 22:33:09 Kveer emhttpd: Retry unmounting disk share(s)...

 

In my case, the cache_nvme device is an M.2 nvme drive in my server that is btrfs mirrored to another one and those two drives are set up a cache pool. I have my system share there as well as my syslog folder, domains, appdata, and nextcloud docker storage share. Basically, anything that I want to be fast and accessible and doesn't need to be on the array.

 

Interestingly, I purposely don't have disk shares enabled or used at all. In the Global Share Settings, I have Enable Disk Shares set to "No". Maybe that's just an overlap in terminology?

 

The parity check should finish in the morning and after I get home from work tomorrow night I'll try to investigate further.

Edited by eggman9713
Added detail about disk share settings.
Link to comment

Hi. I would like to add to this thread. This issue seems very random indeed. On the 4th of july (so two days from posting this) I disabled docker in the web ui. Since according to my logs from other days (with the same issue) it was docker that kept the disk unmountable. So I disabled docker before trying to shutdown the server. And to my surprise that worked - or so I thought. Then yesterday (5th) I tried the same thing again but with no luck this time. This issue seems so be somewhat random. I will add two logs. One from yesterday and the one from the 4th.

 

EDIT: For the time being I think I will go back to 6.15 since I dont even use any of the newer features like ZFS or have rearranged much of the UI. I hope this issue will be addressed and fixed somewhat quickly

tower-diagnostics-20230706-0047.zip tower-diagnostics-20230704-0055.zip

Edited by Brydezen
Added text about downgrading
Link to comment

Out of 8 shutdown/reboot sequences, only 1 was successful. I have dozens of php and other things that refuse to end and there is no killing them. Twice I let the thing sit for hours with nothing happening. Seriously getting tired of non-stop parity checks due to unclean shutdowns. Guess I'll manually install 6.11.5 since I came from 6.9.2 14 days ago.

  • Thanks 1
Link to comment

UPDATE! please see my comment further down in this thread https://forums.unraid.net/topic/141479-6122-array-stop-stuck-on-retry-unmounting-disk-shares/#comment-1283203

 

 

-------------------------------

Original message:

-------------------------------

 

I hit this today when stopping my array.

 

Here is what worked for me, would appreciate if someone hitting this would confirm it works for them too.

 

To get into this state, stop the array. If you are having this issue you will see "retry unmounting shares" in the lower left corner.  Note: There are other reasons this message could happen (like if you left an SSH terminal open while cd'd into the array). This discussion assumes none of the usual suspects apply.

 

In a web terminal or SSH type 'losetup'. In my case it showed:

root@Tower:/etc/rc.d# losetup
NAME       SIZELIMIT OFFSET AUTOCLEAR RO BACK-FILE                           DIO LOG-SEC
/dev/loop1         0      0         1  1 /boot/bzfirmware                      0     512
/dev/loop2         0      0         1  0 /mnt/cache/system/docker/docker.img   0     512
/dev/loop0         0      0         1  1 /boot/bzmodules                       0     512

 

The problem is that docker.img is still mounted. Note that in my case is it on /dev/loop2

 

Then run `/etc/rc.d/rc.docker status` to confirm that docker has stopped:

# /etc/rc.d/rc.docker status
status of dockerd: stopped

(It should be stopped, since you were in the process of stopping the array. But if Docker is still running, you can type `/etc/rc.d/rc.docker stop` and wait a bit, then run status again until it has stopped.)

 

Then to fix the problem, type:

umount /dev/loop2

(use whatever /dev/loopX docker.img is on, as noted above) 

 

Once that is unmounted, the array will automatically finish stopping.

 

We are looking into a fix for this, but it would help if we could reliably reproduce the problem (it has only happened to me once). If anyone is able to identify what it takes to make this happen I'd appreciate it.

  • Like 4
  • Thanks 8
  • Upvote 5
Link to comment

The parity check on my last unclean shutdown finished with zero errors, as it always seems to whenever I have an unclean shutdown. That lends some credence in my mind to the thought that everything is stopping but just something is holding up the disks unmounting. I did some testing this evening with the following results.

 

1. Stop array while watching the log. Everything shuts down as normal and the array stops quickly.

2. Reboot the server while the array is stopped. System comes back up, autostarts the array (I forgot to turn that off for testing), and it came up clean.

3. Stop array again while watching the log. Everything stops normally.

4. Start the array. Normal.

5. Reboot the server while the array is running. Watching the log. Everything shuts down properly, the system reboots normal and clean.

6. Now, I shut down the server while the array is running, rather than rebooting it, because that is when it last happened to me last night. System shuts down normally, and when I power it back on, it comes up clean.

 

So now it seems to be behaving itself on my server. But I do have the logs from the last time it happened showing the behavior. I'll try to do some more testing in the next couple of days.

Link to comment
16 hours ago, ljm42 said:

I hit this today when stopping my array.

 

Here is what worked for me, would appreciate if someone hitting this would confirm it works for them too.

 

To get into this state, stop the array. If you are having this issue you will see "retry unmounting shares" in the lower left corner.  Note: There are other reasons this message could happen (like if you left an SSH terminal open while cd'd into the array). This discussion assumes none of the usual suspects apply.

 

In a web terminal or SSH type 'losetup'. In my case it showed:

root@Tower:/etc/rc.d# losetup
NAME       SIZELIMIT OFFSET AUTOCLEAR RO BACK-FILE                           DIO LOG-SEC
/dev/loop1         0      0         1  1 /boot/bzfirmware                      0     512
/dev/loop2         0      0         1  0 /mnt/cache/system/docker/docker.img   0     512
/dev/loop0         0      0         1  1 /boot/bzmodules                       0     512

 

The problem is that docker.img is still mounted. Note that in my case is it on /dev/loop2

 

Then run `/etc/rc.d/rc.docker status` to confirm that docker has stopped:

# /etc/rc.d/rc.docker status
status of dockerd: stopped

(It should be stopped, since you were in the process of stopping the array. But if Docker is still running, you can type `/etc/rc.d/rc.docker stop` and wait a bit, then run status again until it has stopped.)

 

Then to fix the problem, type:

umount /dev/loop2

(use whatever /dev/loopX docker.img is on, as noted above) 

 

Once that is unmounted, the array will automatically finish stopping.

 

We are looking into a fix for this, but it would help if we could reliably reproduce the problem (it has only happened to me once). If anyone is able to identify what it takes to make this happen I'd appreciate it.

I've been struggling with this issue, and this solved my problem. Thank you!

I noticed this issue kept happening since 6.12.0 (I'm currently running the latest version but still face this issue periodically.

 

This issue occurs for me when I attempt to stop the array. This time my unraid lost internet connection (failed to check for docker updates, no connection to unraid connect, nor the app store, although not sure if related to issue). I stopped docker and VM services to adjust the DNS settings. Changing DNS got the internet working on the server again, but when I tried to turn on docker and VM. It kept telling me docker failed to start, so I attempted a restart. This is when the issue occurred. I'm happy to provide more info if needed.

Edited by wicked_qa
Link to comment
10 hours ago, eggman9713 said:

So now it seems to be behaving itself on my server.

That is what makes it difficult to track this down : ) but we're working on a solution.

 

15 minutes ago, wicked_qa said:

I've been struggling with this issue, and this solved my problem. Thank you!

Thanks for confirming this resolved the issue with the stopping the array, we're working on a solution

 

  • Like 1
Link to comment

I just had my array get stuck while stopping today. The Docker image remained mounted, but Docker had already stopped. Unmounting the /dev/loop2 device from the command line per ljm42's instructions allowed the array to finish stopping as normal, and it started back up clean (I didn't shut down or reboot the server). 

  • Like 1
Link to comment

@ljm42 I got a five unclean shutdown logs all referring to the same issue (the latest two already posted in this thread). That it is unable to unmount the cache drive where the docker.img file lives. I have since downgraded back to 6.1.5 and everything works just as expected again. All of them seems to have stopped the docker service. But is still unable to unmount for whatever reason. 

Link to comment

I am also got this issue. I did have to set my shut-down time out to 300 when i was running 6.11 as i would get parity check on occasion, but never had a parity check ALL THE TiMe like this on the latest version. Should we be worried about our data? I have also attached diagnostics to see if anything could be had.

 

I thought this was a stable update?

hammerthread-diagnostics-20230711-1210.zip

Link to comment
  • Solution

Update - if the array can't stop due to "Retry unmounting shares" in 6.12.0 - 6.12.2, the quick fix is to open a web terminal and type:

umount /var/lib/docker

The array should then stop and prevent an unclean shutdown.

 

(It is possible the array won't stop for other reasons, such as having a web terminal open to a folder on the array. Make sure to exit any web terminals or SSH sessions in this case)

 

We have a fix in the 6.12.3-rc3 prerelease, available here:

The fix is in the 6.12.3 release, available here:

It would be helpful if some of the folks who have been having an issue stopping the array could upgrade and confirm the issue is resolved in this version. No need, this fix is confirmed.  Thanks to everyone who helped track this down!

  • Like 4
Link to comment
7 hours ago, ljm42 said:

Update - if the array can't stop due to "Retry unmounting shares" in 6.12.0 - 6.12.2, the quick fix is to open a web terminal and type:

umount /var/lib/docker

The array should then stop and prevent an unclean shutdown.

 

(It is possible the array won't stop for other reasons, such as having a web terminal open to a folder on the array. Make sure to exit any web terminals or SSH sessions in this case)

 

We have a fix in the 6.12.3-rc3 prerelease, available here:

 

It would be helpful if some of the folks who have been having an issue stopping the array could upgrade and confirm the issue is resolved in this version.

I can try doing this now. I had this issue on 6.11.5. I tried all suggestions and nothing seemed to work. I think the issue might be related to libvirt. I saw a log in that section of settings I think, where it kept throwing libvirt errors. All I was trying to do was get smb working properly for Synology after noticing files not having certain permissions. I'll reply back if this did indeed fix my issue. 

  • Upvote 1
Link to comment
22 minutes ago, jflad17 said:

I can try doing this now. I had this issue on 6.11.5. I tried all suggestions and nothing seemed to work. I think the issue might be related to libvirt. I saw a log in that section of settings I think, where it kept throwing libvirt errors. All I was trying to do was get smb working properly for Synology after noticing files not having certain permissions. I'll reply back if this did indeed fix my issue. 

Update worked successfully to update to 6.12.3-rc3 from 6.11.5, it feels a lot faster too. Tomorrow I will try to stop the array and hopefully it works.

Edited by jflad17
Link to comment
15 hours ago, jflad17 said:

I can try doing this now. I had this issue on 6.11.5. I tried all suggestions and nothing seemed to work. I think the issue might be related to libvirt. I saw a log in that section of settings I think, where it kept throwing libvirt errors. All I was trying to do was get smb working properly for Synology after noticing files not having certain permissions. I'll reply back if this did indeed fix my issue. 

 

The fix in 6.12.3-rc3 is specifically for the docker.img

 

If you think libvirt.img is preventing a stop, please see this comment: https://forums.unraid.net/topic/141479-6122-array-stop-stuck-on-retry-unmounting-disk-shares/#comment-1281063

Except you'll want to manually umount the libvirt.img rather than the docker.img

 

Then please take diagnostics before rebooting so I can see the logs showing that libvirt.img was preventing the array from stopping. Thanks!

  • Like 1
Link to comment
On 7/12/2023 at 2:08 PM, ljm42 said:

It would be helpful if some of the folks who have been having an issue stopping the array could upgrade and confirm the issue is resolved in this version.

I just installed 6.12.3-rc3, and I forgot to stop the array before the reboot, and sure enough it got stuck. But I was able to open a web terminal and use 

umount /var/lib/docker

and it stopped the array and rebooted normally. After reboot I also stopped the array, and it didn't get stuck. Rebooted, and that was normal. Stopped the array again, and it stopped and started again normally. So far, the issue seems to be fixed. I'll try stopping and rebooting a couple of times in the next few days. I normally don't reboot my server for weeks at a time, nor stop the array that often. If 6.12.3 still isn't a stable release in the next week I'll provide an update.

  • Like 1
Link to comment
On 7/12/2023 at 11:08 PM, ljm42 said:

Update - if the array can't stop due to "Retry unmounting shares" in 6.12.0 - 6.12.2, the quick fix is to open a web terminal and type:

umount /var/lib/docker

The array should then stop and prevent an unclean shutdown.

 

 

I can confirm this ❤️ - I updated a few days ago to 6.12.2 and had that exact issue since.

Pasting the command into terminal allows the unmount to go through.

 

Will still wait for the fix in the stable branch.

Link to comment
  • 2 weeks later...

I ran into the problem with v6.12.3.

 

Ran into the problem yesterday, and I forced the shutdown, which I believe caused me a parity sync error (which I have not fixed yet). 

Problem has returned today.

 

The symptoms first start with not being able to check updates for Docker or Plugins.

This pointed me to another problem with DNS problems (e.g. I couldn't ping google.com from terminal).

 

I also run docker containers pretty heavility and wonder if that contributed to some of the problem.

 

The instructions above resolved the array shutdown issue, but the root cause for me needing to restart Docker remains a mystery:

root@Tower:~# losetup
NAME       SIZELIMIT OFFSET AUTOCLEAR RO BACK-FILE                           DIO LOG-SEC
/dev/loop1         0      0         1  1 /boot/bzfirmware                      0     512
/dev/loop2         0      0         1  0 /mnt/cache/system/docker/docker.img   0     512
/dev/loop0         0      0         1  1 /boot/bzmodules                       0     512
root@Tower:~# /etc/rc.d/rc.docker status
status of dockerd: stopped
root@Tower:~# umount /dev/loop2

 

tower-diagnostics-20230727-1517.zip

  • Like 1
Link to comment
27 minutes ago, Jaybau said:

I ran into the problem with v6.12.3.

 

Something is holding your zpool open, but it doesn't appear to be the docker.img:

Jul 27 15:15:42 Tower emhttpd: Unmounting disks...
Jul 27 15:15:42 Tower emhttpd: shcmd (154432): /usr/sbin/zpool export cache
Jul 27 15:15:42 Tower root: cannot unmount '/mnt/cache/system': pool or dataset is busy

 

I'd suggest starting your own thread, I don't think you are hitting the issue this thread is about, and there are other things that are more concerning:

Jul 26 20:06:09 Tower kernel: critical medium error, dev sdh, sector 3317704632 op 0x0:(READ) flags 0x0 phys_seg 72 prio class 2
Jul 26 20:06:09 Tower kernel: md: disk0 read error, sector=3317704568

I'm not an expert on that but hopefully someone else can lend a hand.

 

Link to comment
On 7/6/2023 at 8:01 PM, ljm42 said:

We are looking into a fix for this, but it would help if we could reliably reproduce the problem (it has only happened to me once). If anyone is able to identify what it takes to make this happen I'd appreciate it.

So i don't have the docker hang issue (although its randomly come up in the past BEFORE i upgraded to 6.12.2), my "bzfirmware" and "bzmodules" are the ones hanging themselves!  and unmount script listed (and my old one) doesn't work for these loops....i could unclean shutdown...but that's dum too!  

Ahh shucks, i was hoping to update my production server to 6.12.x to solve some weird SMB/NGIX bug that randomly chokes transfer speeds...

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.