Jump to content

[SOLVED] stuck unmounting while trying to stop array


642

Recommended Posts

Posted

Currently, I'm trying to stop my array (problem with a disk, a lot of reallocated_sector, parity check done, ready with a new disk for replace).

As usual, I'm in a hurry, but until now I think there is no rerrors of my part.

 

Stop from the main window of unraid, stuck unmounting.

Try with telnet a powerdown, prompt return with no answer; again, same.

 

Look at the syslog :

Dec 18 12:19:30 Tower emhttp: _shcmd: shcmd (1135): exit status: 1 (Other emhttp)

Dec 18 12:19:30 Tower emhttp: Retry unmounting user share(s)... (Other emhttp)

Dec 18 12:19:34 Tower emhttp: shcmd (1136): umount /mnt/user >/dev/null 2>$stuff$1 (Other emhttp)

Dec 18 12:19:34 Tower emhttp: _shcmd: shcmd (1136): exit status: 1 (Other emhttp)

Dec 18 12:19:34 Tower emhttp: shcmd (1137): rmdir /mnt/user >/dev/null 2>$stuff$1 (Other emhttp)

Dec 18 12:19:34 Tower emhttp: _shcmd: shcmd (1137): exit status: 1 (Other emhttp)

Dec 18 12:19:34 Tower emhttp: Retry unmounting user share(s)... (Other emhttp)

Dec 18 12:19:35 Tower emhttp: shcmd (1138): umount /mnt/user >/dev/null 2>$stuff$1 (Other emhttp)

Dec 18 12:19:35 Tower emhttp: _shcmd: shcmd (1138): exit status: 1 (Other emhttp)

Dec 18 12:19:35 Tower emhttp: shcmd (1139): rmdir /mnt/user >/dev/null 2>$stuff$1 (Other emhttp)

Dec 18 12:19:35 Tower emhttp: _shcmd: shcmd (1139): exit status: 1 (Other emhttp)

Dec 18 12:19:35 Tower emhttp: Retry unmounting user share(s)... (Other emhttp)

Dec 18 12:19:39 Tower emhttp: shcmd (1140): umount /mnt/user >/dev/null 2>$stuff$1 (Other emhttp)

Dec 18 12:19:39 Tower emhttp: _shcmd: shcmd (1140): exit status: 1 (Other emhttp)

Dec 18 12:19:39 Tower emhttp: shcmd (1141): rmdir /mnt/user >/dev/null 2>$stuff$1 (Other emhttp)

Dec 18 12:19:39 Tower emhttp: _shcmd: shcmd (1141): exit status: 1 (Other emhttp)

Dec 18 12:19:39 Tower emhttp: Retry unmounting user share(s)... (Other emhttp)

Dec 18 12:19:40 Tower emhttp: shcmd (1142): umount /mnt/user >/dev/null 2>$stuff$1 (Other emhttp)

Dec 18 12:19:40 Tower emhttp: _shcmd: shcmd (1142): exit status: 1 (Other emhttp)

Dec 18 12:19:40 Tower emhttp: shcmd (1143): rmdir /mnt/user >/dev/null 2>$stuff$1 (Other emhttp)

Dec 18 12:19:40 Tower emhttp: _shcmd: shcmd (1143): exit status: 1 (Other emhttp)

Dec 18 12:19:40 Tower emhttp: Retry unmounting user share(s)... (Other emhttp)

Dec 18 12:19:44 Tower emhttp: shcmd (1144): umount /mnt/user >/dev/null 2>$stuff$1 (Other emhttp)

Dec 18 12:19:44 Tower emhttp: _shcmd: shcmd (1144): exit status: 1 (Other emhttp)

Dec 18 12:19:44 Tower emhttp: shcmd (1145): rmdir /mnt/user >/dev/null 2>$stuff$1 (Other emhttp)

Dec 18 12:19:44 Tower emhttp: _shcmd: shcmd (1145): exit status: 1 (Other emhttp)

Dec 18 12:19:44 Tower emhttp: Retry unmounting user share(s)... (Other emhttp)

Dec 18 12:19:45 Tower emhttp: shcmd (1146): umount /mnt/user >/dev/null 2>$stuff$1 (Other emhttp)

Dec 18 12:19:45 Tower emhttp: _shcmd: shcmd (1146): exit status: 1 (Other emhttp)

 

Still growing...

 

On telnet, made

lsof | grep /mnt

 

The answer is :

root@Tower:~# lsof | grep /mnt

cache_dir  2206   root  cwd       DIR        9,1     104          2 /mnt/disk1

cache_dir  2207   root  cwd       DIR       9,10     128          2 /mnt/disk10

cache_dir  2209   root  cwd       DIR       9,11     104          2 /mnt/disk11

cache_dir  2211   root  cwd       DIR       9,12     104          2 /mnt/disk12

cache_dir  2214   root  cwd       DIR       9,14     104          2 /mnt/disk14

cache_dir  2215   root  cwd       DIR       9,15     104          2 /mnt/disk15

cache_dir  2217   root  cwd       DIR       9,16     104          2 /mnt/disk16

cache_dir  2218   root  cwd       DIR       9,17     104          2 /mnt/disk17

cache_dir  2223   root  cwd       DIR        9,3     104          2 /mnt/disk3

cache_dir  2226   root  cwd       DIR        9,4     128          2 /mnt/disk4

cache_dir  2228   root  cwd       DIR        9,5     104          2 /mnt/disk5

cache_dir  2230   root  cwd       DIR        9,6     104          2 /mnt/disk6

cache_dir  2232   root  cwd       DIR        9,8     104          2 /mnt/disk8

cache_dir  2233   root  cwd       DIR        9,9     104          2 /mnt/disk9

smbd      13486   root  cwd       DIR       0,14     216       1540 /mnt/user/Video

sleep     20331   root  cwd       DIR       9,16     104          2 /mnt/disk16

sleep     20333   root  cwd       DIR        9,3     104          2 /mnt/disk3

sleep     20335   root  cwd       DIR       9,17     104          2 /mnt/disk17

sleep     20337   root  cwd       DIR       9,12     104          2 /mnt/disk12

sleep     20339   root  cwd       DIR        9,5     104          2 /mnt/disk5

sleep     20342   root  cwd       DIR        9,8     104          2 /mnt/disk8

sleep     20344   root  cwd       DIR        9,4     128          2 /mnt/disk4

sleep     20345   root  cwd       DIR        9,1     104          2 /mnt/disk1

sleep     20348   root  cwd       DIR        9,9     104          2 /mnt/disk9

sleep     20349   root  cwd       DIR        9,6     104          2 /mnt/disk6

sleep     20359   root  cwd       DIR       9,11     104          2 /mnt/disk11

sleep     20361   root  cwd       DIR       9,15     104          2 /mnt/disk15

sleep     20366   root  cwd       DIR       9,14     104          2 /mnt/disk14

 

And now I have to kill something, but what and HOW.

kill: usage: kill [-s sigspec | -n signum | -sigspec] pid | jobspec ... or kill -l [sigspec]

What are "-s sigspec" or "-n signum" or pid or jobspec ?

 

Thanks

 

 

Posted

And now, even after a "cache_dirs -q" (I would rather use kill, but...)

 

root@Tower:/boot/packages# lsof | grep /mnt

cache_dir  2207   root  cwd       DIR       9,10     128          2 /mnt/disk10

smbd      13486   root  cwd       DIR       0,14     216       1540 /mnt/user/Video

 

Even if cache_dir should be died, it's still there.

 

And how to kill smbd?

 

More info :

 

root@Tower:~# lsof /dev/md*

COMMAND    PID USER   FD   TYPE DEVICE SIZE NODE NAME

cache_dir 2207 root  cwd    DIR   9,10  128    2 /mnt/disk10

 

 

root@Tower:~# /usr/bin/fuser -mv /mnt/disk* /mnt/user/*

 

                    USER        PID ACCESS COMMAND

/mnt/disk10:         root       2207 ..c.. cache_dirs

 

/mnt/user/Files:     root      13486 ..c.. smbd

 

/mnt/user/User:      root      13486 ..c.. smbd

 

/mnt/user/Video:     root      13486 ..c.. smbd

 

/mnt/user/Work:      root      13486 ..c.. smbd

 

/mnt/user/Zardoz:    root      13486 ..c.. smbd

 

/mnt/user/vadeo:     root      13486 ..c.. smbd

 

root@Tower:~#

 

Please help....

Posted

...

Have found command "ps ux"

root@Tower:~# ps ux

USER       PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND

root         1  0.0  0.0    704   308 ?        Ss   Dec16   0:02 init

root         2  0.0  0.0      0     0 ?        S    Dec16   0:00 [kthreadd]

root         3  0.0  0.0      0     0 ?        S    Dec16   0:00 [migration/0]

root         4  0.0  0.0      0     0 ?        S    Dec16   0:00 [ksoftirqd/0]

root         5  0.0  0.0      0     0 ?        S    Dec16   0:00 [migration/1]

.../...

root     842  0.0  0.0      0     0 ?        S    Dec16   0:00 [scsi_eh_8]

root      1636  0.0  0.0   1688   592 ?        Ss   Dec16   0:00 /usr/sbin/syslogd -m0

root      1640  0.0  0.0   1632   420 ?        Ss   Dec16   0:00 /usr/sbin/klogd -c 3 -x

root      1679  0.0  0.0   1656   464 ?        Ss   Dec16   0:00 /usr/sbin/ifplugd -i eth0 -fwI -u0 -d10

root      1709  0.1  0.0   1656   232 ?        Ss   Dec16   4:53 /sbin/dhcpcd -d -t 30 -h Tower eth0

root      1741  0.0  0.0   1672   536 ?        Ss   Dec16   0:00 /usr/sbin/inetd

root      1751  0.0  0.0   1632   536 ?        Ss   Dec16   0:00 /usr/sbin/acpid

root      1758  0.0  0.0   1812   656 ?        S    Dec16   0:00 /usr/sbin/crond -l10

root      1772  0.0  0.0  72216  1648 ?        Sl   Dec16   0:31 /usr/local/sbin/emhttp

root      1803  0.0  0.0   2288   480 ?        S    Dec16   0:00 /bin/bash /boot/unmenu/uu

root      1804  0.0  0.0   1616   420 ?        S    Dec16   0:00 logger -tunmenu -plocal7.info -is

root      1807  0.0  0.0   2404  1352 tty1     Ss+  Dec16   0:00 -bash

root      1808  0.0  0.0   1632   520 tty2     Ss+  Dec16   0:00 /sbin/agetty 38400 tty2 linux

root      1809  0.0  0.0   1632   520 tty3     Ss+  Dec16   0:00 /sbin/agetty 38400 tty3 linux

root      1810  0.0  0.0   1636   524 tty4     Ss+  Dec16   0:00 /sbin/agetty 38400 tty4 linux

root      1811  0.0  0.0   1632   520 tty5     Ss+  Dec16   0:00 /sbin/agetty 38400 tty5 linux

root      1812  0.0  0.0   1632   520 tty6     Ss+  Dec16   0:00 /sbin/agetty 38400 tty6 linux

root      1813  0.0  0.0   6464  5488 ?        S    Dec16   0:11 awk -W re-interval -f ./unmenu.awk

root      1838  1.7  0.0      0     0 ?        S    Dec16  53:09 [mdrecoveryd]

.../...

root      1932  7.1  0.0      0     0 ?        S    Dec16 220:06 [unraidd]

root      2073  0.0  0.0      0     0 ?        S    Dec16   0:20 [reiserfs/0]

root      2074  0.0  0.0      0     0 ?        S    Dec16   0:09 [reiserfs/1]

root      2075  0.0  0.0      0     0 ?        S    Dec16   0:11 [reiserfs/2]

root      2076  0.0  0.0      0     0 ?        S    Dec16   0:22 [reiserfs/3]

root      2127  0.2  0.0  56312  4156 ?        Ssl  Dec16   6:45 /usr/local/sbin/shfs /mnt/user -o noatime,big_writes,allow_other,default_permissions

root      2207 17.3  0.0   4244  2616 ?        RN   Dec16 530:33 /bin/bash /boot/packages/cache_dirs -w -m 1 -M 10 -d 9999 -e Work -a -noleaf

root     13486  0.0  0.0  16108  4912 ?        S    Dec17   0:04 /usr/sbin/smbd -D

root     18122  0.0  0.0   1856   760 ?        Ss   12:47   0:00 in.telnetd: Merlin-XP.fritz.box

root     18123  0.0  0.0   2372  1348 pts/0    Ss   12:47   0:00 -bash

root     20291  0.0  0.0   2116   828 pts/0    R+   15:37   0:00 ps ux

root     23209  0.0  0.0      0     0 ?        ZN   06:47   0:00 [cat] <defunct>

root     25696 99.9  0.0   4312  2452 ?        RN   Dec17 1646:19 /bin/bash /boot/packages/cache_dirs -w -m 1 -M 10 -d 9999 -e Work -a -noleaf

root     25697  0.0  0.0      0     0 ?        ZN   Dec17   0:00 [tail] <defunct>

 

Two PID for cache_dirs (2207 and 25696), and I've kill both with "kill -9 2207" and "kill -9 25696", and in fact they disappears on the next "ps ux"

 

Now, here are the new infos :

 

root@Tower:~# /usr/bin/fuser -mv /mnt/disk* /mnt/user/*

 

                    USER        PID ACCESS COMMAND

/mnt/user/Files:     root      13486 ..c.. smbd

 

/mnt/user/User:      root      13486 ..c.. smbd

 

/mnt/user/Video:     root      13486 ..c.. smbd

 

/mnt/user/Work:      root      13486 ..c.. smbd

 

/mnt/user/Zardoz:    root      13486 ..c.. smbd

 

/mnt/user/vadeo:     root      13486 ..c.. smbd

 

root@Tower:~# lsof /dev/md*

root@Tower:~#

 

root@Tower:~# lsof | grep /mnt

smbd      13486   root  cwd       DIR       0,14     216       1540 /mnt/user/Video

root@Tower:~#

 

powerdown doesn't make something more.

 

The syslog is still growing each 5 secs.

Dec 18 15:38:59 Tower emhttp: Retry unmounting user share(s)... (Other emhttp)

Dec 18 15:38:59 Tower emhttp: shcmd (10690): umount /mnt/user >/dev/null 2>$stuff$1 (Other emhttp)

Dec 18 15:38:59 Tower emhttp: _shcmd: shcmd (10690): exit status: 1 (Other emhttp)

Dec 18 15:38:59 Tower emhttp: shcmd (10691): rmdir /mnt/user >/dev/null 2>$stuff$1 (Other emhttp)

Dec 18 15:38:59 Tower emhttp: _shcmd: shcmd (10691): exit status: 1 (Other emhttp)

Dec 18 15:38:59 Tower emhttp: Retry unmounting user share(s)... (Other emhttp)

 

Please help

 

Posted

killall smbd nmbd

 

That should do it since all I see now is the smbd process accessing the user shares.

 

Joe L.

 

Thanks for your answer.

 

In fact and instead, I've eventually do a single kill that does the job :

kill -9 2127

 

because of :

root      2127  0.2  0.0  56312  4156 ?        Ssl  Dec16   6:45 /usr/local/sbin/shfs /mnt/user -o noatime,big_writes,allow_other,default_permissions

 

and magically the server stop

 

root@Tower:~#

Broadcast message from root (Sun Dec 18 18:36:17 2011):

 

The system is going down for system halt NOW!

 

And after reboot, no parity check, so I think it was finally a correct shutdown (?).

 

What was this  >:( shfs task ? ???

 

Can you explain a little what should do the killall in this case?

 

 

Archived

This topic is now archived and is closed to further replies.

×
×
  • Create New...