[6.12.13], first seen in [6.12.11]: NFS3 not working properly because nfsd and lockd are not registered with portmapper

JorgeB · August 10

Thanks for the detailed report.

dlandon · August 10

This is what I see on 6.12.11 on a fresh reboot with the array started:

root@BackupServer:~# rpcinfo -p
   program vers proto   port  service
    100000    4   tcp    111  portmapper
    100000    3   tcp    111  portmapper
    100000    2   tcp    111  portmapper
    100000    4   udp    111  portmapper
    100000    3   udp    111  portmapper
    100000    2   udp    111  portmapper
    100024    1   udp  43852  status
    100024    1   tcp  58807  status
    100003    3   tcp   2049  nfs
    100003    4   tcp   2049  nfs
    100003    3   udp   2049  nfs
    100021    1   udp  43597  nlockmgr
    100021    3   udp  43597  nlockmgr
    100021    4   udp  43597  nlockmgr
    100021    1   tcp  33137  nlockmgr
    100021    3   tcp  33137  nlockmgr
    100021    4   tcp  33137  nlockmgr
    100005    1   udp  34110  mountd
    100005    1   tcp  45221  mountd
    100005    2   udp  57340  mountd
    100005    2   tcp  57893  mountd
    100005    3   udp  47250  mountd
    100005    3   tcp  56383  mountd

I don't see what you are describing. How are you seeing that nfs and lockmgr are not registered?

murkus · August 10

6 hours ago, dlandon said:

This is what I see on 6.12.11 on a fresh reboot with the array started:

root@BackupServer:~# rpcinfo -p
   program vers proto   port  service
    100000    4   tcp    111  portmapper
    100000    3   tcp    111  portmapper
    100000    2   tcp    111  portmapper
    100000    4   udp    111  portmapper
    100000    3   udp    111  portmapper
    100000    2   udp    111  portmapper
    100024    1   udp  43852  status
    100024    1   tcp  58807  status
    100003    3   tcp   2049  nfs
    100003    4   tcp   2049  nfs
    100003    3   udp   2049  nfs
    100021    1   udp  43597  nlockmgr
    100021    3   udp  43597  nlockmgr
    100021    4   udp  43597  nlockmgr
    100021    1   tcp  33137  nlockmgr
    100021    3   tcp  33137  nlockmgr
    100021    4   tcp  33137  nlockmgr
    100005    1   udp  34110  mountd
    100005    1   tcp  45221  mountd
    100005    2   udp  57340  mountd
    100005    2   tcp  57893  mountd
    100005    3   udp  47250  mountd
    100005    3   tcp  56383  mountd

I don't see what you are describing. How are you seeing that nfs and lockmgr are not registered?

In your server they are registered, all is good on your side.

In mine they go away after starting the array. And it is reproducible on my server. I use rpcinfo to see what is registered, just like you did.

Edited August 10 by murkus

murkus · August 10

I need to mention that I have fixed port numbers in /etc/default/rpc for statd, mountd and lockd using a user script. I didn't touch /etc/default/nfs.

dlandon · August 10

1 hour ago, murkus said:

I need to mention that I have fixed port numbers in /etc/default/rpc for statd, mountd and lockd using a user script. I didn't touch /etc/default/nfs.

Why are you doing that?

murkus · August 10

Firewalls

dlandon · August 10

If you restart nfsd does it work properly?

/etc/rc.d/rc.nfsd restart

Edit: Actually I'm beginning to see the issue here. If you do this command before restarting nfsd all should work:

/etc/rc.d/rc.rpc restart

JorgeB · August 11

@murkusPlease post the contents of the /etc/default/rpc file after running the script.

murkus · August 12

RPCBIND_OPTS="-h <redacted>"
RPC_STATD_PORT=950
LOCKD_TCP_PORT=4045
LOCKD_UDP_PORT=4045
RPC_MOUNTD_PORT=635

Edited August 15 by murkus

dlandon · August 13

There was a Kernel update in 6.12.11 that causes an issue with restarting nfsd. The way you can solve this in your situation is to make the /etc/default/rpc file changes in the /flash/config/go file that is run before the first start of nfsd.

We are looking at a fix for this, but are unsure of what we can do because we do not know which kernel commit is the culprit. 6.12.12 will have a change that might help, but there are no guarantees.

warpspeed · August 14

13 hours ago, dlandon said:

There was a Kernel update in 6.12.11 that causes an issue with restarting nfsd. The way you can solve this in your situation is to make the /etc/default/rpc file changes in the /flash/config/go file that is run before the first start of nfsd.

We are looking at a fix for this, but are unsure of what we can do because we do not know which kernel commit is the culprit. 6.12.12 will have a change that might help, but there are no guarantees.

Is this the likely cause regarding all the other NFS issues that have been posted too?

Curious on the time-frame for the fix release. Also if this isn't fixed properly in 6.12.12, then can we have the official workaround documented in the release notes for 6.12.12 please.

dlandon · August 14

6 hours ago, warpspeed said:

Is this the likely cause regarding all the other NFS issues that have been posted too?

I don't think so.

6 hours ago, warpspeed said:

Curious on the time-frame for the fix release. Also if this isn't fixed properly in 6.12.12, then can we have the official workaround documented in the release notes for 6.12.12 please.

We are working on a 6.12.12 release.

This is an extreme corner case and not something a normal user will do, in fact it's probably not a very good idea unless you really know what you are doing.

cakes044 · August 14

I can't talk for the OP and this specific issue but I had to roll back from the release as my NFS mounting became unstable to the point I had to restart unraid/array to get it fixed. Restarting the NFS service didn't really fix the issue but to be fair I didn't attempt to restart rpc at the time. Since rolling back my NFS hasn't failed in the same way.

murkus · August 15

On 8/13/2024 at 7:01 PM, dlandon said:

There was a Kernel update in 6.12.11 that causes an issue with restarting nfsd. The way you can solve this in your situation is to make the /etc/default/rpc file changes in the /flash/config/go file that is run before the first start of nfsd.

What is wrong with using a user script that runs "At Startup of Array"? This works with 6.12.10 and it sounds reasonable. I think user scripts are a good place to put such things. Anything that needs to written to config files manually by logging in to the shell is doable for me but I try to stay away from that, as such changes are less probable to carry over to a new install or across OS updates.

I second the proposal to document this issue as a known problem with 6.12.11.

dlandon · August 15

3 minutes ago, murkus said:

What is wrong with using a user script that runs "At Startup of Array"? This works with 6.12.10 and it sounds reasonable. I think user scripts are a good place to put such things. Anything that needs to written to config files manually by logging in to the shell is doable for me but I try to stay away from that, as such changes are less probable to carry over to a new install or across OS updates.

I second the proposal to document this issue as a known problem with 6.12.11.

User Scripts is a good way to do this, but it is too late in the process. Adding the script you use to change the config file in the '/flash/config/go' file will allow your changes to be applied in the first run of nfsd. The 'go' file is a script run early in the startup process before the array is started. The issue is when nfsd is restarted, the changes are not applied properly.

murkus · August 24

Thanks for the explanation, but I doubt that this will solve the proble due to the following reason:

- the ports are registered correctly with rpcbind initially after bootup, so running my script earlier than now will not help at all, as all is fine initially (the array is not yet started)

- the ports are not registered correctly AFTER starting the array. My user script is set to run "At startup of array" and that sounds about exactly the right time in the process when this should happen, but it doesn't work

dlandon · August 24

23 minutes ago, murkus said:

Thanks for the explanation, but I doubt that this will solve the proble due to the following reason:

- the ports are registered correctly with rpcbind initially after bootup, so running my script earlier than now will not help at all, as all is fine initially (the array is not yet started)

- the ports are not registered correctly AFTER starting the array. My user script is set to run "At startup of array" and that sounds about exactly the right time in the process when this should happen, but it doesn't work

It will work if the array is auto started when booted. The issue is with restarting nfsd. If the array is started after rebooting, nfsd will not restart properly.

Try updating to 6.12.13 and see if that addresses the issue.

murkus · August 24

16 minutes ago, dlandon said:

It will work if the array is auto started when booted. The issue is with restarting nfsd. If the array is started after rebooting, nfsd will not restart properly.

Try updating to 6.12.13 and see if that addresses the issue.

I just updated to 6.12.13. I does NOT solve the problem. The release notes are incorrect regarding this item:

Fix: After stopping and then restarting the array, nfsd is not running

It is nice the issue was picked up, but we are not there yet. Me, sadly having to go back to 6.11.10

Edited August 24 by murkus

dlandon · August 24

I'm not seeing the same thing. Please post a new diagnostics.

murkus · August 24

1 minute ago, dlandon said:

I'm not seeing the same thing. Please post a new diagnostics.

I am sorry that I cannot help with this, the diagnostics violate my policy of data disclosure. I would only be able to provide this 1:1 to the developers.

murkus · August 24

After booting up 6.11.13 and after starting the array, realizing that nfsd was not working, I wanted to stop the array. I had no shares mounted, no sshd sessions on the machine, but the array wouldn't stop, hanging in "Retry unmounting...". Doesn't sound right either. "Open Files" only showing emhttpd using some /usr/local files and shfs using appdata files...

I can see that nfsd is actually running, it just isn't registered with rpcbind

dlandon · August 24

11 minutes ago, murkus said:

I am sorry that I cannot help with this, the diagnostics violate my policy of data disclosure. I would only be able to provide this 1:1 to the developers.

PM the diagnostics to me.

dlandon · August 24

5 minutes ago, murkus said:

I can see that nfsd is actually running, it just isn't registered with rpcbind

Show the output of rpcinfo.

murkus · August 24

This is after bootup and before starting the arrayon 6.12.13:

   program version netid     address                service    owner
    100000    4    tcp6      ::.0.111               portmapper superuser
    100000    3    tcp6      ::.0.111               portmapper superuser
    100000    4    udp6      ::1.0.111              portmapper superuser
    100000    3    udp6      ::1.0.111              portmapper superuser
    100000    4    tcp       0.0.0.0.0.111          portmapper superuser
    100000    3    tcp       0.0.0.0.0.111          portmapper superuser
    100000    2    tcp       0.0.0.0.0.111          portmapper superuser
    100000    4    udp       redacted.0.111     portmapper superuser
    100000    3    udp       redacted.0.111     portmapper superuser
    100000    2    udp       redacted.0.111     portmapper superuser
    100000    4    local     /var/run/rpcbind.sock portmapper superuser
    100000    3    local     /var/run/rpcbind.sock portmapper superuser
    100024    1    udp       0.0.0.0.211.51         status     32
    100024    1    tcp       0.0.0.0.131.53         status     32
    100024    1    udp6      ::.164.85              status     32
    100024    1    tcp6      ::.215.169             status     32
    100003    3    tcp       0.0.0.0.8.1            nfs        superuser
    100003    4    tcp       0.0.0.0.8.1            nfs        superuser
    100003    3    udp       0.0.0.0.8.1            nfs        superuser
    100021    1    udp       0.0.0.0.222.77         nlockmgr   superuser
    100021    3    udp       0.0.0.0.222.77         nlockmgr   superuser
    100021    4    udp       0.0.0.0.222.77         nlockmgr   superuser
    100021    1    tcp       0.0.0.0.176.19         nlockmgr   superuser
    100021    3    tcp       0.0.0.0.176.19         nlockmgr   superuser
    100021    4    tcp       0.0.0.0.176.19         nlockmgr   superuser
    100021    1    udp6      ::.221.142             nlockmgr   superuser
    100021    3    udp6      ::.221.142             nlockmgr   superuser
    100021    4    udp6      ::.221.142             nlockmgr   superuser
    100021    1    tcp6      ::.134.99              nlockmgr   superuser
    100021    3    tcp6      ::.134.99              nlockmgr   superuser
    100021    4    tcp6      ::.134.99              nlockmgr   superuser
    100005    1    udp       0.0.0.0.154.105        mountd     superuser
    100005    1    tcp       0.0.0.0.185.193        mountd     superuser
    100005    1    udp6      ::.159.61              mountd     superuser
    100005    1    tcp6      ::.129.19              mountd     superuser
    100005    2    udp       0.0.0.0.154.118        mountd     superuser
    100005    2    tcp       0.0.0.0.163.35         mountd     superuser
    100005    2    udp6      ::.146.23              mountd     superuser
    100005    2    tcp6      ::.176.15              mountd     superuser
    100005    3    udp       0.0.0.0.170.236        mountd     superuser
    100005    3    tcp       0.0.0.0.177.37         mountd     superuser
    100005    3    udp6      ::.207.70              mountd     superuser
    100005    3    tcp6      ::.186.85              mountd     superuser

Now after starting the array on 6.12.13:

   program version netid     address                service    owner
    100000    4    tcp6      ::.0.111               portmapper superuser
    100000    3    tcp6      ::.0.111               portmapper superuser
    100000    4    udp6      ::1.0.111              portmapper superuser
    100000    3    udp6      ::1.0.111              portmapper superuser
    100000    4    tcp       0.0.0.0.0.111          portmapper superuser
    100000    3    tcp       0.0.0.0.0.111          portmapper superuser
    100000    2    tcp       0.0.0.0.0.111          portmapper superuser
    100000    4    udp       redacted.0.111     portmapper superuser
    100000    3    udp       redacted.0.111     portmapper superuser
    100000    2    udp       redacted.0.111     portmapper superuser
    100000    4    local     /var/run/rpcbind.sock portmapper superuser
    100000    3    local     /var/run/rpcbind.sock portmapper superuser
    100024    1    udp       0.0.0.0.3.182          status     32
    100024    1    tcp       0.0.0.0.3.182          status     32
    100024    1    udp6      ::.3.182               status     32
    100024    1    tcp6      ::.3.182               status     32
    100005    1    udp       0.0.0.0.2.123          mountd     superuser
    100005    1    tcp       0.0.0.0.2.123          mountd     superuser
    100005    1    udp6      ::.2.123               mountd     superuser
    100005    1    tcp6      ::.2.123               mountd     superuser
    100005    2    udp       0.0.0.0.2.123          mountd     superuser
    100005    2    tcp       0.0.0.0.2.123          mountd     superuser
    100005    2    udp6      ::.2.123               mountd     superuser
    100005    2    tcp6      ::.2.123               mountd     superuser
    100005    3    udp       0.0.0.0.2.123          mountd     superuser
    100005    3    tcp       0.0.0.0.2.123          mountd     superuser
    100005    3    udp6      ::.2.123               mountd     superuser
    100005    3    tcp6      ::.2.123               mountd     superuser

processes:

ps -elf | grep nfs
1 I root     12096     2 0 80   0 -     0 svc_re 16:48 ?        00:00:00 [nfsd]
1 I root     12097     2 0 80   0 -     0 svc_re 16:48 ?        00:00:00 [nfsd]
1 I root     12098     2 0 80   0 -     0 svc_re 16:48 ?        00:00:00 [nfsd]
1 I root     12099     2 0 80   0 -     0 svc_re 16:48 ?        00:00:00 [nfsd]
1 I root     12100     2 0 80   0 -     0 svc_re 16:48 ?        00:00:00 [nfsd]
1 I root     12101     2 0 80   0 -     0 svc_re 16:48 ?        00:00:00 [nfsd]
1 I root     12102     2 0 80   0 -     0 svc_re 16:48 ?        00:00:00 [nfsd]
1 I root     12103     2 0 80   0 -     0 svc_re 16:48 ?        00:00:00 [nfsd]
1 I root     16156     2 0 60 -20 -     0 rescue 16:53 ?        00:00:00 [nfsiod]

I don't see nfs or rpc related messages in the system log that would point to a particular problem with registering nfsd

warpspeed · August 31

@dlandon curious if any fixes for this are on the radar for the next release?

[6.12.13], first seen in [6.12.11]: NFS3 not working properly because nfsd and lockd are not registered with portmapper

User Feedback

Recommended Comments

JorgeB 8,138

Link to comment

dlandon 1,341

Link to comment

murkus 11

Link to comment

murkus 11

Link to comment

dlandon 1,341

Link to comment

murkus 11

Link to comment

dlandon 1,341

Link to comment

JorgeB 8,138

Link to comment

murkus 11

Link to comment

dlandon 1,341

Link to comment

warpspeed 9

Link to comment

dlandon 1,341

Link to comment

cakes044 14

Link to comment

murkus 11

Link to comment

dlandon 1,341

Link to comment

murkus 11

Link to comment

dlandon 1,341

Link to comment

murkus 11

Link to comment

dlandon 1,341

Link to comment

murkus 11

Link to comment

murkus 11

Link to comment

dlandon 1,341

Link to comment

dlandon 1,341

Link to comment

murkus 11

Link to comment

warpspeed 9

Link to comment

Join the conversation