New controller cards, new config, data is rebuilding, but lost /mnt/user


timekiller

Recommended Posts

So I finally took everyone here's advice and replaced my Marvell based controller cards (IO Crest 16 Port) with 2 LSI 9201-16i cards. In addition I needed to shuffle some disks around, so when I installed the new cards I also wound having to do a new config and start a parity rebuild. It's been running for about 33 hours and everything was going great until about 30 minutes ago when I got an error deleting a file. Investigation shows that I lost /mnt/user - "Transport endpoint is not connected".

 

Interestingly, /mnt/user0 is still connected and the array is accessible from there. Of course all of my docker container and shares use /mnt/user, so now the entire server is effectively offline. I stopped all my docker containers to hopefully avoid further issues there.

 

I assume a reboot will fix this, but 1) I'd like to know what happened here, and 2) I don't want to have to restart the parity rebuild. There is currently an estimated 9 hours left and it appears to be running fine.

 

Do I have any options beyond reboot and start over, or go 9 hours or more without my server?

 

Diagnostics attached

storage-diagnostics-20211119-0937.zip

Link to comment
Nov 19 09:28:27 Storage kernel: Call Trace:
Nov 19 09:28:27 Storage kernel: fh_getattr+0x45/0x5f [nfsd]
Nov 19 09:28:27 Storage kernel: fill_post_wcc+0x2c/0x94 [nfsd]
Nov 19 09:28:27 Storage kernel: fh_unlock+0x12/0x33 [nfsd]
Nov 19 09:28:27 Storage kernel: nfsd3_proc_rmdir+0x4a/0x4f [nfsd]
Nov 19 09:28:27 Storage kernel: nfsd_dispatch+0xb0/0x11e [nfsd]
Nov 19 09:28:27 Storage kernel: svc_process+0x3dd/0x546 [sunrpc]
Nov 19 09:28:27 Storage kernel: ? nfsd_svc+0x27e/0x27e [nfsd]
Nov 19 09:28:27 Storage kernel: nfsd+0xef/0x146 [nfsd]
Nov 19 09:28:27 Storage kernel: ? nfsd_destroy+0x57/0x57 [nfsd]
Nov 19 09:28:27 Storage kernel: kthread+0xe5/0xea
Nov 19 09:28:27 Storage kernel: ? __kthread_bind_mask+0x57/0x57
Nov 19 09:28:27 Storage kernel: ret_from_fork+0x22/0x30
Nov 19 09:28:27 Storage kernel: ---[ end trace 83e36f2bb8ca0fa2 ]---

 

nfsd is the NFS daemon.

Link to comment
9 minutes ago, JorgeB said:
Nov 19 09:28:27 Storage kernel: Call Trace:
Nov 19 09:28:27 Storage kernel: fh_getattr+0x45/0x5f [nfsd]
Nov 19 09:28:27 Storage kernel: fill_post_wcc+0x2c/0x94 [nfsd]
Nov 19 09:28:27 Storage kernel: fh_unlock+0x12/0x33 [nfsd]
Nov 19 09:28:27 Storage kernel: nfsd3_proc_rmdir+0x4a/0x4f [nfsd]
Nov 19 09:28:27 Storage kernel: nfsd_dispatch+0xb0/0x11e [nfsd]
Nov 19 09:28:27 Storage kernel: svc_process+0x3dd/0x546 [sunrpc]
Nov 19 09:28:27 Storage kernel: ? nfsd_svc+0x27e/0x27e [nfsd]
Nov 19 09:28:27 Storage kernel: nfsd+0xef/0x146 [nfsd]
Nov 19 09:28:27 Storage kernel: ? nfsd_destroy+0x57/0x57 [nfsd]
Nov 19 09:28:27 Storage kernel: kthread+0xe5/0xea
Nov 19 09:28:27 Storage kernel: ? __kthread_bind_mask+0x57/0x57
Nov 19 09:28:27 Storage kernel: ret_from_fork+0x22/0x30
Nov 19 09:28:27 Storage kernel: ---[ end trace 83e36f2bb8ca0fa2 ]---

 

nfsd is the NFS daemon.

yup, saw the call trace and my eyes skimmed right past the nfsd stuff - thanks!

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.