[v6.10.2] Randomly losing access to SMB shares via Active Directory


Recommended Posts

Summary:

I noticed some seemly random issues with my SMB shares since updating from 6.9 to 6.10.2. The latest example is losing access to a media share in the span of 6 hours after making no changes to any systems. It literally worked at 3am, I went to sleep, I then woke up and found I had lost access to the SMB share. Bizarre stuff...

 

Log entries (identifying info removed):

 

[2022/06/10 12:02:55.282902,  0] ../../source3/auth/auth_util.c:1927(check_account)
Jun 10 12:02:55 unraid smbd[2715]:   check_account: Failed to convert SID S-###removed### to a UID (dom_user[DOMAIN\username])

 

Troubleshooting:

 

  • left and rejoined the domain via the UNRAID web GUI
  • Verified AD health (Repadmin /replsummary and DCDiag /Test:DNS /e /v) all tests passed
  • Verified no errors in domain controller event logs
  • Rebooted client computer (standard window access denied error: \\unraid is not accessible. You might not have permission.. blah blah...)
  • Restarted UNRAID array (while leaving and rejoining AD in web GUI)

 

 

I'm fine with things breaking after changes or updates, but it's annoying when things randomly crap out like a jug of bad milk. I'm happy to take any suggestions or post any additional information/logs.

 

 

EDIT #1

After sinking more time into this issue I believe I narrowed it down to UNRAID not being able to successfully communicate with AD. The check_account entry appears with the same username ( same one I joined AD in the web GUI with) regardless of which account tries to access a share on the client end.

 

I also don't see the UNRAID listed anywhere on my AD domain side which tells me it probably isn't correctly joining the domain.

 

What are some command line things I can run from UNRAID to verify or reestablish the AD connection? Leaving, switching to workgroup and then rejoining AD on the web GUI doesn't seem to be getting it done for me.

 

EDIT #2

Since the AD connection is broken (but UNRAID doesn't seem to know this) I disabled AD and went to workgroup mode. This allowed my shares to be accessible again by setting all the shares to public, BUT it means I lost complete control over everything because it's 100% public and ignoring my previously working AD file security settings.

 

Needless to say this is a sucky Band-Aid solution and I hope someone can chime in with a proper way to correctly reestablish my AD link between UNRAID and my domain.

 

EDIT #3

 

Seems I was able to view and list the files, but I can't play or copy them without getting a permissions error. This is officially a crap "day off" that was supposed to be spent lazing around catching up on tv shows... 

Edited by CallOneTech
  • Like 1
  • Upvote 1
Link to comment

Glad to see this isn't just an isolated issue I'm having. I'm starting to think this is a a legit bug in need of fixing...

 

I was hoping to hear a little more helpful feedback from folks that focus more on the AD integration features. Hopefully if we make enough noise we can get some official input on why the AD/file permissions side of things is so buggy.

 

My plan is to take another deep dive into this issue today and I'll try my best to share my findings in this thread until it gets the attention it needs for a resolution.

  • Upvote 1
Link to comment

I'm having the same issue after upgrading to 6.10.2. Initially I thought it may be a permission issue and running the "new permissions" tool seemed to resolve it, but after about a week I am getting the same error when trying to access SMB shares.

 

I tried to leave/join the domain to see if that resolves it as well, but when I click on the leave button it thinks for a minute then nothing happens. The page reloaded but status is still "joined" and I still see the computer object in AD.

 

Reverting back to 6.9.2 gave me access back to my shares.

  • Upvote 1
Link to comment

Well, so much for that plan... About 24 hours later all of my clients have started losing connection to the shares. It was never an issue before upgrading to 6.10.2, but even downgrading back to 6.9.2 hasn't fully resolved the issue.

 

I can also rule out it being a Windows specific issue, as both my Windows and Linux (via CIFS/autofs) clients lose connection with a permission denied error (or error 13 in the case of CIFS/Linux).

 

Other users seem to be having a similar issue with 6.10 here: 

 

 

And I am seeing the same behavior as this one ("wbinfo -i <user>" can't find domain and "wbinfo -u" lists the usernames I am trying), but I don't think they are on 6.10:

 

When I run the "wbinfo -i <user>" I get this response, but pinging the AD domain and FQDN of the domain controller works fine and resolves to the right IP (only 1 DC).

-----

failed to call wbcGetpwnam: WBC_ERR_DOMAIN_NOT_FOUND
Could not get info for user <username>

-----

 

Restarting the samba service doesn't seem to matter. Leaving the domain still appears to be broken in my 6.9.2. Rebooting the entire Unraid server seems to resolve it, but I assume it will come back again after some time. However, this takes forever since I have to stop all my VMs, docker containers, the array, reboot, and reload everything. Not to mention it seems like samba and/or docker doesn't always seem to start correctly on the first reboot... but that's another issue.

 

I am able to get the clients back to access the shares if I create a matching username/password on Unraid locally, so it has something up with the AD authentication in Unraid.

 

 

 

Edited by Brianara3
Link to comment

I never had an issue with AD in 6.9.x, but it's been nothing but issues since upgrading to 6.10.x. The odd thing is that it seems to be isolated to my backup user. I'm not sure what is different about this user that makes unraid hate it. I'm sure that unraid is the problem as I can access files with the same user and machine on my TrueNAS shares. I'm willing to do some troubleshooting if anyone has any ideas. I'm seeing the same response to ```wbinfo -i <user>``` as Brianara3 for the problem user. 

 

Edit: I was able to get the user working again by rebooting unraid, leaving the domain, and rejoining. We'll see how long it takes to break again. 

Edited by mouseskowitz
  • Like 1
Link to comment

First, thank you to everyone for contributing to this thread so far. There is 100% something funky under the hood regarding the AD integration and samba in general.

 

I came to the same conclusion as Brianara3 and mouseskowitz about needing to reboot the server to rejoin AD.

 

However, restarting the server just for the sake of regaining the ability to leave and rejoin the domain isn't a proper fix. Without a clear cause as to why the connection with AD failed, we are just waiting for it to randomly break again.

 

Please post back on this thread the next time your connection breaks, and maybe we can start to identify a pattern. For all we know, this bug could be time-based, like a jug of milk.

Link to comment

I have been having the same problems as you all, I had a single user that was giving the same error message while my other users would authenticate without issue.

I found a solution that at least seems to fix the issue for me, after doing these steps the user can authenticate and I don't see the error message. Whether or not the issue will return I don't know.

 

I found the following post that someone had made about samba back in January. From that post a user stated that the smb.conf file should contain the following lines. The one generated by unraid did not match. (https://www.spinics.net/lists/samba/msg173243.html )

# Where mydomain.com matches your realm option

idmap config mydomain.com : range = 100000-999999
idmap config mydomain.com : backend = rid
idmap config * : range = 3001-7999
idmap config * : backend = tdb

I added this to the "/boot/config/smb-extra.conf" file, but you can also add it in the UI under "Samba extra configuration".

From that post it contains a link to this page which contains more info about this change https://wiki.samba.org/index.php/Idmap_config_rid

 

I suppose at this point you could reboot the server and it should be resolved, but I ended up just deleting the samba cache files.

I followed instructions from this post and changed it slightly for unraid.

https://serverfault.com/questions/476086/samba-winbind-user-resolution 

/etc/rc.d/rc.samba stop
rm -rf /var/lib/samba
mkdir -p /var/lib/samba/private
rm -rf /var/cache/samba
/etc/rc.d/rc.samba restart

 

After doing this my problem user was able to login. Just to be sure I rebooted the server and was still able to authenticate without issues.

I don't know if this will fix everyone's issues but its worth a try. I also don't know why these idmap options are now required but only required for certain users. There is probably some more information out there about this, but I don't feel like investigating anymore tonight.

 

I did a quick test just to verify this wasn't a fluke by removing the idmap options and restarting samba, and the user immediately failed. Redoing the steps above fixed it again.

  • Like 1
  • Thanks 1
Link to comment

My prediction was correct and it's broken again after 7 days. I tried @harpesichord suggestion and now I'm totally locked out of the folder that was owned by the broken user. I've reverted the change and the names at least show back up for the owner and groups of the other shares. Changing the owner via command line isn't fixing things like in the past as the the permissions aren't being given to the owner. Anyone know how to fix this?

Link to comment

UPDATE:

I don't recommend this but I resolved mine by changing the Perms on the "/mnt" (of which houses the disks1-2-3-4-5-x-x-x and the user folders) to my Domain User Account but setting the Perms to 777. this allowed ALL Machines across ALL Servers and my Machines connected to them to start working again.

Very dangerous change but a great stop gap in the meantime.

Link to comment
On 6/25/2022 at 7:22 PM, Stan464 said:

UPDATE:

I don't recommend this but I resolved mine by changing the Perms on the "/mnt" (of which houses the disks1-2-3-4-5-x-x-x and the user folders) to my Domain User Account but setting the Perms to 777. this allowed ALL Machines across ALL Servers and my Machines connected to them to start working again.

Very dangerous change but a great stop gap in the meantime.

I actually noticed on boot that unraid is setting all my shares to 0777 despite them listed in unraid as private shares 0755?

 

Active Directory reports users have correct permissions, but nada when it comes to accessing (denied error).

 

For clarity reasons: My two AD servers are on Unraid. been meaning to move one onto proxmox but haven't had time yet.

 

im wondering if this is the same for yourselves? if so it might be the fact unraid will not be able to join the domain during startup? i know this was easily resolved in 6.9.3 by running "net join -U Administrator" in ssh...but this doesn't seem to resolve it in 6.10.2

Edited by Darren Cook
Link to comment
5 hours ago, mouseskowitz said:

My DC is on a separate host and available during unraid boot.

So its not that then. thanks for clarifying.

 

i remember the last RC (6.9.0?) had issues with AD that where not picked up until final release when i had flagged it. i have a feeling something again was missed?

  • Like 1
Link to comment

Hello,

 

I've the same problem, some users acess without problem, other user not acess and give errors: permission denied or wrong credentials :(

I've put the configuration @harpesichord and reboot. After reboot any users can connect on the shares. I've reverted and delete any config for samba via gui.

 

Anyone with an idea?

 

thank you.

 

 

Link to comment

Well, here we are almost a month later and I am still having the same issue with 6.10.x not working with domain-joined SMB. I have even updated 6.10.3 hoping that would improve things but it hasn't.

 

I have tried configuring the items that @harpesichord suggested and that worked for *some* of the user accounts, but not others. I still get the same result of user shows up in wbinfo -u but get error with wbinfo -i <user>. The wbinfo -i <user> works fine for any users that are able to access shares and everything, but not for the others.

 

I'm pulling my hair out with this one.....

 

  • Like 1
Link to comment
2 hours ago, Brianara3 said:

Well, here we are almost a month later and I am still having the same issue with 6.10.x not working with domain-joined SMB. I have even updated 6.10.3 hoping that would improve things but it hasn't.

 

I have tried configuring the items that @harpesichord suggested and that worked for *some* of the user accounts, but not others. I still get the same result of user shows up in wbinfo -u but get error with wbinfo -i <user>. The wbinfo -i <user> works fine for any users that are able to access shares and everything, but not for the others.

 

I'm pulling my hair out with this one.....

 



Yeah, its strange one. I put in a Delayed CHMOD script to allow shares to start working via 777 perms until its resolved.

I have up to date backups so I'm not too worried about deletion in the short term.

Link to comment
  • 2 weeks later...

I have also been having the same problem - I am locked out of all my shares with SMB in my UNRAID server. It had been working fine for years but sems the 6.10.2 upgrade it seems intermittent most times I am locked out but occasionally I am able to get in. As yet is not affecting all users just some of them.

 

I note my log disk is 98% full most of the space taken by some very large syslog files.

 

These syslog files are constantly reporting many SMB errors - mostly refusing mount requests due to "smbd:   check_account: Failed to convert SID S-1-5-21-**********-**********-*********-1105 to a UID (dom_user[DOMAIN\username])" errors.

 

Link to comment

So I installed Unraid 2 days ago (version 6.10.3 2022-06-14) and lifted it into our Windows domain.

 

AD initial owner: Administrators_xyz
AD initial group: Administrators_xyz

 

I can access the share with users from this group. Even if "everyone" has no rights.

 

I have created a group using the path described in the link.
https://www.linuxserver.io/blog/2015-07-20-how-to-active-directory-on-unraid-6

 

Then I assigned a test user to the ReadWrite group. 
Result: No authorisation. The domain server wnter "Effective Access" says that the user has full rights.

 

If I add Full Access under Permissions for Everyone, the user immediately gets access to the share. 

Apparently the manually added groups do not work.
Yes, I have re-registered the user after he has been assigned the groups. :)

Link to comment
  • 1 month later...
On 6/10/2022 at 9:26 AM, CallOneTech said:

Summary:

I noticed some seemly random issues with my SMB shares since updating from 6.9 to 6.10.2. The latest example is losing access to a media share in the span of 6 hours after making no changes to any systems. It literally worked at 3am, I went to sleep, I then woke up and found I had lost access to the SMB share. Bizarre stuff...

 

Log entries (identifying info removed):

 

[2022/06/10 12:02:55.282902,  0] ../../source3/auth/auth_util.c:1927(check_account)
Jun 10 12:02:55 unraid smbd[2715]:   check_account: Failed to convert SID S-###removed### to a UID (dom_user[DOMAIN\username])

 

Troubleshooting:

 

  • left and rejoined the domain via the UNRAID web GUI
  • Verified AD health (Repadmin /replsummary and DCDiag /Test:DNS /e /v) all tests passed
  • Verified no errors in domain controller event logs
  • Rebooted client computer (standard window access denied error: \\unraid is not accessible. You might not have permission.. blah blah...)
  • Restarted UNRAID array (while leaving and rejoining AD in web GUI)

 

 

I'm fine with things breaking after changes or updates, but it's annoying when things randomly crap out like a jug of bad milk. I'm happy to take any suggestions or post any additional information/logs.

 

 

EDIT #1

After sinking more time into this issue I believe I narrowed it down to UNRAID not being able to successfully communicate with AD. The check_account entry appears with the same username ( same one I joined AD in the web GUI with) regardless of which account tries to access a share on the client end.

 

I also don't see the UNRAID listed anywhere on my AD domain side which tells me it probably isn't correctly joining the domain.

 

What are some command line things I can run from UNRAID to verify or reestablish the AD connection? Leaving, switching to workgroup and then rejoining AD on the web GUI doesn't seem to be getting it done for me.

 

EDIT #2

Since the AD connection is broken (but UNRAID doesn't seem to know this) I disabled AD and went to workgroup mode. This allowed my shares to be accessible again by setting all the shares to public, BUT it means I lost complete control over everything because it's 100% public and ignoring my previously working AD file security settings.

 

Needless to say this is a sucky Band-Aid solution and I hope someone can chime in with a proper way to correctly reestablish my AD link between UNRAID and my domain.

 

EDIT #3

 

Seems I was able to view and list the files, but I can't play or copy them without getting a permissions error. This is officially a crap "day off" that was supposed to be spent lazing around catching up on tv shows... 

 

 

I have found that you should be able to change the files permissions in windows explorer, by right clicking and changing the permissions there. I will say, for some reason, with a larger dataset, you might have to do folders individually. I tried setting permissions on a larger, 1tb dataset, and it failed partway through for me, but doing them one by one or on smaller (<100GB) data sets, or smaller quantity, larger files works fine

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.