Jump to content

CallOneTech

Members
  • Posts

    24
  • Joined

  • Last visited

Posts posted by CallOneTech

  1. First, thank you to everyone for contributing to this thread so far. There is 100% something funky under the hood regarding the AD integration and samba in general.

     

    I came to the same conclusion as Brianara3 and mouseskowitz about needing to reboot the server to rejoin AD.

     

    However, restarting the server just for the sake of regaining the ability to leave and rejoin the domain isn't a proper fix. Without a clear cause as to why the connection with AD failed, we are just waiting for it to randomly break again.

     

    Please post back on this thread the next time your connection breaks, and maybe we can start to identify a pattern. For all we know, this bug could be time-based, like a jug of milk.

  2. As the topic states, I'm trying to manually join and verify AD domain membership via the terminal. There seems to be a bug in the web GUI that doesn't allow you to leave a domain. I also have a feeling it is in some strange stuck state that could easily be proven or disproven with manual testing in the terminal.

     

    I have seen posts mentioning doing exactly what I am attempting with scripts, but I haven't seen any examples of exactly how it was done.

     

    I always could dig through the code to see how the web GUI button works, but I was really hoping to avoid doing that if possible.

  3. Glad to see this isn't just an isolated issue I'm having. I'm starting to think this is a a legit bug in need of fixing...

     

    I was hoping to hear a little more helpful feedback from folks that focus more on the AD integration features. Hopefully if we make enough noise we can get some official input on why the AD/file permissions side of things is so buggy.

     

    My plan is to take another deep dive into this issue today and I'll try my best to share my findings in this thread until it gets the attention it needs for a resolution.

    • Upvote 1
  4. Summary:

    I noticed some seemly random issues with my SMB shares since updating from 6.9 to 6.10.2. The latest example is losing access to a media share in the span of 6 hours after making no changes to any systems. It literally worked at 3am, I went to sleep, I then woke up and found I had lost access to the SMB share. Bizarre stuff...

     

    Log entries (identifying info removed):

     

    [2022/06/10 12:02:55.282902,  0] ../../source3/auth/auth_util.c:1927(check_account)
    Jun 10 12:02:55 unraid smbd[2715]:   check_account: Failed to convert SID S-###removed### to a UID (dom_user[DOMAIN\username])

     

    Troubleshooting:

     

    • left and rejoined the domain via the UNRAID web GUI
    • Verified AD health (Repadmin /replsummary and DCDiag /Test:DNS /e /v) all tests passed
    • Verified no errors in domain controller event logs
    • Rebooted client computer (standard window access denied error: \\unraid is not accessible. You might not have permission.. blah blah...)
    • Restarted UNRAID array (while leaving and rejoining AD in web GUI)

     

     

    I'm fine with things breaking after changes or updates, but it's annoying when things randomly crap out like a jug of bad milk. I'm happy to take any suggestions or post any additional information/logs.

     

     

    EDIT #1

    After sinking more time into this issue I believe I narrowed it down to UNRAID not being able to successfully communicate with AD. The check_account entry appears with the same username ( same one I joined AD in the web GUI with) regardless of which account tries to access a share on the client end.

     

    I also don't see the UNRAID listed anywhere on my AD domain side which tells me it probably isn't correctly joining the domain.

     

    What are some command line things I can run from UNRAID to verify or reestablish the AD connection? Leaving, switching to workgroup and then rejoining AD on the web GUI doesn't seem to be getting it done for me.

     

    EDIT #2

    Since the AD connection is broken (but UNRAID doesn't seem to know this) I disabled AD and went to workgroup mode. This allowed my shares to be accessible again by setting all the shares to public, BUT it means I lost complete control over everything because it's 100% public and ignoring my previously working AD file security settings.

     

    Needless to say this is a sucky Band-Aid solution and I hope someone can chime in with a proper way to correctly reestablish my AD link between UNRAID and my domain.

     

    EDIT #3

     

    Seems I was able to view and list the files, but I can't play or copy them without getting a permissions error. This is officially a crap "day off" that was supposed to be spent lazing around catching up on tv shows... 

    • Like 1
    • Upvote 1
  5. I'm not sure if I would call this 100% fixed, but it's definitely a usable work around if anyone with a similar issue is reading this thread in the future.

     

    I didn't mess with the RAM cache settings yet, but I did turn on Turbo Write.  Since enabling it I have been able to run a MS Explorer SMB drag and drop style transfer from another machine (2 huge writes total) without having either transfer fail.

     

    The Explorer based copy has slowed a bit.  The bulk of it happens between 80-100MB/s which tells me it is still RAM caching like crazy.  It still goes near 0B/s when flushing and if it is a single huge file it will completely bottom out at 0B/s.  The good news is these stalls don't last long enough to cause the transfers to fail anymore 🥳

     

    Turbo Write seems to allow it to flush fast enough and prevent the major stalls that I was seeing before.   My attached graph seems to verify this because the gaps between the bursts of array write activity are MUCH tighter than they were before.  I haven't timed it, but I assume it was stalling at 0 for ~60-90 seconds VS maybe 5-15 seconds with Turbo enabled.

     

    I assume I could reduce the stalls even more by changing the max RAM cache to 4GB as per @JorgeB, but I dont want to touch anything because it is finally running without me having to babysit.

     

    Big thanks to everyone for the helpful insights.

    unraid001.jpg

    • Like 1
  6. I am in the process of moving ~15TB of data to a 28TB array and seem to be seeing some strange behavior. I believe it might be explained with the core function of how Linux manages memory, but I wanted to double check before going down a rabbit hole...

     

    Setup:

    • Unraid machine has 72GB of ram with a 10GBit NIC.
    • Caching is Disabled
    • Parity is Enabled
    • All Disks are Spinning

     

    Transfer Methods:

    • Windows Server 2012 drag and drop between a local disk and Unraid share (causes the stall)
    • MS Robot copy (works fine unless another machine uses drag and drop to cause a stall)

     

    Experienced Behavior:

     

    I attempt a copy using MS Explorer via drag and drop or copy and paste. The transfer begins at 150-180MB/sec. This part strikes me as odd because it seems to be the max read speed of the source disk. However, I would expect the write speeds to be closer to ~50MB/sec with parity on and no Unraid cache drive.

     

    The only thing that makes sense is that Unraid is caching my transfer into a slice of the system's 72GB of ram. This theory is also supported because the chart for array writing activity is near 0/MB at the start of the SMB transfer (while the ram cache is filling).

     

    The problem seems to be that once the ram cache fills, it begins to flush its data to the array at a much slower write speed. This process bogs down the whole system and causes any SMB writing from other machines in the network to drop to 0.

    I also see the array write speed chart spike up to maximum write speed. If all the SMB writes have stalled, this must be the ram cache trying to flush?

     

    It feels like I'm filling my ram cache with a fire hose and then trying to empty it with a straw. This behavior would be completely fine if I were transferring ~50GB or less... but not so much with the big load ups.

     

    My MS Robocopy has been running like a champ for the last 15 hours. It copies slower than MS Explorer and that slower speed seems to avoid the stalling situation. However, I can instantly replicate the stall if I start a drag and drop transfer in MS Explorer from another machine.

     

    How can I better manage some settings to stop my transfers from stalling? I'm fine with the slower write speeds because all this stop and go nonsense is probably just as slow.

     

    Or do I have it completely wrong?

     

×
×
  • Create New...