Jump to content

ericswpark

Members
  • Posts

    60
  • Joined

  • Last visited

Posts posted by ericswpark

  1. @itimpi thanks! Found the diagnostics that UnRAID saved before rebooting (attached below). Note that while the timestamp is newer, that's because the diagnostics in the OP was generated and downloaded from my browser which has an earlier time zone. This is the diagnostics captured before the one in OP.

     

    I did skim through the logs but I don't understand what it means:

     

    Dec  3 01:15:55 dipper root: rmdir: failed to remove '/mnt/user': Device or resource busy
    Dec  3 01:15:55 dipper emhttpd: shcmd (1343364): exit status: 1
    Dec  3 01:15:55 dipper emhttpd: shcmd (1343366): rm -f /boot/config/plugins/dynamix/mover.cron
    Dec  3 01:15:55 dipper emhttpd: shcmd (1343367): /usr/local/sbin/update_cron
    Dec  3 01:15:55 dipper emhttpd: Retry unmounting user share(s)...
    Dec  3 01:15:58 dipper root: Status of all loop devices
    Dec  3 01:15:58 dipper root: /dev/loop1: [2049]:12 (/boot/previous/bzfirmware)
    Dec  3 01:15:58 dipper root: /dev/loop0: [2049]:10 (/boot/previous/bzmodules)
    Dec  3 01:15:58 dipper root: Active pids left on /mnt/*
    Dec  3 01:15:58 dipper root:                      USER        PID ACCESS COMMAND
    Dec  3 01:15:58 dipper root: /mnt/addons:         root     kernel mount /mnt/addons
    Dec  3 01:15:58 dipper root: /mnt/cache:          root     kernel mount /mnt/cache
    Dec  3 01:15:58 dipper root: /mnt/disk1:          root     kernel mount /mnt/disk1
    Dec  3 01:15:58 dipper root: /mnt/disk2:          root     kernel mount /mnt/disk2
    Dec  3 01:15:58 dipper root: /mnt/disk3:          root     kernel mount /mnt/disk3
    Dec  3 01:15:58 dipper root: /mnt/disks:          root     kernel mount /mnt/disks
    Dec  3 01:15:58 dipper root: /mnt/remotes:        root     kernel mount /mnt/remotes
    Dec  3 01:15:58 dipper root: /mnt/rootshare:      root     kernel mount /mnt/rootshare
    Dec  3 01:15:58 dipper root: /mnt/user:           root     kernel mount /mnt/user
    Dec  3 01:15:58 dipper root:                      root       1142 ..c.. rpcd_mdssvc
    Dec  3 01:15:58 dipper root:                      root       2998 ..c.. rpcd_mdssvc
    Dec  3 01:15:58 dipper root: Active pids left on /dev/md*
    Dec  3 01:15:58 dipper root: Generating diagnostics...
    Dec  3 01:16:00 dipper emhttpd: shcmd (1343368): /usr/sbin/zfs unmount -a
    Dec  3 01:16:00 dipper emhttpd: shcmd (1343369): umount /mnt/user
    Dec  3 01:16:00 dipper root: umount: /mnt/user: target is busy.
    Dec  3 01:16:00 dipper emhttpd: shcmd (1343369): exit status: 32
    Dec  3 01:16:00 dipper emhttpd: shcmd (1343370): rmdir /mnt/user
    Dec  3 01:16:00 dipper root: rmdir: failed to remove '/mnt/user': Device or resource busy
    Dec  3 01:16:00 dipper emhttpd: shcmd (1343370): exit status: 1
    Dec  3 01:16:00 dipper emhttpd: shcmd (1343372): rm -f /boot/config/plugins/dynamix/mover.cron
    Dec  3 01:16:00 dipper emhttpd: shcmd (1343373): /usr/local/sbin/update_cron
    Dec  3 01:16:00 dipper emhttpd: Retry unmounting user share(s)...

     

    I'm guessing PIDs 1142 and 2998 of the process `rpcd_mdssvc` is responsible for not allowing the unmount, but I don't know what that process is. Google says it's related to Samba, but I thought UnRAID was supposed to terminate Samba before trying unmount. Any ideas why this didn't happen here?

    dipper-diagnostics-20231203-0115.zip

  2. Hi,

     

    After updating from 6.12.5 to .6, I issued a reboot, and when the server came back up it said it detected an unclean shutdown. I'm guessing something took a while to unmount and UnRAID had to forcefully reboot.

     

    I don't remember where the logs for the hard shutdown are stored – is there a way to recover that? I've attached a diagnostics zip taken after the reboot, but I'm not sure if that includes the logs before the shutdown. Any advice on pinpointing this would be appreciated!

    dipper-diagnostics-20231202-1120.zip

  3. Hi, is there a way of backing up the Docker apps without stopping the containers? I know that that might result in a partial backup as files are modified by the running containers, but the entire process results in about a 30-minute downtime every week.

  4. 1 hour ago, Michael_P said:

    Exactly the point I'm trying to make - even with high latency, you should be able to saturate your 100Mb link - most people that have issues are on 1Gb or higher links. If you're hell bent on it, I'd set up a lightweight VM

    Not really. When I said "there's no reason why SMB can do so as well", I meant that in terms of encryption overhead. I also put in the qualifier: "given improved latency."

     

    However, when you put SMB and SFTP next to each other, you can see why it doesn't perform very well over VPN. SMB is a very chatty protocol, and the increased latency means slower transfer speeds and the general choppyness that I am experiencing. In terms of protocol overhead, SMB just cannot compete with SFTP (and/or other protocols).

     

    I was hoping there would be some sort of Docker thing available, but I think I may have to resort to a VM, as you said. Thanks for the reply.

  5. 16 hours ago, Michael_P said:

     

    It could be if it's a low end processor, especially using SMB

    Tailscale's DERP servers are only used as a last resort, and when it is being used I don't fault the SMB speeds.

     

    The server is running a 4600G, and the client device is an M1 Pro. I don't think encryption overhead is to blame here.

     

    If SFTP over the same connection setting can do 10 MB/s, then there's no reason why SMB can do so as well, given improved latency. But VPN will always add latency, and SMB suffers over high-latency links.

     

    Anyway, I think we've gone off topic. I'm looking for a protocol or something else to use as an alternative to SMB, only when I'm accessing the server away from home. UnRAID's built-in SFTP (and FTP) servers don't cut it because they only permit root user login, which I cannot give out to my users (and is generally more tedious as I have to go in and manually set correct permissions after the fact). At this point. I don't think trying to improve what is hitting against a theoretical limit of SMB is going to be beneficial here.

  6. 3 hours ago, Michael_P said:

     

    Define 'better speeds', what are you seeing now? What's the link's speed? What's the VPN server's specs?

     

    SMB over VPN tends to be slower, but it can also be plenty fast

    Sure! So on SMB over VPN, I'm getting around 1-5 MB/s on a good day, but with horrible latency. That means that file listings take a long time to load (around 10-20 seconds), transfers take a long time to start (around 30 seconds minimum), and the actual progress update is rather sporadic and "jumpy" (about once every 10-20 seconds).

     

    For contrast, on LAN (again, 50 ms ping on average), listings take less than 2 seconds, transfers start in less than 2 seconds, and the actual progress update is done extremely rapidly (once every second or so).

     

    I get 200 Mbps down/up (symmetrical), and the server has 100 Mbps down/up (also symmetrical). I know losses and overhead and all that, but I should be getting around 10 MB/s, not 1-5 MB/s. And I do get 10 MB/s if I transfer using UnRAID's built-in SFTP.

     

    Tailscale uses DERP relays in case a direct connection cannot be established, but most of my transfers are done over a direct link, so VPN server speed shouldn't be an issue. On restrictive networks I take the DERP relay overhead into account.

     

    I mean, 1-5 MB/s over VPN is also "fast", but compared to SFTP I think it's rather slow and high-latency to be usable. Sure, it works in a pinch, but actually navigating and accessing files on the NAS is painful, to the point where all of my family members complain whenever they have to do it away from home.

  7. When in an SSH session, I sometimes need to modify/move/create a lot of files and directories. As I can only log in as the root user, this messes up the permissions.

    I know I can technically `chmod` them, but it would be way easier to switch to the `nobody` user with `su nobody`, make the necessary changes, then exit the session and be dropped right back to the root SSH shell.

  8. Basically title. When I'm at home, I'm getting great performance. But when I'm away from my home network and access my server using a VPN, it adds latency and SMB slows to a crawl. Think fifteen seconds just pulling up a directory listing crawl. Or 30 seconds transferring a 2 MB file crawl.

     

    How much latency? Not much. When I'm at home, I get <100 ms latency. When I'm out and about, the latency varies depending on the connection, but I've noticed the horrible performance even with 200 ms latency.

     

    I wanted to use something like SFTP, but learned that UnRAID doesn't support SFTP unless you log in as the root user. Which is a huge pain because you need to manually set correct permissions and other members of my family obviously cannot use the root user to access everything on the server.

     

    I previously tried Nextcloud, but found the experience very janky. The Nextcloud Docker tends to corrupt the DB, and having Nextcloud as a middleman still feels clunky compared to mounting the server directly using SMB/SSHFS.

     

    Anybody have any tips on alternative file access methods? Maybe an FTPS/SFTP server via Docker? How do you access files on your server when you're away from home through VPN that has high latency?

  9. There is an error in the console during the update:

     

    Uncaught TypeError: progress_span[data[1]] is null
        progress_dots[data[1]]< https://dipper:1443/Docker:1514
        setInterval handler* https://dipper:1443/Docker:1514
        emit https://dipper:1443/webGui/javascript/dynamix.js?v=1680052794:15
        onmessage https://dipper:1443/webGui/javascript/dynamix.js?v=1680052794:15
        listen https://dipper:1443/webGui/javascript/dynamix.js?v=1680052794:15
        start https://dipper:1443/webGui/javascript/dynamix.js?v=1680052794:15
        openDocker https://dipper:1443/Docker:252
        updateContainer https://dipper:1443/plugins/dynamix.docker.manager/javascript/docker.js?v=1678007099:108
        i https://dipper:1443/webGui/javascript/dynamix.js?v=1680052794:51
        h https://dipper:1443/webGui/javascript/dynamix.js?v=1680052794:51
        t https://dipper:1443/webGui/javascript/dynamix.js?v=1680052794:51
        u https://dipper:1443/webGui/javascript/dynamix.js?v=1680052794:51
        updateContainer https://dipper:1443/plugins/dynamix.docker.manager/javascript/docker.js?v=1678007099:105
        action https://dipper:1443/plugins/dynamix.docker.manager/javascript/docker.js?v=1678007099:12
        dispatch https://dipper:1443/webGui/javascript/dynamix.js?v=1680052794:5
        handle https://dipper:1443/webGui/javascript/dynamix.js?v=1680052794:5
    Docker:1514:59

     

    Uncaught TypeError: progress_span[data[1]] is null
        <anonymous> https://dipper:1443/Docker:1518
        emit https://dipper:1443/webGui/javascript/dynamix.js?v=1680052794:15
        onmessage https://dipper:1443/webGui/javascript/dynamix.js?v=1680052794:15
        listen https://dipper:1443/webGui/javascript/dynamix.js?v=1680052794:15
        start https://dipper:1443/webGui/javascript/dynamix.js?v=1680052794:15
        openDocker https://dipper:1443/Docker:252
        updateContainer https://dipper:1443/plugins/dynamix.docker.manager/javascript/docker.js?v=1678007099:108
        i https://dipper:1443/webGui/javascript/dynamix.js?v=1680052794:51
        h https://dipper:1443/webGui/javascript/dynamix.js?v=1680052794:51
        t https://dipper:1443/webGui/javascript/dynamix.js?v=1680052794:51
        u https://dipper:1443/webGui/javascript/dynamix.js?v=1680052794:51
        updateContainer https://dipper:1443/plugins/dynamix.docker.manager/javascript/docker.js?v=1678007099:105
        action https://dipper:1443/plugins/dynamix.docker.manager/javascript/docker.js?v=1678007099:12
        dispatch https://dipper:1443/webGui/javascript/dynamix.js?v=1680052794:5
        handle https://dipper:1443/webGui/javascript/dynamix.js?v=1680052794:5
    Docker:1518:5

     

    Could this be related?

  10. Checking devtools, I see the webUI tries to establish a connection to a websocket. A GET request is sent to wss://dipper:1443/sub/docker (the 1443 port is because I have the webUI running on port 1443), but the response is 101 (Switching Protocols):

     

    1537167131_Screenshot2023-09-02at9_53_52AM.thumb.png.3a2b48bf22ce9d0bc4961451e4dd897b.png1398129489_Screenshot2023-09-02at9_54_33AM.thumb.png.d6a42a13b06f058880d9d8b16ce0ec2d.png

     

    The response tab is showing that data is properly flowing in, suggesting that the connection has succeeded? I have no idea why it won't update the window, then.

     

    1208062321_Screenshot2023-09-02at9_55_11AM.thumb.png.48c9067e4d474acb87b88de41a9ef710.png

     

    Any ideas on what is failing here? How does the WebUI communicate status updates from the Docker service when updating a container?

     

  11. Users should be able to access the server over the built-in SSH/SFTP server.

     

    Currently, the `/etc/passwd` file is set so that shell access is disabled for regular users, and no home directory is assigned. Only the root user can use the built-in SSH/SFTP server.

     

    If there is an option in the user settings page that enables shell access, it should allow for users to modify and create files over SSH/SFTP, which is much more preferable than using the root account and manually checking to make sure that the correct permissions are applied. Plus, SFTP is much more reliable over VPN compared to SMB.

  12. Your phone doesn't support mDNS resolution. That's how devices on the local network "find" each other via hostname.

     

    Try connecting directly to your server using the IP assigned by the router. That should work on your phone.

     

    As an aside, I remember Google rolling out mDNS resolution to Android phones running 9.0 or above with a Google Play Services update, but am unsure of whether they ever finished rolling it out or if they pulled the update. If you are using a phone below 9.0 it's best if you update or change to a newer model.

  13. For some reason, attempting to log in via SSH/SFTP as a "regular user" (not root) results in the password prompt repeating, over and over again.

     

    I don't want to use the root account, as modifying and creating files will set the wrong permissions.

     

    Is there a config option I can change? Or should I file it as a feature request?

  14. @03fc35ss's config above doesn't work with iOS clients. I still have to downgrade the `server min protocol` to `SMB3_02`.

     

    Here is my config as of now:

     

    # Server hardening
    # SMBv3 will break VLC iOS - use prerelease version to fix!
    # SMB3_11 for server min protocol breaks some clients (iOS)
    server min protocol = SMB3_02
    client ipc min protocol = SMB3_11
    server signing = mandatory
    client NTLMv2 auth = yes
    restrict anonymous = 2
    null passwords = no
    raw NTLMv2 auth = no
    smb encrypt = required
    client signing = required
    client ipc signing = required
    client smb encrypt = required
    server smb encrypt = required

     

    • Like 1
  15. After upgrading to 6.12.3 today I found that the same two drives had died again. A physical inspection last time didn't turn up anything, but I decided to check again. I noticed that the drives that had "died" were connected to my HBA with one of those SAS to SATA cables, and the cable on the SATA end had gotten a bit bent as I built the NAS in a mini-ITX case.

     

    I replaced the entire cable and it seems like the problem has been fixed? I'll keep the old cable around, but as long as two drives don't drop out during upgrades I think it's safe to rule this as a cable issue. The missing drives didn't even show up in the SAS configuration utility when the suspected faulty cable was used.

     

    moral of the story: change cables and don't build your NAS in a mini-ITX case

    • Like 1
  16. When I clicked on "Stop" [array] to do some maintenance on the server, I noticed that it was stuck on "Retry unmounting disk share(s)...". So I checked in an SSH session and saw that the script for stopping the array was getting hung up on `/mnt/cache`:

     

    root@dipper:~# less /var/log/syslog
    root@dipper:~# lsof /mnt/cache
    COMMAND   PID USER   FD   TYPE DEVICE SIZE/OFF    NODE NAME
    bash    18466 root  cwd    DIR   0,42       84 7700350 /mnt/cache/appdata/swag/log/nginx
    tail    19201 root  cwd    DIR   0,42       84 7700350 /mnt/cache/appdata/swag/log/nginx
    tail    19201 root    3r   REG   0,42 27694298 7700846 /mnt/cache/appdata/swag/log/nginx/access.log.1

     

    So I manually killed the processes. The two `tail` processes were killed with no arguments, but I had to send a SIGTERM to the `bash` process because it just wouldn't quit.

     

    I've also collected and attached diagnostics immediately after the array stopped.

     

    Is this something I should report to the Linuxserver thread (if there is one specifically for SWAG)? Or is there a problem with my configuration?

    dipper-diagnostics-20230701-0804.zip

  17. I happened to check the webGUI today and noticed that the parity drive... came back on its own? The syslogs just say a device connect or power on event occurred. The drive somehow just woke back up and decided to work normally.

     

    The other data disk is still missing, but I'm hoping that with enough power fed to it it would eventually recover itself like the parity drive.

     

    Still very curious as to what's going on. It doesn't seem like a loose connection at all as the UDMA CRC values are all still at zero for the parity drive that returned. ¯\_(ツ)_/¯

  18. Hi everyone, after upgrading 6.11.5 -> 6.12.0 I was greeted with this lovely sight:

     

    image.png.491616c052ba429b88891f6e7c29b6c9.png

     

    Thank god for dual parity (and a separate backup server). Not worrying too much about the data really takes the strain off of surprises like these.

     

    However, I suspect that the drives aren't actually "dead", since they were working fine right up to the upgrade. Also, it's rather strange for two drives to "die" like that. I suspect the cable that plugs into those two drives may have become loose, either on the drive end or on the LSI card's end.

     

    Unfortunately, the server is located remotely and therefore I cannot go and check it physically. I was wondering if anybody could find any clues as to why the drives are not coming up in the logs. I did find some messages saying "SATA link down", but it didn't say why, or I might've missed it. If not I'll have to check it over the next time I get a chance to inspect it in person. Any ideas are appreciated! Diagnostics attached.

     

     

    dipper-diagnostics-20230616-0845.zip

  19. Hi everyone, I'm trying to set up some subdomains on my server so that they're only accessible via machines connected with Tailscale. I got the configuration working on SWAG's side with the following lines added to the respective server blocks I want to limit access to:

     

     

    # Allow only Tailscale IP blocks
    allow 100.64.0.0/10;
    deny all;

     

    Unfortunately, when I check the logs, all connection attempts from Tailscale are logged as 172.19.0.1, or the internal Docker network gateway. Therefore, the rule ends up blocking connections.

     

    I'm guessing this is because the connection goes something like my computer -> Tailscale servers -> Tailscale plugin -> Docker network -> SWAG container. Is there a way to properly pass the Tailscale internal address to the SWAG container?

×
×
  • Create New...