mgutt

Moderators
  • Posts

    11250
  • Joined

  • Last visited

  • Days Won

    123

Report Comments posted by mgutt

  1. 24 minutes ago, TexasUnraid said:

    as a packet capture showed a lot of failed and re-sent packets

    I found your capture results here:

    https://forums.unraid.net/bug-reports/stable-releases/slow-smb-performance-r566/page/2/?tab=comments#comment-9639

     

    I repeated this test as follows. At first I generated 200 random files and downloaded them on my W10 client:

    share_name="Music"
    mkdir "/mnt/cache/${share_name}/randomfiles"
    for n in {1..200}; do
        dd status=none if=/dev/urandom of="/mnt/cache/${share_name}/randomfiles/$( printf %03d "$n" ).bin" bs=4k count=$(( RANDOM % 5 + 1 ))
    done
    

    Then I started tcpdump as follows:

    tcpdump -s 200 -nn -i eth0 -w "/mnt/cache/tcpdump_$(date +'%Y%m%d_%H%M%S').cap" host 192.168.178.21 and port 445

    And then I uploaded the random files to a different path.

     

    Then I opened your dump and mine and used this filter to get only the SMB errors:

    smb2.error.context_count == 0

    And the results are completely different:

     

    1101603675_2021-01-1303_53_40.thumb.png.64cfd2daf5e6fb3a32166b9fb19f8c1d.png

     

    Then I reviewed your dump and I found out that you are not using Windows to copy your files as your process is much more complex:

    1030223990_2021-01-1303_59_10.thumb.png.d1c6fb86610383562e5468f72f1030fc.png

     

    At first it asks the server if the "FileGen 26139278788.bin" exists, which returns an "STATUS_NO_SUCH_FILE" error. Then it asks the server for "~vv1.tmp" which returns the "STATUS_OBJECT_NAME_NOT_FOUND" error, then it creates this tmp file and finally it renames it to "FileGen 26139278788.bin".

     

    Regarding my research "~vv1.tmp" files are created through ViceVersa. Is this correct? What do I need to set in this app to emulate your situation?

  2. 2 hours ago, TexasUnraid said:

    Multichannel memory?

    SMB Multichannel. It splits the transfer across all CPU cores on the client AND the server, which is a default behaviour of Windows. If the network adapters even supported RDMA, the CPU load is super low. You can see this in this video at 11:00.

     

    And compared to other operation systems, Unraid adds an overhead through FUSE/SHFS which @limetech described on the first page of this bug report. He even explained, that Unraid bypasses SHFS for VMs itself, by replacing /mnt/user paths against direct-disk access paths like /mnt/cache.

     

    I used the same trick to boost my Plex server:

    https://forums.unraid.net/topic/88999-unraid-tweaks-for-media-server-performance/?tab=comments#comment-898167

     

    Of course limetech still need to find a way to optimize the SMB<>SHFS situation, but you already have multiple options to bypass SHFS by yourself. You find many ways in my guide:

    https://forums.unraid.net/topic/97165-smb-performance-tuning/

     

    Regarding the bug itself: It's only a guess, but as the SMB session count explodes for small files, I would say that something like a "chunk size" between SMB and SHFS does not fit. Sadly we can't help limetech as the SHFS mount command / flags are part of the unraid source code. And another guess of me is, that the Samba process and the SHFS process(es) often use the same CPU core, so Samba is not able to fully utilize one core exclusively.

  3. It seems that German providers are too stingy to buy enough US traffic:

    https://telekomhilft.telekom.de/t5/Telefonie-Internet/Routing-zu-BitBucket-org-fehlerhaft-langsam/m-p/4246902#M1164467

     

    So the only solution is to host in europe or use Cloudfront (which is much more expensive). I think the best option would be OVH as they offer unlimited traffic even in their smallest webhosting packages. And the optional CDN service is cheap. It costs only ~2 € per month. But its limited to a maximum file size of 20 MB, so it would be needed to split the download file to multiple chunks to benefit from it.

     

    And my offer was serious, too. Feel free to test the speed:

    https://unraid.gutt.it/stable/unRAIDServer-6.8.3-x86_64.zip

     

    Other versions:

    https://unraid.gutt.it/

  4. I checked the network traffic through "Microsoft Network Monitor" and the USB Creator downloads from s3.amazonaws.com with the IP 52.216.26.222. Seems to be an US AWS data center location. As it's an encrypted https link I'm not able to see more.

     

    EDIT: Ok, I checked the hex code of the USB creator (hacker style ^^) and found some download links:

    https://s3.amazonaws.com/dnld.lime-technology.com/creator_branches.json

    https://s3.amazonaws.com/dnld.lime-technology.com/stable/releases.json

    https://s3.amazonaws.com/dnld.lime-technology.com/stable/unRAIDServer-6.8.3-x86_64.zip

     

    I downloaded the last URL by my browser and it's as slow as with the USB creator.

     

    I tested the same URL with my remote Unraid server which is connected to a Vodafone cable connection (> 500 Mbit/s, used by 30% of all Germans) and its even slower:

    1765275304_2020-11-2816_55_19.png.998b201585824290474ae416311c16a1.png

     

    Conclusion:
    This has nothing to do with my internet connection. Its related to AWS and it seems, that there is no CDN active (Cloudfront).

  5. 2 hours ago, itimpi said:

    but the vast majority of people do not see this and get very fast speeds.

    How do you know that?

     

    My internet connection is from Deutsche Telekom (40% of all Germans use this provider). It costs more than other providers, but it rarely suffers from bandwidth problems (thats why I'm choosed them).

     

    Is it possible that Unraid does not use the CDN services of AWS? Because the EU location is fast, while some international locations are slow (my download speed is ~70 Mbit/s), but there is no location with only 0.8 Mbit/s (the download speed of the usb creator):

    https://cloudharmony.com/speedtest-for-aws

    520876409_2020-11-2816_00_05.png.cda95c7332cc1e49a2924503c8ca0790.png

     

    I stopped this speedtest and repeated the USB Creator and again extremely slow:

    2132268114_2020-11-2816_01_12.png.4d707f66ee68a650e488b3eb262184a6.png

     

    What is the direct link to the Unraid image? I like to test the download speed outside of the USB Creator. By that I could test it with different internet connections as well.

     

    Is this forum hosted on AWS, too? This would explain why it is somethings so extremely slow.

  6. I have the same memory problem, but not because of usual syslog errors. Its only I because I enabled the mover logs and use xfs defragmentation which adds a massive amount of debug lines to the log.

     

    And this bug is not related to the size of /var/log. It's related to the PHP memory limit which is absolutely fine, but in /usr/local/emhttp/plugins/dynamix/include/Syslog.php this line is a problem:

    foreach (file($log) as $line) {

    This is a RAM killer as file() reads the complete log file into the RAM before executing further commands. It could be easily solved by replacing it against:

      $fh = fopen($log, "r");
      while (($line = fgets($fh)) !== false) {

    Or even better (which limits the output to 1000 lines):

      $i=0;
      $line_count = intval(exec("wc -l '$log'"));
      $fh = fopen($log, "r");
      while (($line = fgets($fh)) !== false) {
        $i++;
        if ($i < $line_count - 1000) {
          continue;
        }

    I tested this and now the RAM usage of my browser dropped by 1.6GB while viewing the syslog page and this is the first time I was able to open the syslogs through my smartphone which took ages before.

     

    The fixed file:

    Syslog.zip

     

    EDIT: Ok, I found the file in the repository:

    https://github.com/limetech/webgui/blob/master/plugins/dynamix/include/Syslog.php

     

    Will try to fix it ^^

     

    EDIT2: Yeah, my very first github pull request 😅

    https://github.com/limetech/webgui/pull/770

     

    Syslog.zip

    • Like 3
    • Thanks 1
  7. I bought two new USB3 adapters from different brands through Amazon. Both use the same "ASM1051E" controller from ASMedia, but the second does not display the correct "iSerial":

    Bus 002 Device 003: ID 174c:55aa ASMedia Technology Inc. ASM1051E SATA 6Gb/s bridge, ASM1053E SATA 6Gb/s bridge, ASM1153 SATA 3Gb/s bridge
    Device Descriptor:
      bLength                18
      bDescriptorType         1
      bcdUSB               3.00
      bDeviceClass            0 
      bDeviceSubClass         0 
      bDeviceProtocol         0 
      bMaxPacketSize0         9
      idVendor           0x174c ASMedia Technology Inc.
      idProduct          0x55aa ASM1051E SATA 6Gb/s bridge, ASM1053E SATA 6Gb/s bridge, ASM1153 SATA 3Gb/s bridge
      bcdDevice            1.00
      iManufacturer           2 ASMedia
      iProduct                3 ASM105x
      iSerial                 1 8CHUDHEE            
    ...
    Bus 002 Device 002: ID 174c:55aa ASMedia Technology Inc. ASM1051E SATA 6Gb/s bridge, ASM1053E SATA 6Gb/s bridge, ASM1153 SATA 3Gb/s bridge
    Device Descriptor:
      bLength                18
      bDescriptorType         1
      bcdUSB               3.00
      bDeviceClass            0 
      bDeviceSubClass         0 
      bDeviceProtocol         0 
      bMaxPacketSize0         9
      idVendor           0x174c ASMedia Technology Inc.
      idProduct          0x55aa ASM1051E SATA 6Gb/s bridge, ASM1053E SATA 6Gb/s bridge, ASM1153 SATA 3Gb/s bridge
      bcdDevice            1.00
      iManufacturer           2 CnMemory
      iProduct                3 USB3-SATA-Device
      iSerial                 1 123456789012

    But this seems not to be important as both forward the original disk ID:

    524558955_2020-10-1918_11_52.png.83072598160e8fcf1543531a0bc5763c.png

     

    But the most important part: Both support forwarding the standby mode which solves the issue:

     

    1226266321_2020-10-1918_04_50.png.ff61ca9c43dfaecfc0eaddd575515710.png

     

    sdb=disk3=bus002device003

    sda=disk6=bus002device002

     

    Performance is good as well:

    143150904_2020-10-1921_15_11.png.b20f322d8a4ebcd7874874d4a5f48d15.png

     

    As I was not able to find out how to check the sleep state if the usb adapter does not support the command, I close this bug report as its not solvable. The user needs to use a different adapter.

  8. Dirty hack

     

    Add this to the Go file:

    # -------------------------------------------------
    # USB HDD sleep bug fix
    # https://forums.unraid.net/bug-reports/stable-releases/683-usb-hdds-randomly-spin-up-but-status-stays-unchanged-r1091/?tab=comments#comment-11087
    # -------------------------------------------------
    sed -i -- 's/by %s -A/by %s -i/g' /usr/local/sbin/emhttpd

    Before that (which loads emhttpd):

     

    # -------------------------------------------------
    # Start the Management Utility
    # -------------------------------------------------
    /usr/local/sbin/emhttp &

     

    Note:

    • This will suppress displaying the temps!
    • And sadly it does not solve the icon issue. I think the active / inactive status is extracted from the "CHECK POWER MODE" status.
    • In very rare cases the USB disks still wake up. I need to investigate this further, but I think the USB adapter itself is the problem
  9. Ok, catched it:

    top - 22:56:25 up 22:11,  2 users,  load average: 0.01, 0.02, 0.00
    Tasks: 257 total,   1 running, 250 sleeping,   0 stopped,   6 zombie
    %Cpu(s):  0.6 us,  0.1 sy,  0.0 ni, 89.4 id,  9.7 wa,  0.0 hi,  0.0 si,  0.0 st
    MiB Mem :  64358.4 total,  20504.6 free,   9912.9 used,  33940.9 buff/cache
    MiB Swap:      0.0 total,      0.0 free,      0.0 used.  53088.3 avail Mem 
    
      PID USER      PR  NI    VIRT    RES    SHR S  %CPU  %MEM     TIME+ COMMAND
     4683 root      20   0  283604   4132   3416 S   0.2   0.0   3:24.90 /usr/local/sbin/emhttpd
     5256 root      20   0  149776   8300   3752 S   0.2   0.0   0:18.49 nginx: worker process
    17144 root      20   0       0      0      0 I   0.2   0.0   0:01.73 [kworker/3:3-events_freezable]
    25275 root      20   0    7464   4676   3620 D   0.2   0.0   0:00.01 /usr/sbin/smartctl -n standby -A /dev/sdb
    25276 root      20   0       0      0      0 Z   0.2   0.0   0:00.01 [smartctl] <defunct>
    25278 root      20   0    7464   4580   3524 D   0.2   0.0   0:00.01 /usr/sbin/smartctl -n standby -A /dev/sda
    25279 root      20   0       0      0      0 Z   0.2   0.0   0:00.01 [smartctl] <defunct>
        1 root      20   0    2468   1684   1576 S   0.0   0.0   0:10.52 init
    

     

    Let's verify it.

     

    Spin down USB... takes long time, so it's spinning down

    mdcmd spindown 6

    Check status of SATA... no delay

    /usr/sbin/smartctl -n standby -A /dev/sdg
    smartctl 7.1 2019-12-30 r5022 [x86_64-linux-4.19.107-Unraid] (local build)
    Copyright (C) 2002-19, Bruce Allen, Christian Franke, www.smartmontools.org
    
    Device is in STANDBY mode, exit(2)

    Check status of USB... takes very long time, so it's spinning up

    /usr/sbin/smartctl -n standby -A /dev/sda
    smartctl 7.1 2019-12-30 r5022 [x86_64-linux-4.19.107-Unraid] (local build)
    Copyright (C) 2002-19, Bruce Allen, Christian Franke, www.smartmontools.org
    
    CHECK POWER MODE: incomplete response, ATA output registers missing
    CHECK POWER MODE not implemented, ignoring -n option
    === START OF READ SMART DATA SECTION ===
    SMART Attributes Data Structure revision number: 16
    Vendor Specific SMART Attributes with Thresholds:
    ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
      1 Raw_Read_Error_Rate     0x000b   100   100   016    Pre-fail  Always       -       0
      2 Throughput_Performance  0x0005   132   132   054    Pre-fail  Offline      -       96
      3 Spin_Up_Time            0x0007   163   163   024    Pre-fail  Always       -       416 (Average 389)
      4 Start_Stop_Count        0x0012   100   100   000    Old_age   Always       -       401
      5 Reallocated_Sector_Ct   0x0033   100   100   005    Pre-fail  Always       -       0
      7 Seek_Error_Rate         0x000b   100   100   067    Pre-fail  Always       -       0
      8 Seek_Time_Performance   0x0005   128   128   020    Pre-fail  Offline      -       18
      9 Power_On_Hours          0x0012   100   100   000    Old_age   Always       -       4336
     10 Spin_Retry_Count        0x0013   100   100   060    Pre-fail  Always       -       0
     12 Power_Cycle_Count       0x0032   100   100   000    Old_age   Always       -       59
     22 Helium_Level            0x0023   100   100   025    Pre-fail  Always       -       100
    192 Power-Off_Retract_Count 0x0032   099   099   000    Old_age   Always       -       1683
    193 Load_Cycle_Count        0x0012   099   099   000    Old_age   Always       -       1683
    194 Temperature_Celsius     0x0002   187   187   000    Old_age   Always       -       32 (Min/Max 25/55)
    196 Reallocated_Event_Count 0x0032   100   100   000    Old_age   Always       -       0
    197 Current_Pending_Sector  0x0022   100   100   000    Old_age   Always       -       0
    198 Offline_Uncorrectable   0x0008   100   100   000    Old_age   Offline      -       0
    199 UDMA_CRC_Error_Count    0x000a   200   200   000    Old_age   Always       -       0

     

    Bug Fix

    "smartctl" ignores "-n standby" as the USB device does not support power checks:

    CHECK POWER MODE: incomplete response, ATA output registers missing
    CHECK POWER MODE not implemented, ignoring -n option

    The same message is displayed if the "-i" option is used, but this does not spin up the drive, so I suggest to use this command, before executing "-A" (to obtain the temps/errors etc):

    /usr/sbin/smartctl --nocheck standby -i /dev/sdb

    Sadly this won't solve the active / inactive icon, temps, etc as Unraid seems to rely on the correct answer of the "-A" option.

     

    I was not able to find out if the disk is active or not. I tried to obtain the usb power:

    lsusb -v|egrep "^Bus|MaxPower"

    But it returns all the time "896mA", sleeping or not. Even fully disabling the USB port is not possible anymore, as "suspend" has been removed from recent Kernels:

    https://unix.stackexchange.com/a/166601/101920

     

     

  10. At the moment I try to observe this. This are the things I found out:

     

    Status

     

    USB = sdb = disk3

    SATA = sdg = disk7

     

    USB

    smartctl --nocheck standby -i /dev/sdb
    
    smartctl 7.1 2019-12-30 r5022 [x86_64-linux-4.19.107-Unraid] (local build)
    Copyright (C) 2002-19, Bruce Allen, Christian Franke, www.smartmontools.org
    
    CHECK POWER MODE: incomplete response, ATA output registers missing
    CHECK POWER MODE not implemented, ignoring -n option
    === START OF INFORMATION SECTION ===
    Model Family:     HGST Ultrastar DC HC520 (He12)
    Device Model:     HGST HUH721212ALE604
    Serial Number:    8CHUDHEE
    LU WWN Device Id: 5 000cca 26fd9a3b2
    Firmware Version: LEGNW3D0
    User Capacity:    12,000,138,625,024 bytes [12.0 TB]
    Sector Sizes:     512 bytes logical, 4096 bytes physical
    Rotation Rate:    7200 rpm
    Form Factor:      3.5 inches
    Device is:        In smartctl database [for details use: -P show]
    ATA Version is:   ACS-2, ATA8-ACS T13/1699-D revision 4
    SATA Version is:  SATA 3.2, 6.0 Gb/s (current: 6.0 Gb/s)
    Local Time is:    Fri Oct 16 21:32:04 2020 CEST
    SMART support is: Available - device has SMART capability.
    SMART support is: Enabled

    SATA

    smartctl --nocheck standby -i /dev/sdg
    smartctl 7.1 2019-12-30 r5022 [x86_64-linux-4.19.107-Unraid] (local build)
    Copyright (C) 2002-19, Bruce Allen, Christian Franke, www.smartmontools.org
    
    Device is in STANDBY mode, exit(2)

     

    USB

    hdparm -C /dev/sdb
    
    /dev/sdb:
    SG_IO: bad/missing sense data, sb[]:  70 00 01 00 00 00 00 0a 00 00 00 00 00 1d 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
     drive state is:  unknown

    SATA

    hdparm -C /dev/sdg
    
    /dev/sdg:
     drive state is:  standby

    Both commands do not wake up the disks.

    Monitoring

    - If the USB disks randomly spin up, they spin up both. Its never only one of it.

    - inotifywait does not return any file access

     

    Does Unraid periodically execute a command which checks the HDD status or similar?

     

    Next step is this command to log all processes:

    top -b -c -d 5 > /mnt/cache/top.log

    Sadly "lastcomm" is not available because of "accton: Function not implemented" which should mean the Linux Kernel of Unraid has not enabled accounting.

  11. @sonic6

    You have many requests because of a postgresql database. If you need it, there would be nothing you could optimize. But I would say these writes happen to often :

    MODIFY /var/lib/docker/containers/ced93fa7c199b90a5414e103be9871a25cae5b8e3fdaab059b1517e34ad0150f/ ced93fa7c199b90a5414e103be9871a25cae5b8e3fdaab059b1517e34ad0150f-json.log

    Check in your Container Dashboard for a container ID (advanced view) starting with "ced93fa7". Which docker is it? Does it have a logging setting which you could disable?