Jump to content

bastl

Members
  • Posts

    1,267
  • Joined

  • Last visited

  • Days Won

    3

Posts posted by bastl

  1. @duffbeer Trust me the "EPYC tweak" is a custom edit done by the user and not from a setting made by Unraid. Usually only the following is needed and is reported in a couple tutorials here in the forums. In Unraid you only can set it to emulate a QEMU64 CPU not an EPYC or Skylake or whatever.

      <cpu mode='custom' match='exact' check='full'>
        <model fallback='forbid'>EPYC</model>
        <topology sockets='1' cores='7' threads='2'/>
        <cache level='3' mode='emulate'/>
        <feature policy='require' name='topoext'/>
        <feature policy='disable' name='monitor'/>
        <feature policy='require' name='hypervisor'/>
        <feature policy='disable' name='svm'/>
        <feature policy='disable' name='x2apic'/>
      </cpu>

    Forcing the VM to use some specific CPU features can end up emulating features wich aren't present in the physical CPU. In general this tweaks are needed for first and 2nd gen Ryzen chips and Threadrippers. Maybe try to remove them and test without. The only one which by default Unraid sets is the following.

      <cpu mode='host-passthrough' check='none'>
        <topology sockets='1' cores='3' threads='2'/>
        <cache mode='passthrough'/>
        <feature policy='require' name='topoext'/>
      </cpu>

    Try it with the snippet above and adjust the core/thread counts so it matches your needs.

  2. You did some custom edits in the xml for the CPU. Not sure if you found it on the unraid forum or somewhere else. Usual it's used to get better performance and compatibility with Ryzen CPUs. This long list of defined feature sets I never used myself before, only a couple of them. As long as it works for you and the performance is ok, I see no issue here.

     

      <cpu mode='custom' match='exact' check='full'>
        <model fallback='forbid'>EPYC-IBPB</model>
        <vendor>AMD</vendor>
        <feature policy='require' name='x2apic'/>
        <feature policy='require' name='tsc-deadline'/>
        <feature policy='require' name='hypervisor'/>
        <feature policy='require' name='tsc_adjust'/>
        <feature policy='require' name='clwb'/>
        <feature policy='require' name='umip'/>
        <feature policy='require' name='stibp'/>
        <feature policy='require' name='arch-capabilities'/>
        <feature policy='require' name='ssbd'/>
        <feature policy='require' name='xsaves'/>
        <feature policy='require' name='cmp_legacy'/>
        <feature policy='require' name='perfctr_core'/>
        <feature policy='require' name='clzero'/>
        <feature policy='require' name='wbnoinvd'/>
        <feature policy='require' name='amd-ssbd'/>
        <feature policy='require' name='virt-ssbd'/>
        <feature policy='require' name='rdctl-no'/>
        <feature policy='require' name='skip-l1dfl-vmentry'/>
        <feature policy='require' name='mds-no'/>
        <feature policy='require' name='pschange-mc-no'/>
        <feature policy='disable' name='monitor'/>
        <feature policy='require' name='topoext'/>
        <feature policy='disable' name='svm'/>
      </cpu>

     

  3. @ernestp I had the same issue caused by Kaspersky Internet Security installed inside the VM. By default KIS checks if any virtualisation features are available and try's to use it. Any time i tried to change some settings in KIS or even uninstalling it, it will crash the VM. I couldn't even dissable this function. Changing it to "host-model" or to "custom" allowed me to dissable this feature in Kaspersky. I guess for you there are features like Hyper-V enabled which sees some CPU features from the host system and tries to use them. Try to dissable Hyper-V if you don't need them or use the "custom"  or "host-model" cpu flag.

  4. @luca2 You can use a modified 7zip version for example. Either you install that modified version or like I did, only use the desired zstd codec in the already installed mainline 7zip version.

     

    https://github.com/mcmilk/7-Zip-zstd#zstandard-codec-plugin-for-mainline-7-zip

     

    You should be also able to decompress the file using tar

     

    https://stackoverflow.com/questions/45355277/how-can-i-decompress-an-archive-file-having-tar-zst

  5. 55 minutes ago, luca2 said:

    What is actually the .ZST file

    ZST files are compresses vdisk files if you're using the "Zstandard compression" option. Usually the vdisk files is copied first, compressed into a ZST file and the vdisk copy is been removed after compression. In your case there is an older backup in the same folder which isn't compressed. You have to clean your backup destination first or you will end up in 2 big files, wasting space. Check the help!

     

    grafik.png.20056a01d4757fa8b085e03f4629af14.png

  6. @christopher2007 Did you already tried to give your server a fixed IP adress (outside of dhcp range of course)? Right now it is set to use DHCP and maybe you have a device on your network with that same IP already in use or with the same IP set as static. That could be a reason why the connection gets dropped. Just an idea. Even if I don't think DHCP is the issue here it's worth a try. If you have any IP address conficts you should see it in your router/dhcp-server logs.

     

    Here another thread where "carrier lost" is logged very often but in a bonding nic scenario with dhcp. Static IP in this case was the solution.

     

    I found a couple other reports on the web. One example an RasberryPI dropping lan connection with similar error logs caused by an underpowered power supply. Can you check how many watts your server is pulling from the wall and check your PSU. I guess the PSU is strong enough but maybe check if all the cabling is ok. Some boards have an extra 4 or 6 pin connector which is needed if you have lots of PCIE slots and all are populated with devices like GPUs which can pull so much power, that the standard 4/8pin 12v power delivery can become unstable.

     

    Maybe try the card in another PCIE slot in case the slot is shared with other devices. Maybe the slot the card is used with has issues. Who knows.

     

    Do you ever had any instabilities with your server? Overclocked? memory XMP profile? Cooling ok?

     

    Could also be a configuration on the switch/router to limit the amount of MAC addresses allowed on a single port that causes the dropout

    https://stackoverflow.com/questions/17564620/what-will-make-carrier-changed-or-lost-on-linux

     

    • Like 1
  7. @christopher2007 Also try to set the MTU size (Jumbo Frames) on all devices (unraid, client, switch) to 9000 and try if that helps. On Unraid you can find it on the settings page for the specific interface and on Windows itself you should be able to find the settings in the device manager for the card. Either there is a "Jumbo Frames" entry you have to enable or MTUSize you can define.

    • Like 1
  8. Quick search on the forum for the Asus card shows that this card should work since Unraid version 6.7

     

     

    Not sure if you have to increase the MTU size server and client side from I think default 1500 to something around 9000 to gain better performance, but without using a highspeed storage like an ssd you won't see that much of an icrease compared to a 1gig nic. Remember writting directly to the array only saturates one disc and the drive is the the limiting factor. On my last Unraid build I had a Aquantia 10G nic on my board, but never used it because never had an second client providing that speed nor did I had a 10gig switch.

     

    20 minutes ago, christopher2007 said:

    Is there a way to install a newer version?

    Usually the drivers come with the kernel itself or been added by Limetech in a newer Unraid build. As people reported Aquantia nics working I guess the current 6.8.3 should already come with it and from your devices list the atlantic driver is loaded.

    01:00.0 Ethernet controller [0200]: Aquantia Corp. AQC107 NBase-T/IEEE 802.3bz Ethernet Controller [AQtion] [1d6a:07b1] (rev 02)
    	Subsystem: ASUSTeK Computer Inc. Device [1043:8741]
    	Kernel driver in use: atlantic
    	Kernel modules: atlantic

    Some parts of your logs for the nic

    Jul 18 13:09:54 TS-Alt kernel: atlantic: link change old 1000 new 0
    Jul 18 13:09:54 TS-Alt kernel: br0: port 1(eth0) entered disabled state
    Jul 18 13:09:55 TS-Alt dhcpcd[1751]: br0: carrier lost
    Jul 18 13:09:55 TS-Alt dhcpcd[1751]: br0: deleting route to 192.168.0.0/24
    Jul 18 13:09:55 TS-Alt dhcpcd[1751]: br0: deleting default route via 192.168.0.1

    Not sure if it depends on you changing some network configs earlier or not, but I guess it's not an error dropping the connection. You might watch this in your logs if you test with the ASUS nic later if it somehow reports that carrier lost when transfering your backup.

    31 minutes ago, christopher2007 said:

    And @bastlcreepy, but I guess the geizhals link has betrayed me, hasn't it?

    I first saw your timezone is the same as mine and than your geizhals link. Just adding 1+1 together 😁

    • Like 1
  9. 4 minutes ago, johnnie.black said:

    Those values are normal for Seagate drives, more info here.

    WTF Seagate... So all his reported Seak or Read errors are 8 digits and it's just fine and normal? 😂

    Why is something even possibel that a drive reports errors and there are actually no errors. Isn't this something that maybe Limetech can patch or is it a "some Seagate drive thing" only?

  10. 6 minutes ago, christopher2007 said:

    Better use SMB shares in Windows over IP Address or over computer name?

    Depends on how you manage your local DNS settings. If you have DNS lookup issues on your network, direct connections via IP should work without DNS. In my current test I'am using Unraids DNS entry as path for Macrium to backup to. Still running and 300 gigs already written.

     

    I found a couple things in the smart logs for your disks you might have to look into

     

    for sdb for example there are lots of read and seak errors

    SMART Attributes Data Structure revision number: 10
    Vendor Specific SMART Attributes with Thresholds:
    ID# ATTRIBUTE_NAME          FLAGS    VALUE WORST THRESH FAIL RAW_VALUE
      1 Raw_Read_Error_Rate     POSR--   100   065   044    -    188393
      3 Spin_Up_Time            PO----   091   089   000    -    0
      4 Start_Stop_Count        -O--CK   100   100   020    -    50
      5 Reallocated_Sector_Ct   PO--CK   100   100   010    -    0
      7 Seek_Error_Rate         POSR--   077   060   045    -    48327554
      9 Power_On_Hours          -O--CK   100   100   000    -    404

    These errors are shown for all your array disks. One of my 3 spinners only shows one single seak error. My disks are almost 6 years old now. Could be an issue with your controller as well. Maybe @johnnie.black can also have a look into your logs if he has time and might have some hints for you.

  11. For me it somehow looked either to an networking issue from the Windows client or Unraid itself the way how you describe it. I currently have Macrium 7.2 Free installed and testing to backup with defaults a 1TB SSD, 780GB filled to an default unraid share not using the cache. Small difference, I only have the client in a VM on the same Unraid build, but with it's own IP. The disks where I backing up to have 1,5TB free space. As for now everything looks ok to me. 100 gigs already transferred and I'am able to brows other shares without any hickups or slowdowns. Either your switch, your cables or windows itself has some issues I guess.

     

    Any extra virus scanners installed?

    Do you have another Windows client where you can test how stable the connection during the backup is?

    Do you have any IDS system on your network analysing your LAN traffic?

    • Like 1
  12. @christopher2007 From your screens everything looks ok to me. Enough space to store 500GB, default minimum free space of 0Kb as you said tested, not using the cache. Did you looked into your logs from unraid around the time where the data transfer stopped? Maybe have a look into the smart reports for the disks as well if there are any errors logged. Next time the backup "freezes" pull down your diagnostics from unraid before restarting the server and posting them. Also have a look at you disk temps when the backup job is running.

    • Like 1
  13. @christopher2007 Just a small question, does Macrium have an option to split the created backup file? I ask because all backup software I know are having this option and for most of them splitting is the default setting. The reason for this option is for most cases users upload their backups to cloud storage or via network to a nas or some sort of a server. Network traffic can be interrupted, packets getting dropped and have to be resend. Smaller files in this case have a huge advantage. Lets say your software splits the backups into 50MB chunks locally, starts to upload the first file. After finished it checks if the hash of the remote file matches the local one. In case it doesn't the software only reuploads that 50MB file. This is way quicker with smaller files than creating a huge 500GB file, waiting for the upload and having that huge file reupload again.

     

    Unraid doesn't know how large your file will be in the end and depending how you set the share it will store it on the first drive where space is. I guess reaching the limits for the drive and starting to move it to another disk can cause an interruption which breaks the datastream from Macrium to the share.

  14. 1 hour ago, Masterbob221 said:

    IOMMU group 0:[1022:1482] 00:01.0 Host bridge: Advanced Micro Devices, Inc. [AMD] Starship/Matisse PCIe Dummy Host Bridge

    [1022:1483] 00:01.2 PCI bridge: Advanced Micro Devices, Inc. [AMD] Starship/Matisse GPP Bridge

    [1022:57ad] 01:00.0 PCI bridge: Advanced Micro Devices, Inc. [AMD] Matisse Switch Upstream

    [1022:57a3] 02:02.0 PCI bridge: Advanced Micro Devices, Inc. [AMD] Matisse PCIe GPP Bridge

    [1022:57a3] 02:03.0 PCI bridge: Advanced Micro Devices, Inc. [AMD] Matisse PCIe GPP Bridge

    [1022:57a3] 02:04.0 PCI bridge: Advanced Micro Devices, Inc. [AMD] Matisse PCIe GPP Bridge

    [1022:57a3] 02:05.0 PCI bridge: Advanced Micro Devices, Inc. [AMD] Matisse PCIe GPP Bridge

    [1022:57a4] 02:08.0 PCI bridge: Advanced Micro Devices, Inc. [AMD] Matisse PCIe GPP Bridge

    [1022:57a4] 02:09.0 PCI bridge: Advanced Micro Devices, Inc. [AMD] Matisse PCIe GPP Bridge

    [1022:57a4] 02:0a.0 PCI bridge: Advanced Micro Devices, Inc. [AMD] Matisse PCIe GPP Bridge

    [10de:2182] 03:00.0 VGA compatible controller: NVIDIA Corporation TU116 [GeForce GTX 1660 Ti] (rev a1)

    [10de:1aeb] 03:00.1 Audio device: NVIDIA Corporation TU116 High Definition Audio Controller (rev a1)

    [10de:1aec] 03:00.2 USB controller: NVIDIA Corporation Device 1aec (rev a1)

    [10de:1aed] 03:00.3 Serial bus controller [0c80]: NVIDIA Corporation TU116 [GeForce GTX 1650 SUPER] (rev a1)

    [8086:24fb] 04:00.0 Network controller: Intel Corporation Dual Band Wireless-AC 3168NGW [Stone Peak] (rev 10)

    [8086:1539] 05:00.0 Ethernet controller: Intel Corporation I211 Gigabit Network Connection (rev 03)

    [10ec:8125] 06:00.0 Ethernet controller: Realtek Semiconductor Co., Ltd. RTL8125 2.5GbE Controller

    [1022:1485] 07:00.0 Non-Essential Instrumentation [1300]: Advanced Micro Devices, Inc. [AMD] Starship/Matisse Reserved SPP

    [1022:149c] 07:00.1 USB controller: Advanced Micro Devices, Inc. [AMD] Matisse USB 3.0 Host Controller

    [1022:149c] 07:00.3 USB controller: Advanced Micro Devices, Inc. [AMD] Matisse USB 3.0 Host Controller

    [1022:7901] 08:00.0 SATA controller: Advanced Micro Devices, Inc. [AMD] FCH SATA Controller [AHCI mode] (rev 51)

    [1022:7901] 09:00.0 SATA controller: Advanced Micro Devices, Inc. [AMD] FCH SATA Controller [AHCI mode] (rev 51)

    What Ghost82 said in general is true, but all these devices are in the same group and have to be passed through to make it work. In your case that will break a lot of things. You have a couple options to separate the GPU in it's own group. You can try to use a different PCIE slot and check if the groupings are better or you use the ACS Override Option to split the groups. Most modern motherboard BIOS also have an option to enable IOMMU, which can help to split up the groups.

     

    Don't try to pass all these devices to a VM, or unraid will loose access to all these devices (network, USB, sata controller). You can even break it by only trying to use one device in a VM and the rest can lag out, crash or completely dissapear from unraid.

    • Thanks 1
×
×
  • Create New...