Jump to content

IamSpartacus

Members
  • Posts

    802
  • Joined

  • Last visited

Posts posted by IamSpartacus

  1. So I just did a test Mirror A -> B incremental with the file I screen shot above.  The folder and file are identical on both servers, but it copies a new file over and I wind up with a duplicated directory and second file with the changed character on the destination.

     

    1.png.fcc9973ce235c3ca4c8b7990caeebd72.png

     

    2.PNG.e2d2c642ad162a0d4206dd74111219f0.PNG

     

  2. 7 minutes ago, ich777 said:

    Not in my case, can you create some files on your computer, zip it and send it over to me so that i can test it?
    Wich characterset are you using? As long as it is unicode it should work.

     

    See here.  The files are identical on both source and destination.

     

    1.png.0c364733d0089cd196d1ac6e905e75d4.png

    2.PNG

  3. Does anyone have an issue with DirSyncPro when trying to copy files that have an apostrophe in it?  It seems like it can't read character and thus even if it exists on the destination it copies the file over changing the ' to a ą instead creating a second different copy.

  4. Could anyone explain to me why the first part of the VM backup process includes a copy operation of my previous backup img file to a newly created img file BEFORE my VM shuts down and then continues on the backup process?  I'm just confused by what this first step is doing.

     

    2020-04-28 13:35:20 information: copy of backup of /mnt/user/data/backups/servers/athens/vm/SPE-DC1/20200427_0401_spe-dc1_vdisk1.img vdisk to /mnt/user/data/backups/servers/athens/vm/SPE-DC1/20200428_1335_spe-dc1_vdisk1.img starting.
    2020-04-28 13:42:24 information: copy of /mnt/user/data/backups/servers/athens/vm/SPE-DC1/20200427_0401_spe-dc1_vdisk1.img to /mnt/user/data/backups/servers/athens/vm/SPE-DC1/20200428_1335_spe-dc1_vdisk1.img complete.
    2020-04-28 13:42:24 information: skip_vm_shutdown is false. beginning vm shutdown procedure.

     

  5. 2 minutes ago, hugenbdd said:

    It only runs when scheduled.  Then checks the % of cache used to determine if it should continue to move files.

     

    Example: Schedule set for 3AM with 65% cache used setting.

     

    At 3AM the mover script runs.

    Check the space used on the cache.

    If space used greater than or equal to 65% it continue to invoke the mover script.

    If space used less than 65% it will exit the script with an entry in the log file with something like "mover not running as it has not reached the used threshold".

     

    Perfect!  That's exactly what I need.  Thank you!

  6. Can the setting 'Only move at this threshold of used cache space' be used in conjunction with the scheduler?  Such that the mover is scheduled to only run once a day, but it will only run if the used space is above the threshold?  Or does this setting always invoke the mover the moment the threshold is reached?

  7. 11 minutes ago, johnnie.black said:

    That's odd, pv is usually much faster than rsync and closer to real device speed, if it's not working for you you need to find another tool to test with, like I mentioned rsync is not a good tool for benchmarking.

     

    Using cp it seems to be getting over 1GB/s but it's hard to say for sure since there is no way to view real time progress with cp.

     

    EDIT:  used nload to view current network usage on the NIC while doing a cp to a SMB mount and I get 18Gbps so that is nice at least.

  8. 5 minutes ago, testdasi said:

    I'm a simpleton so rsync --progress or rsync --info=progress2

    And I use real data. It's not too hard to find a 40-50GB linux iso nowadays (or just use dd to create a really big test file and do rsync on it).

     

    image.png.b08b4a139250adb4d3767e62689b99d4.png

     

    No where near the speed the storage is capable but that may be an rsync limitation.  And IOwait was low during thing.  Seems I need to find a better transfer method than rsync and then figure out why my network transfers are still slow assuming they are with that method.

  9. 2 minutes ago, testdasi said:

    And what is your current directory when executing the dd command?

    I have found dd + dsync gives unrealistically low results. I think it's because dsync is sequential so your 32068 blocks are done one-by-one so the high IOWAIT is the 32k times dd had to wait to confirm data has been written.

    In reality, and particularly with NVMe, things are done in parallel and/or in aggregation so it's a lot faster.

     

    Have you also done a test on your other server to see if it is capable of more than 250MB/s read?

     

    Do you recommend a different test such as FIO?  Yes, internal tests with dd on each side of the storage (cache pool on one side, NVMe cache on the other) are each capable of 1.8-2.0GB/s reads/writes to the drive.

  10. 5 minutes ago, bonienl said:

    I am not sure how you are testing, but I have a cache pool of 4 devices in RAID10 mode, and can reach near 10Gb/s transfer speeds when copying files to or from the cache pool over SMB.

     

    These servers are direct connected.  So I don't really have any other way of testing other than an rsync or some other internal transfer tool.  I get poor speed whether I use nfs or smb.  

  11. 9 minutes ago, testdasi said:

    Details on test methodology please.

     

    Initially testing with the following command:

     

    Quote

    dd bs=1M count=32068 if=/dev/zero of=test conv=fdatasync

     

    Then testing from an unassigned disk to cache using rsync, iowait was much lower.  I'm trying to test writes from one server to the next using a cache enabled share but I seem to be only getting 200-250MB/s across my network right now which isn't making much sense since it's a 40GbE connection and iperf3 is showing it connected as such.

  12. I have two Unraid servers connected via direct connect 40GbE NICs and I'm looking for some advice on how best to tune NFS/SMB to be able to get the fast transfers possible between them.  The storage on each end of the transfers is each capable of 2.0GB/s from internal testing.  If I even can get half that I'd be happy but my initial testing is barely breaking 200-250MB/s.  As you can see from the below iperf3 testing, connectivity is not the bottleneck.

     

    The only thing I've tested changing is MTU from 1500 to 9000 for this direct connection but it makes no difference.  I've been testing using rsync between the servers.

     

    image.png.4797039bfaa96c56338c63b51269d680.png

  13. I've been aware of btrfs pools in unraid causing high iowait for a while and up to this point I've avoided the issue by using a single NVMe using XFS.  But circumstances have changed and I'm exploring using a cache pool of Intel S4600 480GB SSD's now.  I've been doing extensive testing with both the number of drives in the pool and the raid balance.  It seems that as the number of drives in the pool increases, so does the amount of IOwait.  It appears there is a specific IOwait number attached per drive.  The raid balance does not seem to have any effect other than shortening/prolonging the IOwait issue depending on the balance (ie. raid1/raid10 has a longer period of high iowait then raid0 obviously since the write takes longer).

     

    I have tested these drives connected both to my onboard SATA controller (SuperMicro X11SCH-F motherboard with C246 chipset) and also connected to my LSI 9300-8e SAS controller.  There is zero difference.

     

    I'm curious if any one has any insight on how to mitigate these IOwait issues.  My only solution at this moment appears to be using a RAID0 balance so that the writes are very fast (ie. 2.0GB/s with four S4600's) so that the iowait only lasts say 10-15 seconds for a 20GB write.  But this is obviously not sustainable unless I can ensure I never do large transfers to cache enabled shares which is kind of the whole point of having a cache.

     

     

    EDIT:  I should note, these tests were done using dd.  It appears that writes from an unassigned pool to cache does show much less iowait.  I guess that would make sense being that RAM is so much faster than storage.

  14. On 7/28/2019 at 12:31 PM, Siren said:

    Been a while since I got back to this thread. Was really busy and trying out some other possibilities to get 40g to work anywhere else. My goal isn't to turn this thread into a Mellanox tutorial, but I'll help out.

     

    Seems like your just trying to run the exe VIA double click. The exe doesn't work like that and is ONLY run by the command prompt/powershell (best to use command prompt). Steps below are how to find out the necessary card info and get it to work in ethernet mode:

     

    A: I'm assuming you've already downloaded and installed BOTH WinOF &  WinMFT for your card as well as you've already installed the card in your system. If not, head over to the Mellanox site and download  + install it. WinOF should automatically update the firmware of the card.

     

    B: In this example, I'm using my card which I've already listed above. Again, YMMV if you have a different model.

     

    1. Run Command prompt as Administrator. Navigate to where WinMFT is installed in Windows by default:

     

    
    cd C:\Program Files\Mellanox\WinMFT

     

    2.  Run the following command & save the info for later:

     

    
    mst status

     

    Your output should look something like this:

    
    MST devices: 
    ------------ 
         <device identifier>_pci_cr0 
         <device identifier>_pciconf<port number> 
    
    
    ##In my case: 
    
    MST devices: 
    ------------ 
         mt4099_pci_cr0 
         mt4099_pciconf0

     

    Note that any additional ports will also be shown here as well.

     

    3. Query the card and port to check on the mode it is using

     

    
    mlxconfig -d <device identifier>_pciconf<port number>
    
    ## in my case
      
    mlxconfig -d mt4099_pciconf0 query

     

    the output should be something similar below:

     

     

    What is in green is the port type for the card. Note that just because I have 2 ports there doesnt mean that I have 2 physical ports on the card. As I mentioned above in the thread, the card is a single port card. The port types are as follows:

    (1) = Infiniband

    (2) = Ethernet

    (3) = Auto sensing

     

    4. If your card is already in ethernet, then your good. If not, use the following command below to change it:
     

    
    mlxconfig -d <device identifier>_pciconf<device port> set LINK_TYPE_P1=2 LINK_TYPE_P2=2
    
    ##In my case
    
    mlxconfig -d mt4099_pciconf0 set LINK_TYPE_P1=2 LINK_TYPE_P2=2

     

    Output:

     

     

    Select Y and hit enter and the port type will change (note that the change will be under the New column).

    It'll then ask you to reboot the system to take effect.

     

    Do so and do step 3 again to verify it's changed. You should get nearly the same output as mine above.

     

     

    Is it possible to complete this process in Unraid or some other Linux (one that can be run from LiveC) distro?  I have no way of running Windows on my server currently and would love to get my Connectx-3 working in Unraid.

     

    EDIT:  I managed to install Windows onto a USB3 HDD to the above steps working perfectly.  Thanks!

  15. 3 minutes ago, itimpi said:

    It has to be UNRAID (I.e. all capitals) or it will not be found.

    the ‘df’ command output confirms that the flash drive is not mounted and until that is resolved you will not get your network up as the drivers and configuration need to be loaded from the flash drive.

     

    if you have a choice of USB ports then a USB2 one is preferred.    Some BIOS’s seem to not be too reliable in their handling of USB drives during the boot process - particularly when using a USB3 flash drive in a USB3 port.

     

    Just doesn't make any sense though.  If I take my other IDENTICAL flash drive that has my current Unraid install on it that I've been running for 4 months and plug it into the same port, the server boots fine.

×
×
  • Create New...