sjerisman

Members
  • Posts

    17
  • Joined

  • Last visited

Posts posted by sjerisman

  1. @Iker - Thanks for the great plugin!

     

    I'd like to propose two possible enhancements for a future version:

    1. Have a refresh icon per pool.  Right now there is only a global refresh icon.  This will cause disks in the array to spin up if any of them are formatted as zfs even if I'm only interested in getting updates for a 'cache' pool at the moment.  A 'refresh' icon per pool would allow targeted refreshing.  I could refresh just a 'cache' or UD' pool and let the array drives stay spun down.  Another possible option is to ignore refreshing pools where the datasets are currently hidden.
    2. Would it be possible to add the 'written' property to the Snapshot Admin table view.  It can be queried with `zfs list -o written` similar to how you are getting the other properties.  I usually find 'written' is more meaningful to me than 'used' when looking at historical snapshots.  My understanding is that 'written' refers to how much space was written to a filesystem or snapshot relative to the previous snapshot, whereas 'used' refers to the amount of unique data that applies to only that snapshot (not reused by any other further snapshots) and the amount that would be freed if it was deleted.  Both are useful, but I think 'written' is actually more relevant usually (at least to me).
  2. @SimonF - Thanks for the great plugin!  I was able to fairly easily get it to do most of what I want.  I did have to go research elsewhere how to setup the keyfiles for remote ssh access, but that was fairly easy to figure out and get working as well.

     

    I'm stuck on retention for the remote server though.  I understand that this is still not implemented on the source server side, but is there something I'm missing to be able to apply retention directly on the remote server?  What are other people doing to cleanup old snapshots that have been sent from another server?

     

    In my case, I have a BTRFS pool with some subvols on a 'source' UnRaid server that I want to keep ~3 days worth of hourly snapshots on, and then send them to a BTRFS pool on a 'remote' UnRaid server.  The remote server should then keep 14 days of the hourly snapshots, but then also take daily and weekly snapshots, each with their own retention periods.  So, I have one schedule on the source server to create the hourly snaps, tag them with 'Hourly' and send them to the remote server.  Then, I have two schedules on the remote server to create the daily and weekly snaps, tagged as 'Daily' and 'Weekly'.  I think the retention of the hourly snaps on the source server is working ok, and I think retention of the daily and weekly snaps on the remote server is working ok as well, but what should I do about the hourly snaps on the remote server?  I tried creating an hourly schedule on the remote server, basically just to handle the retention cleanup, but it doesn't seem like it considers the received incremental snaps when it applies retention.  Is this just a bug, or does it purposely exclude snaps that were received from a different server?

     

    So, again... what are other people doing for retention on 'remote' servers?  Maybe custom scripts, or am I just missing something obvious?

     

    And, just because it is fresh in mind... here are a few general useability observations:

    • It would be nice to be able to sort the list of snaps from newest to oldest instead of the current default of oldest to newest.
    • It would be nice to see the schedule's tag on the list view, maybe instead of the slot #.  Or, maybe allow a short description that could be shown there.
    • I keep accidentally clicking on the schedule's + icon thinking I'm editing the schedule row.  I'm not sure why that moves down from the subvol row to the schedule row once there are schedules?  Maybe leave the + on the subvol row, and then move the clock icon over to where the + currently is on the schedule row?  Maybe change the clock icon to a 'edit' icon as well?
  3. I also noticed that SAS SSDs or HDDs that show 'Accumulated power on time, hours' incorrectly show 'minutes xxxxxx' instead of 'hours xxxxxx' or just 'xxxxxx'.

     

    Example: 

    image.png.16314ba1b43456b74ee912244387856a.png

     

    Raw data:

    > smartctrl -a /dev/xxx

    ...
    === START OF READ SMART DATA SECTION ===
    SMART Health Status: OK
    
    Percentage used endurance indicator: 22%
    Current Drive Temperature:     31 C
    Drive Trip Temperature:        64 C
    
    Accumulated power on time, hours:minutes 27404:25
    ...

     

    27404 is the number of hours, not minutes.  Not sure if it is scrapping the value wrong, or just displaying it weirdly.

     

  4.  

    Hello,

     

    I have a bunch of SAS SSDs that currently only display drive temperature information in the SMART Attributes tab.

    The SSDs in question are: Toshiba PX04SVB096 and PX04SRB096, but I assume this applies to other SAS SSDs as well.

     

    It would be really nice if we could also see the 'Percentage used endurance indicator'.

     

    This value is available as `Percentage used endurance indicator: xx%` through either of these commands: 

    smartctl -a /dev/xxx
    smartctl -l ssd /dev/xxx

    Or, as `"scsi_percentage_used_endurance_indicator"= xx` though the JSON command:

    smartctl -a -j /dev/xxx

     

    Examples:

    > smartctl -a /dev/xxx

    ...
    === START OF READ SMART DATA SECTION ===
    SMART Health Status: OK
    
    Percentage used endurance indicator: 10%
    ...

    > smartctl -l ssd /dev/xxx

    ...
    === START OF READ SMART DATA SECTION ===
    Percentage used endurance indicator: 10%

    > smartctl -a -j /dev/xxx

    ...
      "smart_status": {
        "passed": true
      },
      "scsi_percentage_used_endurance_indicator": 10,
      "temperature": {
        "current": 25,
        "drive_trip": 64
      },
    ...

     

  5. 22 hours ago, JTok said:

    Sorry, I was unclear. What I meant was that once you switch to the new structure, backups made under the old structure will need manually cleaned out until only new structure backups exist.

    Just a thought... What if any old backup files were converted to the new format after a plugin upgrade or after opting into the new naming convention? (i.e. use the filenames read and then remove the timestamps and move the files to new subfolders)

  6. 16 hours ago, nextgenpotato said:

    Thanks to @sjerisman as well.

    Compression looks promising

    Yep, no problem.

     

    Results are definitely looking promising.  More details on the other thread: 

     

    Hopefully these changes will be integrated into this GUI plugin sometime next week.

     

  7. @Squid - Sorry up front if this has already been asked, but any thoughts on an option to use zstd compression instead of gzip?

     

    Here are some quick tests I did on two of my systems that shows much improved speed and slightly smaller sizes:

     

    System 1:

    > cd /mnt/user/appdata
    > du -d 0 -h .
    1.6G    .
    
    > time tar -czf /mnt/user/UnRaidBackups/AppData.tar.gz *
    real    1m17.710s
    user    1m6.245s
    sys     0m6.219s
    
    > time tar --zstd -cf /mnt/user/UnRaidBackups/AppData.tar.zst *
    real    0m24.039s
    user    0m10.248s
    sys     0m5.330s
    
    > ls -lsah /mnt/user/UnRaidBackups/AppData.tar.*
    814M -rw-rw-rw- 1 root root 814M Jan 16 14:28 /mnt/user/UnRaidBackups/AppData.tar.gz
    783M -rw-rw-rw- 1 root root 783M Jan 16 14:20 /mnt/user/UnRaidBackups/AppData.tar.zst

    System 2:

    > cd /mnt/user/appdata
    > du -d 0 -h .
    8.9G    .
    
    > time tar -czf /mnt/user/UnRaidBackups/AppData.tar.gz *
    real    4m55.831s
    user    4m19.009s
    sys     0m27.770s
    
    > time tar --zstd -cf /mnt/user/UnRaidBackups/AppData.tar.zst *
    real    2m1.380s
    user    0m35.069s
    sys     0m23.054s
    
    > ls -lsah /mnt/user/UnRaidBackups/AppData.tar.*
    4.6G -rw-rw-rw- 1 root root 4.6G Jan 16 14:39 /mnt/user/UnRaidBackups/AppData.tar.gz
    4.4G -rw-rw-rw- 1 root root 4.4G Jan 16 14:34 /mnt/user/UnRaidBackups/AppData.tar.zst

     

    • Like 1
  8. 13 hours ago, JTok said:

    Big fat thanks to @sjerisman for doing all the work to get a zstandard compression option added to the script!

    @JTok - Thanks for the shout out, and thanks for reviewing and releasing my changes so quickly!  I did this as much for my own benefit as anything else.  ;)

     

    13 hours ago, JTok said:

    This is going to be significantly faster than using the existing compression option, as well as much more efficient.

    Just to give everyone a bit of an idea of how much faster and more efficient this new inline compression option is, here are some results from one of my UnRaid servers:

     

    I currently have 4 (fairly small) VMs on this server (Win10, Win7, Arch Linux, and AsteriskNOW) running on a NVMe unassigned device and backing up directly to a dual parity HDD array (bypassing the SSD cache) with TurboWrite enabled.  This is running on a 4th Gen quad core CPU (i7-4770) with 32GB of DDR3 RAM.

    * The 4 VMs have a total of 60 G of raw image files with a sparse allocation of 33 G

    * The old post compression option took 55 minutes and produced a total of 17.15 G of compressed .tar.gz files

    * The new inline zstd compression option took less than 3 minutes and produced a total of 16.83 G of compressed .zst image files (using the default compression level of 3 and 2 compression threads)

    * Some of these VMs (Win10 and ArchLinux) have compressed file systems inside the vdisk images as well.  Without this, the vdisk sparse allocation would have been larger and the old compression code would have been even slower.

     

    OS           Size  Alloc  .tar.gz  .zst
    -----------  ----  -----  -------  ----
    Win10        30G   9.1G   6.7G     6.7G
    Win7         20G   18G    8.4G     8.2G
    ArchLinux    5.0G  2.0G   1.4G     1.4G
    AsteriskNOW  5.0G  3.9G   645M     573M

     

    I feel MUCH better now about having daily scheduled backups that go directly to the HDD array and allowing the disks to spin back down again so much more quickly!

     

     

     

  9. @JTok - I was able to do some more coding and testing on my open pull request: https://github.com/JTok/unraid-vmbackup/pull/23/files?utf8=✓&diff=split&w=1

     

    Additional changes:

    * I added seconds to the generated timestamps and logged messages for better granularity

    * I refactored the existing code that deals with removing old backup files (both time based as well as count based) to make it more consistent and easier to follow

    * I added support for removing old .zst archives (both time based as well as count based) using the refactored code above

    * I did a bunch of additional testing (including with snapshots on) and I think everything is working properly

     

    Let me know if you would like to see any other changes on this PR.  I can make a similar PR for the GUI/Plugin version after this one is approved and after I figure out how to properly test changes to the GUI/Plugin.

  10. And, I repeated the same Windows 7 'real' VM test one more time, but this time used the SSD cache tier as the destination instead of the HDD...

     

    With the old compression code, it took 1-2 minutes to copy the 18 GB image file from the NVMe UD over to the dual SSD cache, and then still took 13-14 minutes to further .tar.gz compress it down to 8.4 GB.  The compression step definitely seems CPU bound (probably single threaded) instead of I/O bound with this test.

     

    With the new inline compression code, it still only took about 1-2 minutes to copy from the NVMe UD and compress (inline) over to the dual SSD cache and still produced a slightly smaller 8.2 GB output file.  The CPU was definitely hit harder, and probably became the bottleneck (over I/O), but I'm really happy with these results and would gladly trade off higher CPU for a few minutes for much lower disk I/O, much less disk wear, and much faster backups.

  11. 57 minutes ago, sjerisman said:

    My current test VM is a single vdisk Windows 10 image, 30GB raw, 8 GB sparse, and compresses down to about 6.2GB.  With the old compression turned on it took between 2-3 minutes to copy the 8 GB sparse vdisk over to the array (from a NVMe cache tier), and about 8 minutes to compress it.  With the new compression turned on, it took between 1-2 minutes to perform the inline compression over to the array and then it was done.  The CPU usage was marginally higher, but in either case the primary bottleneck was array I/O (turbo write enabled).  This is running on my test UnRaid server which has a 4th Gen quad-core CPU (with not much else going on) and a 2x 2.5" 500 GB disk single parity array (writes cap out at around 80-90 MB/s).  Obviously results will vary depending on other hardware configs and VM disks.  I have .log files if interested.

     

    And here is another test that is closer to real world (and even more impressive)...

     

    I took a 'real' Windows 7 VM with a 20 GB raw sparse img file (18 GB allocated) and ran it through the old and new compression code.

     

    With the old compression code, it took 3-4 minutes to copy the 18 GB image file from a NVMe UD over to a HDD dual-parity array, and then another 14-15 minutes to .tar.gz compress it down to 8.4 GB.

     

    With the new inline compression code, it only took 2-3 minutes to copy from the NVMe UD and compress (inline) over to the HDD dual-parity array with a slightly smaller 8.2 GB compressed output size.  The CPU usage was marginally higher than during the .tar.gz step but was over much quicker.

  12. 5 hours ago, JTok said:

    Sorry, I wasn't clear. My tests were from an SSD cache array to the Parity Array. So that's also the bottleneck I was referring to.

    I was attempting to also point out that, to anyone interested in improving throughput, the biggest improvements will come from changing storage around to cut out the Parity Array. i.e. running the VMs on an NVMe unassigned device and backing up to an SSD cache array or vice versa.

    Yep, that makes sense.  I hadn't really considered backing up directly to the cache tier because for a lot of people that means their VMs and backups are on the same storage device(s) and it could fill up the cache quickly and add a lot of wear to the SSDs.  In thinking about it, I agree that backing up to a share that has cache: Yes, cache: Prefer, or cache: Only would definitely help with the I/O performance bottleneck.  But I think the script would still be doing things a bit inefficiently (including wearing out the cache faster) and would still be slower than inline compression.

     

    5 hours ago, JTok said:

    This seems viable, but there are some issues that I would need to handle related to backwards compatibility before switching compression algorithms outright.

    Yes, I agree that backwards compatibility would be nice to maintain.  My suggestion is to introduce a new option that, when set performs inline zstd compression, but when not set still behaves like before.

     

    5 hours ago, JTok said:

    I'm going from memory here, so possibly the details are wrong, but I believe it came down to being able to turn the VM back on sooner.

    That makes sense.  I think a large part of the point of zstd compression, however, is that most modern processors can easily handle real-time compression.  So, technically it will be faster to do inline compression because the bottleneck will still usually be I/O, but the amount of data that even needs to be written will usually be reduced.

    5 hours ago, JTok said:

    Thanks for looking into this btw!

    Yep, no problem.

     

    Here is a preliminary pull request to your script only GitHub repo: https://github.com/JTok/unraid-vmbackup/pull/23/files?utf8=✓&diff=split&w=1

     

    I still need to finish some testing (i.e. I have no idea what will happen with snapshots enabled), as well as implement the removal of old .zst archives based on age or count limits, but the base code seems to be working well on my test system.

     

    My current test VM is a single vdisk Windows 10 image, 30GB raw, 8 GB sparse, and compresses down to about 6.2GB.  With the old compression turned on it took between 2-3 minutes to copy the 8 GB sparse vdisk over to the array (from a NVMe cache tier), and about 8 minutes to compress it.  With the new compression turned on, it took between 1-2 minutes to perform the inline compression over to the array and then it was done.  The CPU usage was marginally higher, but in either case the primary bottleneck was array I/O (turbo write enabled).  This is running on my test UnRaid server which has a 4th Gen quad-core CPU (with not much else going on) and a 2x 2.5" 500 GB disk single parity array (writes cap out at around 80-90 MB/s).  Obviously results will vary depending on other hardware configs and VM disks.  I have .log files if interested.

     

  13. On 1/8/2020 at 10:37 PM, JTok said:

    I looked into this a little today, but this is by no means conclusive. In my tests so far I/O has been the biggest bottleneck, not the compression algorithm or number of threads. So using the parity array vs the cache array, or an unassigned device, is probably going to have the biggest effect on performance.
    Honestly, all things being equal, I only saw about a 15-20% performance improvement with my test VM (though I understand that there could be more pronounced differences with other use cases). I tested using zstd, lbzip2, and pigz.

    That being said, since there are some performance improvements with a multi-threaded compression utility, I am looking into a good way to integrate something.
    I suspect, that at least initially, I will stick with pigz because of backwards compatibility issues. Though I may look into adding an option for the other two later on.

     

    I assume most people host their VM image files on faster storage (i.e. SSD or NVMe cache or unassigned devices) and write their backups to the array.  The I/O performance bottleneck is mostly going to be with the array.  Currently, the script copies the image files from source to destination and then afterwards compresses them.  This results in writing uncompressed image files to the array, then reading uncompressed image files from the array, compressing them in memory, and finally writing the compressed result back to the array.  (i.e. READ from cache -> WRITE to array -> READ from array -> COMPRESS in memory -> WRITE to array)

     

    So, what about an option to use inline (zstd) compression per image file and eliminating the entire post compression step?  This would mean that all reads go against the faster storage tier and are compressed in memory prior to writing to the slower array tier.  (READ from cache -> COMPRESS in memory -> WRITE to array).


    Something like this (possibly with options to tweak the compression level and number of threads):

    zstd -5 -T0 --sparse "$source" -o "$destination".zst

    instead of this (and the later tar/compression step): 

    cp -av --sparse=always "$source" "$destination"

     

    In my testing, this dramatically reduces I/O and backup durations, and even results in slightly smaller archives (depending on compression levels chosen).

     

    Or, is there a reason the image files have to first be copied and later compressed?

     

    I might be able to whip up a pull request if that would be helpful.

  14. I am just about finished building a custom enclosure for my first unRAID build.  I am naming this unRAID server 'Trogdor'... just because, no special reason.  This unRAID server is primarily for Plex (docker), Transmission (docker), Jackett (docker), Sonarr (docker), Radarr (docker), Nextcloud (docker), and potentially a small handful of VMs (1x VoIP PBX, 1-2x Windows 10).

     

    image.thumb.png.a258a4dde908fec2410e2f811408b4c3.png

     

    Background:

     

    I have been using a custom-built Ubuntu/ZFS based server (virtualized on Hyper-V) for 8+ years, but am finally switching over to unRAID.  While I really like ZFS overall, I find its upgrade options to be really frustrating and lacking for the home user.  I have heard good things about unRAID and figured I would give it a try.  I fell in love with the user interface and easy Docker integration, and I really like the design decisions around the array/cache/parity architecture.

     

    The enclosure for my old 'server' is a 3U SuperMicro case with 8x hotswap 3.5" drive bays and 2x 5.25" drive bays.  It is installed in a DIY half height rack cabinet that I built by cutting an old networking style rack in half and building a MDF case around.  It has served pretty well, but I am trying to move away from noisy, power hungry, rack mount equipment to smaller, quieter, and more efficient equipment.

     

    The Design/Inspiration:

     

    I have always like the looks of the Synology NAS boxes (I just don't really like the hardware or OS or price), and wanted to come up with something similar.  The end result is custom built case that is 14" wide, 10" deep, and 8" tall.  It takes a microATX sized motherboard and TFX sized power supply.  There are two 120mm fans in the front, and three 92mm fans in the back (this is way more cooling than is actually needed).  The HDDs are mounted vertically in-between the front and rear fans with 1/2" gap between each HDD.  I originally designed this enclosure around the 12-drive 'Plus' unRAID license (2x SSD, 10x HDD), but had to scale it back slightly (2-4x SSD, 8x HDD) due to a power supply mis-calculation (I wasn't properly accounting for the spin-up current).  I may be able to find a different power supply to get back to 10x HDD if needed down the road (the PSU needs to be TFX or Flex ATX).  

     

    Currently installed hardware:

     

    • Motherboard: Dell Optiplex 9020 MT (MiniTower - microATX sized)
      • Required fabrication of a custom cable to use with a 'standard' PSU (also commercially available)
      • Info about other required modifications here
    • CPU: i7-4770 w/ Dell Optiplex 9020 MT heatsink/fan
    • RAM: 32GB DDR3 1600 MHz (4x8GB)
    • PSU: Dell 250W 80+ Gold TFX (PN: 0YJ1JT) (came from a Dell OptiPlex DT Desktop)
      • 17.8A on the 12V rail
      • (yes this might be a bit underpowered, but seems to work fine so far, and was really cheap... $10 shipped)
    • Storage Controller: PERC H310 8x SAS/SATA PCIe (flashed with IT firmware)
    • Boot drive: SanDisk Cruzer 16GB
    • Cache drives: 2x Samsung 840 EVO 120GB (will upgrade to 500GB 850 EVOs after retiring old server)
    • Parity drives: 2x 2TB Hitachi/Seagate 7200RPM
    • Data drives: 4x 2TB WD Greens (with head park times modified) (will add 2x more 2TB WD Greens after retiring old server)
    • Case fans:
      • 3x 92mm x 32mm: SUNAN PSD1209PLV2-A (came from old Dell desktops) 4-wire, 12V, 0.35A, 4.2W, 79 CFM, 4000 RPM, 42.1 dBA
      • 2x 120mm x 38mm: NMB-MAT 4715KL-04W-B56 (came from old Dell desktops) 4-wire, 12V, 1.3A, 15.6W, 129.9 CFM, 3600 RPM, ?? dBA

     

    I already had almost all of this hardware on hand (from previous builds or spare/used hand-me-downs), but did spend $10 for the PSU (eBay), $20 for the PERC H310 (eBay), and $65 to double the RAM from 16GB to 32GB (eBay).

     

    Power Consumption (measured at the wall) and Temperatures:

     

    • 'Idle' with 6x HDDs spun down: ~43W
    • 'Idle' with 6x HDDs spun up: 75-80W
    • Doing a parity check with all 6x HDDs spun up: 90-95W
    • The maximum while booting is about 160W while the drives are spinning up (Yes, I know this isn't an very accurate way to measure this)

     

    About half-way through a parity check, the drives are still around 22C (21-24), and the CPU is around 30C.  Ambient temperature is currently about 20C.  I think I have way more cooling than needed.  ;)

     

    The Build:

     

    I started by cutting the sides and top/bottom from 1/2" MDF on a tablesaw.  The top is made up of three pieces; the two front/back edges are part of the enclosure but the middle piece is removable for access to the insides.  The front/back were then cut to size from 3/16" hardboard on the tablesaw and then drilled and openings cut out with a scroll saw.  I also drilled and screwed in stand-offs for the motherboard in the bottom before assembly.

     

    image.thumb.png.36ff03af3c2ce244042ed3b0878c04aa.png

     

    I then used glue and brad nails to attach the bottom and top edge pieces to the sides, and finally the front/back pieces.

     

    image.thumb.png.6c24d3f086d5052a6075a230f1a9fb83.png

     

    Before going further, I did a test fit to make sure everything actually (still) fits.

     

    image.thumb.png.368a674b1b97080da471503e53ad33fb.png

    image.thumb.png.31534062b0a51184e965c0bf8041d30f.png

     

    Next up was using Bondo Body Filler (and a lot of sanding) to try and fill in the brad nail holes and a few low spots.

     

    image.thumb.png.d890b188fa7a4515be65917b91921df1.png

     

    Then, three coats of sandable spray primer/filler (with sanding in-between).

     

    image.thumb.png.9864345de8f014b8f064fd43b4a924f6.png

     

    And finally two coats of glossy black spray paint.  Unfortunately the imperfections really start to show here.  I need to learn how to do a better job of painting/prep.

     

     image.thumb.png.5791e632544306e23ed8b9b0af4d3dce.png

     

    Then comes assembly... (yes it is pretty tight, but it does all fit!)

     

    image.thumb.png.e066f44e75e08f600d957ed269494e47.png

     

    image.thumb.png.eebea725b12e4a922e9a281d63907a1c.png

     

    image.thumb.png.76ecf1ba8f6656d87c4a78cb6114cd12.png

     

    image.thumb.png.7f201444127db6a509b756f82eb5a6b7.png

     

    TODO:

    • Install a power indicator/button on the front (I have a nice blue illuminated one on order)
    • Add some USB ports to the front?
    • Make a dust cover for the front?
    • Make a custom fan controller to run the case fans slower/quieter

     

    Mistakes/Issues:

    • Everything is a bit tighter than I thought it would be.  I would probably make it slightly larger next time around (potentially including using a much larger ATX sized PSU).
    • The rear fans are a little too low and I had to notch one fan to be able to install the SAS card (this does help keep it in place though)
    • I mis-calculated the power supply requirements during initial design and had to ditch 2x intended HDDs from the design.
    • I am not very good with painting/prep. :(

     

    Let me know if there are any questions and I can try and post more details if needed.

     

    • Upvote 1