Jump to content


  • Content Count

  • Joined

  • Last visited

Community Reputation

1 Neutral

About Ancan

  • Rank
    Advanced Member
  1. I'm keeping about 20 (four weekly and a bunch of daily). Perhaps I'm expecting too much to imitate the setup I've done on the NetApp's at work, but expected a bit more from btrfs is that is the case.
  2. Thanks! Then I'll ignore these in the future. and hope it goes away after a reformat.
  3. Hi all, I've got an annoying issue with BTRFS where accessing the file system sometimes stalls, snapshot transfers takes forever/never completes and for two nights in a row, a frozen Unraid-box in the morning. In the syslog, the only sign of anything abnormal is multiple "kernel: BTRFS info (device md6): the free space cache file (nnnnnnnn) is invalid, skip it". It's always the same disk, which is the disk I use to store backup-snapshots of VM's/containers which are transferred from the cache-drive nightly. There's no hardware related timeouts or similar which I'd suspect if there was a hardware issue. I assume it's some filesystem-issue, but "btrfs check" shows no errors, "scrub" finds no errors and mounting the disk with "clear_cache" option makes no difference. I'm currently evacuating the disk so I can reformat it but wanted to ask if anyone has any idea what the problem might be and if so, if it's possible to repair. Edit: On 6.8.2 now, but have been on the beta that had the 5.x kernel until today.
  4. I also run my VM's and dockers on it, and it's mounted on the back of the motherboard so no chance of any additional cooling. You learn new stuff all the time.
  5. Hi guys! I know there's probably a lot of LCC questions, but isn't this one a bit strange? When moving from my Synology to Unraid, I bought two Seagate IronWolf ST4000VN008-2DR166. They've been running about two months now, and already have 15500 and 30140 LCC. My two WD RED's that I moved over from the Syno are years old (oldest on 49k hours), but they have only 9000 and 5500. This seems normal?
  6. ...and here's an example on how I'm using it. This is my "daily-backup.sh", that is scheduled daily at 01:00. It snapshots and backs up all VM's and all directories under appdata. #!/bin/sh # VMs KEEP_SRC=48h KEEP_DST=14d SRC_ROOT=/mnt/cache/domains DST_ROOT=/mnt/disk6/backup/domains cd /mnt/disk6/backup virsh list --all --name | sed '/^$/d' | while read VM; do /mnt/disk6/backup/snapback.sh -c -s $SRC_ROOT/$VM -d $DST_ROOT -ps $KEEP_SRC -pd $KEEP_DST -t daily done # AppData KEEP_SRC=48h KEEP_DST=14d SRC_ROOT=/mnt/cache/appdata DST_ROOT=/mnt/disk6/backup/appdata for APP in $SRC_ROOT/*/; do /mnt/disk6/backup/snapback.sh -c -s $APP -d $DST_ROOT -ps $KEEP_SRC -pd $KEEP_DST -t daily done
  7. Hi all, Just spent the day creating a somewhat simple script for creating snaphots and transferring them to another location, and thought I'd throw it in here as well if someone can use it or improve on it. Note that it's user-at-your-own risk. Could probably need more fail-checks and certainly more error checking, but it's a good start I think. I'm new to btrfs as well, so I hope I've not missed anything fundamental about how these snapshots works. The background is that I wanted something that performs hot backups on my VM's that lives on the cache disk, and then moves the snapshots to the safety of the array, so that's more or less what this does, with a few more bells and whistles. - It optionally handles retention on both the primary and secondary storage, deleting expired snapshots. - Snapshots can be "tagged" with a label, and the purging of expired snapshots only affects the snapshots with this tag, so you can have different retention for daily, weekly and so on. - The default location for the snapshots created on the primary storage is a ".snapshots" directory alongside the subvolume you are protecting. This can however be changed, but no check is currenlty performed that it's on the same volume as the source subvolume. To use it there's some prerequisites: - Naturally both the source and destination volumes must be brtfs. - Also, all things you want to protect must be converted to a brtfs subvolume if they are not. - Since there's way to manage btrfs subvolumes that span multiple disks in unRAID, the source and destinations must be specified by disk path (/mnt/cache/..., /mnt/diskN/...). Note that this is a very abrubt way to protect VM's, with no VSS integration or other means of flushing guest OS file system. It's however not worse than what I've been doing at work with NetApp/vmware for years, and I've yet to see a rollback that didn't work out just fine there. Below is the usage header quoted, and the actual script is attached. Example of usage: ./snapback.sh --source /mnt/cache/domains/pengu --destination /mnt/disk6/backup/domains --purge-source 48h --purge-destination 2w -tag daily This will create a snapshot of the virtual machine "pengu" under /mnt/cache/domains/.snapshots, named something like pengu.2019-10-27@1828.daily. It will then transfer this snapshot to /mnt/disk6/backup/domains/pengu.2019-10-27@1828.daily. The transfer will be incremental or full depending on if a symbolic link called "pengu.last" exists in the snapshot-directory. This link always points to the latest snapshot created for this subvolume. Any "daily" snapshots on the source will be deleted if they are older than 48 hours, and any older than two weeks will be deleted from the destination. # snapback.sh # # A.Candell 2019 # # Mandatory arguments # --source | -s # Subvolume that should be backed up # # --destination | -d # Where the snapshots should be backed up to. # # Optional arguments: # # --snapshot-location | -s # Override primary storage snapshot location. Default is a directory called ".snapshots" that is located beside the source subvolume. # # --tag | -t # Add a "tag" on the snapshot names (for example for separating daily, weekly). # This string is appended to the end of the snapshot name (after the timestamp), so make it easy to parse and reduce the risk of # mixing it up with the subvolume name. # # --create-destination | -c # Create destination directory if missing # # --force-full | -f # Force a full transfer even if a ".last" snapshot is found # # --purge-source <maxage> | -ps <maxage> # Remove all snapshots older than maxage (see below) from snapshot directory. Only snapshots with specified tag is affected. # # --purge-destination <maxage> | -pd <maxage> # Remove all snapshots older than maxage (see below) from destination directory. Only snapshots with specified tag is affected. # # --verbose | -v # Verbose mode # # --whatif | -w # Only echoes commands, not executing them. # # Age format: # A single letter suffix can be added to the <maxage> arguments to specify the unit used. # NOTE: If no suffix is specified, hours are assumed. # s = seconds (5s = 5 seconds) # m = minutes (5m = 5 minutes) # h = hours (5m = 5 hours) # d = days (5d = 5 days) # w = weeks (5w = 5 weeks) snapback.sh
  8. Hi, I just finished converting all my drives to btrfs, for no other reason that I want to use the snapshot feature. Mainly I want to perform hot incremental backups of my (cache-stored) VM's and send the snapshots to a backup directory on the array. So I was a bit disappointed when I see I can't use btrfs features on the user shares. Creating a subvolume on a user-share just gives the error "ERROR: not a btrfs filesystem". I guess the driver providing the unified name space doesn't pass through these things. Are there any workarounds for this, except using a dedicated disk for the backups and access it directly? Thanks, Anders
  9. As I wrote I've already moved the disk to another slot, and the issue follows the parity disk so it shouldn't be cable/power related. On 6.7.3rc4 there's no problem. I've been running unbalance for hours now without a since hickup, and on 6.8 it's fine for a while, then device resets all the time. Might be related to the kernel and not unraid per se. Anyway, not directly related to this thread so I'll continue the discussion elsewhere if needed. Thank's for the heads-up on the f/w. Will try to upgrade now. My only controller so I hope nothing goes wrong.
  10. Here you go. I upgraded to 6.8, and ran some jobs until the error started again. nasse-diagnostics-20191015-0508.zip
  11. I'm not that lucky with 6.8rc. Transfer rates got really bad, and I got *lots* of these on the parity drive when I stress the array in any way. Oct 14 08:19:32 Nasse kernel: sd 10:0:5:0: attempting task abort! scmd(0000000009d51915) Oct 14 08:19:32 Nasse kernel: sd 10:0:5:0: [sdg] tag#2081 CDB: opcode=0x12 12 01 00 00 fe 00 Oct 14 08:19:32 Nasse kernel: scsi target10:0:5: handle(0x000b), sas_address(0x4433221107000000), phy(7) Oct 14 08:19:32 Nasse kernel: scsi target10:0:5: enclosure logical id(0x500605b005524f40), slot(0) Oct 14 08:19:32 Nasse kernel: sd 10:0:5:0: task abort: SUCCESS scmd(0000000009d51915) Oct 14 08:19:32 Nasse kernel: sd 10:0:5:0: Power-on or device reset occurred Swapped place of the parity drive and the issue followed, so I was afraid the drive was broken, but after going back to 6.7.3rc4 all is back to normal and transfer speeds are good again.
  12. So, in that >4GB space, what's the parity going to be compared to? There's no other data to perform a checksum with. "Let's see, I take the bit from this disk, and XOR it with..., well nothing at all, and then see if it's still the same value"? If you do checksum on the unused space, and for some reason a zero have turned into a one, a parity check won't catch that anyway, because 1 XOR nothing is still 1. Edit: Ok, the check is probably this way and then it'll put zeroes on the extra space: Start with zero, for each data disk, XOR the value, then compare/update to the parity disk. Sorry for being stubborn. It's no biggie really, but I still can't grasp the reason and it bugs me.
  13. Perhaps I'm thinking wrong here, but the "new" area wouldn't have any parity since it's calculated at writes. At least zeroing the parity disk at installation ought to remove the need to check this unused space allt he time.
  14. While my array consists of 4TB disks, I bought a 8TB disk to use as parity drive, so I won't have to juggle everything around if I'd buy a larger data disk in the future. When the parity check is running one would think it would be enough to just check the area up to the size of the largest data drive, but it always does the full 8TB which of course takes twice the time. Anyone knows if there's a reason for this?
  15. Hit this thread looking for info on the exact same message I got today. For me the shares still seemed to be up, and I could connect via SSH. Web-gui and the hosted VM's was dead though. Haven't done memtest, but plan to. Otherwise I've found out there's some stubborn issues with Ryzen on Linux, which might or might not be fixed by limiting the C-state the CPU is allowed to enter, or completely disable C-states at all. Hopefully a new fresh Linux kernel would help as well, but outlook doesn't look good for that since the latest beta is still on the old 4.19 LTS.