FatherChon

Members
  • Posts

    6
  • Joined

  • Last visited

Recent Profile Visitors

The recent visitors block is disabled and is not being shown to other users.

FatherChon's Achievements

Noob

Noob (1/14)

0

Reputation

  1. Thanks guys, that makes sense, still a bit confused as to why it showed up in lsof but when I actually ls'd the file it didn't exist, but I'll chalk that up to weirdness since that was the first VM I created and changed the cache type after that first VM was created. I've nuked the VM anyways because its data pulled from another server, I should've checked it for corruption but didn't think of that until now. I had the mover scheduled too so maybe that helped cause the issue? File was moved on the filesystem level but the system was still writing to where that file was pointing or something. So what happened in case anyone else runs in to this issue: - fresh install of unraid - added all disks and built array, set up btrfs, encryption, mover schedule - created first VM with defaults of "domains" with cache-prefer - noticed it was on cache and would exceed the cache drives size - changed domains to cache-no and changed the disks that domains live on - changed the vdisk to a different /mnt/disk# in the VM config - started up VM, vdisk1 wrote to cache and filled up filesystem - (this is where I'm not 100% sure if this is how it happened, assumptions for the events after this point) - mover probably ran and started moving vdisk1.img - file pointer moved off cache to disk1, but all the space was still utilized on cache? - VM was still running but writing to cache disk without a file - reboot cleared up the space? - VM started back up after reboot, writes happened during the evening, same thing happened. Still gonna give it a bit more time before I purchase, make sure all of my VMs work properly because I need kubernetes in my lab. I'd like to make sure that this is fully solved before I plunk down some cash
  2. So it happened again last night while I was sleeping. It was a gradual increase, around 500mb/minute. Root cause was a kubernetes job that was writing to disk - du still shows 62G - df showed 100% full I did an lsof and looked through all of the files, and found that the vdisk1.img that I created for a kubernetes VM, which I set to use the 12TB drive which is supposed to live on disk1, was also on cache. This was a disk for some machine learning datasets that I have. It looks like this vdisk was also living on the /mnt/cache behind the scenes. If I do an ls /mnt/cache/domains/kubernetes01/vdisk1.img, no file is found. shfs 17140 root 150r REG 0,33 2199023255552 1346821 /mnt/disk1/domains/kubernetes01/vdisk1.img shfs 17140 root 152w REG 0,45 131730452480 829434 /mnt/cache/domains/kubernetes01/vdisk1.img shfs 17140 17141 shfs root 150r REG 0,33 2199023255552 1346821 /mnt/disk1/domains/kubernetes01/vdisk1.img shfs 17140 17141 shfs root 152w REG 0,45 131730452480 829434 /mnt/cache/domains/kubernetes01/vdisk1.img I'm assuming that shfs was caching writes(reads too?) for this img, but this seems to be the only file that was getting cached. The rest of the imgs for my VMs are not on cache and don't show up like this. This was also the first VM that I created, and I had to manually change the domains share to cache no and select the included disk to disk 1/2 Is there a caching mechanism for VMs that are running on other disks? I'm not seeing any options in VM to enable or disable this.
  3. Trim plugin enabled, cache SSDs are actually connected to the onboard SATA which is an Intel controller, but the data drives are all on the LSI. Ran out of slots as the 2208 so the cache drives are in the chassis. Now that I've found some of the btrfs options, I may give a defrag and then a sync a shot if I see the same behaviour. So far I have not seen it in the data pools, VM storage has been fine and all the other drives match up nicely. root@unraid:~# btrfs filesystem df /mnt/cache Data, RAID1: total=69.00GiB, used=63.81GiB System, RAID1: total=32.00MiB, used=16.00KiB Metadata, RAID1: total=2.00GiB, used=65.20MiB GlobalReserve, single: total=22.61MiB, used=0.00B root@unraid:~# df -h /mnt/cache Filesystem Size Used Avail Use% Mounted on /dev/mapper/sdc1 280G 64G 214G 24% /mnt/cache root@unraid:~# du -sh /mnt/cache 62G /mnt/cache root@unraid:~# btrfs filesystem du -s /mnt/cache Total Exclusive Set shared Filename 61.76GiB 61.76GiB 16.00KiB /mnt/cache root@unraid:~# btrfs filesystem usage /mnt/cache Overall: Device size: 558.92GiB Device allocated: 142.06GiB Device unallocated: 416.86GiB Device missing: 0.00B Used: 127.75GiB Free (estimated): 213.62GiB (min: 213.62GiB) Data ratio: 2.00 Metadata ratio: 2.00 Global reserve: 22.62MiB (used: 0.00B) Data,RAID1: Size:69.00GiB, Used:63.81GiB /dev/mapper/sdb1 69.00GiB /dev/mapper/sdc1 69.00GiB Metadata,RAID1: Size:2.00GiB, Used:65.23MiB /dev/mapper/sdb1 2.00GiB /dev/mapper/sdc1 2.00GiB System,RAID1: Size:32.00MiB, Used:16.00KiB /dev/mapper/sdb1 32.00MiB /dev/mapper/sdc1 32.00MiB Unallocated: /dev/mapper/sdb1 208.43GiB /dev/mapper/sdc1 208.43GiB
  4. My VMs disks are running on the 1TB SSD that is in the pool, not on the cache drive. I'm not sure if KVM creates additional files on the cache drive, but the VMs are using raw as well. All of my docker containers are running on the cache though. I do have CoW enabled for the shares. Right now there is a 2G difference between what df and du show. I'm keeping an eye on it so we'll see if this changes. I've got 9 days left on my trial and other than this issue, I'm impressed, but when it filled up the cache drive it caused most of my docker containers to crash. I'll see if I can get an extension, but if I can't solve this then I'll be moving back to FreeNAS unfortunately. So far loving Unraid though, other than this.
  5. I had tried with no luck. Rebooted the host and the cache dropped back down to the expected size. I'm gonna start graphing this on a separate monitoring system to see if it's gradual or just suddenly was at 100%.
  6. So I've got a fresh unraid 6.5.3 build on the trial. I've got two Intel DC S3500 300GB SSDs for cache in btrfs raid1 that appear to fill up to full with no explanation. All of my shares are set to cache=no, drives are LUKS encrypted with btrfs. I've got 2 ubuntu VMs running and about 12 docker containers. When I run a du to show what is actually on the system, only 53gb is used. root@unraid:/mnt/cache# du -sh /mnt/cache 53G /mnt/cache root@unraid:/mnt/cache# du -sh /mnt/cache/* 2.8G /mnt/cache/appdata 19G /mnt/cache/domains 31G /mnt/cache/system When I run a df, it shows 100% root@unraid:/mnt/cache# df -h Filesystem Size Used Avail Use% Mounted on Filesystem Size Used Avail Use% Mounted on rootfs 63G 758M 63G 2% / tmpfs 32M 596K 32M 2% /run devtmpfs 63G 0 63G 0% /dev tmpfs 63G 0 63G 0% /dev/shm cgroup_root 8.0M 0 8.0M 0% /sys/fs/cgroup tmpfs 128M 1.5M 127M 2% /var/log /dev/sda1 15G 215M 15G 2% /boot /dev/loop0 7.5M 7.5M 0 100% /lib/modules /dev/loop1 4.5M 4.5M 0 100% /lib/firmware /dev/mapper/md1 11T 9.6T 1.5T 88% /mnt/disk1 /dev/mapper/md2 11T 9.6T 1.4T 88% /mnt/disk2 /dev/mapper/md3 7.3T 4.6T 2.8T 63% /mnt/disk3 /dev/mapper/md4 7.3T 4.6T 2.8T 63% /mnt/disk4 /dev/mapper/sdc1 280G 278G 128K 100% /mnt/cache shfs 37T 29T 8.2T 78% /mnt/user0 shfs 37T 29T 8.2T 78% /mnt/user Tried running a btrfs scrub, looked in each directory for any massive files but the amount of space there doesn't match up. Uptime is ~2 days Any ideas? I've manually ran the mover and it doesn't make a difference. Build: Intel S2600GZ Mobo 128GB DDR3 ECC RAM Intel E5-2670 LSI 2208 in IT mode to connect all drives Data drives 4x 12TB Seagate Ironwolf Pro 2x 8TB WD Reds 2x 5TB Toshiba 5TB SAS Enterprise 1x Crucial MX500 1TB for VMs Cache 2x Intel DC S3500 300GB SSDs