thefarelkid Posted May 6 Share Posted May 6 This started happening almost a year ago, but hasn't happened for a long while now. I do get docker.img warnings above 70% when I do updates to my dockers, but that is temporary and goes back to normal after updates are complete. But cache disks utilization at 100% is very confusing though as my cache disk is usually about 60%. Diagnostics attached. Thank you for anyone who can help. gemini-diagnostics-20240506-0540.zip Quote Link to comment
JorgeB Posted May 6 Share Posted May 6 Cache floor is set quite high for system and appdata shares, but the pools seems to still have some free space, you are you seeing that it's full? Also, zfs is detecting data corruption, post the output of zpool status -v Quote Link to comment
thefarelkid Posted May 6 Author Share Posted May 6 Thanks for your help Jorge. Both the main pool and cache don't seem to be too full usually. Main Pool is around 47% and cache is 60%. Also I just noticed I don't have a scrub schedule for my cache pool. Oops. root@Gemini:~# zpool status -v pool: cache state: ONLINE status: One or more devices has experienced an error resulting in data corruption. Applications may be affected. action: Restore the file in question if possible. Otherwise restore the entire pool from backup. see: https://openzfs.github.io/openzfs-docs/msg/ZFS-8000-8A config: NAME STATE READ WRITE CKSUM cache ONLINE 0 0 0 mirror-0 ONLINE 0 0 0 nvme1n1p1 ONLINE 0 0 0 nvme2n1p1 ONLINE 0 0 0 errors: Permanent errors have been detected in the following files: /mnt/cache/system/sanoid/cache_appdata/sanoid.conf Quote Link to comment
JorgeB Posted May 6 Share Posted May 6 2 minutes ago, thefarelkid said: Main Pool is around 47% and cache is 60%. So where were you seeing this? 4 hours ago, thefarelkid said: But cache disks utilization at 100% 2 minutes ago, thefarelkid said: /mnt/cache/system/sanoid/cache_appdata/sanoid.conf This file should be deleted/restore from a backup, and if more corruptions are found recommend running memtest. Quote Link to comment
thefarelkid Posted May 6 Author Share Posted May 6 That's what I see for disk utilization normally. But overnight I'll get notifications that read: Unraid Cache disk disk utilization Alert [GEMINI] - Cache disk is low on space (100%) Description PCIe_SSD_21010751200007 (nvme1n1) Priority alert. I got that at 1:11am. That was the last one I got. The first one started at 12:41am at 71% and went up from there. I will attempt to restore the sanoid.conf file and run a memtest to be sure. I'm trying to research what a sharecachefloor means, but not getting far. Quote Link to comment
JorgeB Posted May 6 Share Posted May 6 17 minutes ago, thefarelkid said: But overnight I'll get notifications that read: Unraid Cache disk disk utilization Alert [GEMINI] - Cache disk is low on space (100%) Description That suggests it's getting full, and then probably the mover moves some data, you will need to check that because the cache getting 100% full can cause other issues. Quote Link to comment
thefarelkid Posted May 6 Author Share Posted May 6 That was my thought. And you're right about other issues too. This morning the cache pool had returned to normal, but I couldn't get the docker service to start without initiating a full restart. Maybe I could have if I knew hot to do it from the command line. But back when this was a little more regular, the GUI would crash as well. The only thing that I can think that would do that much writing to the cache would be something like SABnzbd. But that hadn't had any activity since 11 last night. Could it be running a backup of my HomeAssistant VM? Those backups are only ~400MB though. No. They don't start running until 2:00am. I'm stumped for now. Is there a way to log the disk utilization by service? Quote Link to comment
Solution thefarelkid Posted May 7 Author Solution Share Posted May 7 I think I solved it. I remembered that I have Veeam running on a Windows PC that backs up every night at 12:30. Looks like last night was a rather large job. The target for that backup is a share that utilizes the cache for speed, and then the mover moves it off the cache. But since it's an overnight backup, I will move the target to a new share that doesn't touch the cache at all. Thanks for helping me discover my other issues though. I definitely need to sort out my zfs issues. Quote Link to comment
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.