December 15, 2025Dec 15 Hoping somebody smarter than me can help me out. I am usually able to do my own research and find my answer around the web, but I'm afraid I might be in over my head with ZFS. I have two SSD's set up as a mirrored ZFS pool, the drives show up as 'Cache' and 'Cache 2'. I noticed a yesterday that I was having lots of random application errors and started digging into what was going on. I finally was able to find that when I investigated the "Pool Status", I had ~140 errors. Did some research and determined that a Scrub was the best path forward. Ran it, still had errors. Ran it again, and all errors seemed to be resolved. Crossed my fingers and hoped that it was resolved. I also ran extended SMART tests on both drives, both came back with no errors.Today when I woke up I again had over 100 data errors on the ZFS pool. I ran a scrub, but still see 3 data errors.I see the following under ZFS pool status: pool: cache state: ONLINE status: One or more devices has experienced an error resulting in data corruption. Applications may be affected. action: Restore the file in question if possible. Otherwise restore the entire pool from backup. see: https://openzfs.github.io/openzfs-docs/msg/ZFS-8000-8A scan: scrub repaired 24K in 00:13:07 with 1 errors on Mon Dec 15 09:07:47 2025 config: NAME STATE READ WRITE CKSUM cache ONLINE 0 0 0 mirror-0 ONLINE 0 0 0 /dev/sdb1 ONLINE 0 0 19.6K /dev/sdd1 ONLINE 0 0 19.6K errors: 3 data errors, use '-v' for a listMy question is... how do I recover from whatever it is that is going on? What steps should I be taking? tower-smart-20251215-0848.zip
December 15, 2025Dec 15 Author Apologies, grabbed the wrong file before. tower-diagnostics-20251215-0950.zip
December 15, 2025Dec 15 There are also filesystem issues with disk3:Dec 15 07:00:54 Tower kernel: XFS (md3p1): Metadata corruption detected at xfs_dinode_verify+0x398/0x6f0, inode 0x180000081 dinodeDec 15 07:00:54 Tower kernel: XFS (md3p1): Unmount and run xfs_repairMy recommendation is still to run memtest first.
December 15, 2025Dec 15 Author Unfortunately I think you may be correct. I found the GUI based memory test in the App Store, ran it and was instantly flooded with errors. Removed one stick of memory, booted back up, tested again, and look to be operating normally again. Thanks for your help. (Also fixed file system errors on disk 3, thanks for that catch also.)
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.