October 26, 20214 yr I recently started noticing my cache filling up and not going back down in used space. I have the mover set to run everyday at 3am, yet it seems like some data is staying on my cache pool. I enabled logging on the mover, and ran I ran it I saw this pop up. Oct 25 23:25:19 Installation03 kernel: BTRFS warning (device sdb1): csum failed root 5 ino 21216272 off 4594712576 csum 0xad6138ce expected csum 0xb6089ce0 mirror 2 I'm assuming data got corrupt when I was copying it across the network, but I'm not sure if my cache SSD is going bad (less than 8 months old) or if it's memory related. Any help would be greatly appreciated.
October 26, 20214 yr Author Sorry. Was posting from mobile last night and my phone hates downloading zip files. I have attached the diags below. installation03-diagnostics-20211026-0708.zip
October 26, 20214 yr Community Expert Btrfs is detecting data corruption, you should start by running memtest.
October 26, 20214 yr Author I'm trying to launch memtest from my unRAID USB, but when I select memtest, it says ok then reboots and goes back to the unRAID bootloader.
October 26, 20214 yr Community Expert Memtest only works with CSM/legacy boot, it won't with UEFI boot.
October 26, 20214 yr Author Nevermind. Had to download a Uefi version and boot that. How long should I let it run for? I'm doing the default 4 passes right now. Edited October 26, 20214 yr by HemiStormtrooper
October 26, 20214 yr Community Expert Ideally 24 hours, but if there's a findable problem it will usually take a couple of hours at most.
October 26, 20214 yr Author So, I'm highly confident that it's going to be memory related. Pass 1 on memtest already has 29 errors before test 9 was complete. Unfortunately, my build doesn't support ECC memory and not really in the position to buy all new mobo/cpu/memory. Does anyone have any recommendations for DDR4-2133 memory for my build? In the meantime, I think I'm going to test each stick individually and see if it's just one stick that's bad or not.
October 26, 20214 yr Community Expert 23 minutes ago, HemiStormtrooper said: I'm highly confident that it's going to be memory related. Pass 1 on memtest already has 29 errors before test 9 was complete. Definitely the case since anything other than 0 errors is too many
October 26, 20214 yr Author So since these corrupt files are on my cache, how do I go about removing them once memtest is finished running?
October 26, 20214 yr Community Expert Run a scrub, any corrupt files will be listed in the syslog, delete those, but note that with bad RAM some files might have gotten corrupt during writes and the checksum will match that, so there could still be undetected corruption, same for any array xfs data written while the RAM was failing.
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.