uek2wooF Posted February 20, 2020 Share Posted February 20, 2020 I keep getting btrfs corruption on the nvme cache disk after about 3 days of uptime. dmesg shows lots of btrfs errors. I also get 30-50 errors on parity checks (one 4tb wd red with xfs for array and one for parity). Memtest has been running for 18 hours with no errors. Any ideas? The first time the cache disk crashed I removed it as cache and rsynced it to another ssd I had lying around (a handful of unimportant files wouldn't copy). I reformatted the nmve as btrfs and rscynced the stuff back over and added it back as cache. ryzen 9 3900x, 64 gb ram, 1 tb nvme, 2x 4tb wd red Quote Link to comment
Chess Posted February 20, 2020 Share Posted February 20, 2020 24 minutes ago, uek2wooF said: I keep getting btrfs corruption on the nvme cache disk after about 3 days of uptime. dmesg shows lots of btrfs errors. I also get 30-50 errors on parity checks (one 4tb wd red with xfs for array and one for parity). Memtest has been running for 18 hours with no errors. Any ideas? The first time the cache disk crashed I removed it as cache and rsynced it to another ssd I had lying around (a handful of unimportant files wouldn't copy). I reformatted the nmve as btrfs and rscynced the stuff back over and added it back as cache. ryzen 9 3900x, 64 gb ram, 1 tb nvme, 2x 4tb wd red I had this with my cache with was 2 X 512GB Toshiba SSDs on my 3900x when I first set it up and was testing. I had my Ram running at 3600MHz, and the system was not stable. What do you have your ram running at? Try backing it off to 2667 and see if things stable down. Also turn off PBO in the BIOS and see if that helps. I was able to get my system stable at 3200 MHz Ram speed, but 3600 just never worked. System lock ups, and my cache kept getting corrupted. Also, update your BIOS if it is not already on the latest. Pull your diags and post them up so we can take a look. Quote Link to comment
uek2wooF Posted February 20, 2020 Author Share Posted February 20, 2020 I couldn't get my 3600 ram to run at all at 3600, had to drop to 3200. I will drop it more but shouldn't I be seeing errors from memtest? 21 hours now no errors. I will upload some configs when I am done memtesting, going to let it go a little longer. I will try to find PBO too before booting back up. Thanks for the reply. This is so frustrating. (btw asrock taichi x570 is the mobo) Quote Link to comment
JorgeB Posted February 20, 2020 Share Posted February 20, 2020 3 hours ago, uek2wooF said: 64 gb ram 4 DIMMs @ 3200 is overclock, see here for more info: https://forums.unraid.net/topic/46802-faq-for-unraid-v6/page/2/?tab=comments#comment-543490 Quote Link to comment
uek2wooF Posted February 21, 2020 Author Share Posted February 21, 2020 I didn't realize that. I have dropped the ram speed to 2666 and since then I've gotten a clean parity check for the first time, and I even copied 40 gb to the array first to make sure there was some activity. No cache errors yet. I am keeping an eye on it. Thanks! Quote Link to comment
Chess Posted February 21, 2020 Share Posted February 21, 2020 4 hours ago, uek2wooF said: I didn't realize that. I have dropped the ram speed to 2666 and since then I've gotten a clean parity check for the first time, and I even copied 40 gb to the array first to make sure there was some activity. No cache errors yet. I am keeping an eye on it. Thanks! If you don't need 64 GB of ram you could run at a higher ram speed with just 2 dimms, but at the end of the day, the difference in performance is really not that great, especially for a server. Quote Link to comment
uek2wooF Posted February 21, 2020 Author Share Posted February 21, 2020 Getting btrfs errors trying to remove a docker container. Could it be that btrfs just sucks? Is ext4 ok for the cache drive? Quote Link to comment
JonathanM Posted February 21, 2020 Share Posted February 21, 2020 Just now, uek2wooF said: Is ext4 ok for the cache drive? Not an option. tools, diagnostics, attach the zip file to your next post if you want assistance. Quote Link to comment
uek2wooF Posted February 21, 2020 Author Share Posted February 21, 2020 tower-diagnostics-20200221-1449.zip Quote Link to comment
BRiT Posted February 21, 2020 Share Posted February 21, 2020 Never had any issues with XFS on single cache drive pool. Quote Link to comment
uek2wooF Posted February 21, 2020 Author Share Posted February 21, 2020 This looks like a corrupt docker image maybe. Should I just rm docker.img? How do I create a new one? Quote Link to comment
uek2wooF Posted February 21, 2020 Author Share Posted February 21, 2020 Created a new docker image and reinstalled some containers. Forgot my settings for the private docker net I had set up so now everything is broken. Good times. Quote Link to comment
uek2wooF Posted February 22, 2020 Author Share Posted February 22, 2020 Everything seems to be fixed for now. If I have more problems I will try xfs on cache next I guess. Quote Link to comment
JorgeB Posted February 22, 2020 Share Posted February 22, 2020 That could still be a result of the previous issues, btrfs gets quickly corrupt with bad RAM, and if it keeps getting corrupt without an apparent reason it can serve as good warning there are still hardware issues. Quote Link to comment
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.