Jump to content

BTRFS cache pool corrupt errors


Recommended Posts

Hey, so I had to do a force shutdown yesterday as trying to stop the array was stuck at Syncing Filesystems... message. I've double checked using `lsof /mnt` that none of the files were being accessed before doing so. After turning the system back on and started to receive the cache corrupt and checksum invalid errors. I did run the scrub twice to double check if there were any errors but none are being reported yet I'm receiving the corrupt error messages. I'm attaching the diagnostics below, any help is extremely appreciated. Happy to provide any further information needed. Thanks!

titan-diagnostics-20210726-2311.zip

Link to comment
23 hours ago, JorgeB said:

That's a strange one, if there's corruption it should be found by scrubbing, I would suggest backing up, re-formatting pool and restoring the data.

Just went through the above steps and looks like my Cache 0 is still having corrupt issues. I'm guessing I need to replace my cache drive and there's nothing else that can be done? Scrub still doesn't report any errors either. Please find the latest diagnostics below. Appreciate your help, thanks!

titan-diagnostics-20210727-2331.zip

Link to comment
10 minutes ago, JorgeB said:

What would make more sense is that this issue has nothing to do with the previous shutdown, but it's a hardware problem, like bad RAM, and just noticed your RAM is overclocked, that is known to corrupt data with some Ryzen servers, set it to max supported speed and try again, if there are still errors after that run memtest.

 

Yeah, turned off the XMP profile on motherboard this morning, RAM speeds are at 2133 MT/s and been monitoring the cache for corrupt errors. Have received a couple of errors so far. Will try to run memtest, couldn't do one this morning as system kept restarting when selecting memtest option on UEFI mode. Switching to CSM couldn't even boot from the USB device.

Link to comment

Thanks @JorgeB and @itimpi, appreciate both of your help in this thread. Looks like I've had one bad RAM stick out of the 4, so I've removed that and running the system on 3 RAM sticks, there were no errors in this setup on memtest. Hopefully the data corruption errors would not occur anymore. It would've been very helpful if I've come across the user script to monitor cache pool for any errors when I was getting started with Unraid. It isn't really advertised/well known until and unless you look for the specific error posts in the forum.

  • Like 1
Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...