BTRFS Filesystem errors


Recommended Posts

I've had quite a few issues with my unraid the last few weeks. First drive 1 pooped itself and became read-only. Managed to get everything off it and put the data onto other disks before removing the drive and resetting the pool without it.

 

After this was completed, I ran a scrub on all the disks (15/16 disks now) and found 5 files that were not correctable. I have since replaced/removed the files in favour of good copies. Hopefully this is fixed.

 

I felt it was prudent to do a filesystem check on all the drives today since i've had so many problems and now I'm getting a few errors on 3/15 disks. What would be the best way to fix this? Should I do the same as I did to drive 1 (pull data off and place onto the good disks and reformat the disk)? Or can I simply pull the disk, format it, and reinstall and let it rebuild from parity?

 

 

Part 2 of this issue: After having a read through the log, it appears this is really isolated to my first 8 drives which is all on 1 controller. Could the HBA be the cause for all these issues lately? Would it be a good idea to get it replaced? 

Edited by Brandon87
Link to comment
30 minutes ago, Brandon87 said:

Or can I simply pull the disk, format it, and reinstall and let it rebuild from parity?

Parity doesn't contain any files. It contains an image of the drive, filesystem included. Rebuilding will recreate file corruption exactly as it is right now. If you format the emulated drive, parity will be updated to reflect that, and rebuilding will put that empty filesystem back on the drive.

 

If you wish specific analysis based on your hardware and current situation, you will need to attach the intact zip diagnostic file to your next post.

Link to comment
checksum error at logical 801567584256 on dev /dev/mapper/md8

These are checksum errors, i.e., data corruption, SAS2LP is know to corrupt data (causing persistent sync errors, usually 5) in some cases, other than that bad RAM is also a likely culprit, either way you have an underlying hardware problem, I would suggest replacing the controllers with LSI HBAs in any case since those are not recommended and running memtest.

Link to comment
16 minutes ago, Brandon87 said:

I’ve tried to do the memtest but it seems every time I try to select it, the whole system restarts and just goes back to the boot screen. Ideas?

If the built in memtest won't run, I'd advise trying to get another version of memtest running. The version in unraid is an older open source, there is a newer program available but unraid can't package it with the OS. You'll need to prepare a different USB stick.

https://www.memtest86.com/

Link to comment
1 hour ago, Brandon87 said:

I’ve tried to do the memtest but it seems every time I try to select it, the whole system restarts and just goes back to the boot screen. Ideas?

You may have to temporarily switch your BIOS to boot via the flash drive via Legacy or NON-EFI mode

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.