BTRFS corruption

Ishtar · December 14, 2023

Hey guys,

I noticed that my services went down today and as I reviewed the syslogs here's what it says

Dec 14 07:59:39 IshtarCommander kernel: BTRFS error (device nvme0n1p1): block=198824738816 write time tree block corruption detected 
Dec 14 07:59:39 IshtarCommander kernel: BTRFS: error (device nvme0n1p1) in btrfs_commit_transaction:2494: errno=-5 IO failure (Error while writing out transaction) 
Dec 14 07:59:39 IshtarCommander kernel: BTRFS info (device nvme0n1p1: state E): forced readonly

Diagnostics also attached.

While I've seen the same sort of error from other users but the solution seems to be a case-by-case basis. In this, I haven't made any changes or action yet.

Coindicidentally, there was a power outage after the event but I don't think it was the culprit at all. I have a UPS running and NUT plugin configured.

Could someone point me into the right direction to repair the corrupted pool?

For backups, unfortunately my appdata is backed up by Duplicacy which is on the same share (silly me) but that's going to be my next project.

ishtarcommander-diagnostics-20231214-2217.zip

JorgeB · December 14, 2023

This error usually means a hardware issue, most often bad RAM, but with the current kernel there have been some possible false positives, so I would recommend running memtest and if nothing is found try zfs instead.

Ishtar · December 14, 2023

22 minutes ago, JorgeB said:

This error usually means a hardware issue, most often bad RAM, but with the current kernel there have been some possible false positives, so I would recommend running memtest and if nothing is found try zfs instead.

I've done memtest on all sticks and didn't see any issues with the RAM

re: trying ZFS, I can possibly do that, but wouldn't the wipe the data in the pool?

Edited December 14, 2023 by Ishtar

JorgeB · December 14, 2023

You'd need to backup, format the pool zfs and restore the data.

Ishtar · December 14, 2023

7 minutes ago, JorgeB said:

You'd need to backup, format the pool zfs and restore the data.

I’m sorry but… how do I perform a backup of the pool?

It’s not mounted so the shares aren’t available

JorgeB · December 14, 2023

The previous log snippet showed the pool going read-only, not unmountable, reboot and post new diags after array start.

Ishtar · December 14, 2023

New Diagnostics attached after the reboot and array start

ishtarcommander-diagnostics-20231215-0718.zip

JorgeB · December 15, 2023

Try

btrfs rescue zero-log /dev/nvme0n1p1

Then restart the array

Ishtar · December 15, 2023

That seemed to have fixed the issue. Thanks @JorgeB❤️

I'm already in the process of restructuring my backup solution to easily fix issues like this

Ishtar · December 15, 2023

Ran BTRFS scrub and found 32 uncorrectable errors. Will update this thread once I'm able to fix it

Ishtar · December 18, 2023

Just an update with fixing uncorrectable errors. Most files were Plex thumbnails so I just deleted them and let Plex rebuild the files.

Unfortunately some corruptions are located in the docker image, so I'll have to delete the image and re-install the containers. It shouldn't be that complicated but it'll be a weekend project for me.

Will keep this post updated as usual

itimpi · December 18, 2023

On 12/18/2023 at 1:07 PM, Ishtar said:

so I'll have to delete the image and re-install the containers. It shouldn't be that complicated but it'll be a weekend project for me.

This is actually very easy and is much quicker than you might think.. Follow the instructions to recreate the docker.img file and then reinstall your containers with previous settings intact via Apps->Previous Apps.

BTRFS corruption

Recommended Posts

Ishtar

Link to comment

JorgeB

Link to comment

Ishtar

Link to comment

JorgeB

Link to comment

Ishtar

Link to comment

JorgeB

Link to comment

Ishtar

Link to comment

JorgeB

Link to comment

Ishtar

Link to comment

Ishtar

Link to comment

Ishtar

Link to comment

itimpi

Link to comment

Join the conversation