Corrupt leaf error on NVME cache drive - steps to fix?


Recommended Posts

Suddenly having this error repeated in the log, causing my main VM to stutter / behave badly. 

 

ANDRAS4 kernel: BTRFS critical (device nvme0n1p1): corrupt leaf: root=5 block=113983488 slot=109 ino=1122227 file_offset=413696, invalid type for file extent, have 129 expect range [0, 2]

 

Log is attached. 

 

The vdisks (domains) are on the cache drive, as is the appdata folder for docker. It's an NVME SSD that I bought new about 2 months ago. Is it failing? If so, is there a way to safely move domains + appdata to the array?

andras4-syslog-20191014-1851.zip

Link to comment
1 hour ago, jpowell8672 said:

 

Thank you.

 

The disk was mountable, and I was able to recover some of the data stored on it. Luckily, my daily use VM image was intact (though, it's the backup image, which is a few days out of date -- this means I have to start over in The Witcher 2 😩). I was also able to recover the appdata for syncthing. Sadly, the rest is cooked. 

 

I'm wondering if it's an issue with the disk at all, or if it's an(other) issue with BTRFS. Based on some threads I'm reading, people tend to choose XFS for SSDs after suffering repeated failures similar to this one. Once I was finished with all the recovery options that worked, I was able to reformat the disk as XFS with no errors. Not going to put anything on it without testing it extensively, though.

Link to comment
10 minutes ago, johnnie.black said:

If a btrfs filesystem gets corrupted multiple times without an apparent reason it can point to a hardware problem since btrfs is much more susceptible for example to bad RAM than xfs.

This is the first time it's happened to me, but I'll run memtest86 tonight just to make sure there's nothing wrong. Regardless, perhaps XFS is a better option for me because of the fact that it's more resilient. The only advantage to BTRFS I can use is TRIM, and if I'm not mistaken, my SSD does garbage collection in the firmware.

Link to comment
1 minute ago, cyberspectre said:

This is the first time it's happened to me, but I'll run memtest86 tonight just to make sure there's nothing wrong. Regardless, perhaps XFS is a better option for me because of the fact that it's more resilient. The only advantage to BTRFS I can use is TRIM, and if I'm not mistaken, my SSD does garbage collection in the firmware.

On the cache XFS is only an option if you just have a single drive.    The moment you want the cache to be multi-drive then BTRFS becomes the only option available.

Link to comment
18 hours ago, johnnie.black said:

Since you're running a Ryzen CPU make sure to respect max RAM speeds based on config, they are known to in some case corrupt data with overclocked RAM.

 

1066702185_1stgen.png.c8266910cbe9b7786bd3406d9638a266.png

Could you elaborate? Is that some setting in UnRaid? My memory is rated at 3000 but the default profile in the BIOS is 2133, so I leave it at that. 

 

By the way, if I set "use cache disk" on a share to "yes," why does the mover move everything OFF the cache disk?

Link to comment
On 10/15/2019 at 3:21 AM, cyberspectre said:

I'm wondering if it's an issue with the disk at all, or if it's an(other) issue with BTRFS. Based on some threads I'm reading, people tend to choose XFS for SSDs after suffering repeated failures similar to this one. Once I was finished with all the recovery options that worked, I was able to reformat the disk as XFS with no errors. Not going to put anything on it without testing it extensively, though.

 

I see you quoted my thread ... I had a similar issue last year, but it was due to corruption on a StarTech PCI 2.5" HDD/SSD hotswap slot I was using. I removed the slot, connected the drive directly to my motherboard, and no more corruption. I guess the StarTech adapter had poor QC ... I had two of them and both exhibited this issue after extended use.

Link to comment
On 10/15/2019 at 11:49 PM, cyberspectre said:

Could you elaborate? Is that some setting in UnRaid? My memory is rated at 3000 but the default profile in the BIOS is 2133, so I leave it at that. 

 

 

You need to enable the XMP 2.0 profile in the bios to use 3000 mhz. 3000mhz isn't a "standard" speed, so it's configured via Extended Memory Profiles. Technically you paid for a 3000mhz stick, so it should be stable at 3000mhz, but if not, you can always just use the sub-stock speed of 2133 which is fine. I doubt you'll notice any performance difference in unRAID.

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.