Skip to content
View in the app

A better way to browse. Learn more.

Unraid

A full-screen app on your home screen with push notifications, badges and more.

To install this app on iOS and iPadOS
  1. Tap the Share icon in Safari
  2. Scroll the menu and tap Add to Home Screen.
  3. Tap Add in the top-right corner.
To install this app on Android
  1. Tap the 3-dot menu (⋮) in the top-right corner of the browser.
  2. Tap Add to Home screen or Install app.
  3. Confirm by tapping Install.

BTRFS error: corruption

Featured Replies

Hello!

 

I am running a RAID1 SATA SSD Cache pool and I am getting some BTRFS errors:

Jan 29 23:57:12 Turing kernel: BTRFS warning (device sdd1): csum failed root 5 ino 3599 off 2499960832 csum 0x60341ddd expected csum 0x88e58ce3 mirror 2
Jan 29 23:57:12 Turing kernel: BTRFS error (device sdd1): bdev /dev/sdd1 errs: wr 0, rd 0, flush 0, corrupt 1, gen 0
Jan 29 23:57:12 Turing kernel: BTRFS warning (device sdd1): csum failed root 5 ino 3599 off 2499964928 csum 0x1470dccc expected csum 0x8188ffff mirror 2
Jan 29 23:57:12 Turing kernel: BTRFS error (device sdd1): bdev /dev/sdd1 errs: wr 0, rd 0, flush 0, corrupt 2, gen 0
Jan 29 23:57:12 Turing kernel: BTRFS warning (device sdd1): csum failed root 5 ino 3599 off 2499960832 csum 0x60341ddd expected csum 0x88e58ce3 mirror 1
Jan 29 23:57:12 Turing kernel: BTRFS error (device sdd1): bdev /dev/sde1 errs: wr 0, rd 0, flush 0, corrupt 1, gen 0
Jan 29 23:57:12 Turing kernel: BTRFS warning (device sdd1): csum failed root 5 ino 3599 off 2499964928 csum 0x1470dccc expected csum 0x8188ffff mirror 1
Jan 29 23:57:12 Turing kernel: BTRFS error (device sdd1): bdev /dev/sde1 errs: wr 0, rd 0, flush 0, corrupt 2, gen 0
Jan 29 23:57:12 Turing kernel: BTRFS warning (device sdd1): csum failed root 5 ino 3599 off 2499960832 csum 0x60341ddd expected csum 0x88e58ce3 mirror 2
Jan 29 23:57:12 Turing kernel: BTRFS error (device sdd1): bdev /dev/sdd1 errs: wr 0, rd 0, flush 0, corrupt 3, gen 0
Jan 29 23:57:12 Turing kernel: BTRFS warning (device sdd1): csum failed root 5 ino 3599 off 2499964928 csum 0x1470dccc expected csum 0x8188ffff mirror 2
Jan 29 23:57:12 Turing kernel: BTRFS error (device sdd1): bdev /dev/sdd1 errs: wr 0, rd 0, flush 0, corrupt 4, gen 0

 

As I had some issues with this pool (and another one) some time ago with similiar looking logs, I decided to fromat both drives about 2 days ago. I recreated the cache pool and moved the data back and not even after 48h I got the error above.

 

I would be thankful if someone has an idea or can point me into a direction!
(Diagnostics are attached)

 

turing-diagnostics-20230130-0205.zip

Solved by Simom

  • Community Expert

One of the pool devices dropped offline in the past:

 

Jan 28 05:43:28 Turing kernel: BTRFS info (device nvme1n1p1): bdev /dev/nvme1n1p1 errs: wr 364810, rd 272, flush 32498, corrupt 169, gen 0

 

See here for more info and better pool monitoring.

  • Author

Hey, thanks for the response. You are kinda right. but that error is from another cache pool (my nvme one) (I am currently working on clearing the pool you mentioned, so I can scrub and format it).

As far as I understand the error you mentioned shouldn't be connected to the one I posted about because the devices are part of different pools or am I missing something?

  • Community Expert
4 minutes ago, Simom said:

As far as I understand the error you mentioned shouldn't be connected to the one I posted

Correct, missed that you had two pools, unexpected csum errors can be the result of RAM issues, suggest running memtest, still recommend the monitoring script for both pools.

  • Author

Alright I will try that! As I said I had a similiiar issue to this one some time ago but swapped CPU, MoBo and RAM since then. Nevertheless thanks for the help!

  • Community Expert
31 minutes ago, JorgeB said:

unexpected csum errors can be the result of RAM issues

 

23 minutes ago, Simom said:

swapped CPU, MoBo and RAM

In which case, the errors could have been the result of previous bad RAM.

 

In any case,

32 minutes ago, JorgeB said:

suggest running memtest

 

  • Author
18 hours ago, trurl said:

In which case, the errors could have been the result of previous bad RAM.

I think I didn't clearly state my previous trouble shooting steps and this is leading me to some confussion. So all in order:

  • I had some csum erros like the one in my first comment
  • I swapped systems with new CPU, MoBo and RAM
  • unassigned both drives, formatted them and created a new pool
  • not even 48h later I get a new csum error

As I created a new pool, this still might be a problem with my current RAM, but not with my old one, or am I missing something?

 

Memtest is running, nothing found this far. Any advice on how long I should leave this running (I read 24-48 hours somewhere)?

  • Community Expert
3 minutes ago, Simom said:

not even 48h later I get a new csum error

That suggests there's still a problem, usually RAM related.

  • Author

just wanted to get back to this:
memtest has been running for over 72hours without finding anything.

are there any other trouble shooting that steps I can try?

  • Community Expert

Remove one of the RAM sticks, scrub the pool and if no errors are found reset the filesystem stats, then work normally, if new errors appear try with just the other stick.

  • Author

Thanks for the quick response! I will try that and see how it goes.

  • 11 months later...
On 2/4/2023 at 2:16 AM, Simom said:

Thanks for the quick response! I will try that and see how it goes.

Hey i am having similar issues with raid 1 config, how do you manage to move the data from the cache to array and move it back again? Since i am planning to re-format the nvme disks

  • Community Expert

Nothing can move open files. Disable Docker and  VM Manager in Settings. Dynamix File Manager will let you work directly with the disks and pools on the server.

  • Author
  • Solution

Just realized, that I never followed up to this:
I switched from macvlan to ipvlan for my docker containers and that seems to have fixed it. No crashing, no corrupting since that.

(I guess the macvlan stuff lead to kernel panics, that lead to the corruption of the files; but I am no expert).

 

p.s.

I also read that there have been changes to macvlan.

3 hours ago, trurl said:

Nothing can move open files. Disable Docker and  VM Manager in Settings. Dynamix File Manager will let you work directly with the disks and pools on the server.

So after i disable docker and vm, i can move the file from my cache to array? And after formatting the cache i can move the file back and use the vm and docker without additional settings?

  • Author

In theory yes, you should also make sure that no one else is accessing the files over smb, nfs or whatever before starting to move the files. 
But I would highly advise to check if the files system runs as expected before moving important data back to the cache. 

8 hours ago, Simom said:

In theory yes, you should also make sure that no one else is accessing the files over smb, nfs or whatever before starting to move the files. 
But I would highly advise to check if the files system runs as expected before moving important data back to the cache. 

Ok, also do i need to re-direct the vm vdisk directory and other things for the vm? Or i can just hit play button after

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

Account

Navigation

Search

Search

Configure browser push notifications

Chrome (Android)
  1. Tap the lock icon next to the address bar.
  2. Tap Permissions → Notifications.
  3. Adjust your preference.
Chrome (Desktop)
  1. Click the padlock icon in the address bar.
  2. Select Site settings.
  3. Find Notifications and adjust your preference.