BTRFS warning csum failed


Recommended Posts

Hey,

 

Been getting these warnings in my log recently. 

Quote

Mar 22 08:43:03 Tower kernel: BTRFS warning (device dm-3): csum failed root 5 ino 5608093 off 347480064 csum 0x42dde7c7 expected csum 0x69d41968 mirror 1

I think that dm-3 is my cache drive (whole array is BTRFS though) which is a 1tb Samsung 970 evo plus. 

 

I have no idea what this means. Few days ago I had a scary event. Without thinking removed the usb keyboard that was assigned to my Windows 10 vm while it was running. This crashed the VM, the whole VM page was unresponsive. After a reboot I could start the VM again, however my docker.img was gone (no containers in the Docker tab). I recreated my dockers + ran a parity check (0 errors) and after that I have not encountered any further problems, except for these csum errors in the log. I am not 100% that they started after the mentioned event, but it seems very likely. 

 

Any ideas?

tower-diagnostics-20200322-1105.zip

Edited by esaru
Link to comment
27 minutes ago, johnnie.black said:

Checksum failures are usually a sign of data corruption, start by looking a this to see if your RAM is overclocked, which is a known source of data corruption with Ryzen.

Well it's set in BIOS to xmp profile one which is 3200mhz, the "stock" rating of my ram sticks (Vengeance 2x8gb 3200mhz cl16). I did however have some memory stability issues in the past which I thought I ironed out, ran a one pass memory test and no problems. I noticed there was a recent bios update that adresses ram compatibility, I just flashed it.

 

So, I'm going to make sure that my ram runs well. What about the corruption, is there anything I should do about that? What are the implications of this corruption?

 

Thanks!

Link to comment
50 minutes ago, esaru said:

the "stock" rating of my ram sticks

It's still overclock, except for 3rd gen Ryzen with only 2 DIMMs.

 

50 minutes ago, esaru said:

What are the implications of this corruption?

If the corruption is happening on reads it will be fixed once the hardware problem is resolved, but note that there could still be undetected corruption that happened at write time, since the checksum will match the corrupted data.

Link to comment
42 minutes ago, johnnie.black said:

It's still overclock, except for 3rd gen Ryzen with only 2 DIMMs.

 

If the corruption is happening on reads it will be fixed once the hardware problem is resolved, but note that there could still be undetected corruption that happened at write time, since the checksum will match the corrupted data.

Not sure what that means for me. I do have 3rd gen Ryzen, 2 sticks of ram. I didn't really understand all the information in the link you provided before, the table with ranks etc. It says 3200mhz is officially supported for both single and dual rank, 2 of 2 and 2 of 4(??). All I know is that I have a x570 aourus elite mobo with 4 ram slots where I put 2x8gb 3200 mhz ram. 

 

About scrub, I ran it through the unraid GUI and got the following result:

Error summary: csum=1 Corrected: 0 Uncorrectable: 1 Unverified: 0

 

How do I run scrub to find out which specific files are corrupted?

Checked the log after scrubbing, it's just a random movie that is corrupted. I'm gonna go ahead and delete it.

 

Once again, thanks for helping. 

 

 

Edited by esaru
Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.