esaru Posted March 22, 2020 Share Posted March 22, 2020 (edited) Hey, Been getting these warnings in my log recently. Quote Mar 22 08:43:03 Tower kernel: BTRFS warning (device dm-3): csum failed root 5 ino 5608093 off 347480064 csum 0x42dde7c7 expected csum 0x69d41968 mirror 1 I think that dm-3 is my cache drive (whole array is BTRFS though) which is a 1tb Samsung 970 evo plus. I have no idea what this means. Few days ago I had a scary event. Without thinking removed the usb keyboard that was assigned to my Windows 10 vm while it was running. This crashed the VM, the whole VM page was unresponsive. After a reboot I could start the VM again, however my docker.img was gone (no containers in the Docker tab). I recreated my dockers + ran a parity check (0 errors) and after that I have not encountered any further problems, except for these csum errors in the log. I am not 100% that they started after the mentioned event, but it seems very likely. Any ideas? tower-diagnostics-20200322-1105.zip Edited March 22, 2020 by esaru Quote Link to comment
JorgeB Posted March 22, 2020 Share Posted March 22, 2020 Checksum failures are usually a sign of data corruption, start by looking a this to see if your RAM is overclocked, which is a known source of data corruption with Ryzen. Quote Link to comment
esaru Posted March 22, 2020 Author Share Posted March 22, 2020 27 minutes ago, johnnie.black said: Checksum failures are usually a sign of data corruption, start by looking a this to see if your RAM is overclocked, which is a known source of data corruption with Ryzen. Well it's set in BIOS to xmp profile one which is 3200mhz, the "stock" rating of my ram sticks (Vengeance 2x8gb 3200mhz cl16). I did however have some memory stability issues in the past which I thought I ironed out, ran a one pass memory test and no problems. I noticed there was a recent bios update that adresses ram compatibility, I just flashed it. So, I'm going to make sure that my ram runs well. What about the corruption, is there anything I should do about that? What are the implications of this corruption? Thanks! Quote Link to comment
JorgeB Posted March 22, 2020 Share Posted March 22, 2020 50 minutes ago, esaru said: the "stock" rating of my ram sticks It's still overclock, except for 3rd gen Ryzen with only 2 DIMMs. 50 minutes ago, esaru said: What are the implications of this corruption? If the corruption is happening on reads it will be fixed once the hardware problem is resolved, but note that there could still be undetected corruption that happened at write time, since the checksum will match the corrupted data. Quote Link to comment
JorgeB Posted March 22, 2020 Share Posted March 22, 2020 Forgot to say, if/when the hardware problem is resolved you can run scrub, it will identify the corrupt files, those will need to be deleted/recovered from backups. Quote Link to comment
esaru Posted March 22, 2020 Author Share Posted March 22, 2020 (edited) 42 minutes ago, johnnie.black said: It's still overclock, except for 3rd gen Ryzen with only 2 DIMMs. If the corruption is happening on reads it will be fixed once the hardware problem is resolved, but note that there could still be undetected corruption that happened at write time, since the checksum will match the corrupted data. Not sure what that means for me. I do have 3rd gen Ryzen, 2 sticks of ram. I didn't really understand all the information in the link you provided before, the table with ranks etc. It says 3200mhz is officially supported for both single and dual rank, 2 of 2 and 2 of 4(??). All I know is that I have a x570 aourus elite mobo with 4 ram slots where I put 2x8gb 3200 mhz ram. About scrub, I ran it through the unraid GUI and got the following result: Error summary: csum=1 Corrected: 0 Uncorrectable: 1 Unverified: 0 How do I run scrub to find out which specific files are corrupted? Checked the log after scrubbing, it's just a random movie that is corrupted. I'm gonna go ahead and delete it. Once again, thanks for helping. Edited March 22, 2020 by esaru Quote Link to comment
JorgeB Posted March 22, 2020 Share Posted March 22, 2020 32 minutes ago, esaru said: I do have 3rd gen Ryzen, 2 sticks of ram Then 3200Mhz is fine and supported, assuming no failing DIMMs. Quote Link to comment
esaru Posted March 22, 2020 Author Share Posted March 22, 2020 22 minutes ago, johnnie.black said: Then 3200Mhz is fine and supported, assuming no failing DIMMs. Okay. I'd suppose a memtest is in order? Quote Link to comment
JorgeB Posted March 22, 2020 Share Posted March 22, 2020 It won't hurt, especially if you get more similar errors in the future. Quote Link to comment
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.