Maddeen Posted December 22, 2021 Share Posted December 22, 2021 Hi, it seems, that I got some problems with my cache pool. Here is the quick info whats in the log and surely I attached the complete diagnostic. Dec 22 08:45:41 v1ew-s0urce emhttpd: read SMART /dev/sde Dec 22 08:45:41 v1ew-s0urce emhttpd: read SMART /dev/sdf Dec 22 08:47:42 v1ew-s0urce emhttpd: shcmd (427885): /usr/local/sbin/mover &> /dev/null & Dec 22 08:47:42 v1ew-s0urce kernel: btrfs_print_data_csum_error: 23 callbacks suppressed Dec 22 08:47:42 v1ew-s0urce kernel: BTRFS warning (device sdd1): csum failed root 5 ino 143720 off 1500127232 csum 0xf5ce0976 expected csum 0xe221992a mirror 2 Dec 22 08:47:42 v1ew-s0urce kernel: btrfs_dev_stat_print_on_error: 23 callbacks suppressed Dec 22 08:47:42 v1ew-s0urce kernel: BTRFS error (device sdd1): bdev /dev/sdg1 errs: wr 0, rd 0, flush 0, corrupt 758, gen 0 Dec 22 08:47:42 v1ew-s0urce kernel: BTRFS warning (device sdd1): csum failed root 5 ino 143720 off 1500127232 csum 0xf5ce0976 expected csum 0xe221992a mirror 1 Dec 22 08:47:42 v1ew-s0urce kernel: BTRFS error (device sdd1): bdev /dev/sdd1 errs: wr 0, rd 0, flush 0, corrupt 895, gen 0 Dec 22 08:47:42 v1ew-s0urce kernel: BTRFS warning (device sdd1): csum failed root 5 ino 143720 off 1500127232 csum 0xf5ce0976 expected csum 0xe221992a mirror 2 Dec 22 08:47:42 v1ew-s0urce kernel: BTRFS error (device sdd1): bdev /dev/sdg1 errs: wr 0, rd 0, flush 0, corrupt 759, gen 0 Dec 22 08:47:42 v1ew-s0urce shfs: copy_file: /mnt/cache/Filme/movie1.mkv /mnt/disk1/Filme/movie1.mkv (5) Input/output error Dec 22 08:47:42 v1ew-s0urce kernel: BTRFS warning (device sdd1): csum failed root 5 ino 143720 off 1500127232 csum 0xf5ce0976 expected csum 0xe221992a mirror 2 Dec 22 08:47:42 v1ew-s0urce kernel: BTRFS error (device sdd1): bdev /dev/sdg1 errs: wr 0, rd 0, flush 0, corrupt 760, gen 0 Dec 22 08:47:42 v1ew-s0urce kernel: BTRFS warning (device sdd1): csum failed root 5 ino 143720 off 1500127232 csum 0xf5ce0976 expected csum 0xe221992a mirror 1 Dec 22 08:47:42 v1ew-s0urce kernel: BTRFS error (device sdd1): bdev /dev/sdd1 errs: wr 0, rd 0, flush 0, corrupt 896, gen 0 Dec 22 08:47:42 v1ew-s0urce kernel: BTRFS warning (device sdd1): csum failed root 5 ino 143720 off 1500127232 csum 0xf5ce0976 expected csum 0xe221992a mirror 2 Dec 22 08:47:42 v1ew-s0urce kernel: BTRFS error (device sdd1): bdev /dev/sdg1 errs: wr 0, rd 0, flush 0, corrupt 761, gen 0 Dec 22 08:47:42 v1ew-s0urce kernel: BTRFS warning (device sdd1): csum failed root 5 ino 143720 off 1500127232 csum 0xf5ce0976 expected csum 0xe221992a mirror 2 Dec 22 08:47:42 v1ew-s0urce kernel: BTRFS error (device sdd1): bdev /dev/sdg1 errs: wr 0, rd 0, flush 0, corrupt 762, gen 0 Dec 22 08:47:42 v1ew-s0urce kernel: BTRFS warning (device sdd1): csum failed root 5 ino 143720 off 1500127232 csum 0xf5ce0976 expected csum 0xe221992a mirror 1 Dec 22 08:47:42 v1ew-s0urce kernel: BTRFS error (device sdd1): bdev /dev/sdd1 errs: wr 0, rd 0, flush 0, corrupt 897, gen 0 Dec 22 08:47:42 v1ew-s0urce kernel: BTRFS warning (device sdd1): csum failed root 5 ino 143720 off 1500127232 csum 0xf5ce0976 expected csum 0xe221992a mirror 2 Dec 22 08:47:42 v1ew-s0urce kernel: BTRFS error (device sdd1): bdev /dev/sdg1 errs: wr 0, rd 0, flush 0, corrupt 763, gen 0 Dec 22 08:47:43 v1ew-s0urce kernel: BTRFS warning (device sdd1): csum failed root 5 ino 143721 off 2492567552 csum 0xf4c565ea expected csum 0x5b93949e mirror 1 Dec 22 08:47:43 v1ew-s0urce kernel: BTRFS error (device sdd1): bdev /dev/sdg1 errs: wr 0, rd 0, flush 0, corrupt 764, gen 0 Dec 22 08:47:43 v1ew-s0urce shfs: copy_file: /mnt/cache/Filme/movie2.mkv /mnt/disk1/Filme/movie2.mkv (5) Input/output error Dec 22 08:47:43 v1ew-s0urce kernel: BTRFS info (device sdd1): read error corrected: ino 152985 off 863924224 (dev /dev/sdd1 sector 44853048) Dec 22 08:47:43 v1ew-s0urce shfs: copy_file: /mnt/cache/Filme/movie3.mkv /mnt/disk1/Filme/movie3.mkv (5) Input/output error Dec 22 08:47:44 v1ew-s0urce shfs: copy_file: /mnt/cache/unraid_backup/flashdrivebackup/previous/bzroot /mnt/disk3/unraid_backup/flashdrivebackup/previous/bzroot (5) Input/output error Dec 22 08:47:54 v1ew-s0urce emhttpd: read SMART /dev/sdb Dec 22 09:19:19 v1ew-s0urce emhttpd: spinning down /dev/sdb What's the best way to proceed here? I read some related topics but I'm unsure how to handle it. And due to the "fragility" of the unraid setup I wont do anything thats not verified by people who know how to handle it Last time I tried something by myself it ended in a complete loss of my cache pool and I was not able to rebuild anything. Despite the fact, that in my little word, the real sense of a pool was to rebuild anything when a problem occurs. But I was proofed wrong 😞 Hopefully someone can help me out of this - thanks in advance v1ew-s0urce-diagnostics-20211222-1040.zip Quote Link to comment
Solution JorgeB Posted December 22, 2021 Solution Share Posted December 22, 2021 Btrfs is detecting data corruption, start by running memtest. 1 Quote Link to comment
Maddeen Posted December 22, 2021 Author Share Posted December 22, 2021 (edited) Thanks @JorgeB and please forgive me my unknowledge - but how do I do that? Just right from the unraid console? I searched for a topic within the forums for a guide but I cant found one. Google doesnt help me either... so it seems to be a very common test - but sadly not for me Edited December 22, 2021 by Maddeen Quote Link to comment
JorgeB Posted December 22, 2021 Share Posted December 22, 2021 1 hour ago, Maddeen said: but how do I do that? It's an option in the boot menu (requires CSM/legacy boot to work) 1 Quote Link to comment
Maddeen Posted December 24, 2021 Author Share Posted December 24, 2021 @JorgeB Thanks - I started it and it immediately looks horrible 🙈🥺 (see Screenshot) I’ll run the test to the end to see, what’s coming next. But from this screen, I can’t see which RAM module is defect or did I? And honestly I didn’t get it. UnRAID runs smooth. VMs run smooth. Access to all files - even the one that showed up in the logfile - working. But copying the files from cache drive to the normal drives failed and the screenshot shows so many errors that (imho) the server shouldn’t even start. May be you can say/teach me these „connections“ of logic - I can’t understand it. 🙈 Have some wonderful Christmas p.s I’ll upload a screenshot of the final as soon as the memtest is over Quote Link to comment
JorgeB Posted December 24, 2021 Share Posted December 24, 2021 19 minutes ago, Maddeen said: I’ll run the test to the end to see, Not much point, you can run it with just one DIMM at a time to see if you can find the culprit. 1 Quote Link to comment
Maddeen Posted December 24, 2021 Author Share Posted December 24, 2021 Ahh - ok - so there will be no summary at the end that helps indicating. Thank you. Is it possible that the XMP feature causes this errors? Because I activated XMP due to better speeds with my gaming VM. Quote Link to comment
JorgeB Posted December 24, 2021 Share Posted December 24, 2021 4 minutes ago, Maddeen said: Is it possible that the XMP feature causes this errors? XMP is basically overclocking, so you should avoid that, especially with Ryzen, but I checked before and RAM was @ 2133 MT/s, so XMP was disable. 1 Quote Link to comment
Maddeen Posted December 27, 2021 Author Share Posted December 27, 2021 (edited) @JorgeB - Thanks again. Today I received my new RAM. Memtest found no errors and I could succesfully started the "move" process to bring my files from the cache to my normal HDDs. Sadly that seems not to solve all of my problems. I'm still getting this errors when I start a backup of my appdata and libvirt. What can I do now? I attached a new diagnostics zip in this post. Thanks in advance Dec 27 19:05:54 v1ew-s0urce CA Backup/Restore: Using command: cd '/mnt/cache/appdata/' && /usr/bin/tar -cvaf '/mnt/user/unraid_backup/appdatabackup/[email protected]/CA_backup.tar.gz' * >> /var/lib/docker/unraid/ca.backup2.datastore/appdata_backup.log 2>&1 & echo $! > /tmp/ca.backup2/tempFiles/backupInProgress Dec 27 19:06:57 v1ew-s0urce kernel: BTRFS warning (device sdd1): csum failed root 5 ino 20152 off 541986816 csum 0xcbd848d6 expected csum 0x8941f998 mirror 2 Dec 27 19:06:57 v1ew-s0urce kernel: BTRFS error (device sdd1): bdev /dev/sdg1 errs: wr 0, rd 0, flush 0, corrupt 883, gen 0 Dec 27 19:06:57 v1ew-s0urce kernel: BTRFS warning (device sdd1): csum failed root 5 ino 20152 off 541986816 csum 0x7e36f534 expected csum 0x8941f998 mirror 1 Dec 27 19:06:57 v1ew-s0urce kernel: BTRFS error (device sdd1): bdev /dev/sdd1 errs: wr 0, rd 0, flush 0, corrupt 1044, gen 0 Dec 27 19:06:57 v1ew-s0urce kernel: BTRFS warning (device sdd1): csum failed root 5 ino 20152 off 541986816 csum 0xcbd848d6 expected csum 0x8941f998 mirror 2 Dec 27 19:06:57 v1ew-s0urce kernel: BTRFS error (device sdd1): bdev /dev/sdg1 errs: wr 0, rd 0, flush 0, corrupt 884, gen 0 Dec 27 19:10:49 v1ew-s0urce CA Backup/Restore: Backup Complete Dec 27 19:10:49 v1ew-s0urce CA Backup/Restore: Verifying backup Dec 27 19:10:49 v1ew-s0urce CA Backup/Restore: Using command: cd '/mnt/cache/appdata/' && /usr/bin/tar --diff -C '/mnt/cache/appdata/' -af '/mnt/user/unraid_backup/appdatabackup/[email protected]/CA_backup.tar.gz' > /var/lib/docker/unraid/ca.backup2.datastore/appdata_backup.log & echo $! > /tmp/ca.backup2/tempFiles/verifyInProgress Dec 27 19:11:04 v1ew-s0urce kernel: BTRFS warning (device sdd1): csum failed root 5 ino 20152 off 541986816 csum 0xcbd848d6 expected csum 0x8941f998 mirror 2 Dec 27 19:11:04 v1ew-s0urce kernel: BTRFS error (device sdd1): bdev /dev/sdg1 errs: wr 0, rd 0, flush 0, corrupt 885, gen 0 Dec 27 19:11:04 v1ew-s0urce kernel: BTRFS warning (device sdd1): csum failed root 5 ino 20152 off 541986816 csum 0x7e36f534 expected csum 0x8941f998 mirror 1 Dec 27 19:11:04 v1ew-s0urce kernel: BTRFS error (device sdd1): bdev /dev/sdd1 errs: wr 0, rd 0, flush 0, corrupt 1045, gen 0 Dec 27 19:11:04 v1ew-s0urce kernel: BTRFS warning (device sdd1): csum failed root 5 ino 20152 off 541986816 csum 0xcbd848d6 expected csum 0x8941f998 mirror 2 Dec 27 19:11:04 v1ew-s0urce kernel: BTRFS error (device sdd1): bdev /dev/sdg1 errs: wr 0, rd 0, flush 0, corrupt 886, gen 0 Dec 27 19:11:52 v1ew-s0urce Docker Auto Update: Community Applications Docker Autoupdate running v1ew-s0urce-diagnostics-20211227-1914.zip Edited December 27, 2021 by Maddeen Quote Link to comment
JorgeB Posted December 27, 2021 Share Posted December 27, 2021 That's expected, data corruption resulting from running the server with bad RAM, you can run a scrub on the pool, it will list in the syslog all corrupt files, those files need to be deleted/restored from backups. 1 Quote Link to comment
Maddeen Posted December 27, 2021 Author Share Posted December 27, 2021 (edited) Thanks @JorgeB but how do I do that? I just found another thread where you replied but sadly you nor the thread-starter said how to do this? Edited December 27, 2021 by Maddeen Quote Link to comment
JorgeB Posted December 27, 2021 Share Posted December 27, 2021 On the main page click on the first pool member, then: 1 Quote Link to comment
Maddeen Posted December 27, 2021 Author Share Posted December 27, 2021 That did the trick. 6 Files were corrupt - luckily 5 within the krusader docker and one on the cache drive itself Deleted - run another scrub - all fine. Backup also worked fine. Thank you very much @JorgeB May I ask you one more question? While searching for the corrupt files to delete, I recordnize, that I cant browse to /mnt/diskx or /mnt/cache. But I'm pretty sure that the last time I browsed (months ago) I could easily browse directly to the mounting points. Does anything changed? Thanks again - have a great/happy new year!!! Quote Link to comment
ChatNoir Posted December 28, 2021 Share Posted December 28, 2021 12 hours ago, Maddeen said: May I ask you one more question? While searching for the corrupt files to delete, I recordnize, that I cant browse to /mnt/diskx or /mnt/cache. But I'm pretty sure that the last time I browsed (months ago) I could easily browse directly to the mounting points. Does anything changed? from an external machine via SMB ? I disk share enabled ? (Settings / Global Share Settings) 1 Quote Link to comment
Maddeen Posted December 28, 2021 Author Share Posted December 28, 2021 @ChatNoir - no - straight via Krusader docker... running in unraid itself. last time I just could browse to mnt -- and then see the directories for cache, disk 1, disk 2, disk 3 e.g. But now - maybe with 6.9.2 - the folder mnt is completly empty. And I just found the shares in /media/ Quote Link to comment
JorgeB Posted December 28, 2021 Share Posted December 28, 2021 Should still be able to browse /mnt/disk#, nothing changed. Quote Link to comment
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.