Womabre Posted July 5, 2021 Share Posted July 5, 2021 This weekend my log got spammed with errors about the loop2 device. After some reading I decided to recreate the docker.img Halfway reinstalling al my containers I got errors and did some further digging and ran a BTRFS check on my cache pool. I also ran an extended SMART test without any errors. Can someone assist me in how to proceed further? [1/7] checking root items [2/7] checking extents extent item 395649875968 has multiple extent items ref mismatch on [395649875968 872448] extent item 1, found 2 backref disk bytenr does not match extent record, bytenr=395649875968, ref bytenr=395650023424 backref bytes do not match extent backref, bytenr=395649875968, ref bytes=872448, backref bytes=7913472 backpointer mismatch on [395649875968 872448] extent item 395656859648 has multiple extent items ref mismatch on [395656859648 2101248] extent item 1, found 2 backref disk bytenr does not match extent record, bytenr=395656859648, ref bytenr=395657936896 backref bytes do not match extent backref, bytenr=395656859648, ref bytes=2101248, backref bytes=17575936 backpointer mismatch on [395656859648 2101248] extent item 1702685908992 has multiple extent items ref mismatch on [1702685908992 475136] extent item 1, found 3 backref disk bytenr does not match extent record, bytenr=1702685908992, ref bytenr=1702686289920 backref bytes do not match extent backref, bytenr=1702685908992, ref bytes=475136, backref bytes=49152 backref disk bytenr does not match extent record, bytenr=1702685908992, ref bytenr=1702686339072 backref bytes do not match extent backref, bytenr=1702685908992, ref bytes=475136, backref bytes=16384 backpointer mismatch on [1702685908992 475136] extent item 1703009673216 has multiple extent items ref mismatch on [1703009673216 421888] extent item 1, found 2 backref disk bytenr does not match extent record, bytenr=1703009673216, ref bytenr=1703010017280 backref bytes do not match extent backref, bytenr=1703009673216, ref bytes=421888, backref bytes=90112 backpointer mismatch on [1703009673216 421888] ERROR: errors found in extent allocation tree or chunk allocation [3/7] checking free space cache [4/7] checking fs roots root 5 inode 6963467 errors 1000, some csum missing root 5 inode 25937011 errors 800, odd csum item ERROR: errors found in fs roots Opening filesystem to check... Checking filesystem on /dev/mapper/sdh1 UUID: 21bd917c-3bff-4b16-8083-3cc37e866bc0 found 830958428160 bytes used, error(s) found total csum bytes: 485124368 total tree bytes: 1190313984 total fs tree bytes: 431308800 total extent tree bytes: 155336704 btree space waste bytes: 215474656 file data blocks allocated: 1717250732032 referenced 826999590912 Quote Link to comment
trurl Posted July 5, 2021 Share Posted July 5, 2021 Go to Tools - Diagnostics and attach the complete Diagnostics ZIP file to your NEXT post in this thread. Quote Link to comment
Womabre Posted July 5, 2021 Author Share Posted July 5, 2021 Hi trurl. Here are the diagnostics. breedveld-diagnostics-20210705-2313.zip Quote Link to comment
trurl Posted July 5, 2021 Share Posted July 5, 2021 Sorry, please start the array and post new diagnostics in your NEXT post. Quote Link to comment
Womabre Posted July 5, 2021 Author Share Posted July 5, 2021 Sorry. Still had in Maintenance Mode. Bellow the diagnostics after a normal start. I also noticed that I now get this message: Unmountable disk present: Cache • Samsung_SSD_860_QVO_1TB_S4CZNF0M744639K (sdh) Cache 2 • ADATA_SU800_2I4820059015 (sdi) breedveld-diagnostics-20210705-2323.zip Quote Link to comment
Womabre Posted July 5, 2021 Author Share Posted July 5, 2021 Just ran the following commands I found here blkid btrfs fi show 21bd917c-3bff-4b16-8083-3cc37e866bc0 Maybe create a new config with the drives? root@Breedveld:~# blkid /dev/loop0: TYPE="squashfs" /dev/loop1: TYPE="squashfs" /dev/sda1: LABEL_FATBOOT="UNRAID" LABEL="UNRAID" UUID="272C-EBE2" BLOCK_SIZE="512" TYPE="vfat" /dev/sdb1: UUID="9535ccd7-a35f-4ae7-9b81-1df827d3ff81" TYPE="crypto_LUKS" PARTUUID="43b637af-1d86-4885-9238-6deded95ffc5" /dev/sdc1: UUID="1b5ff8ec-dc95-4366-ae24-2aab29dbc19d" TYPE="crypto_LUKS" PARTUUID="dd35aaa3-1958-4f9f-95ae-dc2590c42fe9" /dev/sdf1: UUID="48a9fda0-8361-4d1e-a7f9-8797feb7d36d" TYPE="crypto_LUKS" PARTUUID="13e10a64-25b3-422f-b4a1-a404e7b9fd4f" /dev/sdh1: UUID="6cf73bfd-fbca-434d-bc43-6882086e40b3" TYPE="crypto_LUKS" /dev/sdg1: UUID="d45aaeb9-8d4c-434c-a338-05fda58744b8" TYPE="crypto_LUKS" PTTYPE="atari" PARTUUID="1d83dbb0-8657-4d63-8459-ac8a1cf9073b" /dev/sdi1: UUID="a4f92928-fd20-4a35-b000-a5a2911ef80d" TYPE="crypto_LUKS" /dev/md1: UUID="48a9fda0-8361-4d1e-a7f9-8797feb7d36d" TYPE="crypto_LUKS" /dev/md2: UUID="d45aaeb9-8d4c-434c-a338-05fda58744b8" TYPE="crypto_LUKS" PTTYPE="atari" /dev/md3: UUID="9535ccd7-a35f-4ae7-9b81-1df827d3ff81" TYPE="crypto_LUKS" /dev/md4: UUID="1b5ff8ec-dc95-4366-ae24-2aab29dbc19d" TYPE="crypto_LUKS" /dev/mapper/md1: UUID="410acbe9-5e05-4cbb-a6cc-7468a1594335" BLOCK_SIZE="512" TYPE="xfs" /dev/mapper/md2: UUID="0edff695-7439-4bf4-afab-714828a33068" BLOCK_SIZE="512" TYPE="xfs" /dev/mapper/md3: UUID="999f938a-e3d4-406e-adee-29ffc84a11a5" BLOCK_SIZE="512" TYPE="xfs" /dev/mapper/md4: UUID="7220097f-6376-4713-9962-48d664aed857" BLOCK_SIZE="512" TYPE="xfs" /dev/mapper/sdh1: UUID="21bd917c-3bff-4b16-8083-3cc37e866bc0" UUID_SUB="c9064e5a-ba80-448a-9440-19cd64187136" BLOCK_SIZE="4096" TYPE="btrfs" /dev/mapper/sdi1: UUID="21bd917c-3bff-4b16-8083-3cc37e866bc0" UUID_SUB="fbcdd800-48d6-4c13-a209-a3ed25321280" BLOCK_SIZE="4096" TYPE="btrfs" /dev/sdd1: PARTUUID="1646878a-800f-4ba7-a893-713aef59900d" /dev/sde1: PARTUUID="8095e8cc-a2ca-4ff0-9d3f-a22f526a1f51" root@Breedveld:~# btrfs fi show 21bd917c-3bff-4b16-8083-3cc37e866bc0 Label: none uuid: 21bd917c-3bff-4b16-8083-3cc37e866bc0 Total devices 2 FS bytes used 773.91GiB devid 1 size 931.50GiB used 866.03GiB path /dev/mapper/sdh1 devid 2 size 953.85GiB used 866.03GiB path /dev/mapper/sdi1 Quote Link to comment
JorgeB Posted July 6, 2021 Share Posted July 6, 2021 There are some recovery options here, btrfs restore option is the most likely to work for this case, you should also run memtest since data corruption was being detected in the pool. 1 Quote Link to comment
Womabre Posted July 7, 2021 Author Share Posted July 7, 2021 The btrfs restore was successful, and after that a repair managed to restore the drives and being able to mount them. The last 14 hours I've been running a memtest. Everything seems fine so far. Will keep it running until it at least hits the 24h mark. 1 Quote Link to comment
itimpi Posted July 7, 2021 Share Posted July 7, 2021 I thought the maximum officially supported RAM speed on that CpU without over-clocking was 3200MHz? Could be wrong about that though but it might be worth checking out if you have any stability issues. Quote Link to comment
JorgeB Posted July 7, 2021 Share Posted July 7, 2021 10 minutes ago, itimpi said: I thought the maximum officially supported RAM speed on that CpU without over-clocking was 3200MHz? It is, and running Ryzen with overclock RAM is known to cause data corruption, see here for more info. Quote Link to comment
ChatNoir Posted July 7, 2021 Share Posted July 7, 2021 Had the same reaction earlier but the initial diagnostics show the RAM at 2667 if I remember correctly. Quote Link to comment
itimpi Posted July 7, 2021 Share Posted July 7, 2021 2 hours ago, ChatNoir said: Had the same reaction earlier but the initial diagnostics show the RAM at 2667 if I remember correctly. I assumed that the speed shown in the memtest display is what the RAM is set to run at? Are you suggesting that display may not be accurate? Quote Link to comment
tjb_altf4 Posted July 8, 2021 Share Posted July 8, 2021 3 hours ago, itimpi said: I assumed that the speed shown in the memtest display is what the RAM is set to run at? Are you suggesting that display may not be accurate? I think you are looking at CPU clock, not RAM clock. 1 Quote Link to comment
Womabre Posted July 8, 2021 Author Share Posted July 8, 2021 Just checked all the BIOS settings. No overclocking is enabled anywhere. I think the frequency you see in memtest is the CPU. Has been running quite a while now, no errors. I'm thinking maybe my SSD is dying... Quote Link to comment
Squid Posted July 8, 2021 Share Posted July 8, 2021 The memory is currently running at 2666 Speed: 2666 MT/s Manufacturer: Samsung Serial Number: 03CF9BAE Asset Tag: Not Specified Part Number: M391A4G43MB1-CTD Rank: 2 Configured Memory Speed: 2666 MT/s Quote Link to comment
Womabre Posted July 8, 2021 Author Share Posted July 8, 2021 Just restarted the server again. Everything was looking OK. So I started recreating my docker.img After a few minutes the log was getting spammed with errors. And the btrfs file system switched to read only. breedveld-diagnostics-20210708-1540.zip Quote Link to comment
JorgeB Posted July 8, 2021 Share Posted July 8, 2021 There's still corruption on the pool filesystem, I would recommend re-creating it: backup, format, restore data. 1 Quote Link to comment
JorgeB Posted July 8, 2021 Share Posted July 8, 2021 Forgot to mention, take a look to here for better pool monitoring, so you're immediately notified if it finds new data corruption. Quote Link to comment
Womabre Posted July 10, 2021 Author Share Posted July 10, 2021 I have everything running smoothly again. Thanks everyone for all the help, it was very useful! 🙂 On 7/8/2021 at 6:08 PM, JorgeB said: Forgot to mention, take a look to here for better pool monitoring, so you're immediately notified if it finds new data corruption. Thanks! Set up that script to run hourly. 1 Quote Link to comment
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.