April 18, 20242 yr I had an SSD go bad in my cache pool last week so I swapped a new drive in and I think got everything fixed. Today I'm getting BTRFS errors and I'm not sure exactly what to do. Diagnostics attached storage-diagnostics-20240418-1054.zip
April 18, 20242 yr Community Expert Syslog rotated so we cannot see the beginning of the problem, reboot and post new diags after array start.
April 18, 20242 yr Author @JorgeB here are fresh diagnostics after a reboot. One thing i noticed is that now my entire cache pool says unmountable when it was working fine last night. Any ideas? storage-diagnostics-20240418-1501.zip
April 19, 20242 yr Community Expert Type: btrfs rescue zero-log /dev/sdc1 Then re-start the array and post new diags.
April 19, 20242 yr Author @JorgeB I ran the command in terminal, received this response: Clearing log on /dev/sdc1, previous log_root 4906986635264, level 0 Stopped then restarted the array and it looks like my cache pool has come back online, here are the new diagnostics. storage-diagnostics-20240419-0838.zip
April 19, 20242 yr Community Expert Run a correcting scrub on the pool and post the results. P.S. you should change the docker network to ipvlan, since there are macvlan call traces.
April 19, 20242 yr Author @JorgeB Here are the results from the correcting scrub UUID: bbc56f07-1a5f-4d7b-b019-a515d7eb35aa Scrub started: Fri Apr 19 08:48:42 2024 Status: finished Duration: 0:39:21 Total to scrub: 1.26TiB Rate: 563.20MiB/s Error summary: csum=8 Corrected: 0 Uncorrectable: 8 Unverified: 0
April 19, 20242 yr Community Expert Look in syslog for a list of corrupt file(s), those should be deleted/restored from a backup, then re-run a scrub to confirm there aren't any more errors.
April 19, 20242 yr Author @JorgeB looks like its a couple pieces of plex and jellyfin metadata unless tthis is the wrong area to look at: Apr 19 08:49:34 Storage kernel: BTRFS warning (device sdc1): checksum error at logical 19965161472 on dev /dev/sdc1, physical 10267930624, root 5, inode 686348049, offset 4096, length 4096, links 1 (path: appdata/jellyfin/data/metadata/People/E/Eamon Sheehan/folder.jpg) Apr 19 08:49:34 Storage kernel: BTRFS error (device sdc1): bdev /dev/sdc1 errs: wr 335783, rd 191311, flush 1, corrupt 29971, gen 0 Apr 19 08:49:34 Storage kernel: BTRFS error (device sdc1): unable to fixup (regular) error at logical 19965161472 on dev /dev/sdc1 Apr 19 08:49:34 Storage kernel: BTRFS warning (device sdc1): checksum error at logical 19965161472 on dev /dev/sdb1, physical 9206771712, root 5, inode 686348049, offset 4096, length 4096, links 1 (path: appdata/jellyfin/data/metadata/People/E/Eamon Sheehan/folder.jpg) Apr 19 08:49:34 Storage kernel: BTRFS error (device sdc1): bdev /dev/sdb1 errs: wr 0, rd 0, flush 0, corrupt 6, gen 0 Apr 19 08:49:34 Storage kernel: BTRFS error (device sdc1): unable to fixup (regular) error at logical 19965161472 on dev /dev/sdb1 Apr 19 08:49:34 Storage kernel: BTRFS warning (device sdc1): checksum error at logical 19965165568 on dev /dev/sdc1, physical 10267934720, root 5, inode 686348049, offset 8192, length 4096, links 1 (path: appdata/jellyfin/data/metadata/People/E/Eamon Sheehan/folder.jpg) Apr 19 08:49:34 Storage kernel: BTRFS error (device sdc1): bdev /dev/sdc1 errs: wr 335783, rd 191311, flush 1, corrupt 29972, gen 0 Apr 19 08:49:34 Storage kernel: BTRFS error (device sdc1): unable to fixup (regular) error at logical 19965165568 on dev /dev/sdc1 Apr 19 08:49:34 Storage kernel: BTRFS warning (device sdc1): checksum error at logical 19965165568 on dev /dev/sdb1, physical 9206775808, root 5, inode 686348049, offset 8192, length 4096, links 1 (path: appdata/jellyfin/data/metadata/People/E/Eamon Sheehan/folder.jpg) Apr 19 08:49:34 Storage kernel: BTRFS error (device sdc1): bdev /dev/sdb1 errs: wr 0, rd 0, flush 0, corrupt 7, gen 0 Apr 19 08:49:34 Storage kernel: BTRFS error (device sdc1): unable to fixup (regular) error at logical 19965165568 on dev /dev/sdb1 Apr 19 08:48:46 Storage kernel: BTRFS warning (device sdc1): checksum error at logical 2367467520 on dev /dev/sdc1, physical 1293725696, root 5, inode 812538915, offset 20480, length 4096, links 1 (path: appdata/PlexMediaServer/Library/Application Support/Plex Media Server/Media/localhost/8/4b1112dba0e382f5a87080425e1a7ac0d711dec.bundle/Contents/GoP-0.xml) Apr 19 08:48:46 Storage kernel: BTRFS warning (device sdc1): checksum error at logical 2367467520 on dev /dev/sdb1, physical 199012352, root 5, inode 812538915, offset 20480, length 4096, links 1 (path: appdata/PlexMediaServer/Library/Application Support/Plex Media Server/Media/localhost/8/4b1112dba0e382f5a87080425e1a7ac0d711dec.bundle/Contents/GoP-0.xml) Apr 19 08:48:46 Storage kernel: BTRFS error (device sdc1): bdev /dev/sdc1 errs: wr 335783, rd 191311, flush 1, corrupt 29969, gen 0 Apr 19 08:48:46 Storage kernel: BTRFS error (device sdc1): bdev /dev/sdb1 errs: wr 0, rd 0, flush 0, corrupt 4, gen 0 Apr 19 08:48:46 Storage kernel: BTRFS error (device sdc1): unable to fixup (regular) error at logical 2367467520 on dev /dev/sdc1 Apr 19 08:48:46 Storage kernel: BTRFS error (device sdc1): unable to fixup (regular) error at logical 2367467520 on dev /dev/sdb1 Apr 19 08:48:46 Storage kernel: BTRFS warning (device sdc1): checksum error at logical 2367471616 on dev /dev/sdc1, physical 1293729792, root 5, inode 812538915, offset 24576, length 4096, links 1 (path: appdata/PlexMediaServer/Library/Application Support/Plex Media Server/Media/localhost/8/4b1112dba0e382f5a87080425e1a7ac0d711dec.bundle/Contents/GoP-0.xml) Apr 19 08:48:46 Storage kernel: BTRFS error (device sdc1): bdev /dev/sdc1 errs: wr 335783, rd 191311, flush 1, corrupt 29970, gen 0 Apr 19 08:48:46 Storage kernel: BTRFS warning (device sdc1): checksum error at logical 2367471616 on dev /dev/sdb1, physical 199016448, root 5, inode 812538915, offset 24576, length 4096, links 1 (path: appdata/PlexMediaServer/Library/Application Support/Plex Media Server/Media/localhost/8/4b1112dba0e382f5a87080425e1a7ac0d711dec.bundle/Contents/GoP-0.xml) Apr 19 08:48:46 Storage kernel: BTRFS error (device sdc1): bdev /dev/sdb1 errs: wr 0, rd 0, flush 0, corrupt 5, gen 0
April 19, 20242 yr Community Expert You should delete those files if possible, though don't know if it will affect Plex.
April 19, 20242 yr Author @JorgeB deleted those files, re-ran a repair scrub and it came back clean. Also moved docker to ipvlan. Anything else that i need to do/watch for? Appreciate the help as always!
April 19, 20242 yr Community Expert I would recommend resetting the current stats and monitoring the pool for future issues, since it can be much harder to resolve a problem if a bad or dropped device goes undetected for some time.
April 20, 20242 yr Author @JorgeB woke up this morning to some more BTRFS errors, ive attached a new set of diagnostics. storage-diagnostics-20240420-0758.zip
April 21, 20242 yr Community Expert Filesystem went read-only, this suggests that it still has other issues, rebooting should make it read/write again, then suggest backing up and re-creating the filesystem to make sure it doesn't happen again.
April 21, 20242 yr Author @JorgeB does that mean I should like format the cache drives and start over or how does a person recreate the filesystem?
April 22, 20242 yr Community Expert 18 hours ago, hermy65 said: or how does a person recreate the filesystem? You can format the pool using the GUI, change the filesystem to a different one and you can then format, of course you need to backup the pool first.
April 22, 20242 yr Author @JorgeB thanks. Any reason to replace sdc1? Since that's the one that keeps having errors?
April 22, 20242 yr Community Expert I would replace/swap the cables, if not done yet, but for now don't see a reason to suspect a device problem.
April 23, 20242 yr Author @JorgeB Got the cache pool re-formatted, etc. Just noticed some more BTRFS errors, attached are my diagnostics storage-diagnostics-20240423-1113.zip Also, not sure if related but after rebuilding my cache pool my VMs tab no longer works? It doesnt load any of the vms i had or let me create vms, just a blank page. Edited April 23, 20242 yr by hermy65
April 23, 20242 yr Community Expert There appears to be a problem with the libvirt.img, which is kind of strange since it looks like it's new, also no idea what this is about: Apr 22 21:25:42 Storage root: initializing /etc/libvirt Apr 22 21:25:42 Storage kernel: loop3: detected capacity change from 0 to 6000 Apr 22 21:25:42 Storage kernel: EXT4-fs (loop3): mounted filesystem with ordered data mode. Quota mode: disabled. Apr 22 21:25:42 Storage kernel: ext4 filesystem being mounted at /etc/libvirt- supports timestamps until 2038 (0x7fffffff) Apr 22 21:25:42 Storage kernel: EXT4-fs (loop3): unmounting filesystem. Apr 22 21:25:42 Storage emhttpd: shcmd (1368): /etc/rc.d/rc.libvirt start Never seen libvirt trying to mount an ext4 loop device after mounting the btrfs loop device, no idea what's going on there, but if the image is new try deleting and recreating.
April 23, 20242 yr Author @jorgeb Ok, i deleted libvrt.img and then stopped/started and the vm page works and one of my vms came back so i should be good with that one. Were the BTRFS errors all pointing at the libvirt issue or was that something else i need to take care of?
April 23, 20242 yr Community Expert 14 minutes ago, hermy65 said: Were the BTRFS errors all pointing at the libvirt issue Yes.
April 23, 20242 yr Author @jorgeb Thanks for all of the assistance here, hopefully I am finally past all of these problems!!
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.