January 18Jan 18 I found my docker applications not responding, I believe it is because my cache filled up. There were lots of messages in the log about I/O error and "BTRFS error (device loop2)" which I googled and found to be related to docker.img (I have a docker folder if that is relevant). So, that seems to make sense because I did fill up my cache pool.I tried to stop the array and it is stuck while trying to unmount disks. There is a lot lot these messages in the log that seem to be endless. I was still able to get system diagnostics while it was doing this. So I have attached that.I am wondering what I can do? Is it possible to interrupt it besides pulling the power?Jan 18 04:21:44 Tower emhttpd: Unmounting disks... Jan 18 04:21:44 Tower emhttpd: shcmd (10713): umount /mnt/disk3 Jan 18 04:21:44 Tower root: umount: /mnt/disk3: target is busy. Jan 18 04:21:44 Tower emhttpd: shcmd (10713): exit status: 32 Jan 18 04:21:44 Tower emhttpd: shcmd (10714): umount /mnt/cache Jan 18 04:21:44 Tower root: umount: /mnt/cache: target is busy. Jan 18 04:21:44 Tower emhttpd: shcmd (10714): exit status: 32 Jan 18 04:21:44 Tower emhttpd: Retry unmounting disk share(s)... Jan 18 04:21:47 Tower kernel: I/O error, dev loop2, sector 75904 op 0x0:(READ) flags 0x1000 phys_seg 4 prio class 0 Jan 18 04:21:47 Tower kernel: BTRFS error (device loop2): bdev /dev/loop2 errs: wr 0, rd 809, flush 0, corrupt 0, gen 0 Jan 18 04:21:47 Tower kernel: I/O error, dev loop2, sector 180736 op 0x0:(READ) flags 0x1000 phys_seg 4 prio class 0 Jan 18 04:21:47 Tower kernel: BTRFS error (device loop2): bdev /dev/loop2 errs: wr 0, rd 810, flush 0, corrupt 0, gen 0 Jan 18 04:21:49 Tower emhttpd: Unmounting disks... Jan 18 04:21:49 Tower emhttpd: shcmd (10715): umount /mnt/disk3 Jan 18 04:21:49 Tower root: umount: /mnt/disk3: target is busy. Jan 18 04:21:49 Tower emhttpd: shcmd (10715): exit status: 32 Jan 18 04:21:49 Tower emhttpd: shcmd (10716): umount /mnt/cache Jan 18 04:21:49 Tower root: umount: /mnt/cache: target is busy. Jan 18 04:21:49 Tower emhttpd: shcmd (10716): exit status: 32 Jan 18 04:21:49 Tower emhttpd: Retry unmounting disk share(s)... Jan 18 04:21:52 Tower kernel: I/O error, dev loop2, sector 75904 op 0x0:(READ) flags 0x1000 phys_seg 4 prio class 0 Jan 18 04:21:52 Tower kernel: BTRFS error (device loop2): bdev /dev/loop2 errs: wr 0, rd 811, flush 0, corrupt 0, gen 0 Jan 18 04:21:52 Tower kernel: I/O error, dev loop2, sector 180736 op 0x0:(READ) flags 0x1000 phys_seg 4 prio class 0 Jan 18 04:21:52 Tower kernel: BTRFS error (device loop2): bdev /dev/loop2 errs: wr 0, rd 812, flush 0, corrupt 0, gen 0 Jan 18 04:21:54 Tower emhttpd: Unmounting disks... Jan 18 04:21:54 Tower emhttpd: shcmd (10717): umount /mnt/disk3 Jan 18 04:21:54 Tower root: umount: /mnt/disk3: target is busy. Jan 18 04:21:54 Tower emhttpd: shcmd (10717): exit status: 32 Jan 18 04:21:54 Tower emhttpd: shcmd (10718): umount /mnt/cache Jan 18 04:21:54 Tower root: umount: /mnt/cache: target is busy. Jan 18 04:21:54 Tower emhttpd: shcmd (10718): exit status: 32 Jan 18 04:21:54 Tower emhttpd: Retry unmounting disk share(s)... Jan 18 04:21:57 Tower kernel: I/O error, dev loop2, sector 75904 op 0x0:(READ) flags 0x1000 phys_seg 4 prio class 0 Jan 18 04:21:57 Tower kernel: BTRFS error (device loop2): bdev /dev/loop2 errs: wr 0, rd 813, flush 0, corrupt 0, gen 0 Jan 18 04:21:57 Tower kernel: I/O error, dev loop2, sector 180736 op 0x0:(READ) flags 0x1000 phys_seg 4 prio class 0 Jan 18 04:21:57 Tower kernel: BTRFS error (device loop2): bdev /dev/loop2 errs: wr 0, rd 814, flush 0, corrupt 0, gen 0 Jan 18 04:21:59 Tower emhttpd: Unmounting disks... Jan 18 04:21:59 Tower emhttpd: shcmd (10719): umount /mnt/disk3 Jan 18 04:21:59 Tower root: umount: /mnt/disk3: target is busy. Jan 18 04:21:59 Tower emhttpd: shcmd (10719): exit status: 32 Jan 18 04:21:59 Tower emhttpd: shcmd (10720): umount /mnt/cache Jan 18 04:21:59 Tower root: umount: /mnt/cache: target is busy. Jan 18 04:21:59 Tower emhttpd: shcmd (10720): exit status: 32 Jan 18 04:21:59 Tower emhttpd: Retry unmounting disk share(s)... Jan 18 04:22:01 Tower kernel: buffer_io_error: 3 callbacks suppressed Jan 18 04:22:01 Tower kernel: Buffer I/O error on dev md2p1, logical block 0, async page read Jan 18 04:22:01 Tower kernel: Buffer I/O error on dev md5p1, logical block 0, async page read Jan 18 04:22:01 Tower kernel: Buffer I/O error on dev md1p1, logical block 0, async page read Jan 18 04:22:01 Tower kernel: I/O error, dev loop2, sector 0 op 0x0:(READ) flags 0x0 phys_seg 1 prio class 0 Jan 18 04:22:01 Tower kernel: Buffer I/O error on dev loop2, logical block 0, async page read Jan 18 04:22:01 Tower kernel: Buffer I/O error on dev md4p1, logical block 0, async page read Jan 18 04:22:01 Tower kernel: Buffer I/O error on dev md3p1, logical block 0, async page read Jan 18 04:22:01 Tower kernel: Buffer I/O error on dev md3p1, logical block 1, async page read Jan 18 04:22:01 Tower kernel: Buffer I/O error on dev md3p1, logical block 2, async page read Jan 18 04:22:01 Tower kernel: Buffer I/O error on dev md3p1, logical block 3, async page read Jan 18 04:22:01 Tower kernel: Buffer I/O error on dev md3p1, logical block 4, async page read Jan 18 04:22:03 Tower kernel: I/O error, dev loop2, sector 75904 op 0x0:(READ) flags 0x1000 phys_seg 4 prio class 0 Jan 18 04:22:03 Tower kernel: BTRFS error (device loop2): bdev /dev/loop2 errs: wr 0, rd 815, flush 0, corrupt 0, gen 0 Jan 18 04:22:03 Tower kernel: I/O error, dev loop2, sector 180736 op 0x0:(READ) flags 0x1000 phys_seg 4 prio class 0 Jan 18 04:22:03 Tower kernel: BTRFS error (device loop2): bdev /dev/loop2 errs: wr 0, rd 816, flush 0, corrupt 0, gen 0 Jan 18 04:22:04 Tower emhttpd: Unmounting disks... tower-diagnostics-20260118-0420.zip
January 19Jan 19 Author I asked Unraid to shutdown and it actually managed to shut down. It managed to unmount the disk I guess.However, when I brought the array back online, disk 4 appears to have major filesystem corruption. It is unmountable and when I did a filesystem check in maintenance mode (xfs_repair -n) it shows a lot of problems. I guess I will have to rebuild the disk onto a new one. Edited January 19Jan 19 by unburt
January 19Jan 19 Community Expert Rebuilding the disk should not fix the filesystem issues; run xfs_repair without -n, or use the GUI to try and fix it.
January 19Jan 19 Author Doesn't rebuilding the disk rewrite every sector of the disk, filesystem and all? So, while it's not a direct fix for the filesystem corruption, it would resolve the issue if I understand correctly.I realize that asking how long xfs_repair will take is a very open ended question. But I'm concerned that it may take days and days trying to correct the errors. I know that a rebuild takes approximately 1 day for my 16TB drives so I'm comfortable with that amount of time waiting. Is there any way to guess how long xfs_repair will take relative to a rebuild?I've attached a new diagnostics file that I've just generated after several boot/restarts of the server just poking around trying to gauge the health of the system before I bring the array online. tower-diagnostics-20260119-1309.zip Edited January 19Jan 19 by unburt
January 19Jan 19 Community Expert 5 minutes ago, unburt said:Doesn't rebuilding the disk rewrite every sector of the disk, filesystem and all? So, while it's not a direct fix for the filesystem corruption, it would resolve the issue if I understand correctly.When a disk has filesystem corruption, the rebuilt disk will typically have exactly the same corruption, assuming parity is in sync.
January 19Jan 19 Community Expert 6 minutes ago, unburt said:I realize that asking how long xfs_repair will take is a very open ended question.Typically it takes a few seconds or minutes.
January 19Jan 19 Author 16 minutes ago, JorgeB said:When a disk has filesystem corruption, the rebuilt disk will typically have exactly the same corruption, assuming parity is in sync.16 minutes ago, JorgeB said:Typically it takes a few seconds or minutes.oh I see. That makes sense.I guess I will do the filesystem repair then. Thank you for the advice
January 19Jan 19 Community Expert 10 hours ago, JorgeB said:use the GUI to try and fix it.Easy to get the command line wrong. Even more so for an array disk.Post new diagnostics after.
January 19Jan 19 Author I am now trying to proceed using the gui. But am not sure if I'm going about this correctly? It suggests I should mount the filesystem, how do I do that?I have started the array in maintenance mode.I click on the disabled Disk's name ("Disk 4") and am brought to its settings page at http://192.168.1.152/Main/Device?name=disk4In the "Check Filesystem Status" section, I clicked the button to check the file system. It outputs a lot of identified errors in the box labeled "xfs_repair status".The button changes to something like "fix" (I forgot the exact wording) and I click it. It outputs the following in the "xfs_repair status" box:Phase 1 - find and verify superblock... Phase 2 - using internal log - zero log... ERROR: The filesystem has valuable metadata changes in a log which needs to be replayed. Mount the filesystem to replay the log, and unmount it before re-running xfs_repair. If the filesystem is a snapshot of a mounted filesystem, you may need to give mount the nouuid option. If you are unable to mount the filesystem, then use the -L option to destroy the log and attempt a repair. Note that destroying the log may cause corruption -- please attempt a mount of the filesystem before doing this.The button has now changed to say "ZERO LOG" with the note "Dirty log detected." written beside it. And the following warning beneath: "Note: While there is some risk, if it is not possible to first mount the filesystem to clear the log, zeroing it is the only option to try and repair the filesystem, and in most cases it results in little or no data loss."I'm stuck on how do I mount the filesystem?I have tried stopping the array (exiting Maintenance Mode) and starting the array normally. It started up, Disk 4 contents were emulated. Disk 4 does not seem to get mounted due to the existing errors. Then I stopped the array again and started it in Maintenance Mode once more to check the xfs_repair status and see if perhaps any new output appeared. Nothing changed. The same text as before remained. What should I do?I've attached a screenshot in case it helps illustrate what I'm seeing. The upper portion shows my array devices and the "x" icon beside Disk 4. And the lower portion shows the status page of Disk 4 and my progress so far with xfs_repair. Edited January 19Jan 19 by unburt
January 19Jan 19 Community Expert UNraid has already failed to mount the file system, so you need to select the option to zero the log. After doing that you restart the array in normal mode and the drive should now mount.
January 20Jan 20 Author Zeroing the log and fixing the errors seemed to have completed successfully. However, Disk 4 is still disabled with the 'x' icon. I've attached new diagnostics to this post.I have rebooted the server and started it in Maintenance Mode and did another filesystem check on that disk. It says there are no errors.Phase 1 - find and verify superblock... Phase 2 - using internal log - zero log... - scan filesystem freespace and inode maps... - found root inode chunk Phase 3 - for each AG... - scan (but don't clear) agi unlinked lists... - process known inodes and perform inode discovery... - agno = 0 - agno = 1 - agno = 2 - agno = 3 - agno = 4 - agno = 5 - agno = 6 - agno = 7 - agno = 8 - agno = 9 - agno = 10 - agno = 11 - agno = 12 - agno = 13 - agno = 14 - process newly discovered inodes... Phase 4 - check for duplicate blocks... - setting up duplicate extent list... - check for inodes claiming duplicate blocks... - agno = 0 - agno = 4 - agno = 11 - agno = 14 - agno = 3 - agno = 6 - agno = 5 - agno = 8 - agno = 9 - agno = 1 - agno = 10 - agno = 13 - agno = 12 - agno = 2 - agno = 7 No modify flag set, skipping phase 5 Phase 6 - check inode connectivity... - traversing filesystem ... - traversal finished ... - moving disconnected inodes to lost+found ... Phase 7 - verify link counts... No modify flag set, skipping filesystem flush and exiting. tower-diagnostics-20260120-0911.zip
January 20Jan 20 Community Expert 22 minutes ago, unburt said:Disk 4 is still disabled with the 'x' icon.You have to rebuild it.
January 20Jan 20 Community Expert We usually prefer diagnostics taken with the array started, since we can't see anything about your filesystems since no disks are mounted.
January 20Jan 20 Community Expert 1 hour ago, unburt said:Zeroing the log and fixing the errors seemed to have completed successfully. However, Disk 4 is still disabled with the 'x' iconYou fixed the corrupt file system on the emulated drive. To clear a red ‘x’ (disabled) state a rebuild is needed.
January 20Jan 20 Community Expert 1 hour ago, unburt said:did another filesystem check on that disk. It says there are no errors.Not clear from your check output if there were no errors, and it also said it didn't fix anything that pass since it was no-modify.Might be a good idea if you started the array in Normal (not maintenance) mode and post new diagnostics so we can make sure disabled/emulated disk is mounting before you attempt rebuild.
January 20Jan 20 Author 1 hour ago, trurl said:We usually prefer diagnostics taken with the array started, since we can't see anything about your filesystems since no disks are mounted.Sorry, I did not realize the information would be lacking the way I did it. I have started the array and taken another diagnostics file.I also have the output from earlier when I did the xfs repair (after I zeroed the log). tower-diagnostics-20260120-1225.zip xfs_repair disk 4 fix after zero log.txt
January 20Jan 20 Community Expert Emulated disk4 mounted, should be OK to rebuild. Looks like repair ended up with some lost+found.
January 20Jan 20 Author Thanks for looking at it with me. So, to rebuild the disk I will:Write down the model and serial number of the disk.Unassign the diskStart the arrayStop the arayThen reassign the disk (using the model/serial number to be sure I'm assigning the correct disk)And then the rebuild will start when I start the array.Does that sound right?
January 20Jan 20 Author Thanks! I started a rebuild. Hopefully I'm back to normal in about a day (16TB drives).
January 21Jan 21 Author My rebuild completed and seems like the array is back to normal. Thank you for the help! tower-diagnostics-20260121-1432.zip
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.