xjumper84 Posted January 20 Share Posted January 20 (edited) I use this box for Frigate and Plex in dockers. A week ago I did a parity check. Everything was good. I had a 4TB drive start to show a ton of errors so I took it out and replaced it with a 10tb drive - rebuilt the array. after rebuild it doesn't show any shares, but all the files are physically on the drives. When I look up the disks in terminal it says "structure needs cleaning" on all the drive I rebuilt. I also get "cannot access 'folder name': Input/Output error on the other drives. I had a billion errors during the rebuild. I still have the drive I took out. Diagnostics attached. Any help would be appreciated. Thank you nas-diagnostics-20240119-1721.zip Edited January 20 by xjumper84 Quote Link to comment
JorgeB Posted January 20 Share Posted January 20 Looks like a controller issue, errors with all connected disks, this can happen with onboard Ryzen SATA controllers during heavy load: Jan 19 04:57:31 NAS kernel: md: disk0 read error, sector=1728810592 Jan 19 04:57:31 NAS kernel: md: disk3 read error, sector=1728810784 Jan 19 04:57:31 NAS kernel: md: disk1 read error, sector=1728810520 Jan 19 04:57:31 NAS kernel: md: disk2 read error, sector=1728810504 So disk4 rebuild would not have been successful, reboot and post new diags after array start. Quote Link to comment
xjumper84 Posted January 20 Author Share Posted January 20 Its been trying to reboot for almost 10 minutes. Its stuck on disk5 as it says 'target is busy'. I did a hard power off, checked the sata ports to be connected to the drives and motherboard. Reboot came up fine, array wants to do a parity check - i paused it to create the diagnostics file - attached. Question - if frigate was running while the drive was attempting to rebuild, would that have caused issues with parity, because Frigate was saving camera files to other disks? nas-diagnostics-20240120-0817.zip Quote Link to comment
xjumper84 Posted January 20 Author Share Posted January 20 I'm now seeing a lot of errors on the disk I just rebuilt - should I stop parity check, unmount the drive and run xfs_repair? Jan 20 08:18:58 NAS kernel: XFS (md4): Metadata corruption detected at xfs_buf_ioend+0xac/0x386 [xfs], xfs_inode block 0x15c3e0ef0 xfs_inode_buf_verify Jan 20 08:18:58 NAS kernel: XFS (md4): Unmount and run xfs_repair Jan 20 08:18:58 NAS kernel: XFS (md4): First 128 bytes of corrupted metadata buffer: Jan 20 08:18:58 NAS kernel: 00000000: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ Jan 20 08:18:58 NAS kernel: 00000010: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ Jan 20 08:18:58 NAS kernel: 00000020: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ Jan 20 08:18:58 NAS kernel: 00000030: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ Jan 20 08:18:58 NAS kernel: 00000040: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ Jan 20 08:18:58 NAS kernel: 00000050: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ Jan 20 08:18:58 NAS kernel: 00000060: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ Jan 20 08:18:58 NAS kernel: 00000070: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ Jan 20 08:18:58 NAS kernel: XFS (md4): metadata I/O error in "xfs_imap_to_bp+0x50/0x70 [xfs]" at daddr 0x15c3e0ef0 len 32 error 117 Jan 20 08:18:58 NAS kernel: XFS (md4): Metadata corruption detected at xfs_buf_ioend+0xac/0x386 [xfs], xfs_inode block 0x15c3e0ef0 xfs_inode_buf_verify Jan 20 08:18:58 NAS kernel: XFS (md4): Unmount and run xfs_repair Jan 20 08:18:58 NAS kernel: XFS (md4): First 128 bytes of corrupted metadata buffer: Jan 20 08:18:58 NAS kernel: 00000000: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ Jan 20 08:18:58 NAS kernel: 00000010: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ Jan 20 08:18:58 NAS kernel: 00000020: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ Jan 20 08:18:58 NAS kernel: 00000030: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ Jan 20 08:18:58 NAS kernel: 00000040: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ Jan 20 08:18:58 NAS kernel: 00000050: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ Jan 20 08:18:58 NAS kernel: 00000060: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ Jan 20 08:18:58 NAS kernel: 00000070: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ Jan 20 08:18:58 NAS kernel: XFS (md4): metadata I/O error in "xfs_imap_to_bp+0x50/0x70 [xfs]" at daddr 0x15c3e0ef0 len 32 error 117 Jan 20 08:19:11 NAS kernel: XFS (md4): Metadata corruption detected at xfs_buf_ioend+0xac/0x386 [xfs], xfs_inode block 0x173f2aae8 xfs_inode_buf_verify Jan 20 08:19:11 NAS kernel: XFS (md4): Unmount and run xfs_repair Jan 20 08:19:11 NAS kernel: XFS (md4): First 128 bytes of corrupted metadata buffer: Jan 20 08:19:11 NAS kernel: 00000000: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ Jan 20 08:19:11 NAS kernel: 00000010: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ Jan 20 08:19:11 NAS kernel: 00000020: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ Jan 20 08:19:11 NAS kernel: 00000030: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ Jan 20 08:19:11 NAS kernel: 00000040: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ Jan 20 08:19:11 NAS kernel: 00000050: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ Jan 20 08:19:11 NAS kernel: 00000060: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ Jan 20 08:19:11 NAS kernel: 00000070: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ Jan 20 08:19:11 NAS kernel: XFS (md4): metadata I/O error in "xfs_imap_to_bp+0x50/0x70 [xfs]" at daddr 0x173f2aae8 len 32 error 117 Jan 20 08:19:11 NAS kernel: XFS (md4): Metadata corruption detected at xfs_buf_ioend+0xac/0x386 [xfs], xfs_inode block 0x174129f48 xfs_inode_buf_verify Jan 20 08:19:11 NAS kernel: XFS (md4): Unmount and run xfs_repair Jan 20 08:19:11 NAS kernel: XFS (md4): First 128 bytes of corrupted metadata buffer: Jan 20 08:19:11 NAS kernel: 00000000: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ Jan 20 08:19:11 NAS kernel: 00000010: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ Jan 20 08:19:11 NAS kernel: 00000020: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ Jan 20 08:19:11 NAS kernel: 00000030: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ Jan 20 08:19:11 NAS kernel: 00000040: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ Jan 20 08:19:11 NAS kernel: 00000050: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ Jan 20 08:19:11 NAS kernel: 00000060: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ Jan 20 08:19:11 NAS kernel: 00000070: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ Jan 20 08:19:11 NAS kernel: XFS (md4): metadata I/O error in "xfs_imap_to_bp+0x50/0x70 [xfs]" at daddr 0x174129f48 len 32 error 117 Jan 20 08:19:11 NAS kernel: XFS (md4): Metadata corruption detected at xfs_buf_ioend+0xac/0x386 [xfs], xfs_inode block 0x18141b988 xfs_inode_buf_verify Jan 20 08:19:11 NAS kernel: XFS (md4): Unmount and run xfs_repair Jan 20 08:19:11 NAS kernel: XFS (md4): First 128 bytes of corrupted metadata buffer: Jan 20 08:19:11 NAS kernel: 00000000: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ Jan 20 08:19:11 NAS kernel: 00000010: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ Jan 20 08:19:11 NAS kernel: 00000020: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ Jan 20 08:19:11 NAS kernel: 00000030: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ Jan 20 08:19:11 NAS kernel: 00000040: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ Jan 20 08:19:11 NAS kernel: 00000050: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ Jan 20 08:19:11 NAS kernel: 00000060: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ Jan 20 08:19:11 NAS kernel: 00000070: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ Jan 20 08:19:11 NAS kernel: XFS (md4): metadata I/O error in "xfs_imap_to_bp+0x50/0x70 [xfs]" at daddr 0x18141b988 len 32 error 117 Jan 20 08:19:11 NAS kernel: XFS (md4): Metadata corruption detected at xfs_buf_ioend+0xac/0x386 [xfs], xfs_inode block 0x15c3e0ef0 xfs_inode_buf_verify Jan 20 08:19:11 NAS kernel: XFS (md4): Unmount and run xfs_repair Jan 20 08:19:11 NAS kernel: XFS (md4): First 128 bytes of corrupted metadata buffer: Jan 20 08:19:11 NAS kernel: 00000000: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ Jan 20 08:19:11 NAS kernel: 00000010: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ Jan 20 08:19:11 NAS kernel: 00000020: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ Jan 20 08:19:11 NAS kernel: 00000030: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ Jan 20 08:19:11 NAS kernel: 00000040: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ Jan 20 08:19:11 NAS kernel: 00000050: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ Jan 20 08:19:11 NAS kernel: 00000060: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ Jan 20 08:19:11 NAS kernel: 00000070: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ Jan 20 08:19:11 NAS kernel: XFS (md4): metadata I/O error in "xfs_imap_to_bp+0x50/0x70 [xfs]" at daddr 0x15c3e0ef0 len 32 error 117 Jan 20 08:19:11 NAS kernel: XFS (md4): Metadata corruption detected at xfs_buf_ioend+0xac/0x386 [xfs], xfs_inode block 0x18bb364e8 xfs_inode_buf_verify Jan 20 08:19:11 NAS kernel: XFS (md4): Unmount and run xfs_repair Jan 20 08:19:11 NAS kernel: XFS (md4): First 128 bytes of corrupted metadata buffer: Jan 20 08:19:11 NAS kernel: 00000000: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ Jan 20 08:19:11 NAS kernel: 00000010: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ Jan 20 08:19:11 NAS kernel: 00000020: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ Jan 20 08:19:11 NAS kernel: 00000030: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ Jan 20 08:19:11 NAS kernel: 00000040: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ Jan 20 08:19:11 NAS kernel: 00000050: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ Jan 20 08:19:11 NAS kernel: 00000060: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ Jan 20 08:19:11 NAS kernel: 00000070: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ Jan 20 08:19:11 NAS kernel: XFS (md4): metadata I/O error in "xfs_imap_to_bp+0x50/0x70 [xfs]" at daddr 0x18bb364e8 len 32 error 117 Jan 20 08:19:17 NAS kernel: XFS (md4): Metadata corruption detected at xfs_buf_ioend+0xac/0x386 [xfs], xfs_inode block 0x16ffeca48 xfs_inode_buf_verify Jan 20 08:19:17 NAS kernel: XFS (md4): Unmount and run xfs_repair Jan 20 08:19:17 NAS kernel: XFS (md4): First 128 bytes of corrupted metadata buffer: Jan 20 08:19:17 NAS kernel: 00000000: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ Jan 20 08:19:17 NAS kernel: 00000010: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ Jan 20 08:19:17 NAS kernel: 00000020: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ Jan 20 08:19:17 NAS kernel: 00000030: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ Jan 20 08:19:17 NAS kernel: 00000040: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ Jan 20 08:19:17 NAS kernel: 00000050: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ Jan 20 08:19:17 NAS kernel: 00000060: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ Jan 20 08:19:17 NAS kernel: 00000070: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ Jan 20 08:19:17 NAS kernel: XFS (md4): metadata I/O error in "xfs_imap_to_bp+0x50/0x70 [xfs]" at daddr 0x16ffeca48 len 32 error 117 Jan 20 08:19:17 NAS kernel: XFS (md4): Metadata corruption detected at xfs_buf_ioend+0xac/0x386 [xfs], xfs_inode block 0x16ffeca48 xfs_inode_buf_verify Jan 20 08:19:17 NAS kernel: XFS (md4): Unmount and run xfs_repair Jan 20 08:19:17 NAS kernel: XFS (md4): First 128 bytes of corrupted metadata buffer: Jan 20 08:19:17 NAS kernel: 00000000: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ Jan 20 08:19:17 NAS kernel: 00000010: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ Jan 20 08:19:17 NAS kernel: 00000020: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ Jan 20 08:19:17 NAS kernel: 00000030: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ Jan 20 08:19:17 NAS kernel: 00000040: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ Jan 20 08:19:17 NAS kernel: 00000050: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ Jan 20 08:19:17 NAS kernel: 00000060: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ Jan 20 08:19:17 NAS kernel: 00000070: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ Jan 20 08:19:17 NAS kernel: XFS (md4): metadata I/O error in "xfs_imap_to_bp+0x50/0x70 [xfs]" at daddr 0x16ffeca48 len 32 error 117 Jan 20 08:19:37 NAS kernel: ata9: SATA link up 6.0 Gbps (SStatus 133 SControl 300) Jan 20 08:19:37 NAS kernel: ata9.00: configured for UDMA/133 Jan 20 08:19:39 NAS kernel: XFS (md4): Metadata corruption detected at xfs_buf_ioend+0xac/0x386 [xfs], xfs_inode block 0x15c3e0ef0 xfs_inode_buf_verify Jan 20 08:19:39 NAS kernel: XFS (md4): Unmount and run xfs_repair Jan 20 08:19:39 NAS kernel: XFS (md4): First 128 bytes of corrupted metadata buffer: Jan 20 08:19:39 NAS kernel: 00000000: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ Jan 20 08:19:39 NAS kernel: 00000010: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ Jan 20 08:19:39 NAS kernel: 00000020: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ Jan 20 08:19:39 NAS kernel: 00000030: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ Jan 20 08:19:39 NAS kernel: 00000040: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ Jan 20 08:19:39 NAS kernel: 00000050: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ Jan 20 08:19:39 NAS kernel: 00000060: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ Jan 20 08:19:39 NAS kernel: 00000070: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ Jan 20 08:19:39 NAS kernel: XFS (md4): metadata I/O error in "xfs_imap_to_bp+0x50/0x70 [xfs]" at daddr 0x15c3e0ef0 len 32 error 117 Jan 20 08:21:25 NAS kernel: XFS (md4): Metadata corruption detected at xfs_buf_ioend+0xac/0x386 [xfs], xfs_inode block 0x15c3e0ef0 xfs_inode_buf_verify Jan 20 08:21:25 NAS kernel: XFS (md4): Unmount and run xfs_repair Jan 20 08:21:25 NAS kernel: XFS (md4): First 128 bytes of corrupted metadata buffer: Jan 20 08:21:25 NAS kernel: 00000000: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ Jan 20 08:21:25 NAS kernel: 00000010: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ Jan 20 08:21:25 NAS kernel: 00000020: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ Jan 20 08:21:25 NAS kernel: 00000030: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ Jan 20 08:21:25 NAS kernel: 00000040: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ Jan 20 08:21:25 NAS kernel: 00000050: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ Jan 20 08:21:25 NAS kernel: 00000060: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ Jan 20 08:21:25 NAS kernel: 00000070: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ Jan 20 08:21:25 NAS kernel: XFS (md4): metadata I/O error in "xfs_imap_to_bp+0x50/0x70 [xfs]" at daddr 0x15c3e0ef0 len 32 error 117 Jan 20 08:21:48 NAS ntpd[1447]: kernel reports TIME_ERROR: 0x41: Clock Unsynchronized Jan 20 08:23:15 NAS kernel: ata10: SATA link up 6.0 Gbps (SStatus 133 SControl 300) Jan 20 08:23:15 NAS kernel: ata10.00: configured for UDMA/133 Jan 20 08:23:15 NAS kernel: XFS (md4): Metadata corruption detected at xfs_buf_ioend+0xac/0x386 [xfs], xfs_inode block 0x15c3e0ef0 xfs_inode_buf_verify Jan 20 08:23:15 NAS kernel: XFS (md4): Unmount and run xfs_repair Jan 20 08:23:15 NAS kernel: XFS (md4): First 128 bytes of corrupted metadata buffer: Jan 20 08:23:15 NAS kernel: 00000000: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ Jan 20 08:23:15 NAS kernel: 00000010: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ Jan 20 08:23:15 NAS kernel: 00000020: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ Jan 20 08:23:15 NAS kernel: 00000030: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ Jan 20 08:23:15 NAS kernel: 00000040: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ Jan 20 08:23:15 NAS kernel: 00000050: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ Jan 20 08:23:15 NAS kernel: 00000060: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ Jan 20 08:23:15 NAS kernel: 00000070: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ Jan 20 08:23:15 NAS kernel: XFS (md4): metadata I/O error in "xfs_imap_to_bp+0x50/0x70 [xfs]" at daddr 0x15c3e0ef0 len 32 error 117 Jan 20 08:23:15 NAS kernel: XFS (md4): Metadata corruption detected at xfs_buf_ioend+0xac/0x386 [xfs], xfs_inode block 0x15c3e0ef0 xfs_inode_buf_verify Jan 20 08:23:15 NAS kernel: XFS (md4): Unmount and run xfs_repair Jan 20 08:23:15 NAS kernel: XFS (md4): First 128 bytes of corrupted metadata buffer: Jan 20 08:23:15 NAS kernel: 00000000: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ Jan 20 08:23:15 NAS kernel: 00000010: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ Jan 20 08:23:15 NAS kernel: 00000020: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ Jan 20 08:23:15 NAS kernel: 00000030: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ Jan 20 08:23:15 NAS kernel: 00000040: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ Jan 20 08:23:15 NAS kernel: 00000050: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ Jan 20 08:23:15 NAS kernel: 00000060: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ Jan 20 08:23:15 NAS kernel: 00000070: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ Jan 20 08:23:15 NAS kernel: XFS (md4): metadata I/O error in "xfs_imap_to_bp+0x50/0x70 [xfs]" at daddr 0x15c3e0ef0 len 32 error 117 Quote Link to comment
trurl Posted January 20 Share Posted January 20 Those log entries are not in your most recently posted diagnostics. Post new diagnostics. Quote Link to comment
xjumper84 Posted January 20 Author Share Posted January 20 New diagnostics attached. nas-diagnostics-20240120-0846.zip Quote Link to comment
trurl Posted January 20 Share Posted January 20 24 minutes ago, xjumper84 said: unmount the drive and run xfs_repair? Those ideas are missing some details. Check filesystem on disk4. Be sure to use the webUI and not the command line. Capture the output and post it. Quote Link to comment
trurl Posted January 20 Share Posted January 20 15 hours ago, xjumper84 said: I still have the drive I took out. Can you mount this disk as an Unassigned Device? Quote Link to comment
xjumper84 Posted January 20 Author Share Posted January 20 4 minutes ago, trurl said: Those ideas are missing some details. Check filesystem on disk4. Be sure to use the webUI and not the command line. Capture the output and post it. Output in the text file attached. xfs_repair -n disk4.txt Quote Link to comment
xjumper84 Posted January 20 Author Share Posted January 20 (edited) 11 minutes ago, trurl said: Can you mount this disk as an Unassigned Device? Yes, I'll need to power off to physically install it. Should I do that now? Or attempt the xfs_repair first? Edited January 20 by xjumper84 Quote Link to comment
itimpi Posted January 20 Share Posted January 20 15 hours ago, xjumper84 said: I had a billion errors during the rebuild. I still have the drive I took out. Is this the rebuild onto the 10TB drive? If so I would not expect the rebuilt drive to have usable content. Quote Link to comment
xjumper84 Posted January 20 Author Share Posted January 20 (edited) 5 minutes ago, itimpi said: Is this the rebuild onto the 10TB drive? If so I would not expect the rebuilt drive to have usable content. The billion errors were on disks 1-3. The rebuild went onto the 10TB drive (which is disk4). Edited January 20 by xjumper84 Quote Link to comment
trurl Posted January 20 Share Posted January 20 7 minutes ago, xjumper84 said: The billion errors were on disks 1-3. The rebuild went onto the 10TB drive. Every bit of all other disks must be reliably read to reliably rebuild every bit of the new disk. 56 minutes ago, xjumper84 said: Question - if frigate was running while the drive was attempting to rebuild, would that have caused issues with parity, because Frigate was saving camera files to other disks? Ordinarily this is OK because parity is updated realtime, so parity would still be valid after rebuilding the data disk. On the other hand, it means parity is now no longer valid for that original disk. The results of the check don't look good. Let's see what can be read from the original as an Unassigned Device. Not entirely clear there was anything actually wrong with that disk since "a ton of errors" is not very specific and we don't have any diagnostics from then. Quote Link to comment
xjumper84 Posted January 20 Author Share Posted January 20 (edited) 2 minutes ago, trurl said: Every bit of all other disks must be reliably read to reliably rebuilt every bit of the new disk. Ordinarily this is OK because parity is updated realtime, so parity would still be valid after rebuilding the data disk. On the other hand, it means parity is now no longer valid for that original disk. The results of the check don't look good. Let's see what can be read from the original as an Unassigned Device. Not entirely clear there was anything actually wrong with that disk since "a ton of errors" is not very specific and we don't have any diagnostics from then. Understood. edit: the errors showed on md1, md2 and md3 (same as shown in the picture before the last reboot) to read something similar to: "metadata i/o error in "xfs_imap_to_bp+0x50/0x70 [xfs]" ... len 32 error 5 So i'll power off the machine, connect the drive and startup. Please refresh my memory, with the new drive added, the array won't start? and then I'll be able to add the original 4tb drive into Unassigned Devices? Edited January 20 by xjumper84 Quote Link to comment
xjumper84 Posted January 20 Author Share Posted January 20 Ok. I re-added the original disk4 back, it showed up in Unassigned Devices. I mounted it and all my files are there. What is the proper step to proceed? I would think (and please let me know if this is correct) - unmount the drive from Unassigned Devices. Stop the array, change the 10tb drive to the original 4tb drive and then restart the array. Check the files to make sure all good, then do a parity check, ensure we're good there. Then once we are, in unassigned devices - format the 10tb drive to xfs, copy the 4tb drive to the 10tb drive, unmount the 10tb. stop the array, change the 4tb drive to the 10tb drive, setup the 4tb drive in unassigned devices and restart the array? Quote Link to comment
trurl Posted January 20 Share Posted January 20 1 minute ago, xjumper84 said: I would think (and please let me know if this is correct) - unmount the drive from Unassigned Devices. Stop the array, change the 10tb drive to the original 4tb drive and then restart the array. Check the files to make sure all good, then do a parity check, ensure we're good there. No this is not correct. The only way to get the original back in the array is New Config and rebuild parity. 2 minutes ago, xjumper84 said: Then once we are, in unassigned devices - format the 10tb drive to xfs, copy the 4tb drive to the 10tb drive, unmount the 10tb. stop the array, change the 4tb drive to the 10tb drive, setup the 4tb drive in unassigned devices and restart the array? And all of this is wrong too. If you replace the drive it will rebuilt again. Quote Link to comment
trurl Posted January 20 Share Posted January 20 Just now, trurl said: The only way to get the original back in the array is New Config and rebuild parity. You should disable Docker and VM Manager in Settings until you get everything working right again. Stop anything that might write to your array, then New Config might be the safest approach, since only the parity disk would be written, and parity contains none of your data. Quote Link to comment
trurl Posted January 20 Share Posted January 20 13 minutes ago, xjumper84 said: disk4 back, it showed up in Unassigned Devices Post new diagnostics so we can see check the health of that disk. Quote Link to comment
xjumper84 Posted January 20 Author Share Posted January 20 New diagnostics attached nas-diagnostics-20240120-0951.zip Quote Link to comment
xjumper84 Posted January 20 Author Share Posted January 20 11 minutes ago, trurl said: No this is not correct. The only way to get the original back in the array is New Config and rebuild parity. And all of this is wrong too. If you replace the drive it will rebuilt again. 6 minutes ago, trurl said: You should disable Docker and VM Manager in Settings until you get everything working right again. Stop anything that might write to your array, then New Config might be the safest approach, since only the parity disk would be written, and parity contains none of your data. Ok - So I should: go to settings set enable docker to no - this will keep the docker containers I have though right? go to settings, set enable VM's to no - this will also keep the VM files I already have. Then: stop the array switch the 10tb out add the 4tb in start the array with no parity check go to tools -> new config then rebuild parity, wait. don't let any docker run, don't copy anything in or out until it finishes. Is that correct? Quote Link to comment
trurl Posted January 20 Share Posted January 20 Jan 20 09:29:06 NAS unassigned.devices: Mounting partition 'sdi1' at mountpoint '/mnt/disks/WD-WCC4E4TSK55Y'... sdi has ID# ATTRIBUTE_NAME FLAGS VALUE WORST THRESH FAIL RAW_VALUE 1 Raw_Read_Error_Rate POSR-K 200 200 051 - 177 ... 197 Current_Pending_Sector -O--CK 200 200 000 - 3 ... 200 Multi_Zone_Error_Rate ---R-- 200 200 000 - 12 You should click on each of your WD disks to get to its settings, and add attributes 1 and 200 for monitoring. They all look good right now, unlike that disk. 7 minutes ago, xjumper84 said: go to settings set enable docker to no - this will keep the docker containers I have though right? go to settings, set enable VM's to no - this will also keep the VM files I already have. This will just disable the Docker and VM Manager services, so none of these can run. They will not go away. Do that much while we consider how to proceed. Quote Link to comment
trurl Posted January 20 Share Posted January 20 Another possible approach is to try the rebuild again. We could do that first, see how it goes, keep the original for later. I am leaning towards this approach. Anybody else @JorgeB @itimpi have any opinion or other ideas? Quote Link to comment
xjumper84 Posted January 20 Author Share Posted January 20 18 minutes ago, trurl said: Jan 20 09:29:06 NAS unassigned.devices: Mounting partition 'sdi1' at mountpoint '/mnt/disks/WD-WCC4E4TSK55Y'... sdi has ID# ATTRIBUTE_NAME FLAGS VALUE WORST THRESH FAIL RAW_VALUE 1 Raw_Read_Error_Rate POSR-K 200 200 051 - 177 ... 197 Current_Pending_Sector -O--CK 200 200 000 - 3 ... 200 Multi_Zone_Error_Rate ---R-- 200 200 000 - 12 You should click on each of your WD disks to get to its settings, and add attributes 1 and 200 for monitoring. They all look good right now, unlike that disk. This will just disable the Docker and VM Manager services, so none of these can run. They will not go away. Do that much while we consider how to proceed. I checked the other WD disks, they are all zero's on ID1, 197 and 200. Docker and VM's are set to no. Quote Link to comment
trurl Posted January 20 Share Posted January 20 1 minute ago, xjumper84 said: I checked the other WD disks, they are all zero's on ID1, 197 and 200. As I said, but did you 21 minutes ago, trurl said: add attributes 1 and 200 for monitoring Quote Link to comment
xjumper84 Posted January 20 Author Share Posted January 20 17 minutes ago, trurl said: Another possible approach is to try the rebuild again. We could do that first, see how it goes, keep the original for later. I am leaning towards this approach. Anybody else @JorgeB @itimpi have any opinion or other ideas? Could I copy the data off the original 4tb drive currently in Unassigned Devices to my desktop, just to have a good copy of the data, then erase the rebuilt data on disk 4. Then parity check to ensure parity is good. Then copy the data back to Disk 4? Quote Link to comment
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.