Jump to content

Drive failed, rebuilt and now no shares show up


Go to solution Solved by trurl,

Recommended Posts

I use this box for Frigate and Plex in dockers. 

 

A week ago I did a parity check. Everything was good.

 

I had a 4TB drive start to show a ton of errors so I took it out and replaced it with a 10tb drive - rebuilt the array. after rebuild it doesn't show any shares, but all the files are physically on the drives. 

 

When I look up the disks in terminal it says "structure needs cleaning" on all the drive I rebuilt. I also get "cannot access 'folder name': Input/Output error on the other drives. 

 

I had a billion errors during the rebuild. I still have the drive I took out. 

 

Diagnostics attached. Any help would be appreciated. Thank you :)

nas-diagnostics-20240119-1721.zip

Edited by xjumper84
Link to comment

Looks like a controller issue, errors with all connected disks, this can happen with onboard Ryzen SATA controllers during heavy load:

 

Jan 19 04:57:31 NAS kernel: md: disk0 read error, sector=1728810592
Jan 19 04:57:31 NAS kernel: md: disk3 read error, sector=1728810784
Jan 19 04:57:31 NAS kernel: md: disk1 read error, sector=1728810520
Jan 19 04:57:31 NAS kernel: md: disk2 read error, sector=1728810504

 

So disk4 rebuild would not have been successful, reboot and post new diags after array start.

Link to comment

Its been trying to reboot for almost 10 minutes. Its stuck on disk5 as it says 'target is busy'. 

 

I did a hard power off, checked the sata ports to be connected to the drives and motherboard. 

 

Reboot came up fine, array wants to do a parity check - i paused it to create the diagnostics file - attached. 

 

Question - if frigate was running while the drive was attempting to rebuild, would that have caused issues with parity, because Frigate was saving camera files to other disks? 

20240120_080416.jpg

nas-diagnostics-20240120-0817.zip

Link to comment

I'm now seeing a lot of errors on the disk I just rebuilt - should I stop parity check, unmount the drive and run xfs_repair?

 

Jan 20 08:18:58 NAS kernel: XFS (md4): Metadata corruption detected at xfs_buf_ioend+0xac/0x386 [xfs], xfs_inode block 0x15c3e0ef0 xfs_inode_buf_verify
Jan 20 08:18:58 NAS kernel: XFS (md4): Unmount and run xfs_repair
Jan 20 08:18:58 NAS kernel: XFS (md4): First 128 bytes of corrupted metadata buffer:
Jan 20 08:18:58 NAS kernel: 00000000: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
Jan 20 08:18:58 NAS kernel: 00000010: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
Jan 20 08:18:58 NAS kernel: 00000020: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
Jan 20 08:18:58 NAS kernel: 00000030: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
Jan 20 08:18:58 NAS kernel: 00000040: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
Jan 20 08:18:58 NAS kernel: 00000050: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
Jan 20 08:18:58 NAS kernel: 00000060: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
Jan 20 08:18:58 NAS kernel: 00000070: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
Jan 20 08:18:58 NAS kernel: XFS (md4): metadata I/O error in "xfs_imap_to_bp+0x50/0x70 [xfs]" at daddr 0x15c3e0ef0 len 32 error 117
Jan 20 08:18:58 NAS kernel: XFS (md4): Metadata corruption detected at xfs_buf_ioend+0xac/0x386 [xfs], xfs_inode block 0x15c3e0ef0 xfs_inode_buf_verify
Jan 20 08:18:58 NAS kernel: XFS (md4): Unmount and run xfs_repair
Jan 20 08:18:58 NAS kernel: XFS (md4): First 128 bytes of corrupted metadata buffer:
Jan 20 08:18:58 NAS kernel: 00000000: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
Jan 20 08:18:58 NAS kernel: 00000010: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
Jan 20 08:18:58 NAS kernel: 00000020: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
Jan 20 08:18:58 NAS kernel: 00000030: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
Jan 20 08:18:58 NAS kernel: 00000040: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
Jan 20 08:18:58 NAS kernel: 00000050: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
Jan 20 08:18:58 NAS kernel: 00000060: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
Jan 20 08:18:58 NAS kernel: 00000070: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
Jan 20 08:18:58 NAS kernel: XFS (md4): metadata I/O error in "xfs_imap_to_bp+0x50/0x70 [xfs]" at daddr 0x15c3e0ef0 len 32 error 117
Jan 20 08:19:11 NAS kernel: XFS (md4): Metadata corruption detected at xfs_buf_ioend+0xac/0x386 [xfs], xfs_inode block 0x173f2aae8 xfs_inode_buf_verify
Jan 20 08:19:11 NAS kernel: XFS (md4): Unmount and run xfs_repair
Jan 20 08:19:11 NAS kernel: XFS (md4): First 128 bytes of corrupted metadata buffer:
Jan 20 08:19:11 NAS kernel: 00000000: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
Jan 20 08:19:11 NAS kernel: 00000010: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
Jan 20 08:19:11 NAS kernel: 00000020: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
Jan 20 08:19:11 NAS kernel: 00000030: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
Jan 20 08:19:11 NAS kernel: 00000040: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
Jan 20 08:19:11 NAS kernel: 00000050: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
Jan 20 08:19:11 NAS kernel: 00000060: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
Jan 20 08:19:11 NAS kernel: 00000070: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
Jan 20 08:19:11 NAS kernel: XFS (md4): metadata I/O error in "xfs_imap_to_bp+0x50/0x70 [xfs]" at daddr 0x173f2aae8 len 32 error 117
Jan 20 08:19:11 NAS kernel: XFS (md4): Metadata corruption detected at xfs_buf_ioend+0xac/0x386 [xfs], xfs_inode block 0x174129f48 xfs_inode_buf_verify
Jan 20 08:19:11 NAS kernel: XFS (md4): Unmount and run xfs_repair
Jan 20 08:19:11 NAS kernel: XFS (md4): First 128 bytes of corrupted metadata buffer:
Jan 20 08:19:11 NAS kernel: 00000000: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
Jan 20 08:19:11 NAS kernel: 00000010: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
Jan 20 08:19:11 NAS kernel: 00000020: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
Jan 20 08:19:11 NAS kernel: 00000030: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
Jan 20 08:19:11 NAS kernel: 00000040: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
Jan 20 08:19:11 NAS kernel: 00000050: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
Jan 20 08:19:11 NAS kernel: 00000060: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
Jan 20 08:19:11 NAS kernel: 00000070: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
Jan 20 08:19:11 NAS kernel: XFS (md4): metadata I/O error in "xfs_imap_to_bp+0x50/0x70 [xfs]" at daddr 0x174129f48 len 32 error 117
Jan 20 08:19:11 NAS kernel: XFS (md4): Metadata corruption detected at xfs_buf_ioend+0xac/0x386 [xfs], xfs_inode block 0x18141b988 xfs_inode_buf_verify
Jan 20 08:19:11 NAS kernel: XFS (md4): Unmount and run xfs_repair
Jan 20 08:19:11 NAS kernel: XFS (md4): First 128 bytes of corrupted metadata buffer:
Jan 20 08:19:11 NAS kernel: 00000000: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
Jan 20 08:19:11 NAS kernel: 00000010: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
Jan 20 08:19:11 NAS kernel: 00000020: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
Jan 20 08:19:11 NAS kernel: 00000030: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
Jan 20 08:19:11 NAS kernel: 00000040: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
Jan 20 08:19:11 NAS kernel: 00000050: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
Jan 20 08:19:11 NAS kernel: 00000060: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
Jan 20 08:19:11 NAS kernel: 00000070: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
Jan 20 08:19:11 NAS kernel: XFS (md4): metadata I/O error in "xfs_imap_to_bp+0x50/0x70 [xfs]" at daddr 0x18141b988 len 32 error 117
Jan 20 08:19:11 NAS kernel: XFS (md4): Metadata corruption detected at xfs_buf_ioend+0xac/0x386 [xfs], xfs_inode block 0x15c3e0ef0 xfs_inode_buf_verify
Jan 20 08:19:11 NAS kernel: XFS (md4): Unmount and run xfs_repair
Jan 20 08:19:11 NAS kernel: XFS (md4): First 128 bytes of corrupted metadata buffer:
Jan 20 08:19:11 NAS kernel: 00000000: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
Jan 20 08:19:11 NAS kernel: 00000010: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
Jan 20 08:19:11 NAS kernel: 00000020: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
Jan 20 08:19:11 NAS kernel: 00000030: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
Jan 20 08:19:11 NAS kernel: 00000040: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
Jan 20 08:19:11 NAS kernel: 00000050: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
Jan 20 08:19:11 NAS kernel: 00000060: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
Jan 20 08:19:11 NAS kernel: 00000070: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
Jan 20 08:19:11 NAS kernel: XFS (md4): metadata I/O error in "xfs_imap_to_bp+0x50/0x70 [xfs]" at daddr 0x15c3e0ef0 len 32 error 117
Jan 20 08:19:11 NAS kernel: XFS (md4): Metadata corruption detected at xfs_buf_ioend+0xac/0x386 [xfs], xfs_inode block 0x18bb364e8 xfs_inode_buf_verify
Jan 20 08:19:11 NAS kernel: XFS (md4): Unmount and run xfs_repair
Jan 20 08:19:11 NAS kernel: XFS (md4): First 128 bytes of corrupted metadata buffer:
Jan 20 08:19:11 NAS kernel: 00000000: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
Jan 20 08:19:11 NAS kernel: 00000010: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
Jan 20 08:19:11 NAS kernel: 00000020: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
Jan 20 08:19:11 NAS kernel: 00000030: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
Jan 20 08:19:11 NAS kernel: 00000040: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
Jan 20 08:19:11 NAS kernel: 00000050: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
Jan 20 08:19:11 NAS kernel: 00000060: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
Jan 20 08:19:11 NAS kernel: 00000070: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
Jan 20 08:19:11 NAS kernel: XFS (md4): metadata I/O error in "xfs_imap_to_bp+0x50/0x70 [xfs]" at daddr 0x18bb364e8 len 32 error 117
Jan 20 08:19:17 NAS kernel: XFS (md4): Metadata corruption detected at xfs_buf_ioend+0xac/0x386 [xfs], xfs_inode block 0x16ffeca48 xfs_inode_buf_verify
Jan 20 08:19:17 NAS kernel: XFS (md4): Unmount and run xfs_repair
Jan 20 08:19:17 NAS kernel: XFS (md4): First 128 bytes of corrupted metadata buffer:
Jan 20 08:19:17 NAS kernel: 00000000: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
Jan 20 08:19:17 NAS kernel: 00000010: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
Jan 20 08:19:17 NAS kernel: 00000020: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
Jan 20 08:19:17 NAS kernel: 00000030: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
Jan 20 08:19:17 NAS kernel: 00000040: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
Jan 20 08:19:17 NAS kernel: 00000050: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
Jan 20 08:19:17 NAS kernel: 00000060: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
Jan 20 08:19:17 NAS kernel: 00000070: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
Jan 20 08:19:17 NAS kernel: XFS (md4): metadata I/O error in "xfs_imap_to_bp+0x50/0x70 [xfs]" at daddr 0x16ffeca48 len 32 error 117
Jan 20 08:19:17 NAS kernel: XFS (md4): Metadata corruption detected at xfs_buf_ioend+0xac/0x386 [xfs], xfs_inode block 0x16ffeca48 xfs_inode_buf_verify
Jan 20 08:19:17 NAS kernel: XFS (md4): Unmount and run xfs_repair
Jan 20 08:19:17 NAS kernel: XFS (md4): First 128 bytes of corrupted metadata buffer:
Jan 20 08:19:17 NAS kernel: 00000000: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
Jan 20 08:19:17 NAS kernel: 00000010: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
Jan 20 08:19:17 NAS kernel: 00000020: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
Jan 20 08:19:17 NAS kernel: 00000030: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
Jan 20 08:19:17 NAS kernel: 00000040: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
Jan 20 08:19:17 NAS kernel: 00000050: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
Jan 20 08:19:17 NAS kernel: 00000060: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
Jan 20 08:19:17 NAS kernel: 00000070: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
Jan 20 08:19:17 NAS kernel: XFS (md4): metadata I/O error in "xfs_imap_to_bp+0x50/0x70 [xfs]" at daddr 0x16ffeca48 len 32 error 117
Jan 20 08:19:37 NAS kernel: ata9: SATA link up 6.0 Gbps (SStatus 133 SControl 300)
Jan 20 08:19:37 NAS kernel: ata9.00: configured for UDMA/133
Jan 20 08:19:39 NAS kernel: XFS (md4): Metadata corruption detected at xfs_buf_ioend+0xac/0x386 [xfs], xfs_inode block 0x15c3e0ef0 xfs_inode_buf_verify
Jan 20 08:19:39 NAS kernel: XFS (md4): Unmount and run xfs_repair
Jan 20 08:19:39 NAS kernel: XFS (md4): First 128 bytes of corrupted metadata buffer:
Jan 20 08:19:39 NAS kernel: 00000000: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
Jan 20 08:19:39 NAS kernel: 00000010: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
Jan 20 08:19:39 NAS kernel: 00000020: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
Jan 20 08:19:39 NAS kernel: 00000030: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
Jan 20 08:19:39 NAS kernel: 00000040: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
Jan 20 08:19:39 NAS kernel: 00000050: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
Jan 20 08:19:39 NAS kernel: 00000060: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
Jan 20 08:19:39 NAS kernel: 00000070: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
Jan 20 08:19:39 NAS kernel: XFS (md4): metadata I/O error in "xfs_imap_to_bp+0x50/0x70 [xfs]" at daddr 0x15c3e0ef0 len 32 error 117
Jan 20 08:21:25 NAS kernel: XFS (md4): Metadata corruption detected at xfs_buf_ioend+0xac/0x386 [xfs], xfs_inode block 0x15c3e0ef0 xfs_inode_buf_verify
Jan 20 08:21:25 NAS kernel: XFS (md4): Unmount and run xfs_repair
Jan 20 08:21:25 NAS kernel: XFS (md4): First 128 bytes of corrupted metadata buffer:
Jan 20 08:21:25 NAS kernel: 00000000: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
Jan 20 08:21:25 NAS kernel: 00000010: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
Jan 20 08:21:25 NAS kernel: 00000020: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
Jan 20 08:21:25 NAS kernel: 00000030: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
Jan 20 08:21:25 NAS kernel: 00000040: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
Jan 20 08:21:25 NAS kernel: 00000050: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
Jan 20 08:21:25 NAS kernel: 00000060: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
Jan 20 08:21:25 NAS kernel: 00000070: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
Jan 20 08:21:25 NAS kernel: XFS (md4): metadata I/O error in "xfs_imap_to_bp+0x50/0x70 [xfs]" at daddr 0x15c3e0ef0 len 32 error 117
Jan 20 08:21:48 NAS  ntpd[1447]: kernel reports TIME_ERROR: 0x41: Clock Unsynchronized
Jan 20 08:23:15 NAS kernel: ata10: SATA link up 6.0 Gbps (SStatus 133 SControl 300)
Jan 20 08:23:15 NAS kernel: ata10.00: configured for UDMA/133
Jan 20 08:23:15 NAS kernel: XFS (md4): Metadata corruption detected at xfs_buf_ioend+0xac/0x386 [xfs], xfs_inode block 0x15c3e0ef0 xfs_inode_buf_verify
Jan 20 08:23:15 NAS kernel: XFS (md4): Unmount and run xfs_repair
Jan 20 08:23:15 NAS kernel: XFS (md4): First 128 bytes of corrupted metadata buffer:
Jan 20 08:23:15 NAS kernel: 00000000: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
Jan 20 08:23:15 NAS kernel: 00000010: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
Jan 20 08:23:15 NAS kernel: 00000020: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
Jan 20 08:23:15 NAS kernel: 00000030: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
Jan 20 08:23:15 NAS kernel: 00000040: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
Jan 20 08:23:15 NAS kernel: 00000050: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
Jan 20 08:23:15 NAS kernel: 00000060: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
Jan 20 08:23:15 NAS kernel: 00000070: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
Jan 20 08:23:15 NAS kernel: XFS (md4): metadata I/O error in "xfs_imap_to_bp+0x50/0x70 [xfs]" at daddr 0x15c3e0ef0 len 32 error 117
Jan 20 08:23:15 NAS kernel: XFS (md4): Metadata corruption detected at xfs_buf_ioend+0xac/0x386 [xfs], xfs_inode block 0x15c3e0ef0 xfs_inode_buf_verify
Jan 20 08:23:15 NAS kernel: XFS (md4): Unmount and run xfs_repair
Jan 20 08:23:15 NAS kernel: XFS (md4): First 128 bytes of corrupted metadata buffer:
Jan 20 08:23:15 NAS kernel: 00000000: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
Jan 20 08:23:15 NAS kernel: 00000010: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
Jan 20 08:23:15 NAS kernel: 00000020: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
Jan 20 08:23:15 NAS kernel: 00000030: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
Jan 20 08:23:15 NAS kernel: 00000040: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
Jan 20 08:23:15 NAS kernel: 00000050: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
Jan 20 08:23:15 NAS kernel: 00000060: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
Jan 20 08:23:15 NAS kernel: 00000070: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
Jan 20 08:23:15 NAS kernel: XFS (md4): metadata I/O error in "xfs_imap_to_bp+0x50/0x70 [xfs]" at daddr 0x15c3e0ef0 len 32 error 117

 

Link to comment
5 minutes ago, itimpi said:

Is this the rebuild onto the 10TB drive?    If so I would not expect the rebuilt drive to have usable content.

The billion errors were on disks 1-3. The rebuild went onto the 10TB drive (which is disk4).

Edited by xjumper84
Link to comment
7 minutes ago, xjumper84 said:

The billion errors were on disks 1-3. The rebuild went onto the 10TB drive. 

Every bit of all other disks must be reliably read to reliably rebuild every bit of the new disk.

 

56 minutes ago, xjumper84 said:

Question - if frigate was running while the drive was attempting to rebuild, would that have caused issues with parity, because Frigate was saving camera files to other disks? 

Ordinarily this is OK because parity is updated realtime, so parity would still be valid after rebuilding the data disk. On the other hand, it means parity is now no longer valid for that original disk.

 

The results of the check don't look good.

 

Let's see what can be read from the original as an Unassigned Device. Not entirely clear there was anything actually wrong with that disk since "a ton of errors" is not very specific and we don't have any diagnostics from then.

Link to comment
2 minutes ago, trurl said:

Every bit of all other disks must be reliably read to reliably rebuilt every bit of the new disk.

 

Ordinarily this is OK because parity is updated realtime, so parity would still be valid after rebuilding the data disk. On the other hand, it means parity is now no longer valid for that original disk.

 

The results of the check don't look good.

 

Let's see what can be read from the original as an Unassigned Device. Not entirely clear there was anything actually wrong with that disk since "a ton of errors" is not very specific and we don't have any diagnostics from then.

Understood.

 

edit: the errors showed on md1, md2 and md3 (same as shown in the picture before the last reboot) to read something similar to: "metadata i/o error in "xfs_imap_to_bp+0x50/0x70 [xfs]" ... len 32 error 5 

 

 

So i'll power off the machine, connect the drive and startup. Please refresh my memory, with the new drive added, the array won't start? and then I'll be able to add the original 4tb drive into Unassigned Devices?

Edited by xjumper84
Link to comment

Ok. I re-added the original disk4 back, it showed up in Unassigned Devices. I mounted it and all my files are there. 

 

What is the proper step to proceed?

 

I would think (and please let me know if this is correct) - unmount the drive from Unassigned Devices. Stop the array, change the 10tb drive to the original 4tb drive and then restart the array. Check the files to make sure all good, then do a parity check, ensure we're good there.

 

Then once we are, in unassigned devices - format the 10tb drive to xfs, copy the 4tb drive to the 10tb drive, unmount the 10tb. stop the array, change the 4tb drive to the 10tb drive, setup the 4tb drive in unassigned devices and restart the array?

Link to comment
1 minute ago, xjumper84 said:

I would think (and please let me know if this is correct) - unmount the drive from Unassigned Devices. Stop the array, change the 10tb drive to the original 4tb drive and then restart the array. Check the files to make sure all good, then do a parity check, ensure we're good there.

No this is not correct. The only way to get the original back in the array is New Config and rebuild parity.

 

2 minutes ago, xjumper84 said:

Then once we are, in unassigned devices - format the 10tb drive to xfs, copy the 4tb drive to the 10tb drive, unmount the 10tb. stop the array, change the 4tb drive to the 10tb drive, setup the 4tb drive in unassigned devices and restart the array?

And all of this is wrong too. If you replace the drive it will rebuilt again.

Link to comment
Just now, trurl said:

The only way to get the original back in the array is New Config and rebuild parity.

You should disable Docker and VM Manager in Settings until you get everything working right again. Stop anything that might write to your array, then New Config might be the safest approach, since only the parity disk would be written, and parity contains none of your data.

Link to comment
11 minutes ago, trurl said:

No this is not correct. The only way to get the original back in the array is New Config and rebuild parity.

 

And all of this is wrong too. If you replace the drive it will rebuilt again.

  

6 minutes ago, trurl said:

You should disable Docker and VM Manager in Settings until you get everything working right again. Stop anything that might write to your array, then New Config might be the safest approach, since only the parity disk would be written, and parity contains none of your data.

 

Ok - So I should:

go to settings set enable docker to no - this will keep the docker containers I have though right?

go to settings, set enable VM's to no - this will also keep the VM files I already have. 

 

Then:

stop the array

switch the 10tb out

add the 4tb in

start the array with no parity check

go to tools -> new config

then rebuild parity, wait. don't let any docker run, don't copy anything in or out until it finishes.

 

Is that correct?

Link to comment
Jan 20 09:29:06 NAS unassigned.devices: Mounting partition 'sdi1' at mountpoint '/mnt/disks/WD-WCC4E4TSK55Y'...

sdi has

ID# ATTRIBUTE_NAME          FLAGS    VALUE WORST THRESH FAIL RAW_VALUE
  1 Raw_Read_Error_Rate     POSR-K   200   200   051    -    177
...
197 Current_Pending_Sector  -O--CK   200   200   000    -    3
...
200 Multi_Zone_Error_Rate   ---R--   200   200   000    -    12
 

You should click on each of your WD disks to get to its settings, and add attributes 1 and 200 for monitoring. They all look good right now, unlike that disk.

 

7 minutes ago, xjumper84 said:

go to settings set enable docker to no - this will keep the docker containers I have though right?

go to settings, set enable VM's to no - this will also keep the VM files I already have. 

This will just disable the Docker and VM Manager services, so none of these can run. They will not go away.

 

Do that much while we consider how to proceed.

 

Link to comment
18 minutes ago, trurl said:
Jan 20 09:29:06 NAS unassigned.devices: Mounting partition 'sdi1' at mountpoint '/mnt/disks/WD-WCC4E4TSK55Y'...

sdi has

ID# ATTRIBUTE_NAME          FLAGS    VALUE WORST THRESH FAIL RAW_VALUE
  1 Raw_Read_Error_Rate     POSR-K   200   200   051    -    177
...
197 Current_Pending_Sector  -O--CK   200   200   000    -    3
...
200 Multi_Zone_Error_Rate   ---R--   200   200   000    -    12
 

You should click on each of your WD disks to get to its settings, and add attributes 1 and 200 for monitoring. They all look good right now, unlike that disk.

 

This will just disable the Docker and VM Manager services, so none of these can run. They will not go away.

 

Do that much while we consider how to proceed.

 

 

I checked the other WD disks, they are all zero's on ID1, 197 and 200. Docker and VM's are set to no. 

Link to comment
17 minutes ago, trurl said:

Another possible approach is to try the rebuild again. We could do that first, see how it goes, keep the original for later. I am leaning towards this approach.

 

Anybody else @JorgeB @itimpi have any opinion or other ideas?

 

 

Could I copy the data off the original 4tb drive currently in Unassigned Devices to my desktop, just to have a good copy of the data, then erase the rebuilt data on disk 4. Then parity check to ensure parity is good. Then copy the data back to Disk 4?

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...