February 6, 20251 yr I got an error on one of my disks. I replaced it with a new one and rebuilt the array successfully. After one day I started getting this: Feb 6 00:00:33 nedio-server Docker Auto Update: Installing Updates for borgmatic immich mariadb Minio netdata nextcloud Feb 6 00:00:33 nedio-server kernel: XFS (dm-2): Metadata corruption detected at xfs_buf_ioend+0xa9/0x384, xfs_inode block 0xe4343410 xfs_inode_buf_verify Feb 6 00:00:33 nedio-server kernel: XFS (dm-2): Unmount and run xfs_repair Feb 6 00:00:33 nedio-server kernel: XFS (dm-2): First 128 bytes of corrupted metadata buffer: Feb 6 00:00:33 nedio-server kernel: 00000000: b4 cb fc 73 d1 4d 06 7b e3 4c 72 74 f9 72 b9 42 ...s.M.{.Lrt.r.B Feb 6 00:00:33 nedio-server kernel: 00000010: f1 19 35 c4 a7 ce b4 66 36 dc 1f f6 88 e1 ea ad ..5....f6....... Feb 6 00:00:33 nedio-server kernel: 00000020: 13 28 1e 6a a1 1c e5 46 6c 9f ff 94 7a 5d e1 12 .(.j...Fl...z].. Feb 6 00:00:33 nedio-server kernel: 00000030: 9b 97 0b 2c 47 92 6d 19 f3 84 fa d6 22 0b a3 3e ...,G.m....."..> Feb 6 00:00:33 nedio-server kernel: 00000040: 4b 51 0d 41 a3 8e 59 f9 09 13 ce 3f c1 a2 57 87 KQ.A..Y....?..W. Feb 6 00:00:33 nedio-server kernel: 00000050: f9 c6 30 b4 bc 30 65 b1 0f 86 aa 06 6f 35 57 cc ..0..0e.....o5W. Feb 6 00:00:33 nedio-server kernel: 00000060: c6 33 3c 33 9c 11 cc 2c d3 fc 97 10 61 ea 1f 7c .3<3...,....a..| Feb 6 00:00:33 nedio-server kernel: 00000070: bc 50 a2 fe bf b9 23 36 b3 60 bc 01 60 c0 c3 d7 .P....#6.`..`... Feb 6 00:00:33 nedio-server kernel: XFS (dm-2): metadata I/O error in "xfs_imap_to_bp+0x52/0x74" at daddr 0xe4343410 len 32 error 117 Feb 6 00:00:34 nedio-server emhttpd: read SMART /dev/sdd Feb 6 00:00:41 nedio-server emhttpd: read SMART /dev/sde Feb 6 00:00:45 nedio-server kernel: XFS (dm-2): Metadata corruption detected at xfs_buf_ioend+0xa9/0x384, xfs_inode block 0x1467bfac8 xfs_inode_buf_verify Feb 6 00:00:45 nedio-server kernel: XFS (dm-2): Unmount and run xfs_repair I have stopped the array. Now I can't start it. Please help. My data is only partially backed up.. 😱 nedio-server-diagnostics-20250206-0928.zip
February 6, 20251 yr Author I can't start the array in maintenance mode... So currently I still have the previous HDD (3TB) which was mounted as Disk 3 - unplugged from the unraid server. And the new HDD (4TB) which I used trying to replace the HDD at Disk 3 of the array. I have precleared the 4TB HDD and then mounted in into the Disk3 slot. The array rebuild completed without errors. But now the array does not start (even if I check the maintenance mode). Please advise what should I do. Edited February 6, 20251 yr by Nedio Server Unraid
February 6, 20251 yr Community Expert 5 minutes ago, Nedio Server Unraid said: I can't start the array in maintenance mode Why not?
February 6, 20251 yr Author The log that I am getting is: Feb 6 13:31:19 nedio-server emhttpd: cmdStart: already started Although on the main page the array looks like not started.. With dropdown boxes for disk selection. Should I restart the server maybe?
February 6, 20251 yr Author ok, after rebooting the server I was able to start the array in Maintenance mode and clicked the Check File System button on disk 3. I have the unraid spinner running since 25 mins.. Here is the log output: Feb 6 14:28:53 nedio-server rc.nginx: Reloading Nginx server daemon configuration... Feb 6 14:28:54 nedio-server nginx: 2025/02/06 14:28:54 [alert] 11134#11134: worker process 25079 exited on signal 6 Feb 6 14:28:56 nedio-server nginx: 2025/02/06 14:28:56 [alert] 11134#11134: worker process 25363 exited on signal 6 Feb 6 14:28:57 nedio-server nginx: 2025/02/06 14:28:57 [alert] 11134#11134: worker process 25511 exited on signal 6 Feb 6 14:29:06 nedio-server nginx: 2025/02/06 14:29:06 [alert] 11134#11134: worker process 25549 exited on signal 6 Feb 6 14:29:07 nedio-server ool www[15128]: /usr/local/emhttp/plugins/dynamix/scripts/xfs_check 'start' '/dev/mapper/md3p1' 'ST4000NE001-2MA101_WS25BT0Z' '-n' Feb 6 14:29:07 nedio-server nginx: 2025/02/06 14:29:07 [alert] 11134#11134: worker process 26094 exited on signal 6 Feb 6 14:29:08 nedio-server nginx: 2025/02/06 14:29:08 [alert] 11134#11134: worker process 26284 exited on signal 6 ... Feb 6 14:31:02 nedio-server nginx: 2025/02/06 14:31:02 [alert] 11134#11134: worker process 33870 exited on signal 6 Feb 6 14:31:04 nedio-server nginx: 2025/02/06 14:31:04 [alert] 11134#11134: worker process 33950 exited on signal 6 Feb 6 14:31:06 nedio-server nginx: 2025/02/06 14:31:06 [alert] 11134#11134: worker process 34135 exited on signal 6 Feb 6 14:31:06 nedio-server publish: curl to arraymonitor failed Feb 6 14:31:08 nedio-server nginx: 2025/02/06 14:31:08 [alert] 11134#11134: worker process 34242 exited on signal 6 Feb 6 14:31:10 nedio-server nginx: 2025/02/06 14:31:10 [alert] 11134#11134: worker process 34321 exited on signal 6 ... Feb 6 14:34:48 nedio-server nginx: 2025/02/06 14:34:48 [alert] 11134#11134: worker process 48053 exited on signal 6 Feb 6 14:34:49 nedio-server nginx: 2025/02/06 14:34:49 [alert] 11134#11134: worker process 48151 exited on signal 6 Feb 6 14:39:14 nedio-server emhttpd: spinning down /dev/sdf Feb 6 14:39:14 nedio-server emhttpd: spinning down /dev/nvme0n1 Feb 6 14:39:15 nedio-server emhttpd: sdspin /dev/nvme0n1 down: 25 Feb 6 14:43:48 nedio-server emhttpd: spinning down /dev/sdj Feb 6 14:43:48 nedio-server emhttpd: spinning down /dev/sdh Feb 6 14:43:48 nedio-server emhttpd: spinning down /dev/sdg Feb 6 14:43:48 nedio-server emhttpd: spinning down /dev/sdd Feb 6 14:43:48 nedio-server emhttpd: spinning down /dev/sde Feb 6 14:43:48 nedio-server emhttpd: spinning down /dev/sdb Feb 6 14:43:48 nedio-server emhttpd: spinning down /dev/sdi Feb 6 14:44:17 nedio-server emhttpd: spinning down /dev/sdc Should I wait more? I do not see the xfs_check in htop. Edited February 6, 20251 yr by Nedio Server Unraid
February 6, 20251 yr Author I've run it. Got this: user@nedio-server:~# xfs_repair -v /dev/mapper/md3p1 Phase 1 - find and verify superblock... - block cache size set to 4590072 entries Phase 2 - using internal log - zero log... Log inconsistent (didn't find previous header) failed to find log head zero_log: cannot find log head/tail (xlog_find_tail=5) ERROR: The log head and/or tail cannot be discovered. Attempt to mount the filesystem to replay the log or use the -L option to destroy the log and attempt a repair.
February 6, 20251 yr Author Done. Here are the last lines of the console output: Format log to cycle 2069032622. cache_purge: shake on cache 0x4d67c0 left 1 nodes!? cache_purge: shake on cache 0x4d67c0 left 1 nodes!? cache_zero_check: refcount is 1, not zero (node=0x14bf180fb010) XFS_REPAIR Summary Thu Feb 6 16:03:17 2025 Phase Start End Duration Phase 1: 02/06 16:00:31 02/06 16:00:31 Phase 2: 02/06 16:00:31 02/06 16:01:00 29 seconds Phase 3: 02/06 16:01:00 02/06 16:01:36 36 seconds Phase 4: 02/06 16:01:36 02/06 16:01:36 Phase 5: 02/06 16:01:36 02/06 16:01:37 1 second Phase 6: 02/06 16:01:37 02/06 16:01:37 Phase 7: 02/06 16:01:37 02/06 16:01:37 Total run time: 1 minute, 6 seconds done user@nedio-server:~# Thanks for helping me out JorgeB. I am a bit nervous, hoping I did not lose my data.. What's next? Edited February 6, 20251 yr by Nedio Server Unraid
February 6, 20251 yr Community Expert Start the array in normal, the disk should mount, if yes check contents, also look for a lost+found folder.
February 6, 20251 yr Author It mounted. The array has started. But there is a lot in the lost+found directory on disk3... 😨 Also the used and free size of the disk 3 shows only 49 GB used.. whereas I remember having 2+TB there... BTW: is there a way to know from the diagnostic how much data was on that HDD before XFS restore? Now, this disk (4TB) is the one that was rebuilt from the parity (when I replaced the originally failing HDD which was 3TB). On that 3TB disk I did not try to check/restore the xfs.. Do you think it is safe to plug the HDD-3TB back, start the array in maintenance mode and try to check the XFS there? Could it restore more data?
February 6, 20251 yr Community Expert Solution 6 minutes ago, Nedio Server Unraid said: is there a way to know from the diagnostic how much data was on that HDD before XFS restore? Since the array was not started in normal mode when the diagnostics were taken, no way to know anything about the filesystems on any disks. Leave the array as it currently is. If you have a spare port plug the original disk in and see if it will mount as an Unassigned Device. Either way post new diagnostics.
February 9, 20251 yr Author ok, I have plugged in the original (3TB) disk. It shows up in the unassigned devices. Should I start the array (with the new 4TB disk) before posting the diagnostics? Should I try to mount the original disk outside of the array?
February 9, 20251 yr Author I have started the array in maintenance mode (so that I can pass the encryption passphrase to unraid). Then I pressed the [mount] button next to the original disk in unassigned devices. Here is the log I see: Feb 9 14:46:56 nedio-server unassigned.devices: Mounting partition 'sdb1' at mountpoint '/mnt/disks/WD-WMC4N0F887V2'... Feb 9 14:46:56 nedio-server emhttpd: shcmd (4857): /usr/sbin/cryptsetup luksOpen '/dev/sdb1' 'WD-WMC4N0F887V2' Feb 9 14:46:58 nedio-server nginx: 2025/02/09 14:46:58 [alert] 11265#11265: worker process 90202 exited on signal 6 Feb 9 14:46:58 nedio-server unassigned.devices: Mount cmd: /sbin/mount -t 'xfs' -o rw,relatime '/dev/mapper/WD-WMC4N0F887V2' '/mnt/disks/WD-WMC4N0F887V2' Feb 9 14:46:58 nedio-server kernel: XFS (dm-8): Mounting V5 Filesystem f185aae9-b6c9-4eb8-b7dc-23ae79ee4411 Feb 9 14:46:58 nedio-server kernel: XFS (dm-8): Starting recovery (logdev: internal) Feb 9 14:46:59 nedio-server kernel: XFS (dm-8): Ending recovery (logdev: internal) Feb 9 14:47:00 nedio-server unassigned.devices: Successfully mounted '/dev/mapper/WD-WMC4N0F887V2' on '/mnt/disks/WD-WMC4N0F887V2'. Feb 9 14:47:00 nedio-server unassigned.devices: Device '/dev/sdb1' is not set to be shared. Feb 9 14:47:00 nedio-server nginx: 2025/02/09 14:47:00 [alert] 11265#11265: worker process 90521 exited on signal 6
February 9, 20251 yr Community Expert Check the contents from the old disk, if they are correct and more complete, you can copy to the array.
February 9, 20251 yr Author I am considering to run a full parity check first on my Array. I somehow lost faith in it.. With writing corrections. Once I know my array is in a good shape - I will try to read from the original disk and write to the array. Please let me know if any part of this plan is a bad idea? 🙏 Then I will setup a proper 3-2-1 backup urgently...
February 10, 20251 yr Community Expert I would first look at recovering the data from that disk, then you can run a check.
February 10, 20251 yr Author My array was messed up: Duration: 12 hours, 49 minutes, 35 seconds. Average speed: 86.6 MB/s Finding 487755460 errors All of that is supposed to be corrected now. Will start copying the data from the original disk to array.
March 6, 20251 yr Author managed to copy all my data from the old disk 🍀. have setup the backup ⛑️. a bit bitter after taste - why the hell my XFS file system got corrupted - still no answer 🤷♂️.
March 6, 20251 yr Community Expert 59 minutes ago, Nedio Server Unraid said: why the hell my XFS file system got corrupted - still no answer That can be difficult to answer, typically it's caused by an unclean shutdown or flush issue, but if it keeps happening it may indicate an underlying issue.
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.