Matheew Posted November 8, 2021 Share Posted November 8, 2021 (edited) Hello unRAID Community! I had the unfortune to have one of my disks failing during the weekend. The series of events: OLD Disk 3 shows up as disabled in the unRAID GUI with a red cross and reports as failed by SMART. The array is stopped and OLD Disk 3 and replaced with NEW Disk 3 in the chassi. The array is started and data rebuild begins, shortly after unRAID reports read errors with Disk 2, everything is still fine, the array looks OK and the rebuild is doing fine. Some time later during the rebuild Disk 2 is showing up both as a green disk in the array but also as an Unassigned Device with the ability to mount it. At this stage I touch nothing and let the rebuild finish. See link for how it looked with the Unassiged DIsk. Do note that this picture is taken at a later stage where DIsk 2 has failed and NEW Disk 3 has already been rebuilt. Disk 2 was green during the rebuild. (and yes, I know my disks are too hot at the moment, but it has nothing to do with this topic, I don't question the drive failure itself or the reason behind it). https://imgur.com/a/1aknzNP When the rebuild of NEW Disk 3 is done it turns green in the array. I SSH to /mnt/disk3/movies/ and ls gives me "Structure needs cleaning" unRAID Forums tells me it is caused by XFS-corruption, I mount the array in maintenance mode and do an XFS Repair. After stopping the array from maintenance mode the Start array button is gone and the only ones visible are shutdown and reboot. Seems to be a bug, I reboot the server without first saving diagnostics logs, so I do not have them. After reboot it is possible to start the array, running ls in /mnt/disk3/movies/ is now possible but all data previously on OLD Disk 3 is gone, the disk is practically empty. Shortly after this Disk 2 gets disabled with a red cross but SMART show Pass. XFS Repair gets stuck on: Phase 1 - find and verify superblock... couldn't verify primary superblock - not enough secondary superblocks with matching geometry !!! attempting to find secondary superblock... ...found candidate secondary superblock... unable to verify superblock, continuing... ....found candidate secondary superblock... unable to verify superblock, continuing... I'm left with loss of data on two disks. I basically have three questions here: Why did I loose all the data on Disk 3? I rebuilt it and repaired XFS, why is the data gone? Did the weird behavior shown on Disk 2 actually meant that I lost it during the rebuild and only having one parity disk led to data loss? What do I do with the current disabled Disk 2? Should I rebuild it? If it actually failed during the rebuild of NEW DIsk 3 I guess the data is gone on that one as well? I still have the OLD Disk 3 (do not know if it is broken or not), can I add it as an Unassigned Disk and decrypt it in an attempt to retrieve data from it? Thanks in advance! Edited November 8, 2021 by Matheew Quote Link to comment
trurl Posted November 8, 2021 Share Posted November 8, 2021 7 minutes ago, Matheew said: without first saving diagnostics logs, so I do not have them Get new Diagnostics and attach to your NEXT post in this thread. Quote Link to comment
Matheew Posted November 8, 2021 Author Share Posted November 8, 2021 6 minutes ago, trurl said: Get new Diagnostics and attach to your NEXT post in this thread. Certainly. see attached file. Thanks for the insanely quick reply. unraid-diagnostics-20211108-1559.zip Quote Link to comment
JorgeB Posted November 8, 2021 Share Posted November 8, 2021 26 minutes ago, Matheew said: reports read errors with Disk 2, everything is still fine, the array looks OK and the rebuild is doing fine. The diags you posted are after rebooting so we can't see exactly what happened, but the rebuild is not doing fine, you only have one parity drive, if there were errors during the rebuild in another disk the rebuilt disk will be corrupt, and by the description looks like disk2 dropped offline, so there would be a lot of corruption. Quote Link to comment
Matheew Posted November 8, 2021 Author Share Posted November 8, 2021 (edited) 3 hours ago, JorgeB said: The diags you posted are after rebooting so we can't see exactly what happened, but the rebuild is not doing fine, you only have one parity drive, if there were errors during the rebuild in another disk the rebuilt disk will be corrupt, and by the description looks like disk2 dropped offline, so there would be a lot of corruption. Thanks for the reply, then it is as I feared. I suspect there is no reason to rebuild disk 2 since the file system is corrupted? If I accept the loss of the data on Disk 2 and wish to replace it with a new disk, how would I go on about doing this the best way? I do not wish to simply replace the disk in the array with a new one and put further strain on the other disks by rebuilding the Disk 2 - only to find it corrupted and then empty just as before. Thanks once again in advance! Edited November 8, 2021 by Matheew Quote Link to comment
JorgeB Posted November 8, 2021 Share Posted November 8, 2021 If disk2 is the unassigned 4TB Seagate you should be able to mount it outside the array, if it looks fine you can do a new config with it, note that disk3 will likely have some corruption due to the read errors during the rebuild. Quote Link to comment
Matheew Posted November 8, 2021 Author Share Posted November 8, 2021 (edited) 43 minutes ago, JorgeB said: If disk2 is the unassigned 4TB Seagate you should be able to mount it outside the array, if it looks fine you can do a new config with it, note that disk3 will likely have some corruption due to the read errors during the rebuild. Hi! How would a new config work in this scenario? Why would Disk 3 have corruption after running a XFS repair? Also, if I would do a new config, I would have to do a parity check again correct? This would put strain on the disks as well? Edited November 8, 2021 by Matheew Quote Link to comment
JorgeB Posted November 8, 2021 Share Posted November 8, 2021 30 minutes ago, Matheew said: Why would Disk 3 have corruption after running a XFS repair? The problem was not xfs_repair, it was the rebuild: 4 hours ago, JorgeB said: you only have one parity drive, if there were errors during the rebuild in another disk the rebuilt disk will be corrupt Quote Link to comment
Matheew Posted November 9, 2021 Author Share Posted November 9, 2021 11 hours ago, JorgeB said: The problem was not xfs_repair, it was the rebuild: I see, so what is the recommended action here? Reformat the NEW Disk 3? Quote Link to comment
JorgeB Posted November 9, 2021 Share Posted November 9, 2021 Yes, If you can restore the data from backups. Quote Link to comment
Matheew Posted November 9, 2021 Author Share Posted November 9, 2021 (edited) 29 minutes ago, JorgeB said: Yes, If you can restore the data from backups. I'm sorry if I'm slow here but I'm not following. Even if I can't restore the data from backups the NEW Disk 3 is still corrupted and needs formatting either way? I'm quite confused regarding how to proceed here without messing up more than I've already done. Edited November 9, 2021 by Matheew Quote Link to comment
JorgeB Posted November 9, 2021 Share Posted November 9, 2021 51 minutes ago, Matheew said: Even if I can't restore the data from backups the NEW Disk 3 is still corrupted and needs formatting either way? Some (or a lot) of the data in disk3 is likely corrupt, if you have backups, yes delete everything and restore, so you know everything is OK, if you don't some corruption might be better than no data at all, you should really understand how parity in Unraid works, things would make more sense. https://wiki.unraid.net/Parity#How_parity_works Quote Link to comment
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.