Jan Kuźniak Posted July 20, 2021 Share Posted July 20, 2021 (edited) Hello! I've got an odd one today. I've got a server with 1 parity 1 cache and 4 data drives (2x 2TB, 1x 3TB and 1x 6TB). Now that happened: * One of the old 2TB drives (Disk 2) had write errors and got disabled. * When I wanted to rebuild it, the other one (Disk 1) died hard (read errors, "missing" in unRaid). * I am able to read the data from the disabled drive when I mount it; I trust it and assume it was a loose cable issue. I am (understandably) unable to start the array in a normal mode - but I would love to temporarily "force" it to start by clearing the "disabled" flag on Disk 2 without rebuilding it - so that I can emulate Disk 1 and recover the data from parity. Is there a way to do that (e.g. by manually changing a config file / database) or am I too optimistic in my assumptions and basically just lost the Disk 1 data? Thanks in advance for any insights - even if those are to confirm my worry that an out-of-unRaid recovery is my only option. Edited July 20, 2021 by Jan Kuźniak fixed units to TB (brainfreeze) Quote Link to comment
trurl Posted July 20, 2021 Share Posted July 20, 2021 Go to Tools-Diagnostics and attach the complete Diagnostics ZIP file to your NEXT post in this thread Quote Link to comment
Jan Kuźniak Posted July 20, 2021 Author Share Posted July 20, 2021 Right, sorry for omitting this one - I blame the "my nas broke" panic mode. Please find the diagnostics attached. nas-diagnostics-20210720-1822.zip Quote Link to comment
JorgeB Posted July 20, 2021 Share Posted July 20, 2021 Disk1 isn't even giving a valid SMART report, not a good sign, but replace/swap cables and post new diags. Quote Link to comment
Jan Kuźniak Posted July 20, 2021 Author Share Posted July 20, 2021 (edited) Unfortunately, I am confident the Disk 1 is dead - I did the usual cable / port juggle (including power cable), booted the machine with just this drive attached (to isolate power delivery problems) etc. It did show up once, early in the process, but only for a short time and only to disappear for good on the following attempts. The original question remains: is there a way to force Disk 2 from enabled to disabled (without rebuilding) so that I can start the unRaid array and recover Disk 1 using the parity. Edited July 20, 2021 by Jan Kuźniak Quote Link to comment
JorgeB Posted July 20, 2021 Share Posted July 20, 2021 There's a way to do that, you need a new disk to replace disk1, ideally of the same size, but it will only work if parity is still valid, if yes I can post the instructions. Quote Link to comment
Jan Kuźniak Posted July 20, 2021 Author Share Posted July 20, 2021 Yes, a new disk will not be a problem. If you wouldn't mind posting the instructions, that would be an invaluable help. Quote Link to comment
JorgeB Posted July 20, 2021 Share Posted July 20, 2021 -Tools -> New Config -> Retain current configuration: All -> Apply -Check all assignments and assign any missing disk(s) if needed, including the new disk1, replacement disk should be same size or larger than the old one -IMPORTANT - Check both "parity is already valid" and "maintenance mode" and start the array (note that the GUI will still show that data on parity disk(s) will be overwritten, this is normal as it doesn't account for the checkbox, but it won't be as long as it's checked) -Stop array -Unassign disk1. -Start array (in normal mode now), ideally the emulated disk will now mount and contents look correct, if it doesn't you should run a filesystem check on the emulated disk -If the emulated disk mounts and contents look correct stop the array -Re-assign the disk to rebuild and start array to begin. Quote Link to comment
Jan Kuźniak Posted July 20, 2021 Author Share Posted July 20, 2021 Thanks JorgeB, I'll give it a shot. I was thinking unRaid wouldn't let me start the array with Disk 1 replaced and Disk 2 disabled - I'm glad that's an option. Keep your fingers crossed and I'll keep you posted. Quote Link to comment
Jan Kuźniak Posted July 24, 2021 Author Share Posted July 24, 2021 Hi guys, reporting from the front line. JorgeB: Your method worked - and didn't work at the same time. I was indeed able to start an array in maintenance mode while marking parity as good, get unRaid to accept Disk 2 as valid and then make it so that it shows it's emulating Disk 1 when I removed the replacement drive. Unfortunately, it can't mount the emulated disk and it wouldn't show up in the devices - so I can't even run a filesystem check on it. Am I missing something obvious? See screenshot below. Quote Link to comment
JorgeB Posted July 24, 2021 Share Posted July 24, 2021 25 minutes ago, Jan Kuźniak said: so I can't even run a filesystem check on it. You can still run a filesystem check, using the console or GUI, for the latter first and with the array stopped click on it and set filesystem to xfs. Quote Link to comment
JorgeB Posted July 24, 2021 Share Posted July 24, 2021 Just now, JorgeB said: and set filesystem to xfs. Assuming it was xfs, I see you have a btrfs fs also. Quote Link to comment
Jan Kuźniak Posted July 24, 2021 Author Share Posted July 24, 2021 Yes, that disk was XFS. And here's an odd one - it wouldn't let me change the FS - the "apply" option is grayed out... Also, when I start the array in the maintenance mode, the "check filesystem status" section is missing. I'm tempted to try formatting the spare drive to XFS and re-doing the new config trick you posted at first - maybe that would fool the array into emulating the original FS? (double checking with you guys before I irreversibly fuck something up ;-)) Quote Link to comment
JorgeB Posted July 24, 2021 Share Posted July 24, 2021 Use the console, start the array in maintenance mode and type: xfs_repair -v /dev/md1 Quote Link to comment
Jan Kuźniak Posted July 24, 2021 Author Share Posted July 24, 2021 Seems to have worked, you're my Hero! I'll attempt recovering the data and let you know how that went. 1 Quote Link to comment
LoftyDwarf Posted July 31, 2021 Share Posted July 31, 2021 I have the exact same problem with one disabled and one failed drive. I followed the instructions here for my failed disk, but xfs_repair fails with error 117. I saw a lot of file names I recognized during the rebuild attempt and it put a lot of things into lost and found, so I know at least some of my data is there, but it won't actually rebuild a mountable file system. Quote Link to comment
Jan Kuźniak Posted August 1, 2021 Author Share Posted August 1, 2021 In my case I was able to rebuild the file system -- but got all the files in "lost+found" (I just hope I recovered all the files - it's sometimes hard to find what's missing). I'm guessing that means I got some minor corruption / out of sync on the first slightly-failed disk and the parity was able to recover almost everything. I am unfortunately not able to comment on your case further. Quote Link to comment
trurl Posted August 1, 2021 Share Posted August 1, 2021 On 7/31/2021 at 12:48 AM, LoftyDwarf said: exact same problem Start your own thread with your diagnostics. Quote Link to comment
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.