jrhamilt Posted April 15, 2023 Share Posted April 15, 2023 I've got a unique problem here, and hopefully some big brains here can help out. I'm having problems with Disk 2 and Disk 7. I know this is typically a "data's lost, and move on" (fortunately, all of the critical data is backed up to the cloud I believe... So this should be just an effort in trying to avoid a massive download). Two days ago, Disk 7 wound up disappearing. It looked like a drive disconnect, I reseated the drive, it came online, and rebuilt perfectly, no errors. Unfortunately today, when writing some data in the array, Disk 2 started spitting some errors. "Current_Pending_Sector" errors. My take is that this drive is probably toast, and while trying to figure things out, Disk 7 disappeared again. Got it reconnected, but now I'm hesitating... I'm pretty certain that the contents of Disk 7 are stable and correct. I had temporarily removed it as a target from the shares, so I don't think there's anything that's changed with regard to that drive... Screenshot below is of the current state. My fear is that if I start rebuilding the array, and try and rebuild from Parity on top of Disk 7, I'm going to be really hosed. I have 2 disks coming... Assuming that we can figure something out to get Disk 7 forced back in, then here is the plan for those 2 drives. I plan to use 1 to replace Disk 2 - I think that's the first priority... The other replaces Disk 7 (and I may connect it somewhere else). Third step is to take the drive that *was* Disk 7 and really run it through it's paces to make sure it's good. If it's good, it goes back in as Parity 2... So, questions: 1) Is there a way to confirm that there was no writing to Disk 7 (or the emulated Disk 7) after we started hitting the errors on Disk 2? Does that even matter? 2) Is there a way to confirm that Disk 7 was up when I removed it from the "share target" (I'm pretty sure this was the case...) 3) Is there a way to force Disk 7 in without rebuilding it? 4) Does the plan seem to make sense to you for the drive movement? Thanks so much for the help! whitenas-diagnostics-20230414-1950.zip Quote Link to comment
jrhamilt Posted April 15, 2023 Author Share Posted April 15, 2023 Maybe it would be better to put a new "cleared" drive in place of Disk 7 and "try" the rebuild first? I don't know. I'm going to be out of town for the week, so I'm not going to be able to physically touch things for a bit... And, here's the setting that I changed before Disk 7 went weird - I assume that this would prevent writes to the disk. (Don't have any dockers that would write to it either - they were all disabled). Quote Link to comment
JorgeB Posted April 15, 2023 Share Posted April 15, 2023 7 hours ago, jrhamilt said: 1) Is there a way to confirm that there was no writing to Disk 7 (or the emulated Disk 7) after we started hitting the errors on Disk 2? Does that even matter? If there were any writes to the emulated disk7 after it got disabled parity will no longer be in sync if it's forced enabled, it will never be 100% in sync any way due to filesystem mount/unmount, but if that 's the only changes it's usually recoverable, problem is if you don't know if there were writes, was the disk disable d for long? If not sure it might be still worth a try. Quote Link to comment
jrhamilt Posted April 15, 2023 Author Share Posted April 15, 2023 The disk wasn't disabled for too long - and I was watching everything. A couple "quiescent" hours. But, it's a lot of data and there are other users - though typically they wouldn't be writing anything there. And all of the dockers were disabled. And the mover was disabled. I suppose I did turn the array back on and it took me a little while (again, couple hours) to realize that Disk 7 was offline... Thinking it through, if there were any changes to the emulated disk 7, and we wind up rebuilding Disk 2, then I've effectively corrupted Disk 2 if it rebuilds in those areas (because disk 7 isn't what we said it was). I could remove disk 7 and check file modification times? But that doesn't help by itself... - I need to check the modification times on the "emulated" disk 7 and compare to the real disk 7. That would require me to start the array and "read" the contents of the emulated disk 7 (enough to get the file properties). Can I start the array "read only" in order to check that out? If the file mod times on Disk 7 "emulated" and Disk 7 "actual" match, is that sufficient? Quote Link to comment
jrhamilt Posted April 15, 2023 Author Share Posted April 15, 2023 I've removed the "potentially good" disk 7, and have a new disk in it's place. I'm starting the array in maintenance mode and trying to rebuild to Disk 7... Disk 2 is starting to throw read errors... Not sure what to do, going to let it rebuild? (I still have the old disk 7, which I think / hope is good...) Need help! Quote Link to comment
jrhamilt Posted April 15, 2023 Author Share Posted April 15, 2023 I'm thinking since this is getting so many errors (up to 18,000), that I will let this finish. Then, is there a way for me to put the old "probably good" drive back in the array, force Unraid to consider that drive and parity to be correct, and rebuild on top of Disk 2? Then I'll have a "Disk 2 rebuild" and a "Disk 7 rebuild" and the old Disk 2 (if it still runs) and the old Disk 7, and my online backup - and from there it just is what it is... And when I find failed files, I've got options on trying to find them... Quote Link to comment
JorgeB Posted April 16, 2023 Share Posted April 16, 2023 You could have tried to force enable disk7, still can though parity will be more out sync, can post instructions if you want to try it. Quote Link to comment
jrhamilt Posted April 16, 2023 Author Share Posted April 16, 2023 3 hours ago, JorgeB said: You could have tried to force enable disk7, still can though parity will be more out sync, can post instructions if you want to try it. I would like to try it, and would appreciate your instructions. At the end of my previous build, it noted that disk 7 is invalid. I believe that with it being marked as invalid, that there is no way that I can successfully turn the array on, and I don't think that a "check disk" on disk 2 is going to do anything good. I think the steps are something like, put the other disk back in (the old, probably good disk 7), force it to think the previous configuration (current bad disk 2, old probably good disk 7) is good, (new config? parity correct check mark? start in maintenance? something like that)... then shutdown, pull Disk 2 out, start array. Assign new (blank) 4 TB drive as Disk 2. Start in Maintenance. Sync. I think that series of steps will be as good as I can get it... But again, would like instructions from the professional. I don't need to use the array while we do any of this. It's probably better that it's not used. (I've downloaded the super critical data from my backups, so having this down for a week isn't horrible.) Here is the final result from the main page. Quote Link to comment
JorgeB Posted April 16, 2023 Share Posted April 16, 2023 This will only work if parity is still valid: -Tools -> New Config -> Retain current configuration: All -> Apply -Check all assignments and assign any missing disk(s) if needed, including old disk7 and a new disk2 (or use the old one for now to see if it still can be emulated), replacement disk should be same size or larger than the old one -IMPORTANT - Check both "parity is already valid" and "maintenance mode" and start the array (note that the GUI will still show that data on parity disk(s) will be overwritten, this is normal as it doesn't account for the checkbox, but it won't be as long as it's checked) -Stop array -Unassign disk2 -Start array (in normal mode now), and post new diags Quote Link to comment
jrhamilt Posted April 16, 2023 Author Share Posted April 16, 2023 Hmmm - if I unassign and then start the array, we're not done, right? That's just going to emulate Disk 2 for now. Correct? Then I have to rebuild on Disk 2? Is the point to check the diags before we commit to rebuilding on 2? Here's the screen after the start in maintenance... Quote Link to comment
JorgeB Posted April 16, 2023 Share Posted April 16, 2023 5 minutes ago, jrhamilt said: if I unassign and then start the array, we're not done, right? That's just going to emulate Disk 2 for now. Correct? Correct, if it mounts you just need to rebuild, ideally to a new disk, if it doesn't post new diags. 1 Quote Link to comment
jrhamilt Posted April 16, 2023 Author Share Posted April 16, 2023 That doesn't look right... That's starting the array as "unassigned". I stopped the array as soon as diags and screen shot were complete... What now? whitenas-diagnostics-20230416-0851.zip Quote Link to comment
jrhamilt Posted April 17, 2023 Author Share Posted April 17, 2023 So, thinking that not much could be lost by trying a sync, I assigned a new drive to the disk2 slot, and started a rebuild. All of that looked pretty normal, but I did wind up getting some read errors from the "good disk 7". I didn't write from Disk 2 to Disk 7, this one should be unmolested, but I guess I'm realizing at this point, that both drives really are most likely bad... I don't have a good feel for why the reads failed... Can't get a diagnostic right now, but this is where it's at... Quote Link to comment
JorgeB Posted April 17, 2023 Share Posted April 17, 2023 First check filesystem on the emulated disk2 to see if the filesystem corruption is fixable and worth rebuilding. Quote Link to comment
jrhamilt Posted April 17, 2023 Author Share Posted April 17, 2023 Can't do that from maintenance mode, right? Need to actually mount the drives with a normal startup, right? I did finish the rebuild of Disk 2, (with the 1151 read errors from Disk7 per above). Does that change anything? Do I just run it on the file system itself since it's not emulated? Quote Link to comment
JorgeB Posted April 17, 2023 Share Posted April 17, 2023 Just now, jrhamilt said: Can't do that from maintenance mode, right? Can only do it in maintenance mode. Quote Link to comment
jrhamilt Posted April 17, 2023 Author Share Posted April 17, 2023 Hmmm. The File System type for Disk 2 is "auto", so there is no check disk section. I think all the drives were xfs. Should I "set" disk 2 to XFS first? Is that a doable thing? Quote Link to comment
jrhamilt Posted April 17, 2023 Author Share Posted April 17, 2023 Here are some screenshots of the disk 2 details and the array status overall. Default for the system for filesystem is XFS. Quote Link to comment
JorgeB Posted April 17, 2023 Share Posted April 17, 2023 25 minutes ago, jrhamilt said: Should I "set" disk 2 to XFS first? Is that a doable thing? Yes, with the array stopped click on disk2 and change the fs to xfs. Quote Link to comment
jrhamilt Posted April 17, 2023 Author Share Posted April 17, 2023 Here are the results. Quote Link to comment
JorgeB Posted April 17, 2023 Share Posted April 17, 2023 Run it again without -n or nothing will be done, and if it asks for -L use it. Quote Link to comment
jrhamilt Posted April 17, 2023 Author Share Posted April 17, 2023 Done. Didn't ask for L, I don't think. Quote Link to comment
itimpi Posted April 17, 2023 Share Posted April 17, 2023 When you start the array in normal mode that drive should now mount OK. Quote Link to comment
jrhamilt Posted April 17, 2023 Author Share Posted April 17, 2023 Seems like, even with the read errors from "original disk 7", and the potential for bad sectors in 2 from the rebuild, this is the best path forward, right? Now, what's to be done with Disk 7? When I bring the disk online I plan to have the backup service try and restore all the files it backed up in place overwriting things as it goes... Then will do some other checks on the data that wasn't backed up. I also plan to add a second parity. Thoughts on when I should do that? Quote Link to comment
JorgeB Posted April 17, 2023 Share Posted April 17, 2023 The disk2 finish rebuilding with errors on disk7 or did you cancel? If you didn't reboot post new diags. Quote Link to comment
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.