July 29, 20169 yr EDIT: Problem has changed slightly, moving original post below line and modifying title: When I start the array, Disk 1 shows as unmountable. Running xfs_repair from SSH shows - # xfs_repair -v /dev/md1 Phase 1 - find and verify superblock... - block cache size set to 3025568 entries Phase 2 - using internal log - zero log... zero_log: head block 17489 tail block 17484 ERROR: The filesystem has valuable metadata changes in a log which needs to be replayed. Mount the filesystem to replay the log, and unmount it before re-running xfs_repair. If you are unable to mount the filesystem, then use the -L option to destroy the log and attempt a repair. Note that destroying the log may cause corruption -- please attempt a mount of the filesystem before doing this. Right now the array is started in Maintenance Mode. I don't know what to do next. ---------------------------------------------------------- Hi all, I was having a problem with drives not being seen by my motherboard, which turned out to be a backplane issue in my old Norco RPC-4220. Details here I replaced my 4220 with a new 4224. Prior to moving hardware, I booted unRAID to double check my drive assignments, et al. During that time I had Disk 2 go bad on me. Because of the instability of my backplanes (see thread above) I figured I'd move to the new case, pop my spare in, and rebuild. So I moved all my hardware over and added an LSI 9211-8i to handle the extra bays. I put the LSI in the x16 slot and moved the 2x AOC-SASLP-MV8 over. This is when things got weird. 1) Upon boot, I tried to update my number of devices displayed in the Dynamix GUI from 20 to 24. This crashed the web server. <-- this happens every time 2) I SSH'd into the machine and rebooted. Machine then hangs after the MV8s run through their drive spinup. <-- this is happening on reboots consitently. Coldboots come up fine. After looping through 1&2 a couple times, I decided to leave the number alone for now and get the spare drive added to the array for rebuild. 3) Replaced Disk 2 in the GUI, started array and immediately Disk 1, 2, & 16 came up as unmountable. The GUI was telling me I needed to format the drives. (I did not do this). Also, it told me a rebuild was in progress. 4) I cancelled the rebuild and powered down. Upon reboot, Disk 2 was unassigned (expected) but now Disk 16 is reported as missing. The log file is empty except for one line complaining about needing to move a certain amount of bytes and another memory error. I should have written this down. So I have no log file to attach here. I have rebooted and am running memtest while I get some sleep and see what replies show up here. At this point, I have no idea what I should be doing. I don't know how much data is lost at this point. And I have no idea what my server is doing. Help! I should point out that all 3 disks that were unmountable in step 3 are on the MV8s. None of them are on the added LSI. Thanks, Spall
July 29, 20169 yr Author Update: I solved problem 1 above by booting into safe mode. I was able assign the number of devices. I don't know if this is a bug in the latest Dynamix GUI? Problem 2 was solved by moving the controllers to different PCIe slots. Putting the MV8s first seems to fix that issue. However, when I try to bring the array online the GUI hangs. I am currently sitting in safe mode without the array started. I have attached diagnostics. Any help is appreciated. I am at a loss on how to proceed further. Thanks! -Spall spock-diagnostics-20160729-1152.zip
July 30, 20169 yr Author Restarted array in Maintenance Mode. Only disk 1 was showing as unmountable. I let the rebuild take place (which hopefully was a good course of action) and disk 2 is showing as fine now. So it seems my only problem that I'm aware of at the moment is disk 1 won't mount and I cannot start the array normally. I have updated the original post above.
July 30, 20169 yr Community Expert An unmountable disk shouldn't prevent you starting the array. Post a screenshot and a new diagnostic.
July 30, 20169 yr Author Trurl, I guess I used the wrong wording. I can start the array, but disk 1 is unmountable. I was using the idea that an unmountable disk meant the array is not "normal". FYI, I'm still booted into safe mode. Info requested is attached. Thanks for taking a look. spock-diagnostics-20160730-1300.zip
July 30, 20169 yr Community Expert I believe your only option is running xfs_repair in maintenance mode with the -L option.
July 30, 20169 yr Author @dikkiedirk: I remember reading about that, but I thought that was limited to using the SM AOC-SASLP-MV8 and AOC-SAS2LP-MV8 in a mixed environment. Are you suggesting that adding the LSI HBA would have caused a problem? @johnnie.black: When I read about -L : Force Log Zeroing. Forces xfs_repair to zero the log even if it is dirty (contains metadata changes). When using this option the filesystem will likely appear to be corrupt, and can cause the loss of user files and/or data. I don't really understand what I would be getting out the other side of that and what my next step would be, should I go that route.
July 30, 20169 yr Community Expert I don't really understand what I would be getting out the other side of that and what my next step would be, should I go that route. Hopefully you'd get a mounted disk in the end, since the disk doesn't mount, while not ideal, using -L is your only option, unless someone else has a better idea. If you want to play it safer rebuild the disk to a spare and then run xfs_repair on it.
July 30, 20169 yr Author If I rebuild to a spare, what am I potentially getting here since I just rebuilt disk 2. Is parity actually valid at this point? Would disk 1 even rebuild? Am I rebuilding garbage? I like the idea of safer, but I just don't have a good handle on what is actually/potentially going to happen with the data on drive 1. Hell, at this point I don't even know what was on drive 1. I need to think about that moving forward.. a content inventory. Thanks for the input so far
July 30, 20169 yr Community Expert The idea of rebuilding to a spare is that would then give you two copies of the disk that you could try to recover files from. Then if the xfs_repair on one disk didn't go well, you would still have the other. What you might then try differently to recover files from the other I don't know.
August 1, 20169 yr Author Thanks everyone. I'll try rebuilding and then repairing disk 1 and see what happens. I'll most likely start that tomorrow, but I'll update the results here when it finishes.
August 3, 20169 yr Author Hey guys, Rebuilt the disk. Did a xfs_repair -L And everything seems good. I've been spot checking files and have no issues. I don't know if anything is missing. I need to find a good solution for keeping an inventory. Any suggestions? At any rate, is there anything else I should be checking? Or just be happy I'm this far? Oh! Also, I assume running a parity check at this point would be good, yes? Thanks!
August 3, 20169 yr Community Expert A non-correcting parity check is often recommended after rebuilding a data disk so the rebuild can be checked without changing parity.
August 4, 20169 yr Author Done! Completed with 0 errors. All data intact as far as I can tell. Thanks again everyone. Marking as solved.
Archived
This topic is now archived and is closed to further replies.