JorgeB Posted February 11, 2019 Share Posted February 11, 2019 Yes, that's expected now, when the rebuild finishes try xfs_repair on both, but like mentioned you'll likely need to use a file recovery util to recover anything. P.S. Unrelated, but the SSD needs a new SATA cable. Quote Link to comment
nickro8303 Posted February 11, 2019 Author Share Posted February 11, 2019 4 minutes ago, johnnie.black said: Yes, that's expected now, when the rebuild finishes try xfs_repair on both, but like mentioned you'll likely need to use a file recovery util to recover anything. Ok, I'll give that a shot when the rebuild is done and post an update to this thread. Thanks again for all the help you and Trurl have given me. 5 minutes ago, johnnie.black said: P.S. Unrelated, but the SSD needs a new SATA cable. I've already replaced that cable a few times. Again I'm thinking that the sata ports on my motherboard going bad is what causing these issues to occur in the first place. That SSD and cable are brand new. Quote Link to comment
nickro8303 Posted February 12, 2019 Author Share Posted February 12, 2019 (edited) Data rebuild on disk 4 finished. 2 and 4 are still showing up as "Unmountable: No file system". Should I go a head with the XFS_Repair? tower-diagnostics-20190211-2039.zip Edited February 12, 2019 by nickro8303 Quote Link to comment
JorgeB Posted February 12, 2019 Share Posted February 12, 2019 5 hours ago, nickro8303 said: Should I go a head with the XFS_Repair? Yes, xfs_repair will likely need to search for a secondary superblock, let it run, it can take some time. Quote Link to comment
nickro8303 Posted February 12, 2019 Author Share Posted February 12, 2019 Ok, finished xfs_repair on both disk 2 and 4. The repair on 2 failed but 4 finished with the info below. Looks like I need to start the array and the stop it go back to maintenance mode and rerun the xfs_repair again? tower-diagnostics-20190212-1100.zip Quote Link to comment
JorgeB Posted February 12, 2019 Share Posted February 12, 2019 Usually you need to use -L when this happens. Quote Link to comment
nickro8303 Posted February 12, 2019 Author Share Posted February 12, 2019 Well the second scan finished. Not sure what to make of the results though. tower-diagnostics-20190212-1138.zip Quote Link to comment
JorgeB Posted February 12, 2019 Share Posted February 12, 2019 Never seen that before, try running xfs_repair again, but likely won't be fixable. Quote Link to comment
nickro8303 Posted February 12, 2019 Author Share Posted February 12, 2019 Honestly I'm not worried about recovering data at this point. I really just want to get the array back to stable working order. I have backups of important files and everything else can be re-downloaded. Should I just go ahead and format the drives and try adding them back to the array? Quote Link to comment
trurl Posted February 12, 2019 Share Posted February 12, 2019 7 minutes ago, nickro8303 said: Should I just go ahead and format the drives and try adding them back to the array? You must format the drives while they are still in the array. Stop the array, click on the drive to get to its page and change its filesystem. When you start the array it will be reformatted to the new filesystem. Repeat to get it back to the original filesystem. You can do both drives at the same time. Quote Link to comment
nickro8303 Posted February 12, 2019 Author Share Posted February 12, 2019 (edited) Well went ahead and ran another repair and it looks like it finished successfully this time. tower-diagnostics-20190212-1206.zip Edited February 12, 2019 by nickro8303 Quote Link to comment
JorgeB Posted February 12, 2019 Share Posted February 12, 2019 It should mount now, the question is, will there be any data there? Start the array and check. Quote Link to comment
nickro8303 Posted February 12, 2019 Author Share Posted February 12, 2019 5 minutes ago, johnnie.black said: It should mount now, the question is, will there be any data there? Start the array and check. There is a ton of folders in the lost+found folder on that disk. Some of them look like they have good data, some are empty and some have files with random numbers as names. Not even sure what to do with all this. I don't know what the file structure was on the drive before it failed so I'm not sure how to go about restoring this data. Quote Link to comment
nickro8303 Posted February 12, 2019 Author Share Posted February 12, 2019 So disk 4 keeps failing to allow writes. I've rerun the xfs_repair a few times which temporarily fixes it but it just goes right back to "Unable to write to disk" error. Not sure if this means that drive is failed and needs to be replaced or what. I'm still waiting on new sata cables which should be on Friday. tower-diagnostics-20190212-1624.zip Quote Link to comment
JorgeB Posted February 13, 2019 Share Posted February 13, 2019 There's recurring filesystem corruption, I suggest you copy any data there to other disk(s) and re-format the disk. Quote Link to comment
nickro8303 Posted February 18, 2019 Author Share Posted February 18, 2019 I'm at my wits end with this server. I've replaced all the sata cables and the motherboard now and I'm still getting drives failing and com errors. Disk 4 keeps showing up as Unmountable and Disk 3 is now showing a red X. I really don't know what to do at this point. I'm ready to just trash the whole thing and start over. tower-diagnostics-20190218-0829.zip Quote Link to comment
JorgeB Posted February 18, 2019 Share Posted February 18, 2019 Diags are after rebooting, so not much info, try running xfs_repair again on disk4, but it appears there's a missing or damaged superblock, that's unusual. Quote Link to comment
nickro8303 Posted February 18, 2019 Author Share Posted February 18, 2019 (edited) XFS repair finished. I really don't understand what could be causing all these errors. I get that disk 4 is corrupt but how would that affect the other disks in the array. Why to do other disks randomly go red X? How do I get back to the point that it's stable? I could replace disk 4 completely but will that solve the other issues that keep recurring? tower-diagnostics-20190218-1003.zip Edited February 18, 2019 by nickro8303 Quote Link to comment
JorgeB Posted February 18, 2019 Share Posted February 18, 2019 Filesystem corruption and disabled disks are two completely different things, did you re-format the disk? On 2/13/2019 at 7:43 AM, johnnie.black said: There's recurring filesystem corruption, I suggest you copy any data there to other disk(s) and re-format the disk. Quote Link to comment
nickro8303 Posted February 18, 2019 Author Share Posted February 18, 2019 (edited) I did reformat once and it keeps going back to Unmountable. I did it again though we'll see if it sticks. Maybe I should just replace the drive and be done with it? How do I troubleshoot drives disabling? First it was the parity drive and now disk 3, I've gone through doing a New Config but it's happened three times now. tower-diagnostics-20190218-1127.zip Edited February 18, 2019 by nickro8303 Quote Link to comment
nickro8303 Posted February 18, 2019 Author Share Posted February 18, 2019 (edited) I did the new config again and started the array, Parity is rebuilding. Disk 3 is back up and Disk 4 mounted. Should I let the rebuild finish? tower-diagnostics-20190218-1147.zip Edited February 18, 2019 by nickro8303 Updated with Diags Quote Link to comment
JorgeB Posted February 18, 2019 Share Posted February 18, 2019 Yes, and if another gets disable again grab the diags before rebooting. Quote Link to comment
nickro8303 Posted February 19, 2019 Author Share Posted February 19, 2019 Parity rebuild finished with a ton of errors and disk 3 has been disabled. tower-diagnostics-20190218-2222.zip Quote Link to comment
JorgeB Posted February 19, 2019 Share Posted February 19, 2019 Disk that dropped is on a Marvell controller, those are known to drop disks, suggest you replace it. Quote Link to comment
nickro8303 Posted February 19, 2019 Author Share Posted February 19, 2019 I replaced the controller and everything seems to be stable for now just waiting for the parity to sync again. Hopefully going forward there won't be anymore issues since I've replaced the motherboard, sata cables and controller. tower-diagnostics-20190219-1138.zip Quote Link to comment
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.