gideva Posted July 21, 2017 Share Posted July 21, 2017 Just changed one of the disks that failed. After a couple of hours I started the parity check this is what I have.... Any suggestion??? Thanks Quote Link to comment
JorgeB Posted July 21, 2017 Share Posted July 21, 2017 Since it's correcting a large number of sync errors it's normal to slow down a lot, are so many sync errors expected? Quote Link to comment
trurl Posted July 21, 2017 Share Posted July 21, 2017 1 hour ago, gideva said: Just changed one of the disks that failed. After a couple of hours I started the parity check this is what I have.... Why are you doing a parity check instead of a data rebuild? Post your diagnostics. Quote Link to comment
SSD Posted July 21, 2017 Share Posted July 21, 2017 2 hours ago, gideva said: Just changed one of the disks that failed. After a couple of hours I started the parity check this is what I have.... Any suggestion??? Thanks How did you change the disk? A disk rebuild? Assuming yes, I am not sure why you'd have sync errors. Did the rebuild proceed at normal speed? If so, that would be a good sign. Did you open the case AFTER the rebuild? Looking at the number of sync errors 154,000+. Sounds like a lot but isn't for 550G of processed space. Each block I think is 8 sectors (4K). so 154000 is only 600M. Way less than 550G! About 0.1%. So most of the disk is verifying, but some is not. The fact it is so slow is not due to the sync errors. There are too few of them to slow it down that much. Even if every block were bad, it would not be this slow. So either the rebuild went flawlessly and now there is some connectivity / HBA issue that is causing performance and data accuracy issues. Or the rebuild was suffering the same issues and we are seeing its continuation into the parity check. Parity is being updated to match the rebuilt disk. Probably a bad thing. Stop this, answer my questions, and post a diagnostic file. Quote Link to comment
gideva Posted July 21, 2017 Author Share Posted July 21, 2017 Hi there and really thanks for the quick reply.... Coming to the answers: 1) it was a rebuilt and all seemed fine. The only weird thing was that after the rebuilt one of the other disks (cannot remember which) was marked as unreadable and the system was asking to format it. After I reboot the system all was normal. 2) I did not open the case I got your post just before leaving home and I had the chance to stop everything but not to get the diagnostic file and post it. As soon as I will be back home (on Sunday) I will post everything. Thank you again Quote Link to comment
trurl Posted July 21, 2017 Share Posted July 21, 2017 5 hours ago, gideva said: Just changed one of the disks that failed. After a couple of hours I started the parity check this is what I have.... So are you saying the rebuild completed in only 2 hours and then you started a parity check? Quote Link to comment
gideva Posted July 21, 2017 Author Share Posted July 21, 2017 Probably something more but did not take really long... This is my first time with rebuilt! Quote Link to comment
bonienl Posted July 21, 2017 Share Posted July 21, 2017 It is more likely your rebuild failed before finishing due a disk dropping offline. Quote Link to comment
trurl Posted July 21, 2017 Share Posted July 21, 2017 3 hours ago, bonienl said: It is more likely your rebuild failed before finishing due a disk dropping offline. And now parity isn't likely to give a valid rebuild. We will have to hope to recover files from the original disk. gideva, you should have asked for advice before doing anything. It's best to try to understand why a disk failed before deciding what to do about it. Quote Link to comment
gideva Posted July 22, 2017 Author Share Posted July 22, 2017 Great... now what??? Did I just lose some dats or all the array is compromised? What should do now not to make the situation even worser? You are right I should have asked before... Quote Link to comment
bonienl Posted July 22, 2017 Share Posted July 22, 2017 Go to Tools -> Diagnostics and get the diagnostics zip file. This will help to understand the current situation better. Quote Link to comment
gideva Posted July 22, 2017 Author Share Posted July 22, 2017 Will be back home tomorrow and I will do it right the way... Once again thanks for your help Quote Link to comment
gideva Posted July 23, 2017 Author Share Posted July 23, 2017 Here it is... hope it will help you to help me. monstruo-diagnostics-20170723-1511.zip Quote Link to comment
JorgeB Posted July 23, 2017 Share Posted July 23, 2017 (edited) Server came up from an unclean shutdown, besides that you're having issues with your SAS2LP, disable VT-D if you don't need it, try a different slot (you should anyway since it's on a x4 slot, especially if it's a DMI shared slot). Since the rebuild was before this log no idea if it was successful or not, but the sync errors and unclean shutdown suggest otherwise. Edited July 23, 2017 by johnnie.black Quote Link to comment
JorgeB Posted July 23, 2017 Share Posted July 23, 2017 Forgot to mention the filesystem corruption on disk4, you need to run xfs_repair on it, if that was the rebuilt disk it may also be a clue that the rebuild wasn't 100% successful, but it may also a consequence of the SAS2LP issues, same for some of the sync errors. Quote Link to comment
gideva Posted July 23, 2017 Author Share Posted July 23, 2017 All copied... but not all understood completely (talking with a newbie)... Just to be sure I am going to do correctly this time: 1) I have to disable VT-D on BIOS 2) The change of slot is not really clear... 3) Will run xfs_repair on disk 4 (that was the failed disk) When I do all this shall I try again with the Parity Check and see if is ok or not? Thanks for help and patience Beppe Quote Link to comment
JorgeB Posted July 23, 2017 Share Posted July 23, 2017 16 minutes ago, gideva said: 2) The change of slot is not really clear... Change the SAS2LP to one the top two PCIe slots. 17 minutes ago, gideva said: 3) Will run xfs_repair on disk 4 (that was the failed disk) It's likely that the filesystem corruption is because of an incomplete rebuild, is so data on that disk will also be corrupt, assuming you don't have checksums to figure out which files are bad is the old disk still readable? Quote Link to comment
gideva Posted July 23, 2017 Author Share Posted July 23, 2017 The old disk is dead apparently. Will try again tomorrow. Quote Link to comment
gideva Posted July 24, 2017 Author Share Posted July 24, 2017 all done but when I try to run xfs_repair on disk 4 does not work!!! It gives an error message like: ERROR: The filesystem has valuable metadata changes..... Quote Link to comment
JorgeB Posted July 24, 2017 Share Posted July 24, 2017 If the disk doesn't mount you need to use -L. Quote Link to comment
gideva Posted July 24, 2017 Author Share Posted July 24, 2017 So I have to use xfs_repair -L /dev/md1? Try to be patient.... Quote Link to comment
JorgeB Posted July 24, 2017 Share Posted July 24, 2017 Not md1, md4 (disk4), and only if the disk is not currently mounting, it was on the last diags. Quote Link to comment
gideva Posted July 24, 2017 Author Share Posted July 24, 2017 Yes.. sorry I meant md4... (I copied and pasted...) Quote Link to comment
gideva Posted July 24, 2017 Author Share Posted July 24, 2017 OK done... What is next? Quote Link to comment
JorgeB Posted July 24, 2017 Share Posted July 24, 2017 If the old disk still works compare the files, or copy everything you can from it. Quote Link to comment
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.