assur191 Posted October 8, 2020 Share Posted October 8, 2020 Hello All, I have an array of 22 drives and 1 parity drive, all xfs formatted. Simultaneously Disk 2 is showing "Unmountable: no filesystem" and Disk 17 is showing disabled, with contents emulated. I would like to try to restore Disk 2 and also replace Disk 17. I would appreciate any feedback on in which order I should resolve these issues, or what approaches I may take. I have attached a syslog I just captured after stopping and restarting the array. Please let me know any additional information I can provide. syslog.txt Quote Link to comment
ChatNoir Posted October 8, 2020 Share Posted October 8, 2020 The syslog is good but the diagnostics would be better to have a proper advice. Please attach it to your next post (Tools/Diagnostics). Quote Link to comment
assur191 Posted October 8, 2020 Author Share Posted October 8, 2020 Thank you, ChatNoir. I have attached my diagnostics here. tower-diagnostics-20201008-0840.zip Quote Link to comment
trurl Posted October 8, 2020 Share Posted October 8, 2020 13 hours ago, assur191 said: I have an array of 22 drives and 1 parity drive With that many you should consider dual parity. 13 hours ago, assur191 said: Disk 2 is showing "Unmountable: no filesystem" Disk3 is the unmountable disk according to those diagnostics. 13 hours ago, assur191 said: Disk 17 is showing disabled Not getting SMART for disk17. Do you have some reason to think it is actually bad and not just a bad connection? Syslog indicated corruption on disk17 also before it became disabled, but since the emulated disk is mounting maybe it is OK. You could check connections and see if we can get SMART for disk17. If it is OK you could rebuild to the same disk, but rebuilding to a new disk and keeping the original is always a good approach also since if the original disk is good you might be able to recover something from it if there is any problems with rebuild. I would be inclined to do the rebuild first so you at least get back to parity protection, then repair disk3 filesystem after that. Also, you should turn off mover logging in Scheduler since those are not anonymized and unless you are trying to diagnose a problem there best to not log those and it makes syslog easier without all that. And I see you have Marvell controllers, those might be the root of your trouble. Quote Link to comment
JorgeB Posted October 8, 2020 Share Posted October 8, 2020 40 minutes ago, trurl said: you have Marvell controllers Yep, two SASLP, you should difinetely get rid of those, it's not clear the disk is OK, you need to reboot an post ne diags, but still they can cause more trouble when there are errors because the driver crashes. Also make sure scheduled parity checks are set to non correct. Quote Link to comment
assur191 Posted October 9, 2020 Author Share Posted October 9, 2020 Thanks all for your feedback. I rebooted the server and checked connections, and now I'm showing the message: Quote Unraid array errors: 08-10-2020 22:41 Notice [TOWER] - array turned good Array has 0 disks with read errors The drive is now showing "healthy" under SMART, where it was "error" before. Is there any way I can just re-enable the drive without rebuilding, then fix the filesystem on drive 3? I have attached the new diagnostics here. Also, regarding the Marvell controllers, are they any suggestions on what I should replace them with? tower-diagnostics-20201009-1018.zip Quote Link to comment
JorgeB Posted October 9, 2020 Share Posted October 9, 2020 20 minutes ago, assur191 said: The drive is now showing "healthy" under SMART, It's not that healthy, in fact it appears to be failing, you can confirm by running an extended SMART test. 21 minutes ago, assur191 said: then fix the filesystem on drive 3? You can do that now. 21 minutes ago, assur191 said: Also, regarding the Marvell controllers, are they any suggestions on what I should replace them with? Any LSI with a SAS2008/2308/3008/3408 chipset in IT mode, e.g., 9201-8i, 9211-8i, 9207-8i, 9300-8i, 9400-8i, etc and clones, like the Dell H200/H310 and IBM M1015, these latter ones need to be crossflashed. Quote Link to comment
trurl Posted October 9, 2020 Share Posted October 9, 2020 37 minutes ago, assur191 said: Is there any way I can just re-enable the drive without rebuilding You should rebuild unless you have some good reason to suspect rebuild will not produce a good result. A disabled disk is out-of-sync and rebuilding will get it back in sync. The alternative is to rebuild parity instead to get the array back in sync, but the disabled disk is the one that is out of sync. See this recent post for more details about this: Quote Link to comment
assur191 Posted October 9, 2020 Author Share Posted October 9, 2020 Thanks again for the responses. I will go ahead and rebuild disk 17 and repair the filesystem on disk 3. However, I just want to confirm that I should first rebuild 17, then repair 3. Will that order result in the least amount of lost data? Quote Link to comment
JorgeB Posted October 10, 2020 Share Posted October 10, 2020 12 hours ago, assur191 said: Will that order result in the least amount of lost data? I would repair the fs first since that should be quick and make data on that disk available now, then rebuild, but either way you do it shouldn't be more or less risky. Quote Link to comment
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.