mikamap Posted February 21, 2022 Share Posted February 21, 2022 (edited) Hi, On my unraid server, I currently have 2 parity drive (6TB and 4TB) and 7 disk (3TB or 4TB). yesterday I was getting SMART errors from a disk in my array. So I ordered 2 new 4tb seagate Ironwolf drive. I replaced the drive with the new ones. Everything seems to work fine. And then the parity / sync was halted and 2 other drives were getting some errors (1 parity and 1 another data drive). The 2 drives with errors were disabled and the one I tried to replace was "Emulated". My error, i think, was to unassign the "disabled drive" and start the array as now, I cannot start it again since it is missing 3 disk. What I'm thinking : 1- Put back the drive with the error on the array (since the data should already be there) 2- Do a "New config" Since the 3 drives should have the data on them. edit: I also plan to change the PSU and all the SATA cable as I have read that some of these errors (CRC) are caused by bad SATA cable or fluctuating power. Is there a good chance it will work ? Should I first try to backup the data I have at the moment on external drive before doing this ? thank you Edited February 22, 2022 by mikamap Quote Link to comment
JorgeB Posted February 22, 2022 Share Posted February 22, 2022 Please post the diagnostics. Quote Link to comment
mikamap Posted February 22, 2022 Author Share Posted February 22, 2022 This is the diagnostics when 2 drives were disabled cause of errorrs and 1 was INVALID because the parity/sync was not successfull. tower-diagnostics-20220221-1646.zip Quote Link to comment
JorgeB Posted February 22, 2022 Share Posted February 22, 2022 Problem was cause by issues with the onboard SATA controller, this is quite common with some Ryzen boards, especially under load, BIOS update might help, or you can also use an add-on controller. As for the current situation, first you need to reboot to clear the controller problem, then you can re-enable the disable disks and try to rebuild again, though same thing might happen if the controller errors out again, we assume the disable disks are OK but since there's no SMART for any of them it's best to reboot now and post new diags, also keep the old disk intact for now in case it's needed. Quote Link to comment
mikamap Posted February 22, 2022 Author Share Posted February 22, 2022 Here is the new diagnostics that I have now. As I told in the first post, I tried to reassign the disabled drive, so I started the array without the disc. Now, 3 of them are "new drive". And I forgot to tell in the first post that I have currently 6 disk plugged into the motherboard SATA and the 3 other and the cache already plugged into a sata controller. If it's a better solution to plug all of them on sata controller, I'll purchase more. So the plan : - Update motherboard driver - Add sata controller for the drives And for the fact that 3 are new, for what I have read, I need to do a "New config" right ? tower-diagnostics-20220222-0743.zip Quote Link to comment
Solution JorgeB Posted February 22, 2022 Solution Share Posted February 22, 2022 Old disk4 does show some recent issues, other ones look fine, I would suggest doing this, keep old disk4 intact for now and reconnect the new one, then: -Tools -> New Config -> Retain current configuration: All -> Apply -Check all assignments and assign any missing disk(s) if needed, including the new disk4, replacement disk should be same size or larger than the old one -IMPORTANT - Check both "parity is already valid" and "maintenance mode" and start the array (note that the GUI will still show that data on parity disk(s) will be overwritten, this is normal as it doesn't account for the checkbox, but it won't be as long as it's checked) -Stop array -Unassign disk4 -Start array (in normal mode now), ideally the emulated disk will now mount and contents look correct, if it doesn't you should run a filesystem check on the emulated disk -If the emulated disk mounts and contents look correct stop the array -Re-assign disk4 and start array to begin. Quote Link to comment
mikamap Posted February 22, 2022 Author Share Posted February 22, 2022 Thank you, I will try that. I really appreciate the quick response. Quote Link to comment
mikamap Posted February 22, 2022 Author Share Posted February 22, 2022 (edited) Everything worked great. The parity-Sync / Data-rebuild started. Just need to wait a few hours now. I just saw that my VM didn't start and I get the message "Libvirt Service failed to start" on this tab, but I'll check back after the process is finished. Thank you. Edited February 22, 2022 by mikamap Quote Link to comment
mikamap Posted February 22, 2022 Author Share Posted February 22, 2022 Not sure if everything will be ok after. Parity Sync is at 67% but there are a lot of errors. All those disk seems to be connected to the unboard sata. I'll order another one to replace rapidly Quote Link to comment
JorgeB Posted February 22, 2022 Share Posted February 22, 2022 Just now, mikamap said: Not sure if everything will be ok after. It won't, would need to see the diags to confirm but like mentioned above it's most likely the same controller issue, it tends to be worst under load, like during a parity check or rebuild. You can cancel the current rebuild and try again later, ideally without using the onboard controller. Quote Link to comment
mikamap Posted February 22, 2022 Author Share Posted February 22, 2022 Here is the diagnostics. Just canceled the sync / stoped the array / shutdown the server. Will try to find one tonight and launch it again tonight or tomorrow. Thank you tower-diagnostics-20220222-1420.zip Quote Link to comment
JorgeB Posted February 22, 2022 Share Posted February 22, 2022 Yep, same controller issue. Quote Link to comment
mikamap Posted February 23, 2022 Author Share Posted February 23, 2022 Happy to tell you that after addind a new sata controller and moving all data disk to the controller, I could finish the parity sync without error. Only the cache disk is on the onboard sata controller now for the moment. @JorgeB Thank you so much for your help and support. 1 Quote Link to comment
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.