belupig Posted September 8, 2024 Posted September 8, 2024 (edited) Hi everyone, My old disk 6 (8tb) was failing so I replaced it with a new precleared (14tb) disk. The new disk precleared without errors. Here are the steps I did. 1.) preclear replacement disk 2.) shutdown server 3.) replace data drive with new precleared drive 4.) start server 5.) server detects missing disk, select new disk in drop down 6.) start array 7.) rebuild I am unable to access some folders as they appear to be 0kb, then I noticed there's 600k sync error detected and disk 6 (new disk) shows very high raw value (21475164165) of 188 command timeout. Please see the attached diagnostics, any help is much appreciated! unraid-diagnostics-20240908-0858.zip Edited September 11, 2024 by belupig Add Solved to title Quote
belupig Posted September 8, 2024 Author Posted September 8, 2024 I have shutdown my server, connected disk 6 to another set of Sata Power cable and sata data cable. After starting the array, I resumed the partiy check from 20%, and did not get any sync errors. It reset to 0 as I rebooted the server. I was still unable to navigate to certain folders on the array. I then cancel the parity-check so it starts from 0% rather than 20%. As soon as parity check starts checking at 0%, there is a high number of sync errors 791334 CORRECTED (not detected), I hit pause. I believe I overwrite the error checks into my parity, and may have caused some data loss (it ran for a couple of minutes) I am lost on what to do next to prevent further dataloss. Should I replace new disk 6 with old disk 6 and rebuild parity? Ensure to check correct data correction, so Unraid writes to parity. The old disk may have enough life left to do that. Then reformat new disk 6 and rebuild new disk 6 using now intact parity. Quote
Kilrah Posted September 8, 2024 Posted September 8, 2024 (edited) 5 hours ago, belupig said: 1.) preclear replacement disk 2.) shutdown server 3.) replace data drive with new precleared drive 4.) start server 5.) server detects missing disk, select new disk in drop down 6.) start array 7.) rebuild That cannot be what you've done since your screenshots show parity being rebuilt, not the replaced disk. 7 minutes ago, belupig said: Should I replace new disk 6 with old disk 6 and rebuild parity? Ensure to check correct data correction, so Unraid writes to parity. The old disk may have enough life left to do that. If that drive is still good that's an option. Or let the current parity rebuild finish, then mount the old drive as unassigned and copy to the new drive. Edited September 8, 2024 by Kilrah Quote
belupig Posted September 8, 2024 Author Posted September 8, 2024 (edited) 1 hour ago, Kilrah said: That cannot be what you've done since your screenshots show parity being rebuilt, not the replaced disk. If that drive is still good that's an option. Or let the current parity rebuild finish, then mount the old drive as unassigned and copy to the new drive. Hi Kilrah, Thank you for the response. Are you referring to screenshot from the original post or the screenshots from the follow up post? In the original post, the data has been fully rebuilt onto disk 6. I only noticed there was a lot of sync error in the main page as indicated in the first post screenshot, and new disk 6 has very high raw value (21475164165) of 188 command timeout. This can be confirmed in the diagnostics. After I rebooted the system. I cancelled the parity check because I want to start checking from 0%. I probably missed unchecking the "write correction to parity disk" and started parity check. Then it wrote the corrections to my parity disk. I believe this is data loss as I assume the data from disk 6 is incorrect. I noticed I screwed up after a couple of minutes in, I paused the parity check. Based on second post screenshot, I believe I am 5.03gb in so far. Therefore I think using old disk 6 to rebuild parity would mean less write to the parity, compared to letting the current parity finish (using the corrupt? data from disk 6). There has been minor data changes between when I took old disk 6 out to now, would this cause major issues to the overall array? Or it would only mean I lose the data between when I took disk 6 offline to now? Here is my plan, please let me know if these steps are correct Plan A: 1.)Cancel parity check 2.)Shutdown server 3.)Replace new disk 6 with old disk 6 4.)Boot server and run corrective parity check (This should write the correct data back to parity disk) 5.)After rebuilding/corrective parity write done, shutdown server. 6.)Replace old disk 6 with new disk 6 7.)Reformat new disk 6 8.)Use dropdown menu to select newly formatted disk 6 as part of the array and rebuild. Or Plan B: 1.) Run the corrective parity rebuild as suggested. This will use the corrupted new disk 6 data to overwrite parity. 2.) Mount old disk 6 as unassigned disk. 3.) Stop array and reformat new disk 6 4.) Copy old disk 6 to new disk 6 Wouldn't this require another parity rebuild as disk 6 data would have been changed. Is plan B a higher risk approach if my old disk 6 fails? This is because I am overwriting 100% of my parity disk with corrupted disk 6 first. Thanks! Edited September 8, 2024 by belupig Quote
Solution Kilrah Posted September 9, 2024 Solution Posted September 9, 2024 (edited) 8 hours ago, belupig said: Are you referring to screenshot from the original post or the screenshots from the follow up post? In the original post, the data has been fully rebuilt onto disk 6. I only noticed there was a lot of sync error in the main page as indicated in the first post screenshot Both screenshots show a parity-check in progress with sync errors. A disk rebuild will not show any sync errors, only a parity check. In your first post you mentioned the steps you did but that did not mention starting a parity check. 2nd post mentions stopping parity checks but again nothing mentions why or when any was even started, so again what the screenshots show doesn't match the description of what was done. Your syslog is full of spam from XFS corruption so hard to read, assuming it's for that new disk 6 but you cropped the main page screenshots showing whehter it showed unmountable. 8 hours ago, belupig said: Wouldn't this require another parity rebuild as disk 6 data would have been changed. No since once parity is built from the array it'll just get updated as you format the drive/copy data to it as it always does. 8 hours ago, belupig said: Is plan B a higher risk approach if my old disk 6 fails? This is because I am overwriting 100% of my parity disk with corrupted disk 6 first. Parity is useless at this point anyway. IMO B is safer since should your old disk 6 fail part way through the copy what's already been copied will be safe, with plan A you get nothing if the parity build fails to complete. Edited September 9, 2024 by Kilrah Quote
belupig Posted September 9, 2024 Author Posted September 9, 2024 (edited) 19 hours ago, Kilrah said: Both screenshots show a parity-check in progress with sync errors. A disk rebuild will not show any sync errors, only a parity check. In your first post you mentioned the steps you did but that did not mention starting a parity check. 2nd post mentions stopping parity checks but again nothing mentions why or when any was even started, so again what the screenshots show doesn't match the description of what was done. Your syslog is full of spam from XFS corruption so hard to read, assuming it's for that new disk 6 but you cropped the main page screenshots showing whehter it showed unmountable. No since once parity is built from the array it'll just get updated as you format the drive/copy data to it as it always does. Parity is useless at this point anyway. IMO B is safer since should your old disk 6 fail part way through the copy what's already been copied will be safe, with plan A you get nothing if the parity build fails to complete. I did not manually initiate a parity check, but I do have scheduled parity check every night between 2-7AM. However, my screenshot shows elapsed time of 24 minutes, so that does not add up. Please see the uncropped screenshot attached, this should help identify if new disk is unmountable or not. Do you see any potential issues for the below plan? Plan C (plan B but steps rearranged) 1.) Stop array and reformat new disk 6 2.) Shut down server and attach old disk 6 3.) Power on server, Stop array. Copy old disk 6 to new disk 6. 4.) Shut down server, remove old disk 6 5.) Power on server. Choose new disk 6 manually if required and start array. 6.) Run corrective parity check to fix rebuild parity drive. Edit: In step 3 to copy and check, use rsync command via the terminal. Thanks again Kilrah. Edited September 10, 2024 by belupig Edit: In step 3 to copy and check, use rsync command via the terminal. Quote
belupig Posted September 11, 2024 Author Posted September 11, 2024 Update on this. This all happened due to a loose SATA cable to the motherboard. I was able to copy old disk 6 to new disk 6 using rsync, and my server is back. Currently correcting parity. 1 Quote
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.