Juniper Posted June 28, 2020 Posted June 28, 2020 Unraid 6.83, 6 drives: 5 8TB Data drives, 1 10TB parity drive, no cache at this time. Unraid user since maybe 3 weeks. Got a new parity drive (Seagate Ironwolf 10 TB), and precleared it with the "Preclear Disks" plugin (1 pass). Test showed no problems. Then I ran the Parity swap procedure step by step as described in https://wiki.unraid.net/UnRAID_6/Storage_Management#Parity_Swap The Parity Swap procedure read the old parity drive (Seagate Exos 8TB) and copied the parity info to the new drive. Afterwards the old parity drive was added as data drive to the array. It finished successfully, no errors were reported. I was able to access all my shares just like before. Afterwards I copied ~3 TBs more data to the array. Then I started a Parity Check. The "write correction to parity" box was default checked I think. It ran 19.5h and when finished it reported 3.8 mil errors. During the parity check I only read from the array (watched videos from it), and didn't change or copy anything to it. The smart reports of the drives were fine before the parity check, just the known CRC errors for the drives where I once had problems with in my Win 10 pc. After the parity check I ran short smart tests on all drives: no errors. I got no idea what happened. The display in "Main" always showed "Parity is valid" before and after the parity swap, and also before and after the parity check. The temp is always around 30-33C on all drives, regardless of activity. The parity drive could not have gotten too hot: drives are in the cages from my old Antec 900 case (with new powerful fans). The syslog shows: parity check started at 1am, ran without any reports in the log for ~13h, then the errors started: Jun 27 01:02:17 Schiethucken kernel: md: recovery thread: check P ... Jun 27 03:40:01 Schiethucken root: mover: cache not present, or only cache present Jun 27 10:47:40 Schiethucken webGUI: Successful login user root from 192.168.1.152 Jun 27 14:02:53 Schiethucken kernel: md: recovery thread: P corrected, sector=15628053640 Jun 27 14:02:53 Schiethucken kernel: md: recovery thread: P corrected, sector=15628054664 Jun 27 14:02:53 Schiethucken kernel: md: recovery thread: P corrected, sector=15628055688 Jun 27 14:02:53 Schiethucken kernel: md: recovery thread: P corrected, sector=15628056712 Jun 27 14:02:53 Schiethucken kernel: md: recovery thread: P corrected, sector=15628057736 Jun 27 14:02:53 Schiethucken kernel: md: recovery thread: P corrected, sector=15628058760 Jun 27 14:02:53 Schiethucken kernel: md: recovery thread: P corrected, sector=15628059784 Jun 27 14:02:53 Schiethucken kernel: md: recovery thread: P corrected, sector=15628060808 Jun 27 14:02:53 Schiethucken kernel: md: recovery thread: P corrected, sector=15628061832 Jun 27 14:02:53 Schiethucken kernel: md: recovery thread: P corrected, sector=15628062856 Jun 27 14:02:53 Schiethucken kernel: md: recovery thread: P corrected, sector=15628063880 Jun 27 14:02:53 Schiethucken kernel: md: recovery thread: P corrected, sector=15628064904 Jun 27 14:02:53 Schiethucken kernel: md: recovery thread: P corrected, sector=15628065928 Jun 27 14:02:53 Schiethucken kernel: md: recovery thread: P corrected, sector=15628066952 Jun 27 14:02:53 Schiethucken kernel: md: recovery thread: P corrected, sector=15628067976 Jun 27 14:02:53 Schiethucken kernel: md: recovery thread: P corrected, sector=15628069000 Jun 27 14:02:53 Schiethucken kernel: md: recovery thread: P corrected, sector=15628070024 Jun 27 14:02:53 Schiethucken kernel: md: recovery thread: P corrected, sector=15628071048 Jun 27 14:02:53 Schiethucken kernel: md: recovery thread: P corrected, sector=15628072072 Jun 27 14:02:53 Schiethucken kernel: md: recovery thread: P corrected, sector=15628073096 Jun 27 14:02:53 Schiethucken kernel: md: recovery thread: P corrected, sector=15628074120 Jun 27 14:02:53 Schiethucken kernel: md: recovery thread: P corrected, sector=15628075144 Jun 27 14:02:53 Schiethucken kernel: md: recovery thread: P corrected, sector=15628076168 Jun 27 14:02:53 Schiethucken kernel: md: recovery thread: P corrected, sector=15628077192 Jun 27 14:02:53 Schiethucken kernel: md: recovery thread: P corrected, sector=15628078216 Jun 27 14:02:53 Schiethucken kernel: md: recovery thread: P corrected, sector=15628079240 Jun 27 14:02:53 Schiethucken kernel: md: recovery thread: P corrected, sector=15628080264 Jun 27 14:02:53 Schiethucken kernel: md: recovery thread: P corrected, sector=15628081288 Jun 27 14:02:53 Schiethucken kernel: md: recovery thread: P corrected, sector=15628082312 Jun 27 14:02:53 Schiethucken kernel: md: recovery thread: P corrected, sector=15628083336 Jun 27 14:02:53 Schiethucken kernel: md: recovery thread: P corrected, sector=15628084360 Jun 27 14:02:53 Schiethucken kernel: md: recovery thread: P corrected, sector=15628085384 Jun 27 14:02:53 Schiethucken kernel: md: recovery thread: P corrected, sector=15628086408 Jun 27 14:02:53 Schiethucken kernel: md: recovery thread: P corrected, sector=15628087432 Jun 27 14:02:53 Schiethucken kernel: md: recovery thread: P corrected, sector=15628088456 Jun 27 14:02:53 Schiethucken kernel: md: recovery thread: P corrected, sector=15628089480 Jun 27 14:02:53 Schiethucken kernel: md: recovery thread: P corrected, sector=15628090504 Jun 27 14:02:53 Schiethucken kernel: md: recovery thread: P corrected, sector=15628091528 Jun 27 14:02:53 Schiethucken kernel: md: recovery thread: P corrected, sector=15628092552 Jun 27 14:02:53 Schiethucken kernel: md: recovery thread: P corrected, sector=15628093576 Jun 27 14:02:53 Schiethucken kernel: md: recovery thread: P corrected, sector=15628094600 Jun 27 14:02:53 Schiethucken kernel: md: recovery thread: P corrected, sector=15628095624 Jun 27 14:02:53 Schiethucken kernel: md: recovery thread: P corrected, sector=15628096648 Jun 27 14:02:53 Schiethucken kernel: md: recovery thread: P corrected, sector=15628097672 Jun 27 14:02:53 Schiethucken kernel: md: recovery thread: P corrected, sector=15628098696 Jun 27 14:02:53 Schiethucken kernel: md: recovery thread: P corrected, sector=15628099720 Jun 27 14:02:53 Schiethucken kernel: md: recovery thread: P corrected, sector=15628100744 Jun 27 14:02:53 Schiethucken kernel: md: recovery thread: P corrected, sector=15628101768 Jun 27 14:02:53 Schiethucken kernel: md: recovery thread: P corrected, sector=15628102792 Jun 27 14:02:53 Schiethucken kernel: md: recovery thread: P corrected, sector=15628103816 Jun 27 14:02:53 Schiethucken kernel: md: recovery thread: P corrected, sector=15628104840 Jun 27 14:02:53 Schiethucken kernel: md: recovery thread: P corrected, sector=15628105864 Jun 27 14:02:53 Schiethucken kernel: md: recovery thread: P corrected, sector=15628106888 Jun 27 14:02:53 Schiethucken kernel: md: recovery thread: P corrected, sector=15628107912 Jun 27 14:02:53 Schiethucken kernel: md: recovery thread: P corrected, sector=15628108936 Jun 27 14:02:53 Schiethucken kernel: md: recovery thread: P corrected, sector=15628109960 Jun 27 14:02:53 Schiethucken kernel: md: recovery thread: P corrected, sector=15628110984 Jun 27 14:02:53 Schiethucken kernel: md: recovery thread: P corrected, sector=15628112008 Jun 27 14:02:53 Schiethucken kernel: md: recovery thread: P corrected, sector=15628113032 Jun 27 14:02:53 Schiethucken kernel: md: recovery thread: P corrected, sector=15628114056 Jun 27 14:02:53 Schiethucken kernel: md: recovery thread: P corrected, sector=15628115080 Jun 27 14:02:53 Schiethucken kernel: md: recovery thread: P corrected, sector=15628116104 Jun 27 14:02:53 Schiethucken kernel: md: recovery thread: P corrected, sector=15628117128 Jun 27 14:02:53 Schiethucken kernel: md: recovery thread: P corrected, sector=15628118152 Jun 27 14:02:53 Schiethucken kernel: md: recovery thread: P corrected, sector=15628119176 Jun 27 14:02:53 Schiethucken kernel: md: recovery thread: P corrected, sector=15628120200 Jun 27 14:02:53 Schiethucken kernel: md: recovery thread: P corrected, sector=15628121224 Jun 27 14:02:53 Schiethucken kernel: md: recovery thread: P corrected, sector=15628122248 Jun 27 14:02:53 Schiethucken kernel: md: recovery thread: P corrected, sector=15628123272 Jun 27 14:02:53 Schiethucken kernel: md: recovery thread: P corrected, sector=15628124296 Jun 27 14:02:53 Schiethucken kernel: md: recovery thread: P corrected, sector=15628125320 Jun 27 14:02:53 Schiethucken kernel: md: recovery thread: P corrected, sector=15628126344 Jun 27 14:02:53 Schiethucken kernel: md: recovery thread: P corrected, sector=15628127368 Jun 27 14:02:53 Schiethucken kernel: md: recovery thread: P corrected, sector=15628128392 Jun 27 14:02:53 Schiethucken kernel: md: recovery thread: P corrected, sector=15628129416 Jun 27 14:02:53 Schiethucken kernel: md: recovery thread: P corrected, sector=15628130440 Jun 27 14:02:53 Schiethucken kernel: md: recovery thread: P corrected, sector=15628131464 Jun 27 14:02:53 Schiethucken kernel: md: recovery thread: P corrected, sector=15628132488 Jun 27 14:02:53 Schiethucken kernel: md: recovery thread: P corrected, sector=15628133512 Jun 27 14:02:53 Schiethucken kernel: md: recovery thread: P corrected, sector=15628134536 Jun 27 14:02:53 Schiethucken kernel: md: recovery thread: P corrected, sector=15628135560 Jun 27 14:02:53 Schiethucken kernel: md: recovery thread: P corrected, sector=15628136584 Jun 27 14:02:53 Schiethucken kernel: md: recovery thread: P corrected, sector=15628137608 Jun 27 14:02:53 Schiethucken kernel: md: recovery thread: P corrected, sector=15628138632 Jun 27 14:02:53 Schiethucken kernel: md: recovery thread: P corrected, sector=15628139656 Jun 27 14:02:53 Schiethucken kernel: md: recovery thread: P corrected, sector=15628140680 Jun 27 14:02:53 Schiethucken kernel: md: recovery thread: P corrected, sector=15628141704 Jun 27 14:02:53 Schiethucken kernel: md: recovery thread: P corrected, sector=15628142728 Jun 27 14:02:53 Schiethucken kernel: md: recovery thread: P corrected, sector=15628143752 Jun 27 14:02:53 Schiethucken kernel: md: recovery thread: P corrected, sector=15628144776 Jun 27 14:02:53 Schiethucken kernel: md: recovery thread: P corrected, sector=15628145800 Jun 27 14:02:53 Schiethucken kernel: md: recovery thread: P corrected, sector=15628146824 Jun 27 14:02:53 Schiethucken kernel: md: recovery thread: P corrected, sector=15628147848 Jun 27 14:02:53 Schiethucken kernel: md: recovery thread: P corrected, sector=15628148872 Jun 27 14:02:53 Schiethucken kernel: md: recovery thread: P corrected, sector=15628149896 Jun 27 14:02:53 Schiethucken kernel: md: recovery thread: P corrected, sector=15628150920 Jun 27 14:02:53 Schiethucken kernel: md: recovery thread: P corrected, sector=15628151944 Jun 27 14:02:53 Schiethucken kernel: md: recovery thread: P corrected, sector=15628152968 Jun 27 14:02:53 Schiethucken kernel: md: recovery thread: P corrected, sector=15628153992 Jun 27 14:02:53 Schiethucken kernel: md: recovery thread: P corrected, sector=15628155016 Jun 27 14:02:53 Schiethucken kernel: md: recovery thread: stopped logging Is maybe the new parity drive defective? It survived the preclear test which ran 2 days (1 pass), and its smart short test after the parity check shows no errors. Is there anything I can do to fix this? Maybe run more tests on the new parity drive, e.g. run preclear again with more passes? I installed the "Parity Check Tuning" plugin, but I checked the log: that was hours after the errors had started. That could not have caused them. Attached are the diagnostics, and screenies with the disk configuration and the parity check info. Thank you very much for reading this! schiethucken-diagnostics-20200627-2209.zip Quote
JorgeB Posted June 28, 2020 Posted June 28, 2020 For some yet unknown reason the parity swap doesn't sync the new parity section for some users, so millions of errors are found on the next sync, just run another check and if no more errors you're fine, unfortunatly diags don't show the parity swap, I'm still trying to find out why this happens to some, I could never reproduce it. Quote
JorgeB Posted June 28, 2020 Posted June 28, 2020 4 hours ago, Juniper said: Afterwards the old parity drive was added as data drive to the array. Just to be clear, you mean it was used to replace a disable data disk, correct? Quote
Juniper Posted June 28, 2020 Author Posted June 28, 2020 (edited) Before I bought the new drive the array had 5 8TB drives and 1 3 TB drive. Then I bought the 10 TB Iron Wolf. I ran parity swap to use the Iron Wolf as the new parity drive, and replace the 3 TB drive with the old 8 TB parity drive. Sorry for not mentioning it earlier. Was worried about the 3.8 mil errors and somehow forgot. As part of the parity swap I took the 3 TB drive out of the array. Then the parity swap continued with swapping the parity from the old 8 TB parity drive to the new 10 TB drive and rebuilding the data of the drive I took out on the old parity drive, effectively turning it into a data drive. That is what I meant by "afterwards the old parity drive was added as data drive to the array". It makes sense now: the new parity drive is 2 TB larger than the old parity drive. Means I ended up with 2 TB of unsynched parity. That sounds so much better than 3.8 mil errors on the new drive. Thank you very much for your help! I'll start another parity check and will report back. Edited June 28, 2020 by Juniper Quote
itimpi Posted June 28, 2020 Posted June 28, 2020 8 minutes ago, Juniper said: It makes sense now: the new parity drive is 2 TB larger than the old parity drive. Means I ended up with 2 TB of unsynched parity. That sounds so much better than 3.8 mil errors on the new drive It is MEANT to zero the remainder of the parity drive so this is actually a symptom of a bug. I do not know if the trigger has been identified and it is now fixed for the 6.9.x series of releases. Quote
JorgeB Posted June 28, 2020 Posted June 28, 2020 1 minute ago, itimpi said: It is MEANT to zero the remainder of the parity drive so this is actually a symptom of a bug. And it should do it, I could never reproduce the problem to create a bug report, but this has happened to multiple users, so maybe is just happens with some specific configuration or they are doing something different during the procedure. Quote
Juniper Posted June 28, 2020 Author Posted June 28, 2020 (edited) Parity check finished: 0 errors Problem solved. Thank you guys very much for taking the time and helping me! My configuration seems to help make the bug occur. Would be happy to help test it. Don't have another drive big enough to do more parity swaps under a different Unraid version but maybe there is anything else you guys can think of I could do to help. Edited June 29, 2020 by Juniper Quote
JorgeB Posted June 29, 2020 Posted June 29, 2020 14 hours ago, Juniper said: Don't have another drive big enough to do more parity swaps under a different Unraid version but maybe there is anything else you guys can think of I could do to help. The best way you could help would be to repeat the procedure, but that's a lot to ask, I took note of your array config and going to try to compare it with other users having the same issue, maybe there will be something in common, though I'm still more inclined of this being caused by the user doing something different during the swap, but can't think of what. Quote
Juniper Posted June 29, 2020 Author Posted June 29, 2020 (edited) Is there a way to swap the parity from the 10 tb drive to one of the 8 TB drives? If yes, I can check which of the data drives has the least amount of stuff, and copy it to the other drives. Then when that data drive is empty I can try it backwards, and then repeat it with the same configuration. Just when I checked yesterday I couldn't find anything to replace a drive, data or parity, with a smaller drive. The 10 TB Iron Wolf is the largest drive I possess. If there is a way I'll do it. The original configuration was: 5 8TB drives and 1 3 TB drive. I could empty an 8 TB drive then try "shrink the array", add the 3 TB drive again and try a parity swap replacing the 10 TB Iron Wolf with the prior emptied 8 TB drive. Edited June 29, 2020 by Juniper Quote
JorgeB Posted June 29, 2020 Posted June 29, 2020 You'd need to manually shrink the array by doing a new config with the old 3 tb disk (copy any missing data written since), re-sync parity to the 8tb disk then repeat the parity swap, but like mentioned it's a lot to ask. Quote
Juniper Posted June 29, 2020 Author Posted June 29, 2020 I'll try that. :) Will look stuff up and report back on progress. Quote
JorgeB Posted June 29, 2020 Posted June 29, 2020 18 minutes ago, Juniper said: I'll try that. That would be cool, but since I don't know what's your backup situation make sure you don't risk your data because of that. Quote
Juniper Posted June 29, 2020 Author Posted June 29, 2020 (edited) I checked: it's just about 1.2 TB on the 8TB former parity drive now. I'll copy them back to my Win PC. Identified the exact stuff that is on the drive. Should take about 1.5 to 2 days. Will report back when everything is copied. This whole project will take about 1 week. But once it's done we'll know more, and I can then copy all the remaining data to the array from all the old drives I still have flying around somewhere. Then everything will be safe and sound on the array. Edited June 29, 2020 by Juniper Quote
Juniper Posted June 29, 2020 Author Posted June 29, 2020 (edited) Copying done. Array rebuilding parity now after taking out the former parity drive with https://wiki.unraid.net/Shrink_array "Remove Drives Then Rebuild Parity". Trying to rebuild parity on the old 8TB parity drive. Takes probably about 15-19h. But Unraid says it wants to rebuild 10 TB of parity, even though the data drives are all 8 TB. Let's see what happens. It took the assignment of the former parity drive back as parity drive without complaining. Edited June 29, 2020 by Juniper Quote
Juniper Posted June 30, 2020 Author Posted June 30, 2020 (edited) Unfortunately the parity rebuild on the original 8 TB parity disk complained: Log said: "Jun 30 06:38:09 Schiethucken kernel: sdb: rw=1, want=15628053176, limit=15628053168" and "Jun 30 06:38:09 Schiethucken kernel: attempt to access beyond end of device" How can I rebuild the parity on the smaller drive? schiethucken-diagnostics-20200630-0803.zip Edited June 30, 2020 by Juniper Quote
JorgeB Posted June 30, 2020 Posted June 30, 2020 Unassign parity, start/stop array, re-assign parity. Quote
JorgeB Posted June 30, 2020 Posted June 30, 2020 Isn't the old 3TB disk missing from the array? Quote
Juniper Posted June 30, 2020 Author Posted June 30, 2020 (edited) I wanted to add the 3 TB drive once the parity synch is done. Then do it again. First make sure it works then add the 3 TB drive, then run parity again. Thank you! I'll try that now: unassign the parity drive, start stop the array then reassign it and let parity synch run. This time it worked. Parity is building now with 8 TB. When it's finished I'll add the 3 TB drive and let it rebuild again. Will let you know when it's done. Edited June 30, 2020 by Juniper Quote
JorgeB Posted June 30, 2020 Posted June 30, 2020 13 minutes ago, Juniper said: I wanted to add the 3 TB drive once the parity synch is done. That would mean a disk clearing, any data there will be deleted. Quote
Juniper Posted June 30, 2020 Author Posted June 30, 2020 (edited) I'll stop the array now and add the drive. It's empty, but the parity will be invalid once it's in and being rebuilt anyways. I can save time by just adding it now. 3TB drive added, after a couple of stop/starts and step by step adding the disks . It's now rebuilding parity with all drives present. Edited June 30, 2020 by Juniper Quote
Juniper Posted July 1, 2020 Author Posted July 1, 2020 Parity rebuild finished. Config back to how it was before I did the parity swap. Ready to start the test. Please let me know what I should do now. schiethucken-diagnostics-20200701-0034.zip Quote
JorgeB Posted July 1, 2020 Posted July 1, 2020 Just repeat the procedure exactly like the last time, run a parity check after it's done and post new diags, all without rebooting (or if you need to reboot at any point save diags before doing it) Quote
Juniper Posted July 1, 2020 Author Posted July 1, 2020 My array is running Unraid 6.8.3 should I update to a newer version first? Or just do it the same as last time with the same Unraid version? Quote
JorgeB Posted July 1, 2020 Posted July 1, 2020 Use the current one, but it should be the same with the new beta since AFAIK there aren't any changes that would affect a parity swap. Quote
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.