nikiforos Posted January 28, 2023 Share Posted January 28, 2023 Hello everyone, thank you for your time, I hope you will be able to help me with my situation! I have been running my Unraid server for roughly three years now without any issues. Until a couple of weeks ago... So a couple of weeks ago I got a message, that one of my array disks has an error and cannot be read from. Since I had been contemplating expanding my storage anyway, I did not spend much time looking into the original error and just bought a bigger harddrive to replace the one with the error. To to that, I 1) shut the server down (cleanly) 2) added the new drive to an empty slot in my case 3) pre-cleared the new disk (no issues/errors) 4) replaced the disks in the "Array Devices" 5) started the array and let the new disk rebuild Everything seemed fine after that and I thought the problem was dealt with. Sadly, after the next scheduled parity check, I got an error message that both of my parity drives have errors. Over 1000 each. So I decided to rebuild the parity from the ground up. I'm hoping this wasn't a fatal mistake... To rebuild the parity drives, I stopped the array and swapped the two parity drives with each other. After starting the array, Unraid started rebuilding the "new" parity drives. Btw, I also turned off Docker and the VM manager, as I thought it would be best to minimize data being written to the drives, while the parity is being rebuilt. Once the parity was freshly rebuilt, I manually started a parity check, as I wanted to make sure that everything works fine. Which it did not! Again the parity drives reported over 1000 errors each. I now got the option to start a "Read Check", which I did. It will take about 4 days though. I attached a diagnostic.zip file, which I just now downloaded. I'm hoping someone will find useful information here. I certainly have no clue what to look for Could you please help me with my next steps? Should I run tests on the two parity drives, or should I wait for the "Read Check" to finish? Did I mess up, or can the disks/data me salvaged? Thank you very much for your support!! Greetings from Vienna, Nick unraidserver-diagnostics-20230128-1727.zip Quote Link to comment
Solution trurl Posted January 28, 2023 Solution Share Posted January 28, 2023 Which data disk did you replace? Disk12 has little if any data. Is that expected? Both parity disks and both cache disks have disconnected. Looks like those are all on this controller: 02:00.1 SATA controller [0106]: Advanced Micro Devices, Inc. [AMD] 400 Series Chipset SATA Controller [1022:43c8] (rev 01) Subsystem: ASRock Incorporation Device [1849:43c8] Kernel driver in use: ahci Kernel modules: ahci Both parity disabled, and cache pool is unmountable, but maybe that will come back when the disks do. You should always double check connections when inside the case. Do any of your other disks show SMART warnings (thumbs down) on the Dashboard page? No point in doing read check unless you just want to exercise and test those other disks. Unrelated, but appdata has files on the array. Shutdown, check connections, power and SATA, both ends, including splitters. Reboot, start the array, and post new diagnostics. Quote Link to comment
nikiforos Posted January 28, 2023 Author Share Posted January 28, 2023 Hello, thank you for your reply. I replaced data disk 5 (Z2JMJMZT). I will do as you suggested and report back. thank you! Quote Link to comment
trurl Posted January 28, 2023 Share Posted January 28, 2023 13 minutes ago, trurl said: Disk12 has little if any data. Is that expected? 13 minutes ago, trurl said: Do any of your other disks show SMART warnings (thumbs down) on the Dashboard page? Quote Link to comment
nikiforos Posted January 28, 2023 Author Share Posted January 28, 2023 - It is fine that Disk 12 has hardly any data. I had planned on using it only for a specific share - All the disks have green thumbs up on the dashboard (see screenshot) - I shut down the server, opened it, checked all the connections, moved the harddrives away from the motherboard SATA connection onto a PCi card (no room to move cache drives too), rebooted and started the array. The cache drives have fixed themselves, the parity drives seem to still have the same issue. I attached a new diagnostics file. Thanks again! unraidserver-diagnostics-20230128-1921.zip Quote Link to comment
trurl Posted January 28, 2023 Share Posted January 28, 2023 11 minutes ago, nikiforos said: parity drives seem to still have the same issue No, they are not disconnected, just disabled, and will be until rebuilt. Quote Link to comment
nikiforos Posted January 28, 2023 Author Share Posted January 28, 2023 So I should rebuild the parity again? Do I have to use the same trick as before, by swapping the two drives and force a rebuild, or is there a more elegant solution? Quote Link to comment
nikiforos Posted January 28, 2023 Author Share Posted January 28, 2023 Also, is my thinking correct, that I should keep Docker and the VMs inactive during the rebuild (to minimize writing to the disks), or is it fine to activate them? Quote Link to comment
trurl Posted January 28, 2023 Share Posted January 28, 2023 55 minutes ago, nikiforos said: use the same trick What you did will work, but the standard way to rebuild a drive to itself, whether parity or data disk: https://wiki.unraid.net/Manual/Storage_Management#Rebuilding_a_drive_onto_itself Quote Link to comment
trurl Posted January 28, 2023 Share Posted January 28, 2023 54 minutes ago, nikiforos said: keep Docker and the VMs inactive during the rebuild (to minimize writing to the disks) Read/writes of the array will slow rebuild, and rebuild will slow read/writes of the array. Quote Link to comment
nikiforos Posted January 28, 2023 Author Share Posted January 28, 2023 Ok. Thank you! I started to rebuild. I will update you in about a week when it is done! Have a nice weekend. Quote Link to comment
trurl Posted January 28, 2023 Share Posted January 28, 2023 11 minutes ago, nikiforos said: about a week when it is done! Typically 2-3 hours per TB of largest parity disk, so should only be 2 days unless you have port multipliers Quote Link to comment
nikiforos Posted January 28, 2023 Author Share Posted January 28, 2023 8 minutes ago, trurl said: Typically 2-3 hours per TB of largest parity disk, so should only be 2 days unless you have port multipliers Hm... I don't know what "port multipliers" are, so I expect I don't have them. The parity is being rebuilt at roughly 32MB/sec and is expected to last another 4 days and 20 hours. I will then start a manual parity check, which also lasts 3-4 days. Quote Link to comment
trurl Posted January 28, 2023 Share Posted January 28, 2023 1 minute ago, nikiforos said: parity is being rebuilt at roughly 32MB/sec That's about 1/4 the speed I get and I know many others get as much or better. Probably these controllers are to blame: 09:00.0 SATA controller [0106]: Marvell Technology Group Ltd. 88SE9215 PCIe 2.0 x1 4-port SATA 6 Gb/s Controller [1b4b:9215] (rev 11) Subsystem: Marvell Technology Group Ltd. 88SE9215 PCIe 2.0 x1 4-port SATA 6 Gb/s Controller [1b4b:9215] Kernel driver in use: ahci Kernel modules: ahci 0b:00.0 SATA controller [0106]: Marvell Technology Group Ltd. 88SE9215 PCIe 2.0 x1 4-port SATA 6 Gb/s Controller [1b4b:9215] (rev 11) Subsystem: Marvell Technology Group Ltd. 88SE9215 PCIe 2.0 x1 4-port SATA 6 Gb/s Controller [1b4b:9215] Kernel driver in use: ahci Kernel modules: ahci Quote Link to comment
trurl Posted January 28, 2023 Share Posted January 28, 2023 Looks like both parity are on that first Marvell, so also possibly the reason those disks are getting dropped. Quote Link to comment
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.