aurevo Posted September 15, 2019 Share Posted September 15, 2019 Hello everybody, This morning my UnRAID told me that a hard drive is missing, after restarting the machine, I was able to reassign the hard drive, but the parity sync is extremely slow: Total size: 8 TB Elapsed time: 7 minutes Current position: 237 MB (0.0 %) Estimated speed: 504.4 KB/sec Estimated finish: 183 days, 3 hours, 13 minutes Is the hard disk possibly defective? What else can I look up? I don't want the rebuild to run at this speed. Quote Link to comment
trurl Posted September 15, 2019 Share Posted September 15, 2019 Since this isn't a bug report, I have copied your "report" to this General Support forum. If, in future, you think you may actually have a bug, please follow the guidelines: https://forums.unraid.net/bug-reports/stable-releases/report-guidelines-r68/ Now, on to your support questions. Go to Tools - Diagnostics and attach the complete diagnostics zip file to your next post. Quote Link to comment
aurevo Posted September 15, 2019 Author Share Posted September 15, 2019 A short question, how long does it usually take for the diagnosis log to be completed? I just waited 50 minutes, is that normal? I have now aborted it, if it can happen that it takes so long, I will restart it now. Quote Link to comment
Squid Posted September 15, 2019 Share Posted September 15, 2019 5 minutes ago, aurevo said: I just waited 50 minutes, is that normal? Due to a design consideration, most things from the webUI if they don't complete in exactly 120 seconds will never *appear* to complete. Diagnostics is one of them, and slowly processes which have the ability to run longer than 120 seconds are being reconfigured to operate slightly differently. In the meantime, just go to a terminal and type in diagnostics The file will get saved to the flash drive (logs folder) Quote Link to comment
aurevo Posted September 15, 2019 Author Share Posted September 15, 2019 58 minutes ago, Squid said: Due to a design consideration, most things from the webUI if they don't complete in exactly 120 seconds will never *appear* to complete. Diagnostics is one of them, and slowly processes which have the ability to run longer than 120 seconds are being reconfigured to operate slightly differently. In the meantime, just go to a terminal and type in diagnostics The file will get saved to the flash drive (logs folder) Okay, that's pretty weird. Attached the last two logs. Thanks in advance for the help. tower-diagnostics-20190913-1411.zip tower-diagnostics-20190915-1244.zip Quote Link to comment
trurl Posted September 15, 2019 Share Posted September 15, 2019 Looks like most of your problems are disk connection issues. Check power and SATA connections, both ends, all disks, then try again and post a new diagnostic. Quote Link to comment
aurevo Posted September 18, 2019 Author Share Posted September 18, 2019 On 9/15/2019 at 11:24 PM, trurl said: Looks like most of your problems are disk connection issues. Check power and SATA connections, both ends, all disks, then try again and post a new diagnostic. So on, I just ordered new cables for all drives and assembled them. Trying to replace the possibly defective 8TB drive with an 6TB drive, I hit new config. Can I undo this step, because when I now start the array I will loose parity and the data from that 8TB disk, or? Partiy was valid before hitting New Config. I think it is better to ask before making the next step, the server is off until your answer. Quote Link to comment
itimpi Posted September 18, 2019 Share Posted September 18, 2019 I could not see a 8TB data disk in the diagnostics you posted! Which is the 8TB drive that has failed? Quote Link to comment
JorgeB Posted September 18, 2019 Share Posted September 18, 2019 1 hour ago, aurevo said: Trying to replace the possibly defective 8TB drive with an 6TB drive, Can't replace an existing array disk with a smaller one. SMART looks OK for disk3, though what kind of disk is: Device Model: OOS8000G Serial Number: 00000000 If you really want to replace it you can use the invalid slot command but need another 8TB (or larger) disk. Quote Link to comment
aurevo Posted September 18, 2019 Author Share Posted September 18, 2019 24 minutes ago, itimpi said: I could not see a 8TB data disk in the diagnostics you posted! Which is the 8TB drive that has failed? The first 8TB hard disk is the Parity Disk 1, the second 8TB hard disk has disappeared from the configuration during operation. I reinstalled it and restarted the server, then it was there again, but the Partity Check / Rebuild process was as slow as mentioned in the original post. Then I exchanged all the data and power cables and rebooted the server with the 6TB for the 8TB hard drive because I thought I could replace it with a smaller one. After that I created a new config. This is the current state. What is the best way to keep the parity (I can reinstall the old 8TB HDD) but how do I get the config back? 20 minutes ago, johnnie.black said: Can't replace an existing array disk with a smaller one. SMART looks OK for disk3, though what kind of disk is: Device Model: OOS8000G Serial Number: 00000000 If you really want to replace it you can use the invalid slot command but need another 8TB (or larger) disk. When it looks okay, I will try to build the parity with this disk enabled. But as mentioned above, how to revert the new config change? Quote Link to comment
JorgeB Posted September 19, 2019 Share Posted September 19, 2019 I would recommend using the invalid slot command if you want to use a new disk, rebuilding on top of the older one might be a mistake, instead run an extended test on it, and if all OK re-sync parity instead. Quote Link to comment
aurevo Posted September 19, 2019 Author Share Posted September 19, 2019 1 hour ago, johnnie.black said: I would recommend using the invalid slot command if you want to use a new disk, rebuilding on top of the older one might be a mistake, instead run an extended test on it, and if all OK re-sync parity instead. I'm just not sure I understand you right now. My approach would be like this: I start with the new config, reassign the disks to the slots as before the new config. Then I set the disk I want to remove to unassigned, is that right? But if I start the array at this point, it will automatically rebuild parity and I loose the parity data which is avaiable at the parity drive and loose the data on the disk and I lose the contents of the hard drive I removed earlier. So I lose the data and the data to restore the removed hard disk via parity, or do I see it wrong?I or? Quote Link to comment
JorgeB Posted September 19, 2019 Share Posted September 19, 2019 Assuming all disks are available and disk3 is indeed OK, and passes the extended SMART test, after the new config you just need to assign all disks as before and start the array to begin parity sync, you can even trust parity and then run a correcting check. If I misunderstood and there is one missing disk then this isn't the way to go, in that case the invalid slot command would be the way to go, but you need a new disk of the same size to replace the missing one. Quote Link to comment
aurevo Posted September 19, 2019 Author Share Posted September 19, 2019 15 minutes ago, johnnie.black said: Assuming all disks are available and disk3 is indeed OK, and passes the extended SMART test, after the new config you just need to assign all disks as before and start the array to begin parity sync, you can even trust parity and then run a correcting check. If I misunderstood and there is one missing disk then this isn't the way to go, in that case the invalid slot command would be the way to go, but you need a new disk of the same size to replace the missing one. Okay, so first I do the extended SMART test, and if it's okay, I'll assign the disks as before. If the parity data is then used to restore the data to disk 3, or the parity data is overwritten, as I see from the message (starting the array will override the data or something like that). The following hard disks are in the system: 8TB (Parity 1) 3TB 3TB 8TB (This disk is disconnected during operation) 4TB 2TB 6TB 6TB 6TB After reinstalling the disk I wanted to restart the parity process and it was so slow. At the moment the disk is removed, but I could reinstall it. What do I do now to recover the data that was on the disk, because I have a new config, but before the new config the parity was supposed to still exist. Quote Link to comment
JorgeB Posted September 19, 2019 Share Posted September 19, 2019 38 minutes ago, aurevo said: Okay, so first I do the extended SMART test, and if it's okay, I'll assign the disks as before. Correct, if SMART test is successful assign all disks as before, check "parity is already valid" and start the array, if all disks mount correctly run a correcting parity check, if not post new diags. Quote Link to comment
aurevo Posted September 19, 2019 Author Share Posted September 19, 2019 2 hours ago, johnnie.black said: Correct, if SMART test is successful assign all disks as before, check "parity is already valid" and start the array, if all disks mount correctly run a correcting parity check, if not post new diags. Just to be sure: The hard disk allocation must be exactly the same as before and must not be extended by a hard disk until parity is restored, right? Does it make a difference if the ports have changed (for example from sdc to sdf) because I changed ports and cables yesterday? Quote Link to comment
JorgeB Posted September 19, 2019 Share Posted September 19, 2019 5 minutes ago, aurevo said: The hard disk allocation must be exactly the same as before and must not be extended by a hard disk until parity is restored, right? Yes, use only the same disks, though data disk order is not important with single parity. Which SATA port doesn't matter. Quote Link to comment
aurevo Posted September 19, 2019 Author Share Posted September 19, 2019 3 hours ago, johnnie.black said: Yes, use only the same disks, though data disk order is not important with single parity. Which SATA port doesn't matter. I just did what you suggested, and the process started normally. After a few minutes the page refreshed (I don't know why) and the array was stopped again and the message about the missing encryption key came up. I re-entered the password and restarted the array with the option "parity correction". Now it's back, but it started all over again. Speed shortly after start: Total size: 8 TB Elapsed time: 1 minute Current position: 9.40 GB (0.1 %) Estimated speed: 103.9 MB/sec Estimated finish: 21 hours, 22 minutes Sync errors corrected: 0 Quote Link to comment
JorgeB Posted September 19, 2019 Share Posted September 19, 2019 That's normal with encryption, first time doesn't know it needs the key. Speed seems OK, if any issues grab and post new diags. Quote Link to comment
aurevo Posted October 11, 2019 Author Share Posted October 11, 2019 Hello, the problems were no longer present for a long time. Today I couldn't use one of my VMs and wanted to restart the server. I shut it down cleanly via Power down and didn't disconnect the power. After the reboot Disk 3 (OOS8000G_00000000 - 8 TB (sdk)) was only emulated. After your last tips I had exchanged all cables and also checked the power plugs. If I see it correctly, the hard disk is still recognized by the system, isn't it? In the appendix the logs of directly after the restart. tower-diagnostics-20191011-2118.zip Quote Link to comment
John_M Posted October 11, 2019 Share Posted October 11, 2019 57 minutes ago, aurevo said: If I see it correctly, the hard disk is still recognized by the system, isn't it? It was recognised but it has dropped off line. Is it connected to a SATA port multiplier? Oct 11 21:18:23 Tower kernel: ata9: softreset failed (1st FIS failed) Oct 11 21:18:23 Tower kernel: ata9: limiting SATA link speed to 3.0 Gbps Oct 11 21:18:28 Tower kernel: ata9: softreset failed (device not ready) Oct 11 21:18:28 Tower kernel: ata9: reset failed, giving up Oct 11 21:18:28 Tower kernel: ata9.00: disabled Oct 11 21:18:28 Tower kernel: sdk: detected capacity change from 8001563222016 to 0 Quote Link to comment
JorgeB Posted October 12, 2019 Share Posted October 12, 2019 You should avoid Marvell controllers, Marvell + SATA port multiplier double no no. Quote Link to comment
aurevo Posted October 12, 2019 Author Share Posted October 12, 2019 13 hours ago, John_M said: It was recognised but it has dropped off line. Is it connected to a SATA port multiplier? Oct 11 21:18:23 Tower kernel: ata9: softreset failed (1st FIS failed) Oct 11 21:18:23 Tower kernel: ata9: limiting SATA link speed to 3.0 Gbps Oct 11 21:18:28 Tower kernel: ata9: softreset failed (device not ready) Oct 11 21:18:28 Tower kernel: ata9: reset failed, giving up Oct 11 21:18:28 Tower kernel: ata9.00: disabled Oct 11 21:18:28 Tower kernel: sdk: detected capacity change from 8001563222016 to 0 Yes, it is connected to an multiplier. Card with Marvell 88SE9215 controller and ASM1062 from ASMedia. 2 hours ago, johnnie.black said: You should avoid Marvell controllers, Marvell + SATA port multiplier double no no. Can I find this information on this forum at any place? And which controller should I better use for, than I would replace it. Hard to find another controller without Marvell in Germany. Quote Link to comment
JorgeB Posted October 12, 2019 Share Posted October 12, 2019 43 minutes ago, aurevo said: And which controller should I better use for, than I would replace it. Any LSI with a SAS2008/2308/3008/3408 chipset in IT mode, e.g., 9201-8i, 9211-8i, 9207-8i, 9300-8i, 9400-8i, etc and clones, like the Dell H200/H310 and IBM M1015, these latter ones need to be crossflashed. Quote Link to comment
aurevo Posted October 12, 2019 Author Share Posted October 12, 2019 1 hour ago, johnnie.black said: Any LSI with a SAS2008/2308/3008/3408 chipset in IT mode, e.g., 9201-8i, 9211-8i, 9207-8i, 9300-8i, 9400-8i, etc and clones, like the Dell H200/H310 and IBM M1015, these latter ones need to be crossflashed. Then I will try to buy one of the controllers and send the other one back. I looked once and the controllers are available for a good price probably only used. But how is it that only one hard disk was affected at a time? Three times only exactly the one hard disk and for one month it worked. Quote Link to comment
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.