Scott Balkum Posted April 2 Share Posted April 2 System has been fine for a while. I do power it off when not in use for long terms. I had it powered up about a week now and was moving data over. Everything was fine. It started performing a parity check so I stopped copying data and just let it run. This morning it had amassed millions of errors and had dumped 4 drives as unavailable. I shut it all down and was able to get 2 back but the other 2 were not coming back. Running dual parity so I kicked off a rebuild and now a 3rd drive is kicking out millions of errors and estimates 400+ days for rebuild. Been running for many hours. What should I be doing different? I’m a little worried about the data. I’m a bit of a data horder and this my be my calling to learn to stop but, for now. What can I do? tower-diagnostics-20240401-2355.zip Quote Link to comment
JorgeB Posted April 2 Share Posted April 2 Looks more like a controller issue, please note that we don't recommend controllers with SATA port multipliers, they have a tendency to drop disks, for now reboot to see the damage, in case there is some, and post new diags after array start. Quote Link to comment
Scott Balkum Posted April 2 Author Share Posted April 2 2 hours ago, JorgeB said: for now reboot to see the damage, in case there is some, and post new diags after array start. I need to let it finish rebuilding right? Before I started the rebuild, all the data appeared to be there. Quote Link to comment
Kilrah Posted April 2 Share Posted April 2 Is it running normally now? Cause if it's still estimating days there's no point. State will be a mess regardless. You should never start any parity/rebuild operation before making sure the hardware is working properly. Quote Link to comment
JorgeB Posted April 2 Share Posted April 2 1 hour ago, Scott Balkum said: I need to let it finish rebuilding right? Not much point. Quote Link to comment
Scott Balkum Posted April 2 Author Share Posted April 2 Ok, Stopped it and restarted. It has come up. I am copying some data off of it. It shows 2 drives emulated and the rest are online but there are errors copying some data so I have lost some of it for sure. Sounds like I need to get what I can off of it and then completely rebuild UNRAID and ditch the cheap SATA controllers and probably get rid of the old 3TB drives. Quote Link to comment
JorgeB Posted April 2 Share Posted April 2 You can post new diags after array start if you want us to take a look. Quote Link to comment
Scott Balkum Posted April 2 Author Share Posted April 2 Since I haven’t written any data, is there any way I can reconnect those 2 drives that vanished and came back? I feel like if I can get those back, the other errors from other disks won’t be a problem. Quote Link to comment
JorgeB Posted April 2 Share Posted April 2 You could do a new config, but note that you will need to run a correcting parity check since there will be a few sync errors, and that will take about the same time as a rebuild. Quote Link to comment
Scott Balkum Posted April 2 Author Share Posted April 2 Here are the diags. tower-diagnostics-20240402-1049.zip Quote Link to comment
JorgeB Posted April 2 Share Posted April 2 More disk errors, for now only disk12, but this one could be an actual disk problem, cancel the rebuild and run an extended SMART test on that disk. Quote Link to comment
Scott Balkum Posted April 3 Author Share Posted April 3 Attached is the smart report for the drive. It doesn’t look good. WDC_WD30EZRS-00J99B0_WD-WCAWZ0131800-20240403-1013.txt Quote Link to comment
JorgeB Posted April 3 Share Posted April 3 It doesn't, and the SMART test has been failing for a while now, some data loss may be unavoidable. Quote Link to comment
Scott Balkum Posted April 3 Author Share Posted April 3 Is there a best way to reintroduce the 2 disconnected drives back in and accept the data that is on them as accurate? Quote Link to comment
Solution JorgeB Posted April 3 Solution Share Posted April 3 If you mean to then rebuild disk12, there's a way, but it will only work if parity is still valid, it won't hurt to try though: -Tools -> New Config -> Retain current configuration: All -> Apply -Check all assignments and assign any missing disk(s) if needed, including the new disk12, replacement disk should be same size or larger than the old one -IMPORTANT - Check both "parity is already valid" and "maintenance mode" and start the array (note that the GUI will still show that data on parity disk(s) will be overwritten, this is normal as it doesn't account for the checkbox, but it won't be as long as it's checked) -Stop array -Unassign disk12 -Start array (in normal mode now) and post new diags. Quote Link to comment
Scott Balkum Posted April 3 Author Share Posted April 3 Disk 10 and 20 are currently unmountable because they had disconnected before disk 12 showed all the error. Can I put those 2 disks back in as 10 and 20 and Unraid will see the data on it as valid? Is that what doing the new config will do? Quote Link to comment
JorgeB Posted April 3 Share Posted April 3 I assumed they were mounting, the rebuild should have stopped/skipped if there were errors on other disks. If they are not mounting they will keep not mounting after the new config, but the filesystem may be fixable, you can run a filesystem check on both before attempting the new config. Quote Link to comment
Scott Balkum Posted April 3 Author Share Posted April 3 Ok, I did all of those steps. One of my disconnected drives was able to be corrected. The other disconnected drive could not. But, bringing up the array with the new config and the 1 saved drive allowed it to come up with dual parity acting properly, protecting the 2 bad drives. I am now able to copy data off without failing. Thank you so much for your efficient and willing help. I sent some beer money, as much as I could. I know a little helps more than nothing. Thank you. Quote Link to comment
JorgeB Posted April 4 Share Posted April 4 Glad to hear, and thanks for the beer money. Quote Link to comment
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.