(SOLVED) Rebuild after drive replacement is take AGES


Jax

Recommended Posts

Hi,

 

This all began after initiating a drive replacement due to errors I was receiving on disk 6.

I installed a new disk (4TB to replace the 3TB that had failed) and it seemed to start OK, but when I woke up this morning to check, I see that it's going to take a year to complete at the current rate:

 

image.png.92c681454a05d0bda28bdd940dab8ac8.png

 

I am also seeing that the log is pegged @100% only after 7 hours of system uptime:

 

image.png.8062878b23939f497f7c40f2c6262d0a.png

 

Syslog is loaded with a lot of REISERFS errors related to md6. (attached)

 

Any ideas as to what could be going on here?

Any help or direction provided would be greatly appreciated... thanks!

 

 

Edited by Jax
Issue solved
Link to comment
On 10/3/2019 at 2:25 PM, trurl said:

How did you determine the original disk was bad?

 

Couldn't be mounted or read in Unraid - unable to perform a SMART scan... now that I have it out I can do some more checking.

I've had drives fail before and this didn't appear to be any different. So since I had the spare on hand, I swapped it out to ask questions later. 

 

Full diagnostic attached.

 

 

Edited by Jax
Link to comment
2 hours ago, Jax said:

Couldn't be mounted or read in Unraid - unable to perform a SMART scan

These could all be caused by a bad connection. But could be a bad disk of course.

 

2 hours ago, Jax said:

So since I had the spare on hand, I swapped it out to ask questions later. 

Rebuilding to a spare is actually the best way even if the original is still good, since it keeps the original as a backup in case of problems.

 

Looks like it's having problems communicating with multiple disks. Reseat controller, check connections, both ends, including power. Power splitters are also a good suspect with multiple disk problems.

 

The filesystem problems might not be real if it can't read all the disks to calculate the rebuild. Maybe it will clear up if you get all the disks connected again.

  • Like 1
Link to comment
34 minutes ago, Jax said:

Should I just cancel the rebuild to power down and check the connections?

It's still stuck at 48.2% for the past 10 hours, but I don't want to do something that could result in data loss.

No point in continuing with a rebuild that can't be producing the correct results, since all of the disks must be read reliably to reliably rebuild a disk.

Link to comment

Reseated all power and signal cables and all disks and the rebuild completed in a reasonable amount of time:

 

image.png.9929ccac46b7175b1dab8435720d8eb7.png

 

Putting the original "failed" drive through it's paces on the bench - so far so good.

Will also look into the SAS controller issue as I wasn't aware there was one..... thanks for the replies!

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.