(Solved) 6.4.1 Parity errors, Disabled drive, SMART errors


Recommended Posts

For the first time, a monthly parity check returned errors and a drive is now showing as Disabled. I am unable to read SMART attributes on this drive although the SMART status is green. A couple of other drives have some SMART reallocated sector errors but have been holding steady for some time now.

 

My plan of action is to first move the data off emulated/disabled drive and then remove the drive from the array, as I have plenty of spare capacity in remaining drives.

 

Should I upgrade to 6.5 first? Any steps I should be mindful of? Do my logs indicate what happened to cause the Disabled drive? I'm assuming simple failure from an old drive.

 

Edited by mfort312
Solved
Link to comment

Rebooted and the SMART attributes from disk8 are now available. However Disk9 is now reporting as unmountable with no file system.

 

When I was moving files from Disk8, I first copied to Disk3 with no problems. When I was moving a folder to Disk9, a few files copied before the read/write errors and then the Disk9 was inaccessible via MC and all the files disappeared. There were only about 80G of files, nothing irreplaceable, but now I am worried about the other drives. If possible, it would be nice to recover at least a file list to know which files were on that drive before hand. Actually, it would be nice to do that for all the drives if possible before proceeding, is there an easy way to get a full file list?

tower-diagnostics-20180403-0732.zip

Link to comment

Disk8 is also failing and needs to be replaced, disk9 seems fine, looks life a filesystem problem only.

 

Since you have two failing disks IMO your best way forward would to do a new config with a new parity and a new disk8 (or without one if you don't need the space), run xfs_repair on disk9, rsync parity and then connect old disk8 with the UD plugin and try to copy any important data.

 

P.S.: disk5 is not currently failing but has a lot of reallocated sectors, keep an eye on it or preemptively replace.

  • Like 1
Link to comment

Thank you, Johnnie. The parity drive is the other drive failing?

 

So I should set up a new config with a new parity drive first and then run xfs_repair on disk9 as part of the array?  Or try xfs_repair first, with disk9 in array?  I was thinking to use disk9 as the new parity drive if I can recover and move the files. Sounds like it would be better to get a new drive for parity before attempting xfs_repair?

Link to comment
43 minutes ago, mfort312 said:

The parity drive is the other drive failing?

Yes

 

43 minutes ago, mfort312 said:

So I should set up a new config with a new parity drive first and then run xfs_repair on disk9 as part of the array?  Or try xfs_repair first, with disk9 in array?

Any way will work, if done first unassign the failing parity drive Since there's a disable disk you can't unassign parity, so do a new config first.

Edited by johnnie.black
  • Like 1
Link to comment

Ok, good news: I managed to copy everything off the disabled disk8 while still in emulation mode to a drive off the unraid array.

 

Next, to fix disk9, I started in maintenance mode and from a terminal attempted:

 

xfs_repair -v /dev/md9

ERROR: The filesystem has valuable metadata changes in a log which needs to
be replayed.  Mount the filesystem to replay the log, and unmount it before
re-running xfs_repair.  If you are unable to mount the filesystem, then use
the -L option to destroy the log and attempt a repair.
Note that destroying the log may cause corruption -- please attempt a mount
of the filesystem before doing this.

I tried mounting and unmounting again with the same error, so back in maintenance mode I next tried:

 

xfs_repair -vL /dev/md9

ALERT: The filesystem has valuable metadata changes in a log which is being
destroyed because the -L option was used.

After finishing, I stopped and started the array again in normal mode, and bingo, there were all my missing files. Lost and found had only a few files from the failed MC copy yesterday. Everything else is in its place. I am now copying everything from disk9 off the unraid array.

 

With disk9 free, I will use it to replace the failing parity drive. And next work on replacing disk5.

 

 

Disk5's SMART status looks pretty similar if not worse than the failing Parity drive's SMART status. How can I spot the difference between currently failing and still hanging on?

 

 

Thanks again for your help and advice.

 

Link to comment
3 hours ago, mfort312 said:

Disk5's SMART status looks pretty similar if not worse than the failing Parity drive's SMART status. How can I spot the difference between currently failing and still hanging on?

Parity has pending sectors, disk5 not, at least not on the report.

 

3 hours ago, mfort312 said:

Also, what will happen with my Docker apps (on cache drive) and User Shares with a New Config? Will I need to rebuild them?

Not as long as cache drive remains the same.

  • Like 1
Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.