Jump to content
Sign in to follow this  
3560freak

Rebuilt parity after unclean shutdown and disabled disk, am I in trouble? [OS build 6.6.7]

18 posts in this topic Last Reply

Recommended Posts

I've been an UnRaid user for some time, however I have been very luck and had few issues over the last 6-8 years (this community has been a tremendous help in the past), however might lack of patience and potential understanding of the processes might have gotten the best of me this time.

 

Logged into my server recently to see that I had a disabled data disk [DISK 1] (browser notices show that the disk was disable over 30 days ago and was being emulated).  I powered down the server, checked the connections, then followed a post that show how to re-enable the disk and tried that.  After the disk "rebuild itself" back into the array, I noticed that the UnRaid OS upgrade was available for ver 6.6.7 and so I processed to do that.  After completed it requested a restart, but the system would not restart, it immediately started throwing tons and tons of read errors on the Parity disk (different than the disk that had previously been disabled).  No matter what I tried, it would not allow the restart, it simply said STOPPING UNRAID for hours.  So I powered the system down manually.  Once bringing it back up I knew that I would have to do a parity check again due to the unclean shutdown, however after I started the parity check, and shortly after data DISK 1 once again disabled itself and the parity check continued.  It has completed now (it does show 168 errors on the Parity drive, assuming from the bad data pulled from DISK 1 before disabling itself), but I am not sure if any data has been overwritten or lost.

 

What can I do at this point to get all drives back up, or is it likely that I have lost a lot of unrecoverable data?

 

 

Thanks for any suggestions.

Share this post


Link to post
2 hours ago, 3560freak said:

(browser notices show that the disk was disable over 30 days ago and was being emulated). 

You should enable email notifications, also please post the diagnostics: Tools -> Diagnostics

 

 

 

 

Share this post


Link to post

Parity is failing, disk1 appears to be suffering from a connection issue, replace cables on disk1 and post new diags.

Share this post


Link to post
19 hours ago, johnnie.black said:

You should enable email notifications

^ This probably would have allowed you to fix a single problem when it occurred instead of getting to a place where you had multiple problems.

Share this post


Link to post
On 4/13/2019 at 1:52 AM, johnnie.black said:

Parity is failing, disk1 appears to be suffering from a connection issue, replace cables on disk1 and post new diags.

Unfortunately with the HP MicroServer Gen 8 that I am using I can't easily replace cabling.  I did however reseat all drives in their respective caddies/cage and unplug/replug the SATA breakout cable.  Updates diags attached.

 

On another note, should I be running individual SMART reports on the drives every so often?

tower2-diagnostics-20190414-0925.zip

Share this post


Link to post
4 hours ago, 3560freak said:

On another note, should I be running individual SMART reports on the drives every so often?

As long as you have Notifications enabled Unraid will inform you if any of the important attributes change.

Share this post


Link to post
On 4/15/2019 at 2:50 AM, johnnie.black said:

Disk1 looks fine, you could swap positions with another one just to rule out slot issue.

 

Thanks for your help.  Based on this updated log you still see that the PARITY is failing/needs to be replaced, correct?

Share this post


Link to post

There were media errors, i.e., bad or failing sectors on the parity disk, these errors some times can be a once off, though they rarely are, you can give it a second chance if it passes an extended SMART test but any more errors replace.

Share this post


Link to post
On 4/17/2019 at 1:44 AM, johnnie.black said:

There were media errors, i.e., bad or failing sectors on the parity disk, these errors some times can be a once off, though they rarely are, you can give it a second chance if it passes an extended SMART test but any more errors replace.

 

Ran the extended SMART test on the PARITY drive and it reported back no errors.  Then did a parity check (which completed successfully) but the status on the MAIN page still shows ERRORS as 1, causing the system to respond as READ ONLY.  How do I get UnRaid to drop the error marker and allow writes to the system again?

Share this post


Link to post

If you are referring to the Errors column in Main, that would not cause Read Only. Post new Diagnostics. 

Share this post


Link to post

You have problems communicating with parity and disk2, possibly bad connections or controller issue. And your docker image is corrupt. After you get the connection issue corrected and a good parity check, you will have to delete and recreate docker image, then reinstall your dockers using the Previous Apps feature on the Apps page.

Share this post


Link to post

Strange there's a simultaneous read error on two disks, though in different but very close sectors, and it's reported as a media error, that's usually a disk problem, but in this case not so sure, very strange, something weird is going on with that server.

Share this post


Link to post
6 hours ago, johnnie.black said:

Strange there's a simultaneous read error on two disks, though in different but very close sectors, and it's reported as a media error, that's usually a disk problem, but in this case not so sure, very strange, something weird is going on with that server.

Maybe power supply voltage sag?

Share this post


Link to post
23 minutes ago, jonathanm said:

Maybe power supply voltage sag?

Possible, could also be the miniSAS cable, just strange being reported as a media error.

Share this post


Link to post

Well did another reboot/reseating of drives and parity check, then all came up as normal... but after about 12 hours DISK1 is showing errors again.  I guess I will begin the search for a new backplane cabling harness.  I've got the server running off of an APC (1500VA / 900W) UPS, so with such a relatively low power microserver likely not a voltage sag.

Share this post


Link to post

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

Sign in to follow this