Added 2nd Parity - now getting a variety of errors


Go to solution Solved by JorgeB,

Recommended Posts

Hello - I decided to add a second parity drive this weekend. No real need, I do not have a large array but wanted to learn and set it up "just in case".

 

I also added another disk but it is not online. I had the case open, I had pre-cleared it some time ago and figure I would put it in but leave it unassigned. I hate going into the server as I seem to always knock something loose.

All went well and all was good for 2-3 days but just started getting increased error messages and warnings. See attached logs.

I can also hear "disk noise" which does not sound good and I never hear this kind of noise so a bit worried. It sounds pretty horrific.

I have shutdown and have checked all connections which seem fine.

 

On reboot, the second parity drive no longer shows up. All other drives seem fine. Both the new "suspect" parity 2 and my spare data drive show up in unassigned devices.

I am trying to decide next steps. Should I:

1) Reassign the suspect drive back to Parity 2 and let it "rebuild" (I assume that is what will happen)?
2) Run extended SMART on the suspect drive? But not sure how to do that to an unassigned drive?

3) Run a parity check (with no corrections)?
4) Something else?

It is a brand new drive - so it should be ok - but I know there is a chance it is bad. ANd the sounds it is making do not sound good. Think gravel in a tin can. 🙂

But I think in reading the forums many times it is a poor connection or something else?

I attach logs/diagnostics and the "brief" SMART report for the drive in question.

Thoughts on next steps? One of the reason I added the second parity was I do have some old drives so wanted to have an additional safety net...Thanks!

 

zack-unraid-diagnostics-20230124-2213.zip zack-unraid-smart-20230124-2211.zip

Link to comment

Thank you - much appreciated. The sound coming from my server seconds that diagnosis.

 

I had a spare disk and have started to try to add that in as my new "second parity" but have gotten several errors already.

 

Jan 25 07:21:34 Zack-unRAID kernel: md: disk2 read error, sector=2930266240
Jan 25 07:21:34 Zack-unRAID kernel: md: disk2 read error, sector=2930266248
Jan 25 07:21:34 Zack-unRAID kernel: md: disk2 read error, sector=2930266256
Jan 25 07:21:34 Zack-unRAID kernel: md: disk2 read error, sector=2930266264
Jan 25 07:21:34 Zack-unRAID kernel: md: disk2 read error, sector=2930266272
Jan 25 07:21:34 Zack-unRAID kernel: md: disk2 read error, sector=2930266280
Jan 25 07:21:34 Zack-unRAID kernel: md: disk2 read error, sector=2930266288
Jan 25 07:21:34 Zack-unRAID kernel: md: disk2 read error, sector=2930266296
Jan 25 07:21:34 Zack-unRAID kernel: md: disk2 read error, sector=2930266304
Jan 25 07:21:34 Zack-unRAID kernel: md: disk2 read error, sector=2930266312



I am now a bit freaked out.

1) Do I stop trying to add a second parity and run extended tests on all drives? Can I do this all at once (run the extended test at the same time)
2) Can I run extended tests while it is rebuilding the new second parity (imagine this would slow parity rebuild)

3) Complete the parity build for the second parity and then run extended tests for all drives?

4) Other options?

 

I like to learn stuff but now a bit frustrated at myself as adding a second parity has not been as straightforward as I thought. Getting a failed new drive can happen but I doubt the second drive I added (which passed pre-clear a year or so ago (and then was put in storage offline)) would also be bad?

Thanks!

Link to comment
1 hour ago, itimpi said:

@TexasDaveSince you have just added some new drives are you sure your PSU is up to driving them?  Also, are you using power splitters?

 

I am not sure how to tell but running with a 650W Power Supply which I think would be enough? My system specs are in the signature. Do I just look at CPU, drives and GPU to add up and it should be under 650W?

No power splitters...

EDIT: Found power supply calculator and all looks ok?

https://outervision.com/b/nBNNuj


Thanks!

Edited by TexasDave
Link to comment
24 minutes ago, JorgeB said:

It's logged as a disk problem, run an extended SMART test on disk2.

 

So stop the current parity rebuild and focus on the extended smart scan on the disk in question?

Is this correct: As I have one "good" parity and my other disks look ok, I should be OK while I figure this out?

Thanks!

Link to comment

@JorgeB - thanks again for your help, much appreciated.

 

I ran extended smart scans on all disks in the array. Results as follows and attached:

  • Disk 1 (WD-WX51DA47619X): Passed
  • Disk 2 (WD-WCC4N0CNYXPC): Passed (but one previous error?)
  • Disk 3 (WD-WX61DC896552): Passed
  • Disk 4 (WD-WX21D65NV0T0): Passed
  • Cache 1 (S21JNXBG430473F): Passed
  • Cache 2 (S21JNXAG541010H): Passed
  • Parity 1 (WD-WX51DA47619X): Passed
  • Possible Parity 2 (WD-WX11DC8PPZ2F): Passed

I think my interpretation above is correct? I am now going to try to add Parity 2 and hopefully no issues. Any comments welcome and appreciated - Thanks!

zack-unraid Extended SMART Scan all Disks.zip zack-unraid-diagnostics-20230126-0647.zip

Link to comment
1 minute ago, JorgeB said:

Yes, previous one failed but last one passed, these errors can be intermittent, you can try adding parity2 again but any more errors from that disk and I would replace it.

 

That disk (Disk 2)  - with a previous error - was a disk in my data array. It looks like the second parity drive was fine and I did not have to stop the other day. I stupidly assumed it was the new parity drive 2. 

But glad I stopped and ran an extended smart scan on all drives. I will keep an eye on that data drive (Disk 2).

I have googled/searched and it appears the consensus is running monthly parity check is as good or better that running extended scans on a regular basis. Thoughts?

Finally - The drive that failed (my first new parity) was pulled. I ran an extended and extended smart scan on it in a caddy on a windows machine and it is coming up fine. It is the one that was making horrible noises and seemed to have a ton of errors. I may try to run some scans on it - via a caddy - in unRAID at some point. But Smartscans should be the same regardless of OS? I attach that scan again.

Many thanks and will update and (hopefully) close this out once data rebuild on new parity 2 is complete.

zack-unraid-smart-20230124-2211.zip

Link to comment
9 minutes ago, JorgeB said:

For all purposes it's the same.

 

 

I would agree as far as disk health is concerned, but using a scheduled monthly parity check has the additional advantage of checking that parity is still in-sync with the data drives as it is possible for them to get out-of-sync for other reasons than just disk issues.

  • Like 1
Link to comment

OK - a mix of "good news" and "bad news".... @JorgeB and @itimpi thank you for helping me get this fair and sorting my disk issue!

Good News

Array and disks are backup and healthy from what I can see.

  1. Disk 2 was bad (as noted)
  2. I moved my Movies to disk 1, TV to disk 3, everything else to disk 4...
  3. Shrunk array and removed bad disk. Been meaning to shrink it anyways....

All seems fine....

 

Bad New

My dockers have all disappeared. I suspect this is because the docker information was on cache and in trying to move everything off disk 2, this may have caused issues. I also seem to have moved stuff from cache onto disk 4.

 

When I went to start up my dockers - I got the message that "/mnt/cache/appdata" was missing so I made this "/mnt/user/appdata" and the dockers started. But it looks like stuff got out of sync. I started adding in the dockers as with the "restore" function this is not too bad.

But in checking the dockers, the first few were behaving weird and did not have app data. For instance, Nginx Proxy Manager lost my login. I restored appdata using the plugin and it fixed NPM. But some dockers like NZBget and NZBHydra2 (which I router through binhex delugeVPN) no longer are accessible from the web UI. CrashPlan Pro will not allow me to login in with my account details on unRAID but works find on the external code42 website with my credentials.

Before I try to add the rest of my dockers - any ideas? Like I said, I think I have "good" app data on /mnt/disk4/appdata (and /mnt/user/appdata) but do not want to keep going without some kind of sanity check.

But I may have messed up and may have to re-configure all of my dockers which would kind of bite....

Diagnostics attached. Thanks!

zack-unraid-diagnostics-20230130-1104.zip

Link to comment

Yes - I have restored my appdata backup. I am slowly adding back my dockers (I only have 12 or so) and seem to be making progress.

Having issues on a few but will circle back on them (NPM and CrashPlan Pro) but others seem to be picking up the backed up information.

I think I am just going to have to add the dockers again....

 

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.