Jump to content

Parity Disk in Error State Disk dsbl


Recommended Posts

I woke up to this email from my server for my 6TB parity drive. 

 

Event: Unraid Parity Disk Error

Subject: Alert [TOWER] - Parity disk in error state (disk dsbl)

 

The GUI shows the disk as having;

 

22,756,061,247,961 Reads

18,446,744,073,704,421,376 writes

808 errors

 

My assumption is that the drive is toast so I'm going to order another drive, but I have a few questions.

 

1. Is the safest thing to stop the array until the new drive gets delivered an installed?  I only have one parity drive so another drive failing would mean data loss.

2. Does the new drive need to go through pre-clear?

 

Thanks!

Link to comment

It could just be a case of the drive dropping offline for some reason and the drive is actually fine.    If you post your system’s diagnostics zip file (obtained via Tools -> Diagnostics) in its current state we should be able to determine if this is the case.    If it has dropped offline, then diagnostics taken after power cycling the server should give a better idea of whether the drive really has problems.

Link to comment

SMART for parity looks OK but looks like it was disconnected as sdk and reconnected as sdo.

 

Your syslog goes back a few months and it was all good until now. Anything you can think of that might have disturbed the connections?

 

Doesn't look like any SMART tests have been done on that disk so you might try an extended SMART test on it and if it passes you can rebuild to that same disk.

Link to comment

No changes other than normal updates.  It's a SuperMicro rack mount chassis with a backplane, so I can't see how any connection issues would effect a single drive. 

 

Don't these point to a drive issue though?  You know more than me, but thought I'd ask so I can understand how it works.

 

Nov 19 02:15:24 Tower kernel: md: disk0 read error, sector=4971446528
Nov 19 02:15:24 Tower kernel: md: disk0 read error, sector=4971446536
Nov 19 02:15:24 Tower kernel: md: disk0 read error, sector=4971446544
Nov 19 02:15:24 Tower kernel: md: disk0 read error, sector=4971446552
Nov 19 02:15:24 Tower kernel: md: disk0 read error, sector=4971446560

 

Nov 19 02:16:51 Tower kernel: md: disk0 write error, sector=4971446528
Nov 19 02:16:51 Tower kernel: md: disk0 write error, sector=4971446536
Nov 19 02:16:51 Tower kernel: md: disk0 write error, sector=4971446544
Nov 19 02:16:51 Tower kernel: md: disk0 write error, sector=4971446552
Nov 19 02:16:51 Tower kernel: md: disk0 write error, sector=4971446560

 

I'm not sure what to do about the ReiserFS.

Link to comment

The extended SMART test is in progress.  It's been on 10% for about 45 minutes.  I'm not sure if that is normal or not?  Does the array need to be started for it to run the test?

 

I was actually just reading the wiki article on it.  I read this part and thought I might just do it as my drives begin to need replacing.  Some are quite old.

 

"At this point, there is NO general recommendation as to converting existing Reiser drives, UNLESS you are having a known Reiser-related issue. Some feel it is a good idea to begin converting existing drives to XFS, but others do not think it is necessary, and may be an over-reaction to the previous now-fixed issues. At any rate, it does seem wise to consider a slow migration strategy, as drives are added."

Edited by Spyderturbo007
Link to comment

Morning all.  It says "Completed without error".  I'm attaching the SMART report and new diagnostics.  Thoughts on what to do next?  One weird thing is that if I click on Show, next to SMART self-test history, it says "No self-tests have been logged.  (To run self-tests, use: smartctl -t).

WDC_WD60EFRX-68MYMN1_WD-WX51D6422029-20201119-1515.txt

tower-diagnostics-20201120-0805.zip

Edited by Spyderturbo007
Link to comment

Thanks for the help.   Parity rebuild is in progress and is estimated to take 1 day 11 hours.  I'll report back when it's finished.

 

Should I refrain from using the array I assume since the parity rebuild is in progress?  I don't want to lose any data if another disk fails.

 

I'm also thinking a second parity drive would be a good idea for a situation like this in the future, but thought I'd ask for your opinions?

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...