[Solved]Disk Disabled


Recommended Posts

One of my disks is showing as disabled. It has a red sphere next to it and shows 1 read, 0 writes and 0 errors.

I also got this message in unmenu: Aug 17 18:29:45 Media emhttp: shcmd (69): killall -HUP smbd

 

Running a S.M.A.R.T test results in the following error.

 

Smartctl: Device Read Identity Failed (not an ATA/ATAPI device)

 

Is it time to to get a new disk? Not really sure of the best way forward.

Link to comment

I just switched two disks around, and the same disk (checked by serial number) has a red sphere next to it still; presumably that rules out any issues with cables?

Nope. The red ball has nothing to do with the current health or status of the disk, because once unraid has failed it, the red ball won't go away until the drive slot is rebuilt from the rest of the disks. Try getting a smart report on the failed drive now that you've switched the disks.
Link to comment

Still got the same error:

 

Smart Short Test of /dev/sdc will take from several minutes to an hour or more.

smartctl -t short -d ata /dev/sdc 2>&1

smartctl 5.40 2010-10-16 r3189 [i486-slackware-linux-gnu] (local build)

Copyright © 2002-10 by Bruce Allen, http://smartmontools.sourceforge.net

 

Smartctl: Device Read Identity Failed (not an ATA/ATAPI device)

 

A mandatory SMART command failed: exiting. To continue, add one or more '-T permissive' options.

Link to comment

A quick way to check if the disk is still online is to run something like

fdisk /dev/sd?

in a console/telnet session where ? corresponds to the device you want to check.  If fdisk successfully finds the disk then it IS online so immediately use the 'q' option to quit without making changes.  If the disk has dropped offline then fdisk will give an error message saying it cannot find the device.

Link to comment

"unable to open /dev/sd2" So the disk is not online.

 

I tried this for multiple disk and they all returned the same error, it occurred to me you may have meant /dev/md? So I tried that too and have attached the results

No - I DID mean use the /dev/sd? devices as these are the physical devices while /dev/md? are logical ones.    However the ? part will not be a number - it will be a letter.  You need to look in the unRAID GUI to see which sd? device corresponds to a particular disk.  Note that these assignments can change between boots (although in practice they rarely do) so you always need to check via the GUI to be sure of what device is assigned.

Link to comment

fdisk /dev/sdc returns the following:

 

Warning: DOS-Compatible mode depreciated. It's strongly recommended to switch off the mode (command 'c') and change display units to sectors (command 'u').

You can ignore any warning like that.  As long as fdisk started and then gave you the option to quit that at least means the disk was detected and is still online.  If the disk drops offline then fdisk will tell you the device does not exist.

Link to comment

Before you proceed, you need to check that the disk you ran the smartctl command on is the disk that is marked as not-writable.

 

Do the model/serial number in the smartctl report match that of the failed drive?

 

Every time you re-start unRAID the /dev/sdX device names are re-assigned.  If you really had a failed disk then the current /dev/sdc would NOT be the same disk, but the a different disk in your server (one that is still working).  No device name would be assigned to a failed disk (one that is not responding at all)

 

 

I am assuming you've re-started unRAID several times now since to re-seat the cables and swap disks.  A dead drive would be assigned no device name, a functional, but off-line-because-a-write-to-it-failed drive would get assigned a device.

 

Link to comment

That SMART report shows no obvious problems on the disk (assuming that you have checked it is the correct drive).

 

You should be able to:

[*]Stop the array

[*]Set the drive to unassigned

[*]Start the array, and it should start OK saying that there is a missing drive

[*]Stop the array and reassign the drive.. unRAID should now indicate that it will rebuild the drive

[*]Start the array and the rebuild will start

[*]When the rebuild completes, then do a non-correcting parity check to check there are no errors.

If the rebuild fails, then there is a chance of data loss.  Ways to minimise this are:

  • At the moment unRAID is simulating the drive, so you can copy the data to another location before starting the rebuild
  • If you have another spare drive of a suitable size then you could rebuild onto that, putting the current drive aside while the rebuild is in progress.  If the rebuild works then the removed disk can be put through a pre_clear cycle to check it out and prepare it for potential use in the unRAID array.  If the rebuild fails it is kept unchanged to allow data recovery to be attempted of the removed drive (this will normally get at lest 99%+ of the data if the drive has not physically failed).

Link to comment

Before you proceed, you need to check that the disk you ran the smartctl command on is the disk that is marked as not-writable.

Do the model/serial number in the smartctl report match that of the failed drive?

Yes I checked before running it:

DISK_DSBL /dev/md2 /mnt/disk2 /dev/sdc WDC_WD20EARS-00MVWB0_WD-WMAZA4840105

 

I am assuming you've re-started unRAID several times now since to re-seat the cables and swap disks.  A dead drive would be assigned no device name, a functional, but off-line-because-a-write-to-it-failed drive would get assigned a device.

Yes ofcourse, I have also swapped the drive into a different hotswap bay incase there were any cable issues.

 

 

That SMART report shows no obvious problems on the disk (assuming that you have checked it is the correct drive).

 

You should be able to:

[*]Stop the array

[*]Set the drive to unassigned

[*]Start the array, and it should start OK saying that there is a missing drive

[*]Stop the array and reassign the drive.. unRAID should now indicate that it will rebuild the drive

[*]Start the array and the rebuild will start

[*]When the rebuild completes, then do a non-correcting parity check to check there are no errors.

If the rebuild fails, then there is a chance of data loss.  Ways to minimise this are:

  • At the moment unRAID is simulating the drive, so you can copy the data to another location before starting the rebuild
  • If you have another spare drive of a suitable size then you could rebuild onto that, putting the current drive aside while the rebuild is in progress.  If the rebuild works then the removed disk can be put through a pre_clear cycle to check it out and prepare it for potential use in the unRAID array.  If the rebuild fails it is kept unchanged to allow data recovery to be attempted of the removed drive (this will normally get at lest 99%+ of the data if the drive has not physically failed).

 

I will give that a go now. Is there any easy way to remove the files from this drive onto another in the array? I have enough free space.

Link to comment

After a busy weekend, I have followed the suggested steps an my array is now back online all drives with the green light next to them. I completed a non correcting parity check and it found 0 syncing errors. However the parity drive is showing 768 errors, should this be of concern?

Link to comment

After a busy weekend, I have followed the suggested steps an my array is now back online all drives with the green light next to them. I completed a non correcting parity check and it found 0 syncing errors. However the parity drive is showing 768 errors, should this be of concern?

Yes. I would get smart reports on at least the parity drive, probably should get smart reports on all the drives to see if anything changed after the parity check.
Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.