Jump to content

How to safely remove bad drive from array


Recommended Posts

Hi everyone.

I am on unraid version: 6.9.2. Recently one of my drive failed (red cross mark beside it) and it wont turn on at all. How can I safely remove it? I dont have a replacement drive of same size, I have a bigger driver that I will add later. Not sure if I can just replace the bad drive with new bigger one and have parity rebuild safely. 

 

I have checked all FAQs, but they all advise to move data to other drives, which I can't since drive is already bad. 

 

Thanks in advance for any suggestion. 

Link to comment
26 minutes ago, munimisu said:

checked all FAQs, but they all advise to move data to other drives

Where do you see that? I am very skeptical that any of our FAQs suggest anything like that.

 

Moving to other drives in the array is NOT the recommendation. In fact, I always specifically recommend NOT doing that.

 

Lots of questions we have can be answered if you 

 

Attach Diagnostics to your NEXT post in this thread 

 

 

Link to comment
2 hours ago, trurl said:

What do you mean and how do you know?

 

Please see the screenshot. When I hover over the red X tooltip says click to spin-up the drive, when I do that drive still have same red X beside it and contents are not displayed when I check the disk in explorer.

 

I have not done anything yet that will impact the drive. Array is still up and running on same situation. Only change I have done is excluded the bad drive (disk 5 in my array) from any share. 

 

Thanks a lot for helping. 

Screenshot 2022-07-27 122146.jpg

Link to comment

That wiki is indeed about removing a disk, but it isn't really about removing a disabled disk. Normally you want to recover the data by rebuilding the disk.

 

Doesn't look like there is anything wrong with the disk itself. Syslog indicates problems communicating with multiple disks, but since you have single parity only one could be disabled.

 

Probably controller or power problem.

 

Disk5 disabled, but SMART looks OK except for a ridiculous number of CRC (connection problems). Extended test passed but was some time ago.

 

Emulated disk5 mounts but doesn't seem to have much data on it, if any.

 

Is disk5 supposed to be empty?

 

 

Link to comment

Thank you for checking the logs. 

 

I dont recall what was the status of the disk5 before I noticed the issue. It was never setup to be empty. Its showing it have 27GB of data on it (which is pretty low since this drive was in the array since beginning).

 

Can you please advise what can I do now? 

 

 

Link to comment
2 minutes ago, trurl said:

Did you ever reformat the disk?

 

No, never reformat the drive since adding to the array.

 

Here is the result:

root@homeNAS:~# ls -lah /mnt/disk5
total 0
drwxrwxrwx  2 nobody users   6 Jul  7 11:41 ./
drwxr-xr-x 15 root   root  300 Jul  7 11:31 ../

 

Link to comment
12 minutes ago, munimisu said:

Here is the result

So the disk is empty. Did you move the data off of it?

 

7 minutes ago, JorgeB said:

Disk was already disabled at boot

How did you decide that? The syslog goes back 3 weeks.

 

Also, what about these in syslog?

Spoiler
Jul 20 04:53:37 homeNAS kernel: sd 4:0:0:0: attempting task abort!scmd(0x00000000d321d4dd), outstanding for 7226 ms & timeout 7000 ms
Jul 20 04:53:37 homeNAS kernel: sd 4:0:0:0: [sdc] tag#9294 CDB: opcode=0x85 85 06 20 00 d8 00 00 00 00 00 4f 00 c2 00 b0 00
Jul 20 04:53:37 homeNAS kernel: scsi target4:0:0: handle(0x001c), sas_address(0x500c04f2f3186921), phy(33)
Jul 20 04:53:37 homeNAS kernel: scsi target4:0:0: enclosure logical id(0x500c04f2f3186900), slot(0) 
Jul 20 04:53:37 homeNAS kernel: sd 4:0:0:0: device_block, handle(0x001c)
Jul 20 04:53:38 homeNAS kernel: sd 4:0:0:0: task abort: SUCCESS scmd(0x00000000d321d4dd)
Jul 20 04:53:38 homeNAS kernel: sd 4:0:0:0: device_unblock and setting to running, handle(0x001c)
Jul 20 04:53:38 homeNAS emhttpd: read SMART /dev/sdc
Jul 20 04:53:40 homeNAS emhttpd: spinning down /dev/sdk
Jul 20 04:53:43 homeNAS emhttpd: spinning down /dev/sdd
Jul 20 04:54:07 homeNAS kernel: sd 4:0:9:0: attempting task abort!scmd(0x00000000867a2e94), outstanding for 7444 ms & timeout 7000 ms
Jul 20 04:54:07 homeNAS kernel: sd 4:0:9:0: [sdk] tag#9322 CDB: opcode=0x85 85 06 20 00 d8 00 00 00 00 00 4f 00 c2 00 b0 00
Jul 20 04:54:07 homeNAS kernel: scsi target4:0:9: handle(0x001b), sas_address(0x500c04f2f3186920), phy(32)
Jul 20 04:54:07 homeNAS kernel: scsi target4:0:9: enclosure logical id(0x500c04f2f3186900), slot(1) 
Jul 20 04:54:07 homeNAS kernel: sd 4:0:9:0: device_block, handle(0x001b)
Jul 20 04:54:08 homeNAS kernel: sd 4:0:9:0: task abort: SUCCESS scmd(0x00000000867a2e94)
Jul 20 04:54:08 homeNAS kernel: sd 4:0:9:0: device_unblock and setting to running, handle(0x001b)
Jul 20 04:54:08 homeNAS emhttpd: read SMART /dev/sdk
Jul 20 04:54:10 homeNAS emhttpd: spinning down /dev/sdh
Jul 20 04:54:10 homeNAS emhttpd: spinning down /dev/sde
Jul 20 04:54:47 homeNAS kernel: sd 4:0:3:0: attempting task abort!scmd(0x000000003c610d58), outstanding for 7043 ms & timeout 7000 ms
Jul 20 04:54:47 homeNAS kernel: sd 4:0:3:0: [sde] tag#9280 CDB: opcode=0x85 85 06 20 00 d8 00 00 00 00 00 4f 00 c2 00 b0 00
Jul 20 04:54:47 homeNAS kernel: scsi target4:0:3: handle(0x0015), sas_address(0x500c04f2f3186919), phy(25)
Jul 20 04:54:47 homeNAS kernel: scsi target4:0:3: enclosure logical id(0x500c04f2f3186900), slot(5) 
Jul 20 04:54:47 homeNAS kernel: sd 4:0:3:0: task abort: SUCCESS scmd(0x000000003c610d58)
Jul 20 04:54:50 homeNAS emhttpd: read SMART /dev/sde
Jul 20 04:55:07 homeNAS kernel: sd 4:0:6:0: attempting task abort!scmd(0x0000000019c137e4), outstanding for 7348 ms & timeout 7000 ms
Jul 20 04:55:07 homeNAS kernel: sd 4:0:6:0: [sdh] tag#9310 CDB: opcode=0x85 85 06 20 00 d8 00 00 00 00 00 4f 00 c2 00 b0 00
Jul 20 04:55:07 homeNAS kernel: scsi target4:0:6: handle(0x0018), sas_address(0x500c04f2f318691d), phy(29)
Jul 20 04:55:07 homeNAS kernel: scsi target4:0:6: enclosure logical id(0x500c04f2f3186900), slot(7) 
Jul 20 04:55:07 homeNAS kernel: sd 4:0:6:0: task abort: SUCCESS scmd(0x0000000019c137e4)

 

 

Link to comment
6 minutes ago, munimisu said:

I did exclude this disk from all shares. Didn't do anything manually.  

That wouldn't affect any files already on the disk. Was the disk ever shown as unmountable? That would cause Unraid to list it as a disk to be formatted along with any other disks that actually needed formatting, but you would have to agree to format them.

Link to comment
8 minutes ago, trurl said:

That wouldn't affect any files already on the disk. Was the disk ever shown as unmountable? That would cause Unraid to list it as a disk to be formatted along with any other disks that actually needed formatting, but you would have to agree to format them.

I didnt do any formatting recently, server have been in same state for many months now until recently this disk issue. 

 

3 minutes ago, trurl said:

How long ago was this disk installed?

I think its been around 3 years. 

Link to comment
11 minutes ago, munimisu said:

recently this disk issue

As noted disk had been disabled for at least 3 weeks maybe longer. 

 

Do you have Notifications setup to alert you by email or other agent as soon as a problem is detected? 

 

Do you really want to remove the disk or do you just want to enable it again? Either way is a rebuild, rebuild disk5 or rebuild parity without it. 

Link to comment
22 minutes ago, trurl said:

As noted disk had been disabled for at least 3 weeks maybe longer. 

 

Do you have Notifications setup to alert you by email or other agent as soon as a problem is detected?

Also worth noting that with single parity and one disk disabled you have been running with no redundancy all that time. 

Link to comment
25 minutes ago, trurl said:

Do you have Notifications setup to alert you by email or other agent as soon as a problem is detected? 

 

Unfortunately no. I periodically check if there is any issue/update pending. I will setup notifications.

 

25 minutes ago, trurl said:

Do you really want to remove the disk or do you just want to enable it again? Either way is a rebuild, rebuild disk5 or rebuild parity without it. 

I dont want to remove the disk if there is no issue with disk. 

Can you advise if I got this right:

1. stop array

2. unassign disk5

3. rebuild parity without disk5

4. After parity built is complete, stop array

5. assign disk5 again

6. re-build parity

 

Link to comment
13 minutes ago, trurl said:

Also worth noting that with single parity and one disk disabled you have been running with no redundancy all that time. 

Noted. I have a disk to add another parity, just been putting it off. I will do that now and setup notifications as well. Thanks!

Link to comment
5 minutes ago, munimisu said:

1. stop array

2. unassign disk5

3. rebuild parity without disk5

4. After parity built is complete, stop array

5. assign disk5 again

6. re-build parity

No reason to rebuild parity twice. And you can't rebuild without the disk unless you New Config, so needs some correction.

 

Do you want to see if there is anything on the physical disk?

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...