Drive replacement


FreeMan

Recommended Posts

I know I've done this before and I'm quite certain I've asked about it, too. The only reference I could find in any of my posts (I did look through about 1/2 of them), though, was from 2014, so I want to be sure nothing's changed since then. Also, I've been around long enough to know that I know just enough to be dangerous and I can easily shoot myself in the foot if I'm not careful.

 

I need to replace a tired, old, yet functional drive with a newer, larger one. I've run a pre-clear on the new drive as a test - I know that this step is unnecessary except for the level of comfort it brings that it won't die within the first 48 hours - so the drive's installed and ready to be used.

 

In the Wiki, it states:

 

To replace a failed disk or disks:

  1. Stop the array.
  2. Power down the unit.
  3. Replace the failed disk(s) with a new one(s).
  4. Power up the unit.
  5. Assign the replacement disk(s) using the Unraid webGui.
  6. Click the checkbox that says Yes I want to do this and then click Start.

 

The comment made on that 2014 post I found stated:

  1. Stop the array and unassign the drive to be replaced
  2. Start the array so it shows a missing drive
  3. Stop the array and assign the new (replacement) drive to the slot where the drive was missing
  4. Start the array and wait while UnRAID rebuilds the drive.

 

They are, essentially, the same set of instructions. My only hesitation is that the Wiki states it's for a failed disk. I believe that steps 1 & 2 from the 2014 instructions are to convince unRAID that the disk has failed, then it's continue as per Wiki.

 

Correct?

Link to comment
20 minutes ago, ChatNoir said:

If you go this route, I would personnaly do a parity check before remplacing the drive so to be sure that the data used for the rebuild is good.

?  Isn't a parity check pointless as it will calculate parity using the contents of the failed drive, which is now emulated....using parity.  Kind of like checking to see if the answer in the back of the book is correct using a photocopy of the answer in the back of the book.

  • Like 1
Link to comment
1 hour ago, FreeMan said:

I need to replace a tired, old, yet functional drive with a newer, larger one.

 

54 minutes ago, calvinandh0bbes said:

Isn't a parity check pointless as it will calculate parity using the contents of the failed drive, which is now emulated....using parity.

Technically it isn't a parity check when the drive has already failed, it's a read check, for the reasons you state.

 

However, in this thread, we are discussing replacing a GOOD drive, and yes, a full parity check with zero errors is crucial before pulling a good drive to be upgraded.

 

So, if the drive in question is dead, no point in doing a check, rebuilding the drive will do the exact same thing, hopefully getting the array back protected.

If the drive is still good, valid parity is required to successfully replace it, so a check before removing the drive is warranted.

Link to comment
37 minutes ago, jonathanm said:

If the drive is still good, valid parity is required to successfully replace it, so a check before removing the drive is warranted.

The monthly parity check just completed 2 days ago, and the drive being replaced is more than full enough that I know nothing's being written to it, so I'm not concerned about losing data there. I believe I've added a grand total of one whole file to the array since completion, so I'm comfortable with the parity being good.

 

Otherwise, the process is correct?

Link to comment
  1. Stopped array.
  2. Selected the new drive in place of the existing drive for disk 5.
  3. Got a red X and it said it was missing and replaced the selected new disk with "no drive".
  4. Repeated steps 2 & 3 several times
  5. Left it set at "no drive"
  6. Started the array (should have probably selected "maintenance mode", but didn't
  7. Stopped the array
  8. Hung

image.png.d6c96972cdf123e7f33f6746e1a94f8a.png

It's said this for about the last 10 minutes.

The dashboard shows "Array (Stopped)" and lists Parity, Disk 1-4, 6-8 (as expected).

The server is not responding by name, but it is accessible by IP address.

 

As always when the server's not responding properly: nas-diagnostics-20200904-2053.zip

 

Should I reboot, which seems to be about my only option, or is there something else I'm not aware of? I'd probably set auto-start to off for this boot, just to save a minute or two.

 

Edited by FreeMan
Link to comment

There is also this in syslog when you try to assign the new disk:

Sep  4 16:24:30 NAS kernel: mdcmd (6): import 5 sde 64 7814026532 0 HGST_HUS728T8TALE6L4_VDKY234M
Sep  4 16:24:30 NAS kernel: md: import disk5: lock_bdev error: -13
Sep  4 16:24:30 NAS kernel: md: import_slot: 5 missing

I don't recall seeing that lock_bdev before but maybe I have just never had a reason to look for it

Link to comment
10 minutes ago, trurl said:

Can you get to the Docker page?

"Array must be started to view Docker containers", so that's a no.

 

I have shut down all SSH connections, Windows explorers, etc. I had my Kodi box running, but I've rebooted the server many times in the past with it up and that didn't present any issues.

 

The odd thing is that I stopped the array with no issues to remove the disk 5 assignment. It's only after I brought it back up, then tried stopping it again that it's refusing to stop.

 

It's had this on the status bar of the browser window since I attempted the shutdown:

image.png.c642fdd2b8c25442ebabab050782f90f.png

 

Also, the WebGUI stopped responding by name and is currently only responding via IP. I don't know if that's an important symptom, but I presume it'll go away once I get it restarted.

Link to comment

OK, I finally got frustrated and hit the "Reboot" button on the Array Operations tab. I figured the worst that could happen was that it wouldn't do anything. I'd tried to set the array to not start before doing that, but it wouldn't save, so I left it to auto-start the array.

 

Upon clicking the reboot button, it commenced to shutting down. After about 2 minutes, I got the Login page. After logging in, it showed that the array was up, Dockers had all started and everything was hunky dory with disk5 being emulated.

 

I stopped the array, assigned the new 8TB drive to disk5 and clicked start. It's now happily rebuilding disk5. Not sure what the malfunction was, but I'm on the right track now. Another 20 or so hours and I'll have an extra 4TB of space and a nice crispy-new drive to use.

Link to comment
8 hours ago, trurl said:

I don't recall seeing that lock_bdev before but maybe I have just never had a reason to look for it

Yes, that's not normal there was some issue with that disk, there's also this later:

Sep  4 16:44:45 NAS kernel: sd 9:0:3:0: [sde] tag#2701 UNKNOWN(0x2003) Result: hostbyte=0x00 driverbyte=0x00
Sep  4 16:44:45 NAS kernel: sd 9:0:3:0: [sde] tag#2701 CDB: opcode=0x88 88 00 00 00 00 00 00 00 00 00 00 00 00 20 00 00
Sep  4 16:44:45 NAS kernel: print_req_error: I/O error, dev sde, sector 0

But not sure what caused it, UD was spinning down that disk, but I would think that would be unrelated, but if all is good for now ignore.

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.