[SOLVED] Replace Empty Data Drive With New Larger Drive


Recommended Posts

Background:

 

My media unRAID has 26 spinning disks and 2 SSDs in it, all driven by an LSI 9217-8i (SAS2308). One miniSAS port is feeding a 24 port Supermicro backplane and the other a 12 port Supermicro backplane (both in my Supermicro CSE-847 enclosure). I have 2 x 16TB for dual parity, 2 x 16TB and 22 x 10TB  for the data array, 1 x 2TB SSD for cache, 1 x 1TB SSD for some VMs and Docker containers.

 

The 4 x 16TB drives were implemented just over a month ago using a new config with a full parity sync (since 2 of them were my new parity drives). Alas doing a full parity sync with this many drives only ran at 30MB/sec until it ran past the 10TB mark. The parity sync speed increased to about 110MB/sec when it was just doing the 2 x 16TB data drives. The complete parity build took over 5 days to complete.

 

For Prime day I picked up 2 more new 16TB drives. I had re-used my old 10TB parity drives as additional data drives but now wanted to move them to my backup unRAID system. Only 1 of the 2 x 10TB Ironwolf drives had data on it already. After a successful preclear of the 2 new 16TB drives I decided to only replace 1 at a time, even though I have dual parity. Alas the data rebuild took close to 6 days, again running at 25 - 30MB/sec when all 26 spinning disks were being utilized and 110MB/sec once only the 16TB drives were in use.

 

Questions:

 

I'm now ready to replace the 2nd 10TB drive (which is empty) with the last 16TB drive. As this 10TB drive is still empty, I was thinking it might be faster to replace it with the precleared 16TB by doing the drive swap procedure, but at the last step before starting the array select the box for 'Parity is already valid'.  Is this a viable (and safe) way to replace an empty drive with a new larger drive?

 

Assuming it is viable and safe, should the new 16TB drive be assigned with just the zero/signature or should it be formatted (to XFS) prior to adding it the same slot that will be vacated by the empty 10TB?

 

Hopefully this will let me do AVOID a 'parity rebuild' on an empty drive. Thanks in advance for any assistance!

 

PS: If anyone has any suggestions on how to improve the parity sync/rebuild speed, that would be appreciated as well.

 

 

Edited by AgentXXL
change 'do' to AVOID
Link to comment
6 hours ago, AgentXXL said:

'm now ready to replace the 2nd 10TB drive (which is empty) with the last 16TB drive. As this 10TB drive is still empty, I was thinking it might be faster to replace it with the precleared 16TB by doing the drive swap procedure, but at the last step before starting the array select the box for 'Parity is already valid'.  Is this a viable (and safe) way to replace an empty drive with a new larger drive?

No, unless the existing drive was also cleared and never formatted.

Link to comment
On 10/26/2020 at 4:03 AM, JorgeB said:

No, unless the existing drive was also cleared and never formatted.

I went ahead and put the replacement 16TB drive in and it's now rebuilding. Estimating 6 days to complete.... this is the part that I don't like or  understand. Originally my Supermicro CSE-847 came with a Dell H310 HBA which is LSI SAS2008 based. I read posts both here and on other forums stating that these cards were a 'little long in the tooth' for dealing with a large number of physical spinning disks.

 

To try and alleviate this I went ahead and picked up a new 9217-8i (LSI SAS2308 based) to replace it, hoping that would improve the speed of parity checks/rebuilds and overall performance. Alas I'm seeing no change to read or write performance, as evidenced by the time to do a parity check/rebuild. Once complete I'll again try a run of the Diskspeed docker to see if my tunable values need any tweaking. But back to the original question.

 

The empty 10TB drive was precleared successfully and had the preclear signature. It was formatted by unRAID when added to the array as a new data disc, but was still empty other than the [up to] 70GB that unRAID seems to reserve. The new 16TB drive was also successfully precleared. Since both drives are filled with zeroes, it should be a trivial task to replace the 10TB with the 16TB as parity should be unaffected.

 

Even if the 10TB drive had some data on it, once the rebuild reaches past that amount the rest of the drive should still be zeroed. It should be relatively simple to just accept that parity was valid, which is why I thought the 'Parity is already valid' checkbox exists. What's the use case for this checkbox?

 

Regardless, it's moot now that I've gone ahead and replaced the drive and let unRAID 'rebuild' an empty drive. What a colossal waste of time and resources IMO. While I truly love unRAID for its simplicity and ease of adding plugins/dockers/VMs, small issues like this should be easier to resolve. I realize the plethora of hardware configs might make it difficult to implement certain features so for now I'll accept it.

 

Link to comment
15 hours ago, JorgeB said:

That's a long time, 16TB should still be done in 24 hours or so, posting the diags might give some clues.

Thanks... diagnostics attached. It's been running the rebuild of the empty drive for almost 2 days already and showing about 4 days left. I probably should have attempted this sooner as the initial parity sync with the new 16TB drives also took almost 6 days total. And the same for the 1st of these 2 new 16TB drives - about 6 days. Note that I have occasionally paused the parity sync/rebuild when I needed a little more write performance for new data being added to the server. I know that increases the time it takes to complete.

 

As an aside, is there a tutorial on how to use the various files in the diagnostics? I've occasionally looked at them myself but really have only concentrated on the syslog. I've seen it mentioned by others that you look at several things initially so that's why you want the full diagnostics and not just the syslog. I'm feeling fairly confident with unRAID these days so I'd like to know how to do this kind of troubleshooting myself.

 

Thanks again for the assistance!

 

animnas-diagnostics-20201029-1457.zip

Edited by AgentXXL
Grammar & Spelling
Link to comment
15 hours ago, JorgeB said:

You should check current HBA link speed, but assuming that is good and despite CPU not being very fast (parity check is single threaded), and since it's a large array, most likely this is the main issue.

I've had a quick read through of that thread and at least somewhat appeased by the fact that this is affecting others as well. One thing that's notable is that others report a single core (thread) utilizing 100% while a parity check/rebuild is underway. As you can see from the attached pic, that doesn't seem to be the case for my system. Any thoughts on why? Is it my larger number of drives (26 spinning, 2 SSDs) that's the bottleneck?

 

 SupermicroCSE-847ParityCPU.thumb.jpg.3f2d7fcd72118788f16483a2df33c63d.jpg

 

Also looked through my syslog and see this:

 

Oct 27 02:54:54 AnimNAS kernel: pci 0000:03:00.0: 8.000 Gb/s available PCIe bandwidth, limited by 2.5 GT/s x4 link at 0000:00:1c.0 (capable of 63.008 Gb/s with 8 GT/s x8 link)

 

Is this what you're referring to? I also note that the 0000:00:1c.0 device is in the same IOMMU group as the LSI card. Perhaps splitting the LSi card into its own IOMMU group might help? The LSI HBA is installed in a x8 slot on the motherboard but that message indicates it appears to be running at x1 speed.

 

LSI-IOMMU-Group.thumb.jpg.0c3bb5dc214bddc9ceb4925a17fe85c4.jpg

 

Thanks again!

Edited by AgentXXL
Added info re: LSA HBA PCIe slot
Link to comment
9 hours ago, AgentXXL said:

limited by 2.5 GT/s x4 link at 0000:00:1c.0

This will be the main issue, HBA is caped at 1GB/s theoretical max bandwidth, usable max is around 80% of that, so 800MB/s, on the diags posted you have 25 drives reading at 31MB/s, that's close to the 800MB/s expected limit.

 

Ideally you'd want a PCIe 2.0 ou 3.0 slot for the HBA, and your board should have a couple of x8 PCIe 2.0 slots, if you use one of those it will quadruple current bandwidth.

Link to comment
1 hour ago, JorgeB said:

This will be the main issue, HBA is caped at 1GB/s theoretical max bandwidth, usable max is around 80% of that, so 800MB/s, on the diags posted you have 25 drives reading at 31MB/s, that's close to the 800MB/s expected limit.

 

Ideally you'd want a PCIe 2.0 ou 3.0 slot for the HBA, and your board should have a couple of x8 PCIe 2.0 slots, if you use one of those it will quadruple current bandwidth.

Yes, that's what I've determined as well. The motherboard (a Supermicro x8DTN+) has 2 x8 slots and 1 x4 slot. Only one of the x8 slots is connected to the 5520 chipset whereas the other one is connected to the ICH10 PCH. I will be moving the card to the other x8 slot once the parity rebuild completes.

 

If I could pause the parity rebuild and have it start-up again after a system shutdown/reboot, I'd go ahead and try this right away. But it appears that's not possible. I don't want to have to start the parity rebuild from scratch, especially if moving the card to the other slot doesn't change the performance. So I'll (im)patiently wait for the rest of the parity rebuild to complete... still estimating just over the 3 day mark.

 

Link to comment
11 minutes ago, AgentXXL said:

Only one of the x8 slots is connected to the 5520 chipset whereas the other one is connected to the ICH10 PCH. I will be moving the card to the other x8 slot once the parity rebuild completes.

Both x8 slots are connected to the chipset, though it's still an older FSB design, still much better than the PCI 4 slot that uses the DMI.

 

Also make sure the HBA and expander are connected using dual link (two cables).

 

Restarting from the beginning using a x8 PCI 2.0 slot with dual link should be be faster than waiting 3 days, should...

Link to comment
13 hours ago, JorgeB said:

Both x8 slots are connected to the chipset, though it's still an older FSB design, still much better than the PCI 4 slot that uses the DMI.

 

Also make sure the HBA and expander are connected using dual link (two cables).

 

Restarting from the beginning using a x8 PCI 2.0 slot with dual link should be be faster than waiting 3 days, should...

 

The Tylersburg 5520 only supports PCIe 2.0. The Supermicro CSE-847 model I have only supports low-profile cards and I only have the one LSI 9217-8i which only has 2 x SFF-8087 miniSAS ports so I can't do dual link cabling. I had tried dual link cabling before by using a 9201-16i with 4 miniSAS ports but it's a full height card so I couldn't put the chassis cover back on. My testing showed that dual link made minimal difference at that time, but it may help now as I've reset and reconfigured the motherboard BIOS (details later in this post). I may give it another try after the rebuild completes but if it works, I'll need to purchase a PCIe slot extender so I can lay the 9201-16i horizontally in the chassis (allowing me to put the cover back on). Here's the board layout:

 

160469065_x8DTN_BoardLayout.thumb.jpg.283eabfdcb28a6aef21f686d74f69aec.jpg

 

Regardless, I wanted to try the other x8 slot. However, upon opening the chassis, I noticed that the HBA was actually installed in the x4 slot (physical slot 5, bus 3). OOOPS! Not sure how I did that but I then moved the HBA to the 1st x8 slot ( physical slot 6, bus 9 ). In this config unRAID no longer sees my drives. The HBA itself is seen and the link speed is x4. I captured diagnostics while in this config and will start comparing the syslogs. Here's the link speed info from the 1st x8 slot:

 

Oct 31 16:52:56 AnimNAS kernel: pci 0000:09:00.0: 32.000 Gb/s available PCIe bandwidth, limited by 5 GT/s x8 link at 0000:00:03.0 (capable of 63.008 Gb/s with 8 GT/s x8 link)

 

I then moved the HBA to the 2nd x8 slot ( physical slot 4, bus 8 ) and decided to look through the motherboard BIOS. I went ahead and loaded the defaults for the BIOS. I then went through the BIOS re-configuring items that didn't need to be enabled like the floppy controller, serial ports and the option ROM support for the PCI-X slots. Yes, this motherboard is so old it has both PCIe and PCI-X slots.


After re-configuring the BIOS I restarted with the HBA still in the 2nd x8 slot (slot 4). This time the logs show that it also negotiated a link speed of x4, but now the drives were seen and passed through to unRAID. I captured the diagnostics with this config also.


Oct 31 17:03:16 AnimNAS kernel: pci 0000:08:00.0: 32.000 Gb/s available PCIe bandwidth, limited by 5 GT/s x8 link at 0000:00:05.0 (capable of 63.008 Gb/s with 8 GT/s x8 link)

 

So at least re-loading the BIOS defaults and the changes I made now have the HBA running at x4. I've gone ahead and started the array and am seeing about 90 MB/s for the rebuild. That has reduced the time for the rebuild to 2 days so at least that's some movement forward. I'll spend some time comparing the syslogs to see why the drives aren't seen when in the other x8 slot, even though the HBA is (and also at a x4 link speed).

 

Note: After reloading and reconfiguring the motherboard BIOS I did retry the HBA in slot 6. It still negotiated a x4 link speed but the drives were not seen or passed through to unRAID. Hopefully comparing the syslogs will reveal why the drives aren't seen when the HBA is in slot 6.

 

Link to comment
9 hours ago, AgentXXL said:

LSI 9217-8i which only has 2 x SFF-8087 miniSAS ports so I can't do dual link cabling.

That is all you need for dual link.

 

9 hours ago, AgentXXL said:

This time the logs show that it also negotiated a link speed of x4

Where are you seeing this?

9 hours ago, AgentXXL said:

limited by 5 GT/s x8 link

This is showing an x8 PCIe 2.0 link, as it should be.

  • Thanks 1
Link to comment
8 hours ago, JorgeB said:

That is all you need for dual link.

If I only had one backplane in my Supermicro CSE-847... this is a 36 bay unit with a 24 port backplane at the front (same as a CSE-846) and a 12 port backplane (same as CSE-836) in the rear. So to do dual link on both backplanes requires 4 ports. I thought about re-using my Dell H310 in the other slot which would give me the 2 ports to accomplish dual link, but that'll have to wait as the rebuild is already half way done.

 

Quote

Where are you seeing this?

In the syslog; the line that I copied into the post. But I just realized that since my board is only PCIe 2.0, the max possible is 5GT/s. The message that unRAID reported says 32Gb/s out of a possible 64Gb/s. Possible with PCIe 3.0 or higher, but not on PCIe 2.0. My mistake...

 

Quote

This is showing an x8 PCIe 2.0 link, as it should be.

 

Yes, now that it's a new day and I see my mistake I agree that it's running a x8 PCIe 2.0 link. As I'm seeing 90MB/s in actual use, it confused my muddled brain.

 

So for now I'm running at the fastest I can given the age of the motherboard. I'm still puzzled as to how the heck I installed the 9217-8i in the x4 slot. I'm sure the Dell H310 was in the 1st x8 slot, but obviously I goofed when replacing it with the new 9217-8i.

 

Thanks again for your help (and patience!) with my issues.

 

 

Link to comment
  • AgentXXL changed the title to [SOLVED] Replace Empty Data Drive With New Larger Drive
14 hours ago, AgentXXL said:

this is a 36 bay unit with a 24 port backplane at the front (same as a CSE-846) and a 12 port backplane (same as CSE-836) in the rear. So to do dual link on both backplanes requires 4 ports.

Ahh, missed that, in that case yes, though you could run dual link to the front backplane and then have that one cascade to the back one, it should still give better performance, though probably not much of a difference with PCIe 2.0, PCIe 3.0 would be required for the slot itself not to be a bottleneck with dual link, still worth trying.

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.