Should I Build my Array Now Before my SATA Controller Comes?


Synaptix

Recommended Posts

First time unRAIDer here. I'm anxious to get started on building my array and transferring my files over-the-wire. I have most of my components, including HDDs, but only thing I'm missing is my 8-port SATA controller card and hot swap bays. According to my supplier they won't be here until a week or two weeks from now.

 

Since I already have my HDDs, would it be fine to connect my HDDs to my onboard SATA ports, start building the array, transfer my files, and then when the controller comes, unplug the drives from the onboard SATA ports and connect them to the controller card?

 

I would think this should work since I'm not raiding anything, and the file system remains the same.

Link to comment
  • Replies 50
  • Created
  • Last Reply

Top Posters In This Topic

I'd recommend to preclear your disks as the first step. Would not hurt to do that IMO.

 

You just want to mount them safely and provide sufficient cooling.

 

If that is a challenge without the drive cages, I would wait.

 

Depending on the size, it could take ~30-55 hours for one cycle (I run one cycle and personally think it is enough).

 

Take your time. Measure twice cut once. Post back if questions along the way.

Link to comment

I'd recommend to preclear your disks as the first step. Would not hurt to do that IMO.

 

You just want to mount them safely and provide sufficient cooling.

 

If that is a challenge without the drive cages, I would wait.

 

That shouldn't be an issue. So connect drives to mobo, click Start Array, and click Format? That preclears it?

Link to comment

There is a thread on the preclear plugin. You should read about that.

 

The plugin runs one of two versions of the preclear script. One version, the original, was written by Joe L. and works great. But the final "leg", the post read and verify, uses a pretty inefficient way to doing the read and verify that takes twice as long as it needs to. I say it is inefficient, but it is done entirely in shell script, and I believe is the only way to accomplish the task. So I made a tweak that includes an option to use a custom program to remove the inefficiency. Many users use and prefer that version. My version requires placing a small executable on your flash called readvz.

 

I personally don't use the plugin (old habits die hard), but it is well supported and you should find instructions for it easily.

 

You need to boot unRAID to use the plugin, but you don't need to have an array defined (at least I don't think you do).

 

Post back if you have trouble and I can help.

 

Depending on your memory, you can run a few preclears in parallel. I don't know the exact number, but 4 should be safe.

Link to comment

One more things ... there is a lot to know before doing your build. If you don't know about preclear, I expect there is a lot you don't know. Suggest you review the wiki and "stickie" threads and get familiar.

 

(Look in the plugin support subforum for the preclear plugin. It is one of the top stickies. There are a number of others you might want to read about as well. The unassigned devices plugin is very useful to those new to Linux.)

Link to comment

BTW - the preclear script does two things. One is it tests the disk by running a preread, a write, and then a post read. I won't go into the whys and wherefores, but this works in conjunction with the drive's SMART system to identify problems. If the disk has bad or marginal sectors, it will tend to highlight this fact and cause you to RMA defective disks before putting valuable data on them.

 

The other thing is it leaves the disk with a "preclear signature". This is recognized by unRAID and allows instant adding of a disk to an array. Otherwise unRAID needs to clear when you add it - a lengthy process but does not include the safety checking above.

 

You don't need the preclear signature when defining a new array, but do need it for adding a disk to an existing array. But the reason you'd be preclearing is for the testing function.

Link to comment

Since I already have my HDDs, would it be fine to connect my HDDs to my onboard SATA ports, start building the array, transfer my files, and then when the controller comes, unplug the drives from the onboard SATA ports and connect them to the controller card?

 

I'm just wondering why, because I would use up the onboard ports first, in preference to those on a controller card, nearly every time?

 

Link to comment

Since I already have my HDDs, would it be fine to connect my HDDs to my onboard SATA ports, start building the array, transfer my files, and then when the controller comes, unplug the drives from the onboard SATA ports and connect them to the controller card?

 

I'm just wondering why, because I would use up the onboard ports first, in preference to those on a controller card, nearly every time?

Absolutely, unless there is some known issue with the mobo ports.
Link to comment

Since I already have my HDDs, would it be fine to connect my HDDs to my onboard SATA ports, start building the array, transfer my files, and then when the controller comes, unplug the drives from the onboard SATA ports and connect them to the controller card?

 

I'm just wondering why, because I would use up the onboard ports first, in preference to those on a controller card, nearly every time?

Absolutely, unless there is some known issue with the mobo ports.

 

It might just be because I'm OCD about ordering lol. I have six onboard SATA ports, 2 of which are SATA3 and 4 are SATA2. One of the SATA3 I'm using for my cache drive. My hot swap bays that are coming will be 5 in 3. I don't really want to have 4 of the HDDs using mobo SATA and 1 of the HDDs on a controller card.

 

Technically, I could just use all 5 (4 x SATA2, 1 x SATA3) for a hot swap bay, but I'm wondering if that'll have any impact on drive speeds. It's to my understanding that platter drives (namely 5400 and 7200 RPM) will never go above 300 MB/s anyway which is the speed of SATA2.

 

Would I be fine to just have one of my hot swap bays on all mobo SATA ports even if one of them is SATA3?

Link to comment

Since I already have my HDDs, would it be fine to connect my HDDs to my onboard SATA ports, start building the array, transfer my files, and then when the controller comes, unplug the drives from the onboard SATA ports and connect them to the controller card?

 

I'm just wondering why, because I would use up the onboard ports first, in preference to those on a controller card, nearly every time?

Absolutely, unless there is some known issue with the mobo ports.

 

It might just be because I'm OCD about ordering lol. I have six onboard SATA ports, 2 of which are SATA3 and 4 are SATA2. One of the SATA3 I'm using for my cache drive. My hot swap bays that are coming will be 5 in 3. I don't really want to have 4 of the HDDs using mobo SATA and 1 of the HDDs on a controller card.

 

Technically, I could just use all 5 (4 x SATA2, 1 x SATA3) for a hot swap bay, but I'm wondering if that'll have any impact on drive speeds. It's to my understanding that platter drives (namely 5400 and 7200 RPM) will never go above 300 MB/s anyway which is the speed of SATA2.

 

Would I be fine to just have one of my hot swap bays on all mobo SATA ports even if one of them is SATA3?

 

There are actually advantages of different drives being on different controllers from a performance perspective.

 

You should have no problem using all of your ports.

 

SATA1 is 150 Gb/sec

SATA2 is 300 Gb/sec

SATA3 is 600 Gb/sec

 

Hare drives (non-SSD) can surpass the SATA1 standard - but it would have to be fast, and even then not by a lot. Using a SATA1 port would not be bad.

 

No spinner can surpass the SATA2 standard, but SSDs can.

 

SATA3 is recommended for SSDs.

 

More insidious restrictions can be introduced on controller. For example, each port might be able to provide SATA2 speed, but if all ports are running in parallel, you might hit a maximum that is more restrictive. Think of it like water coming into your house. You have a faucet that can put out 1 gal/minute. Your whole house can take in 5 gals/minute. You turn on that 6th faucet and everyone's water slows down. Using two controllers is like having 2 water lines - some faucets on one and some on the other. You can run 5 faucets off one, and 5 faucets off the other - all at full speed.

 

Generally speaking, the fastest and best ports on the server are the motherboard ports. A good controller comes close, and can, in some situations, exceed the speed / total capacity of the motherboard ports. But you're talking about a maximum speed. The real world performance change is so negligible its not even worth noting. If you are looking for absolute performance - look at RAID5 or RAID6. Much faster than unRAID. But unRAID has a number of advantages, including very low risk of loosing the entire array, use of different sized drives, easy to grow over time.

 

So short answer - there is no problem using all of your SATA ports. Spread your disks across them. I have found some very slight issue preclearing using some controllers, and I prefer to preclear on the motherboard. (So I know which of my drive cages are hooked to motherboard ports, and will often swap an existing drive to a secondary controller to do a preclear there.)

 

Direct your OCD at sharing the load across the controllers if you must, but really, this is nothing to be concerned about in this day and age. Back in the PCI days it was different - you could really bottleneck your system. But you'd really have to work at it to do that with modern motherboards and PCIe controllers.

Link to comment

Come on, what's a couple of decimal places between friends?  :)

 

SATA1 is 1.50 Gb/sec

SATA2 is 3.00 Gb/sec

SATA3 is 6.00 Gb/sec

 

Thanks for all the great info. I think I'm going to do just that then with one of my hot swap bays. Since I have 5 onboard SATA ports, one of my bays will use all the mobo ports and the others will use the controller cards.

 

My preclear will be done on four drives tomorrow, and I'll begin to preclear the next two when they're finished. Once the first four are finished preclearing, is it safe to set one to Parity, and begin building the array and moving data over to the drives?

 

Once the controller cards and hot swap bays come, I'll be doing a bunch of rearranging and cleaning up the wiring in my case. There's a big chance that the ports the HDDs are currently connected to will be changed, this shouldn't affect the array though once it's built correct?

Link to comment

Correct -- you can freely move the drives to different ports.

 

In fact, you can build the initial array with less than all of your drives; do the initial parity sync; and then you can add more drives without the need to rebuild parity as long as the drives have been pre-cleared (actually, with v6 that's not even necessary -- UnRAID will automatically clear the drive before adding it if you haven't pre-cleared it).

 

Link to comment

The feature of unRaid that clears disks is not nearly as good as the community preclear.

 

The community preclear will do a very effective job of testing out the drive, and if the drive is looking shaky, gives you the opportunity to RMA it before it gets into the array. With unRaid's basic preclear, it clears and puts in the array, that's it.

 

There are a lot of different load strategies, but the one I like best is settling up the array with parity protection and then start loading it over the lan. It puts your server to the test, and if there are lose connections (see later in this post) will tend to highlight that fact. While you still have the source disks, it tends to be a no risk load. Once all the data is loaded, and a parity check is run, and the smart reports all look good, you are ready to trust the array.

 

But many people have some new disks to prime the array, and plan to copy data to the array from them, and then add the copied disks to the array to receive data from other existing disks. There may be a couple of cycles of that. There is no problem with doing this solong as the data on the array is protected and copied data is verified. But it is better if the source data is retained as a backup.

 

The last category is those that are planning to migrate a lot of data. Load times in the days to perhaps 10 days are fine, but if you've got enough data that it is going to take weeks or more, load can be significantly shortened by delaying the introduction of parity. (You can assume 40 MB/sec to do a quick and dirty estimate of load time). Parity slows down write speeds, and also makes loadng 2 or more disks in the array simultaneously very inefficient. If you configure the array before adding parity, you can load it 2x faster over your lan.. And if you temporarily install the disks you want to copy from into the server, you can go a lot faster, copying data 3x as fast, and loading disks 4 at a time could be up to 12x faster. If you are going to start reusing disks you've already copied, things get more complicated. Because that data is in a new array that is not parity protected. You really need to install parity and check parity, as well as make sure the array disks are healthy before doing that. So you can do a fast initial load of part of the array, then put parity in place, and then do the remainder of the load. The disadvantage is you're not exercising your array and burning it in to the same degree.

 

Again, the best strategy is to set it up with parity, copy the data, and save the originals as backup.

 

One more thing, the MOST common problem people have with the NAS portion of their unRaid server is drives dropping from the array (appearing to fail) due to bad or lose connections. This is particularly stressful if the user opens the case, exchanges the disk (thinking it failed), and then, while rebuilding it, another disk does the same thing because a cable got nudged swapping out the first disk.

 

I applaud your investment in drive cages which makes the second problem all but disappear. But the initial install is still rife with risk of shaky connections that can raise their ugly heads days, weeks, or months later. And with Murphy's assistance, at the least opportune times. I like to use locking sata cables wherever possible, and where not possible, making sure I am using cables that have a firm friction connection. That sata cable that has worked great in your old server may be electrically fine, but if the connector has loosened over time, and it slides on and off the sata terminal effortlessly, do not use it. Also, you don't want any of the connections to have tension on them. They will come lose. Carefull attention to the cabling of your cages will pay off with troublefree (or at least less troublesome) operations.

 

The adventure begins. Be neat. Take your time. Have fun!

Link to comment

The load strategy you mentioned with loading the data over the LAN is what I plan on doing. Once I build the array, I plan on enabling user shares, making folders, and copying (not moving) the data from the main rig to my unRAID server over the LAN.

 

My next question is, once I assign the drives and click "Start", I know that Parity Sync will begin. Is it safe to copy (again, not move) the data over to the unRAID server while Parity Sync is still occurring, or should I wait until Parity Sync completes before moving the data over?

 

Now all this can be null and void because I've never done a Parity Sync and because there's no data on the drives (all zeroes), Parity Sync might just take a split second.

 

 

Edit: I just read that if I enable parity protection now, the write speeds are going to be much slower. What I'm considering doing now is to begin copying the data over the LAN, and once that finishes, enable parity protection. If something messes up, I will still have my original data on my main rig.

Link to comment

The load strategy you mentioned with loading the data over the LAN is what I plan on doing. Once I build the array, I plan on enabling user shares, making folders, and copying (not moving) the data from the main rig to my unRAID server over the LAN.

 

That is the best with a new server that you are burning in.

 

My next question is, once I assign the drives and click "Start", I know that Parity Sync will begin. Is it safe to copy (again, not move) the data over to the unRAID server while Parity Sync is still occurring, or should I wait until Parity Sync completes before moving the data over?

 

Safe, yes. Advised, no. It will slow down both operations. It will be much faster to do the parity build, and then do the copy afterwards.

 

Now all this can be null and void because I've never done a Parity Sync and because there's no data on the drives (all zeroes), Parity Sync might just take a split second.

 

In a perfect world, if you entered three pre-cleared disks (or even one precleared parity), unRAID would optimize the process and have it complete instantly. But it is such a rare use case that I would very much doubt that is in place (I've never tried it). You'd have to wait to it to build. Would not be any faster or slower than if the disks contained data.

 

Edit: I just read that if I enable parity protection now, the write speeds are going to be much slower. What I'm considering doing now is to begin copying the data over the LAN, and once that finishes, enable parity protection. If something messes up, I will still have my original data on my main rig.

 

Would not blame you. But, I would suggest saving several hundred gig to be copied AFTER the parity is in place. You really do want to make sure the server is behaving properly, and doing all of your copies with no parity is not that helpful. I'd copy some of the data, build parity, then copy the remaining data (300G +/-) to the array, then do your first parity check. After the parity check, check all the SMART reports, and if all looks good, and parity had no sync errors, you'd be feeling pretty good.

Link to comment

Now all this can be null and void because I've never done a Parity Sync and because there's no data on the drives (all zeroes), Parity Sync might just take a split second.

 

In a perfect world, if you entered three pre-cleared disks (or even one precleared parity), unRAID would optimize the process and have it complete instantly. But it is such a rare use case that I would very much doubt that is in place (I've never tried it). You'd have to wait to it to build. Would not be any faster or slower than if the disks contained data.

This exact scenario was recently discussed here. Looks like you could just check the parity valid box. Note this only applies if all disks are clear, formatted disks are not clear disks.
Link to comment

 

Now all this can be null and void because I've never done a Parity Sync and because there's no data on the drives (all zeroes), Parity Sync might just take a split second.

 

In a perfect world, if you entered three pre-cleared disks (or even one precleared parity), unRAID would optimize the process and have it complete instantly. But it is such a rare use case that I would very much doubt that is in place (I've never tried it). You'd have to wait to it to build. Would not be any faster or slower than if the disks contained data.

This exact scenario was recently discussed here. Looks like you could just check the parity valid box. Note this only applies if all disks are clear, formatted disks are not clear disks.

 

My disks are precleared which I assume means they're cleared. When you say formatted, I'm thinking in terms of formatted file system (NTFS, FAT32, XFS, etc.)

 

You can also enable turbo write, best of both worlds, disadvantage is that all disks will spin up for writes, but IMO not a big deal for the initial copy.

 

I did not even know this existed, I'll look into this as well.

 

Would not blame you. But, I would suggest saving several hundred gig to be copied AFTER the parity is in place. You really do want to make sure the server is behaving properly, and doing all of your copies with no parity is not that helpful. I'd copy some of the data, build parity, then copy the remaining data (300G +/-) to the array, then do your first parity check. After the parity check, check all the SMART reports, and if all looks good, and parity had no sync errors, you'd be feeling pretty good.

 

Would turbo write allow faster write speeds when the Parity Sync has already completed? I should also mention that I have a 500 GB cache drive as well.

Link to comment

This exact scenario was recently discussed here. Looks like you could just check the parity valid box. Note this only applies if all disks are clear, formatted disks are not clear disks.

 

Didn't think of that (you can teach an old dog new tricks).

 

But I would not recommend doing it this way. Building parity is something you want to know works on your server. No shortcuts here!

 

You can also enable turbo write, best of both worlds, disadvantage is that all disks will spin up for writes, but IMO not a big deal for the initial copy.

 

I thought I read that this speedup was debatable. TBH, I have never experimented with turbo-write.

 

But sort of same as above, I'd like to burn in the server doing most/all writes the way writes are normally done. If he also wants to test out with turbo-write - that's good too.

 

Cheers!

Link to comment

... Note this only applies if all disks are clear, formatted disks are not clear disks.

My disks are precleared which I assume means they're cleared. When you say formatted, I'm thinking in terms of formatted file system (NTFS, FAT32, XFS, etc.)

Exactly. Seems like a lot of people don't have the understanding of "formatted" that you do, and that has sometimes led them into trouble.

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.