Parity prior to Data move


J.Nerdy

Recommended Posts

I have spent the evening reading, and, haven't found a clear answer.

 

I am in the process of waiting for my pre-clear to complete on 16TB (4x4 Reds).  I plan on moving 6 TBs of data to the array when it is ready.

 

Is it wiser to activate parity, pass a check and copy the data slowly (for disk protection) or, copy the data, then calculate parity?  How much is array performance degraded during parity calculations across it?

 

 

Link to comment

You can have both, go to Settings -> Disk Settings -> Tunable (md_write_method) and select "reconstruct write" (aka turbo write).

 

You'll be protected and write speed should be at or very close to gigabit, you may want to turn it off after the initial copy as it has the disadvantage of requiring that all disks be spun up for writes.

Link to comment

You can have both, go to Settings -> Disk Settings -> Tunable (md_write_method) and select "reconstruct write" (aka turbo write).

 

You'll be protected and write speed should be at or very close to gigabit, you may want to turn it off after the initial copy as it has the disadvantage of requiring that all disks be spun up for writes.

 

ummmm...

 

Yaaaaasssssssssssss! 

 

Cheers, mate.

 

Happy Holidays.

 

Link to comment

You can have both, go to Settings -> Disk Settings -> Tunable (md_write_method) and select "reconstruct write" (aka turbo write).

 

You'll be protected and write speed should be at or very close to gigabit, you may want to turn it off after the initial copy as it has the disadvantage of requiring that all disks be spun up for writes.

 

sorry for double post:

 

I was thinking that this should be set for just parity drive...but that seems counter-intuitive ... all drives should have reconstruct (array and parity)?

Link to comment

You can have both, go to Settings -> Disk Settings -> Tunable (md_write_method) and select "reconstruct write" (aka turbo write).

 

You'll be protected and write speed should be at or very close to gigabit, you may want to turn it off after the initial copy as it has the disadvantage of requiring that all disks be spun up for writes.

 

sorry for double post:

 

I was thinking that this should be set for just parity drive...but that seems counter-intuitive ... all drives should have reconstruct (array and parity)?

 

It's not a per-drive setting => it's for the system.  When you set "reconstruct write" writes to the array only require two disk operations instead of the 4 used for a normal write process.

 

Link to comment

Also, I wouldn't expect the reconstruct write method to be quite as fast as writes without parity, but it will indeed be MUCH faster than a standard write to the parity-protected array.    Reconstruct writes still require two disk operations -- reads of all the disks except the one being written to and the parity drive;  then a write to those two disks.  Writing without parity enabled will simply write to the disk you're copying the data to.

 

Nevertheless, I'd still do it with reconstruct writes instead of using an unprotected array.

 

Link to comment

Also, I wouldn't expect the reconstruct write method to be quite as fast as writes without parity,...

 

It should be on a 4 disk server, it can be on bigger servers as long as there are no controller bottlenecks, parity check speed is a good indication, it will never be faster than that.

 

I write at gigabit speed to my biggest servers with 22 disks (except when writing to the last cylinders of a disk).

Link to comment

Thank you.

 

Brain is mush between holidays, FAQs, documentation and forums. 

 

Also, using "high-water" should keep the data  from inner cylinder...so that should keep writes from bottoming out near end of transfer.

 

I am thinking of mounting sources disk in unRAID box, and CPing as opposed to via network and Samba.  That should also reduce some of the overhead and give a boost in speed (or will parity protected write be the bottleneck has opposed to newtork i/o)

Link to comment

Also, I wouldn't expect the reconstruct write method to be quite as fast as writes without parity,...

 

It should be on a 4 disk server, it can be on bigger servers as long as there are no controller bottlenecks, parity check speed is a good indication, it will never be faster than that.

 

I write at gigabit speed to my biggest servers with 22 disks (except when writing to the last cylinders of a disk).

 

Is this with all 1TB/platter disks?  (or SSDs)    ... or perhaps large sequential writes (where the buffers on the disks are likely to already have the next read data available most of the time).    I'm sure none of your systems have controller bottlenecks -- you've done an amazing amount of detailed testing on that (THANKS by the way -- that info is very useful).

 

Link to comment

Thank you.

 

Brain is mush between holidays, FAQs, documentation and forums. 

 

Also, using "high-water" should keep the data  from inner cylinder...so that should keep writes from bottoming out near end of transfer.

 

I am thinking of mounting sources disk in unRAID box, and CPing as opposed to via network and Samba.  That should also reduce some of the overhead and give a boost in speed (or will parity protected write be the bottleneck has opposed to newtork i/o)

 

Not sure if this would be appreciably faster -- although Johnnie's experience indicates that you CAN saturate a Gb network with turbo writes (I've not seen that; but my test systems don't have all 1TB/platter disks).    In any event, I'd think Gb speed is plenty.

 

Link to comment

Is this with all 1TB/platter disks?  (or SSDs)    ... or perhaps large sequential writes (where the buffers on the disks are likely to already have the next read data available most of the time).    I'm sure none of your systems have controller bottlenecks -- you've done an amazing amount of detailed testing on that (THANKS by the way -- that info is very useful).

 

I'm referring to my HDD servers, some are 1TB platter only, some still have some 667MB platter disks, but as long as I'm not writing to the end of a disk I get constant (or very close to) gigabit speed, because all disks are accessed at once it has the same bandwidth limits as a parity check or disk rebuild, so if a server has 80MB/s parity check turbo write will never be faster than that.

Link to comment

Also, I wouldn't expect the reconstruct write method to be quite as fast as writes without parity,...

 

It should be on a 4 disk server, it can be on bigger servers as long as there are no controller bottlenecks, parity check speed is a good indication, it will never be faster than that.

 

I write at gigabit speed to my biggest servers with 22 disks (except when writing to the last cylinders of a disk).

 

3 data, 1 parity (all 4TB reds) and cach pool (480 intel 530 sata , OCZ 512 pcie).

 

unfortunately... its onboard controller for now (amd 950).  This was my first server build, so I wanted to use equipment on hand. 

 

My wishlist has supermicro and skylake e3s on it though...and a solid HBA to match.

 

 

Link to comment

I plan on keeping cache pool offline until bulk data is transferred.

 

One thing I am struggling with is Disk vs User shares:  see a lot of experienced forum members championing per disk assignment.  I love the idea of Shares spanning across volumes, as long as files are not chunked across disks (streaming drops due to spin up / platter scan).

 

Setting up hierarchy seems slightly confusing.  Parent directory is set up via shares and then children are cascading folders in shares, correct? 

Link to comment

Read about split levels => that's how you can control what is allowed to be on a different disk.  If properly set, it works VERY well, but it's often misunderstood.

 

User shares is one of the key features of UnRAID -- allowing a large disk pool to "look" like a single very large storage device.    Some folks prefer to use shares as effectively "read only" -- i.e. doing all their writes directly to the disk shares; but nevertheless enabling shares so when they read the data it's all combined into a single share.

 

Just depends on what you feel most comfortable with.

 

Link to comment

Read about split levels => that's how you can control what is allowed to be on a different disk.  If properly set, it works VERY well, but it's often misunderstood.

 

User shares is one of the key features of UnRAID -- allowing a large disk pool to "look" like a single very large storage device.    Some folks prefer to use shares as effectively "read only" -- i.e. doing all their writes directly to the disk shares; but nevertheless enabling shares so when they read the data it's all combined into a single share.

 

Just depends on what you feel most comfortable with.

 

I am digging into "unofficial - man" and split leveling right now.  The primary benefit of disk shares seems to granular control of file/data allocation.  If I can accomplish this intelligently through split-levels and "user shares" seems the best of both worlds.

 

Also, thank you both:  @garycase, @johnnie.black - great community in these forums from what I see.

 

Happy Holidays.

Link to comment

I plan on keeping cache pool offline until bulk data is transferred.

 

One thing I am struggling with is Disk vs User shares:  see a lot of experienced forum members championing per disk assignment.  I love the idea of Shares spanning across volumes, as long as files are not chunked across disks (streaming drops due to spin up / platter scan).

 

Setting up hierarchy seems slightly confusing.  Parent directory is set up via shares and then children are cascading folders in shares, correct?

 

User shares offer much more flexibility over disk shares.  If you don't want to mess with split levels you can keep it simple by restricting a user share to one drive.  After you've written a significant amount of data to that drive simply change the setting to another drive.  When you restrict a share to a given drive that restriction only applies to writing data, not reading it.  So if you have a share called "Movies" and you once had it restricted to drive 1, and later changed that to drive 2 it will now only write to drive 2, but the end user when accessing the Movie share over the network will see all the files on both drives.  If you don't want to actively manage it that closely allow the share to span multiple drives, but keep split level turned completely off.  Set the share to use "most free" and files go to the most free drive, and never split the file tree.  So a single movie will have all files copied to the same drive, the one that has the most free space.  Finally you can always move files from one drive to another manually to free up space on one drive, and fill in on another.  Simply telnet in, and use the Midnight Commander file browser that is built in to Unraid.  I use this all the time to move things around.  The shares don't care what disk something is on.  So you are free to rearrange your data all you want.

Link to comment

Solid suggestions.

 

After reading about Split levels, allocation method, minimums and disk assignment... I think I can have user shares work for me

 

Allocation:  Most Free

Min Free: 50gb

Split Level: based on share

 

Movies > Genres > Movie Folder > Movie - split level 2 (disk 1, disk 2)

TV > Series > Season > Shows - Split level 2 (disk 2, disk 3)

Photos > Albums > Pictures  Split level 2 (disk 4)

Music > Albums > Songs  (disk 4)

Documents > Folder assignment > Folder assignment > Docs (disk 4)

Backups > Sources > Dates > images  (disk 4)

 

I will keep each type as a top level "share" and disseminate down from there.

 

Or at least that is what I am leaning to

 

Link to comment

I would not recommend using most free unless the split level and quantity of files written mostly avoid constant disk switching.

 

That's less of an issue if you're writing say multiple TV seasons and each season is going to a disk based on the split level, but if it's constantly switching from disk to disk you'll get about half writing speed due to parity writes overlapping.

 

See examples below, writing over gigabit with turbo write, the actual writing speed is the parity reported speed.

most_free.png.b31419274c02cb2d2b5fb78c4f81370f.png

highwater.png.aa5bad20ee9331685ec832a3682c6420.png

Link to comment

Thanks j.b.

 

My initial assumption was high water...but then thought most free would promote disk "equanimity".  Thinking it through, though, it would certainly increase disk switch frequency since data written will be frequent and small (episodic vs bulk)

 

High Water

Min Free: 50gb

 

Movies > Genres > Movie Folder > Movie - split level 2 (disk 1, disk 2)

TV > Series > Season > Shows - Split level 2 (disk 2, disk 3)

Photos > Albums > Pictures  Split level 2 (disk 4)

Music > Albums > Songs  (disk 4)

Documents > Folder assignment > Folder assignment > Docs (disk 4)

Backups > Sources > Dates > images  (disk 4)

 

I find this absurdly interesting. Thanks for the help.

 

 

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.