Seagate’s first shingled hard drives now shipping: 8TB for just $260


Recommended Posts

... the timings we have seen so far are assumed because pre_clear is writing zeros!??

 

No ... WHAT's being written doesn't matter.  It's the fact it's written sequentially and quick enough that the drive's firmware recognizes that a full band is being written.    When this happens, the persistent cache isn't used, and thus there's no need for the later moving of the data and full band re-writes that cause such pitiful write performance on the shingled drives.

 

It's NOT any different than writing anything else to the disk IF the "anything else" is being written sequentially and the writes are close enough together in time that the drive recognizes the full band writes.    I'd think there's a chance that drive rebuilds will fall into this category -- but don't know that for a fact.    If they do, they'll probably happen in a reasonable time frame (< 24 hrs);  if not, they make take the 9+ days that the Storage Review testing indicated.

 

 

Link to comment
  • Replies 655
  • Created
  • Last Reply

Top Posters In This Topic

Top Posters In This Topic

Posted Images

Thanks for the Clarity! Excellent response, as always! Sometimes I feel like I'm back at School when you guys get going on this forum!!  ;)

 

If they happen to read this thread can we get some input from someone who knows how Unraid does a drive rebuild!? Gurus? LT Staff?

Link to comment

<Sigh>  Did you READ what I said?

 

... assuming all drives in the array are the same size.

CLEARLY if the drives are smaller, then the math is different.

 

I read, just did not agree with the premise.

 

The size of the array is not a function of the size of the disks or the number of disks. The size of the array is a function of the amount of data a person has. Whether he uses 2T, 3T, 4T, 6T, or 8T parity results in spreading the data over a different number of disks. When setting up a new array, the larger the parity, the fewer same-size disks are required, the longer the parity build/parity check/disk rebuilds, and the more expensive the redundancy. The smaller the parity, the more same-size disks are required, the shorter the parity build/parity check/disk rebuilds, and the less expense of redundancy. If the parity it too small to be able to support the array size in the server slots, a larger parity is required.

 

But when upsizing an array, the larger parity will not necessarily reduce the drive count much if at all for a period of time.

 

The other issue is the amount of data on a single disk. We all know that a 2nd drive failure causes you to loose 2 full disks of data. If you have it backed up, maybe not a concern. But if not, losing 2 2T drives is a very different thing that loosing 2 8T drives!

 

If 50T drives were available at a competitive $/T , would you buy them? I'm thinking not. As some point, the drives are too big and the increased size is not useful any more. I'm not sure where that happens - for me I'd want my data split over at least 8 disks so that I was not putting too may eggs in one basket, and not going overboard in buying a $$$ parity.

 

The other consideration is room to grown. A larger parity size will leave more open slots and therefore more room to grow.

 

So there are a lot of variables in deciding your parity size and no one-size-fits-all (pun intended :)) solution.

 

The zeroing time could be used as a guide to the time it would take to rebuild the 8TB drive, without other limiters. I'd expect it to be within 125% of this.

 

I'd agree with this IF (and this is a big IF) UnRAID writes the data during a rebuild in a manner that allows the drive to always recognize that full band writes are being done, so it will skip the use of the persistent cache and the band rewrites that would entail.    If that's not the case, the rebuild will be FAR longer than the zeroing time [Writing zeroes sequentially throughout the entire disk clearly bypasses the persistent cache -- that's clear from the timings we've already seen posted here].

 

The parity build is quite snappy on my array - starting at about 120 MB/sec. I expect that is fast enough, but yield it is a high probability but not a certainty. A test case would be needed to confirm.

Link to comment

The size of the array is not a function of the size of the disks or the number of disks. The size of the array is a function of the amount of data a person has..

 

Not a function of the size and number of disks???  Whatever are you measuring as "size"?  Most folks I suspect equate "size" with "capacity" -- which was certainly what I was referring to ... and for that the "size" of an N-disk array is (N-1) * (size of the disks).    So an 8-disk array using 8TB disks = 7x8 = 56TB    It doesn't matter a whit how much data the user has  -- it's still a 56TB array.    How much data somebody has (or expects to acquire) may very well influence what size array he builds ... but the actual size of the array is purely a function of the size and number of disks !!

 

 

The other issue is the amount of data on a single disk. We all know that a 2nd drive failure causes you to loose 2 full disks of data. If you have it backed up, maybe not a concern. But if not, losing 2 2T drives is a very different thing that loosing 2 8T drives!

 

Clearly the size of modern drives is why many of us are "itching" for dual fault-tolerance ... just as virtually all data centers now use RAID-6 (or some variant thereof, e.g. 60) instead of RAID-5.

 

But the likelihood of a dual drive failure is higher as the drive count increases ... so using smaller drives to limit the potential data loss is also increasing the likelihood of that loss occurring.

 

... and of course any data you don't want to lose should always be backed up anyway.

 

 

As for the "cost of protection" => it's certainly true that the parity disk costs more as you use larger disk sizes ... which will make the total cost of the disks for a given array size slightly higher;  but the cost of the SYSTEM will almost certainly be lower, since you'll require fewer SATA ports; fewer drive bays;  less power;  and a smaller chassis.

 

It's true that parity checks and drive rebuilds will take longer with larger drives, although it's not a linear function unless the drives have the same areal density, since the larger drives tend to have much higher areal densities and thus faster data rates.  If the fastest possible parity checks is your goal, then just build a system using 1TB drives with 1TB/platter areal density drives (like the 1TB WD Reds).    Of course a maxed-out Pro license using those drives would have the same capacity as a Plus license with five 6TB drives; use five times as much power;  run MUCH warmer; and require 20 additional drive bays and SATA ports  :)    [And if the 6TB drives are WD Reds, they'll also be faster, since they have 20% higher areal density than the smaller drives.]    Of course using 1TB drives would also cost FAR more than using larger drives, since the cost/TB is much higher when you're only buying 1TB drives (a 1TB WD Red is currently $65 at Newegg).

 

 

 

Link to comment
Not a function of the size and number of disks???  Whatever are you measuring as "size"?  Most folks I suspect equate "size" with "capacity" -- which was certainly what I was referring to ... and for that the "size" of an N-disk array is (N-1) * (size of the disks).    So an 8-disk array using 8TB disks = 7x8 = 56TB    It doesn't matter a whit how much data the user has  -- it's still a 56TB array.    How much data somebody has (or expects to acquire) may very well influence what size array he builds ... but the actual size of the array is purely a function of the size and number of disks !!
A purely semantic argument. The size of a person's array has everything to do with the amount of space they need, not with the size of the disks they buy. It's not like a person with a 10 drive build is going to be choosing between buying 10 4T drives and 10 8T drives. :o Yes, at the end of the day adding the capacities of all of their data disks together will be the the arrays capacity, but the space they need is not driven by (N-1) * (size of a disk). We are not automatons! Look at your own array - is it's current size a function of the space you need or the size of the largest disks? Would doubling your parity disk size suddenly have your parity disk protecting twice the data? When you upsize parity do you upsize every disk in the array also?

 

As for the "cost of protection" => Yes, the larger drives cost more, but they're also "protecting" twice as much data, assuming all drives in the array are the same size.
Read that quote carefully. The larger parity (in this mythical array of all same sized disks) protects the data in the array which is sized based on the user's need for space. So the larger parity is protecting only half the number of disks, not twice the amount of data.

 

...many of us are "itching" for dual fault-tolerance
We're in agreement on 2 parity disks being a good thing. Although I will quickly say that the incidence of data loss here due to multi-drive failure is tiny (only one case I know of) compared to the incidence of people just doing the wrong thing and loosing data. Closing the gaps that have historically lead to data loss is therefore more important for the masses. But dual parity helps the informed users, like you and I, lower the risk of loss.

 

But the likelihood of a dual drive failure is higher as the drive count increases ... so using smaller drives to limit the potential data loss is also increasing the likelihood of that loss occurring.
I hugely question that assumption that you ask to be accepted as fact. Different drives have different failure rates. Some well over 20%. Choosing the wrong disks, even if you have fewer of them, will still not yield nearly as reliable an array as an array of smaller, but less prone to failure drives. This is a huge issue for me with the new Seagate 8TB (and also to WD branded drives which have not faired too well in the studies I have read). But the 8T SMRs are new technology and the jury is still out on their long term reliability. I'd stack the reliability of an 8 4T HGST build against that of a 4 8T Seagate build any day.

 

As for the "cost of protection" => it's certainly true that the parity disk costs more as you use larger disk sizes ... which will make the total cost of the disks for a given array size slightly higher;  but the cost of the SYSTEM will almost certainly be lower, since you'll require fewer SATA ports; fewer drive bays;  less power;  and a smaller chassis.
I tend the view the cost of system (sans disks) as a sunk cost. It makes no sense for a person that builds an 8 disk build to buy new case, etc. so they can use smaller disks (although there are some innovative ways to very inexpensively grow an array beyond its current disk limit). That person is pushed towards the higher capacity drives, even if uneconomic. If you have a case that supports more disks you can grow the capacity over time and can be more selective about when to increase parity size. In a growing array, over time, the higher sunk cost pays for itself by allowing the owner to use disks longer and defer buying larger disks until they overcome the $/T of their smaller capacity cousins. Higher areal density only gets you so far with unRAID performance. And things like parity builds and checks are typically gated by the smallest disks in the array until their size is passed.

 

... the "size" of an N-disk array is (N-1) * (size of the disks) ... It's true that parity checks and drive rebuilds will take longer with larger drives, although it's not a linear function unless the drives have the same areal density, since the larger drives tend to have much higher areal densities and thus faster data rates.
You argue from the perspective of an array with identical capacity disks. I have never had an array with all the same size disks. Virtually everyone that has had an array for more than a couple years will have a mix of capacity. Arguing as though a person's array uses either this size disk or that size disk ignores the realities that the norm use case is a mixed array.

 

My other point is that there is a point at which larger drive sizes become a hindrance and not an advantage (i.e., my 50T array example). You kinda skipped over that. Thoughts?

Link to comment

The "graceful" multi-drive failure is what attracted me to unRAID.  Unlike a striped array, if a few sectors go bad across a few drives, the whole thing doesn't crash and burn.  The good drives can still be read individually and maybe you can limp along the bad drives enough to get most of your data off.  It's not like you're cruising along with 80tb of data one second and 0tb the next.

Link to comment

The "graceful" multi-drive failure is what attracted me to unRAID.  Unlike a striped array, if a few sectors go bad across a few drives, the whole thing doesn't crash and burn.  The good drives can still be read individually and maybe you can limp along the bad drives enough to get most of your data off.  It's not like you're cruising along with 80tb of data one second and 0tb the next.

Agree completely.  The single biggest feature I like about unRAID.
Link to comment

The "graceful" multi-drive failure is what attracted me to unRAID.  Unlike a striped array, if a few sectors go bad across a few drives, the whole thing doesn't crash and burn.  The good drives can still be read individually and maybe you can limp along the bad drives enough to get most of your data off.  It's not like you're cruising along with 80tb of data one second and 0tb the next.

Agree completely.  The single biggest feature I like about unRAID.

 

I agree but please keep the thread on topic. You have to at least mention 8T SMR! :)

Link to comment

Clearly we've left the road with regard to the topic of the 8TB SMR's  :)

 

I think there's enough "food for thought" in the last couple exchanges for anyone interested to make up their own minds.

 

One note for Brian:  Yes, I DO buy all drives of the same size for my arrays.    If I upsized my parity to a 6TB drive, I'd simply buy a stack of 6TB drives for the array and relegate the old drives to backup usage.  It's silly to buy a 6TB drive and then use a bunch of smaller drives (in that case you would indeed be overpaying for your parity protection).

 

My next backup server will likely be a bunch of 8TB SMRs

 

I think what started this back-and-forth was you disagreeing that the "size of an array is a function of the number and size of the disks"  ==>  I still find it incredible you don't believe that !!    It's certainly true that for a user "...  the space they need is not driven by (N-1) * (size of a disk) " ==> but it is absolutely true that the size of the array they build is indeed (N-1) * (size of a disk) for an array using all disks of the same size ... and even if the disks aren't the same size the size is still driven by the size and number of the disks ... the math is just a bit different.

 

 

 

Link to comment

 

If I upsized my parity to a 6TB drive, I'd simply buy a stack of 6TB drives for the array and relegate the old drives to backup usage.  It's silly to buy a 6TB drive and then use a bunch of smaller drives (in that case you would indeed be overpaying for your parity protection).

 

I only buy the drives I need at the time. If I upgrade parity I buy 1 data drive. I rarely if ever buy more then 2 drives at a time.

My current array has 4-6tb drives 2-4tb drives, 2-3tb drives. at $200-$300 a pop for the 6TB's, I'll upgrade them as I need them.

 

I don't find it silly to do it this way. I find it cost effective since drives get larger and prices drop.

 

I'm interested in the 8TB SMR (there I've said it) as archive backup drives.

I can rsync a spindle to a backup drive, and also keep a dated incremental. 

Link to comment

I don't actually buy ALL of the drives in the same batch, but I do replace all of them within a short time span (within 90 days).  Primarily so they come from different manufacturing runs, as I've had my share of "bad batches" over the years.

 

I am also very interested in the 8TB SMR units -- as I noted, for my next backup server.    I've also considered building another small media server to replicate my entire media collection for use at a vacation home or in an RV ... a mini-ITX chassis with 6 of these would have a very nice storage capacity => and the array would essentially be read-only after the initial load.

 

Link to comment

Gary, don't know anyone that upsizes their arrays like you. My impression is it is high risk. Buying the same model so close to each other. It just tempts the fates!

 

I am more like Weebo. I tend to buy drives a pair at a time as they are on good sales. Mostly HGST, but did buy a pair of Seagate 4T that have worked nicely so far. When the array gets bloated with (comparative) small drives, I tend to move them out. A pair of 4T drives swapped out 8 1T. And a pair of 6T replaced 6 2Ts. My rule of thumb is the replacement drives need to be at least 3x bigger than the disks it replaces. And the old disks need to have 5 years on them.

 

The next such replacement comes replacing 3T. I think I'll wait for the 10s or 12s to swap them out. I don't see that happening for a while.

 

The 8T SMRs are a possibility. Still waiting for more experiences with the drives to be documented. I am currently not hurting for space so in no rush. But I have a pair of HGST 7200 NAS drives ready to RAID0 when/if I go that route. And I just bought a pair of HGST 6T in preparation for 10 or 12T parity in the future.

 

But I could also see myself stopping at 6T for a while.

Link to comment

The zeroing time could be used as a guide to the time it would take to rebuild the 8TB drive, without other limiters. I'd expect it to be within 125% of this.

 

I'd agree with this IF (and this is a big IF) UnRAID writes the data during a rebuild in a manner that allows the drive to always recognize that full band writes are being done, so it will skip the use of the persistent cache and the band rewrites that would entail.    If that's not the case, the rebuild will be FAR longer than the zeroing time [Writing zeroes sequentially throughout the entire disk clearly bypasses the persistent cache -- that's clear from the timings we've already seen posted here].

 

Hopefully this will prove to be the case -- I suspect either pkn or jtown will do this experiment one of these days and let us know  :)  [Or anyone else who decides to buy a few of these to experiment with.]

I've got one of these on order that's supposed to be shipping in the next couple days that I plan on using for my parity drive so once I get it and run a preclear cycle I'll be able to start a rebuild with one of these 8TB drives as the parity. I've got 1x4TB, 3x3TB, and 1x2TB as my data array currently, so that should give us a pretty good idea of how long a data rebuild will take with a varying size of drives array. I'll report my results once I start the rebuild in ~2 weeks.

Link to comment

The zeroing time could be used as a guide to the time it would take to rebuild the 8TB drive, without other limiters. I'd expect it to be within 125% of this.

 

I'd agree with this IF (and this is a big IF) UnRAID writes the data during a rebuild in a manner that allows the drive to always recognize that full band writes are being done, so it will skip the use of the persistent cache and the band rewrites that would entail.    If that's not the case, the rebuild will be FAR longer than the zeroing time [Writing zeroes sequentially throughout the entire disk clearly bypasses the persistent cache -- that's clear from the timings we've already seen posted here].

 

Hopefully this will prove to be the case -- I suspect either pkn or jtown will do this experiment one of these days and let us know  :)  [Or anyone else who decides to buy a few of these to experiment with.]

I've got one of these on order that's supposed to be shipping in the next couple days that I plan on using for my parity drive so once I get it and run a preclear cycle I'll be able to start a rebuild with one of these 8TB drives as the parity. I've got 1x4TB, 3x3TB, and 1x2TB as my data array currently, so that should give us a pretty good idea of how long a data rebuild will take with a varying size of drives array. I'll report my results once I start the rebuild in ~2 weeks.

 

How will this be different than what jtown has already done?

 

February 15, 2015, 04:03:50 AM

Estimated speed:  44  MB/sec

 

February 15, 2015, 05:02:28 PM

Total size:  8  TB

Current position:  2.03  TB (25%)

Estimated speed:  108.29  MB/sec

Estimated finish:  919  minutes

 

February 16, 2015, 02:20:58 AM

Total size:  8  TB

Current position:  5.81  TB (73%)

Estimated speed:  140.39  MB/sec

Estimated finish:  260  minutes

 

February 16, 2015, 07:47:08 AM

Total size:  8  TB

Current position:  7.95  TB (99%)

Estimated speed:  95.84  MB/sec

Estimated finish:  9  minutes

Link to comment

The rebuilds you're describing won't be useful at all with regards to this discussion.  The question is how long a rebuild would take for one of the 8TB shingled drives ... i.e. would the use of the persistent cache and need to do extensive band rewrites come into play; or would the rebuild generate writes quickly enough that they would all be recognized as full band rewrites, and thus skip the consistent cache and band rewrites.    There's a BIG (actually HUGE) difference in the potential rebuild times depending on the answer to that.    I suspect -- based on the pretty good zeroing times that have been shown with pre-clear; that the rebuilds will do the same thing.    But the Storage Review article that tested this in a RAID-5 environment showed dramatically slower writes times ... so the proof will be when somebody actually does it.

 

Rebuilding a standard PMR drive with the shingled drive as parity has no impact on this => we already know that reads aren't' an issue with shingled drives.

 

If you're referring to the parity sync ... then yes, that will also be interesting to see.  Given that it's computationally almost identical to a rebuild, that may indeed answer the question r.e. the rebuilds ... EXCEPT that with the drive complement you have there won't be any other drives involved for the last full half of the drive.

 

Link to comment

If you're referring to the parity sync ... then yes, that will also be interesting to see.  Given that it's computationally almost identical to a rebuild, that may indeed answer the question r.e. the rebuilds ... EXCEPT that with the drive complement you have there won't be any other drives involved for the last full half of the drive.

 

As c3 pointed out, jtown already did a parity sync with one of these 8TB drives as the parity drive:

February 15, 2015, 04:03:50 AM

Estimated speed:  44  MB/sec

 

February 15, 2015, 05:02:28 PM

Total size:  8  TB

Current position:  2.03  TB (25%)

Estimated speed:  108.29  MB/sec

Estimated finish:  919  minutes

 

February 16, 2015, 02:20:58 AM

Total size:  8  TB

Current position:  5.81  TB (73%)

Estimated speed:  140.39  MB/sec

Estimated finish:  260  minutes

 

February 16, 2015, 07:47:08 AM

Total size:  8  TB

Current position:  7.95  TB (99%)

Estimated speed:  95.84  MB/sec

Estimated finish:  9  minutes

 

Based on the time stamps it's pretty obvious to see that it probably took less than 2 days to complete the entire sync therefore proving that the drive is skipping the consistent cache and just doing full band rewrites during a rebuild.

Link to comment

I got faster preclear times on a SATA port.  About 76 hours per cycle. ...

I think I've figured why your preclearing was so much faster than mine - I was preclearing two drives simultaneously, both connected via single USB 3.0 to PCIe card. Here is little table with read speeds I've got yesterday for various configurations of three drives and two cards:

 

sdasdbsdc

Card1145------

Card2---100100

Card1145------

Card2------145

 

 

Card1100100---

Card2------145

 

Card1707070

Card2---------

 

 

 

 

Numbers are approximate reading speed, in MB/s, during first 2% of prereading.

Card1 is Vantec UGT-PC341 4-port USB 3.0 PCIe x1

Card2 is Mediasonic HP1-U34F 4-port USB 3.0 PCIe x1

 

sda, sdb, sdc are all Seagate 8TB USB 3.0 externals.

 

Mobo is Supermicro H8DME-2.

Link to comment

An interesting review of these drives from an Amazon customer of these drives. With pics and stats.

 

http://www.amazon.com/gp/aw/review/B00QX0ZGO6/RNCS7U6K0383B/ref=mw_dp_cr?cursor=1&sort=rd

 

Good stuff, thanks for posting.

 

I still saw slowdowns back to the 3MB/s range even with one transfer but they didn't last as long and they were not as frequent.

 

I found this particularly interesting and telling. This phenominon was more pronounced with 2 simultaneous transfers, but also occured with even a single transfer. This, I believe, is a by-product of non-sequential write activity inherent in writes even to a very unfragmented disk. But I beleive that this tendency will increase with normal disk creation of new files and deletion of old ones. In a highly fragmented disk I think performance will degrade significantly. That appears to be the Achilles heel. A huge cache which Gary eludes to may help, but here is evidence it is being seen even with a relatively new and unfragmented disk.

 

I continue to see the parity disk as a cautionary area. If all of your disks are unfragmented and performing sequential writes, the parity writes may hum along sequentially also, but if any of your protected disks are subject to a lot of data turnover, parity updates when writing to that disk may very well become a problem. But only time will tell. This is cautionary and not absolute.

 

If I were wanting to transition to using these 8T drives, I would likely go with the RAID0 and follow instructions like the following ...

 

- Buy an Areca RAID card  (e.g., ARC-120x, ARC-123x, ARC-126x, ARC-1280)

- Empty or buy 2 identical 4T fast disks

- Create an 8T RAID0 array

- Compare your 8T RAID0 device size to the device size of the big 8T drives (you really need to know if the RAID0 size is the same size or larger)

- If the RAID0 is slightly smaller, you need to create a small HPA on the 8T drive to make it the same size or smaller as the big boy (I consider this unlikely. I have 2 3T drives in a RAID0 and that was significantly larger than a new 6T drive)

- Then install the big 8T disk outside the array

- Copy static-est data from fast-est PMR disks to the 8T beast with no parity overhead. Fill it up so you'll be able to free up 8T of faster storage in your array. Assume this data will be retained forever.

- Unmount the 8T drive outside the array

- Do a new config including the RAID0 as parity, the new 8T disk as part of the array

- Build parity onto the 8T RAID0 array

- Compare the MD5s of the data copied to the 8T drive (see RFS to XFS instructions)

- Once satisfied all copied properly, delete the files that were copied to the 8T disk from the smaller disks

- Going forward your new writes will be to the smaller disks and your parity will be as fast as possible

 

It would be nice to be able to reuse the 4T parity (assuming that is your current parity size) and an identical model 4T data disk after copying the data disk to the 8T disk, to become the RAID0 parity disks. I can see an optimization to make that possible.

 

I actually have a pair of 4T disks set aside to either expand my array or for this exact purpose in the future. If someone will post the size of the 8T disk, I can create a Areca RAID0 parity and post its size. From myMain, refer to "Detail View", and look the column called "Size (k)" for that big 8T. drive.

Link to comment

@bjp999 I like that idea for transitioning to those drives in a production environment. Don't think it impacts my scenario though.

 

I'll continue to use 3TB WD Reds in main server and Seagates in Backup. For now. I'll see how these Seagtates perform in others main environments and in my backup scenario but no need to transition my WD's yet as still 1.8 yrs warranty left and spinning along quite nicely.

 

As for my 3 Seagates here - I imagine it's going to take "a loooong time" to do a continuous backup of my 12TB of data on the main server (unless because they are zerod the backup application I use writes the continuous backup sequentially??) and of course if it's continuous I get a nice sequential parity write too. It's unlikkey I'll have much regular file turnover - just weekly filling every Sunday so I'm hoping all is ok- proof as they say is in the pudding!! :-)

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.