Seagate 8TB Shingled Drives in UnRAID


Recommended Posts

... You'd have to be nuts to use a shingled drive in an array like this.

 

You apparently haven't read about the mitigations Seagate has incorporated into these drives to offset the potential issues with shingled technology.    These are outlined in some detail in the 2nd post in this thread -- the most relevant fact vis-à-vis typical UnRAID usage is "... If you're writing a large amount of sequential data, you'll end up with very little use of the persistent cache, since the drives will recognize that you're writing all of the sectors in each of the shingled zones.  There may be a few cases where this isn't true - but those will be written to the persistent cache, and it's unlikely you'll ever fill it."

 

So the performance "wall" you'll hit with shingled drives simply isn't likely with typical UnRAID usage.  Note that both danioj and ashman70 have VERY large arrays and have been using these drives for well over a year and had NO issues with them.  There are many other users who have had similar experiences -- and virtually NO reports of any significant write performance issues.

 

 

... Every time sustained writes fill that up, the drive locks in a busy cycle until it flushes the cache in a slow stream of read-modify-write cycles to the shingled storage area.

 

Correct => this is the performance "wall" I referred to.  But the simple fact is that actual users of UnRAID who are using these drives have NOT found this to be an issue.    Remember that most UnRAID users write a lot of LARGE media files ... and these files simply won't be using the persistent cache, but will be written directly to the shingled area, so there's no performance "hit" with these files.

 

 

... Just because you can do it, doesn't mean that you necessarily should.

 

Agree -- and if these were priced the same as PMR drives, I'd definitely recommend using the non-shingled units.  But many folks are price sensitive -- and at $233.88 for an 8TB Seagate Archive drive vs. $310.97 for an 8TB WD Red drive (current Amazon prices for both) it can make a significant difference in the cost of populating an array => e.g. $770 difference if you're buying 10 drives.

 

... and of course there's NO difference in read performance.

 

 

  • Upvote 1
Link to comment

I regularly do a single write of 100 -120 GB of data at a time, no speed issues at all, with "reconstruct write" on I nearly always max out the Gigabit connection, it occasionally drops from 113MB/s to 90MB/s.  I have 11 of these shingled drives in the server, 2 of them are parity.

  • Upvote 1
Link to comment

... You'd have to be nuts to use a shingled drive in an array like this.

 

You apparently haven't read about the mitigations Seagate has incorporated into these drives to offset the potential issues with shingled technology.    These are outlined in some detail in the 2nd post in this thread -- the most relevant fact vis-à-vis typical UnRAID usage is "... If you're writing a large amount of sequential data, you'll end up with very little use of the persistent cache, since the drives will recognize that you're writing all of the sectors in each of the shingled zones.  There may be a few cases where this isn't true - but those will be written to the persistent cache, and it's unlikely you'll ever fill it."

 

So the performance "wall" you'll hit with shingled drives simply isn't likely with typical UnRAID usage.  Note that both danioj and ashman70 have VERY large arrays and have been using these drives for well over a year and had NO issues with them.  There are many other users who have had similar experiences -- and virtually NO reports of any significant write performance issues.

 

 

... Every time sustained writes fill that up, the drive locks in a busy cycle until it flushes the cache in a slow stream of read-modify-write cycles to the shingled storage area.

 

Correct => this is the performance "wall" I referred to.  But the simple fact is that actual users of UnRAID who are using these drives have NOT found this to be an issue.    Remember that most UnRAID users write a lot of LARGE media files ... and these files simply won't be using the persistent cache, but will be written directly to the shingled area, so there's no performance "hit" with these files.

 

 

... Just because you can do it, doesn't mean that you necessarily should.

 

Agree -- and if these were priced the same as PMR drives, I'd definitely recommend using the non-shingled units.  But many folks are price sensitive -- and at $233.88 for an 8TB Seagate Archive drive vs. $310.97 for an 8TB WD Red drive (current Amazon prices for both) it can make a significant difference in the cost of populating an array => e.g. $770 difference if you're buying 10 drives.

 

... and of course there's NO difference in read performance.

I don't know how you guys use unRAID, but I move all my files at night via SynchBackPro.  While write speed is important, read speed is actually more important to me in my use. And I have 20TB+ on unRAID. 

 

Sent from my SM-N920V using Tapatalk

 

 

Link to comment
  • 2 weeks later...

Just a note on the new brackets: I haven't received mine yet, nor seen a picture of the new design. So they may or may not exist in reality. I'll report back when/if I receive them.

 

Finally received the new Fractal Node 304 brackets that have been redesigned to fit these drives.

Not sure what I was expecting, but all they did was to add another mounting hole so the drive can be attached at three points instead two.

Seeing how my 8TB drive was held in place by rubber bands up until now, I'll call it an improvement!

 

Old bracket, drive can only be mounted to the two holes on the right:

1585r85.jpg

 

New bracket, with the extra mounting hole at top left:

jud83r.jpg

 

 

 

 

 

Link to comment
  • 3 weeks later...

So i have 4 of these disks in my setup (4x 8TB Seagate ST8000AS0002) 2x 8TB WD Red disks and 3x 4TB (seagate and wd).

 

But from the first week of my Build (now 2 months old) there were some problems with disk1 in my Array (the 8TB Seagate disk).

Disk was disabled and many sector errors. So i RMA'd the disk and ordered a new one (was on my wishlist) and replaced the broken disk with the new ordered disk. All was well for a couple of days, and then Disk1 was broken again, remember that this was the replaced new disk). Same happening sector errors so the drive was disabled. Re-added the disk and parity was calculated an restored the disk.

 

Offcourse the problem was returning after a few days. So the status this moment is, send that broken disk to Seagate, ordered a 8TB WD disk to replace it. the replaced Seagate disk will be added later.

 

So you can say my luck with these disks is not joyfull at this moment...

Link to comment

And back to same question.. how do you confirm machine is good... same as cabling...  I have not had my machine lock up or crash in years but my HDD did die a while back. Your "cable" theory.. well it's great but how to test it... Certainly would not do so by buying another 8Tb drive changing cable and waiting to see if it dies...

Link to comment

If you can't obtain a new, high quality SATA cable, and check it's working OK, then really I think there's little hope for your problem solving abilities.

 

But, hey, I've only 23 years of computer technician experience, what would I know?  ::)

 

You don't seem to follow.. if the cable works.. and seems to work.. but isn't quite 100.00000% then how would you know? I mean, I have cables that transfer data 24/7 fine.. but your suggestion is that cables may be the fault...  Maybe its too complicated a question for your brain the size of a planet ;)  Do you sit and copy 200Tb of data through that cable and if no errors are thrown you call that 100%? Same with power cables...  And on power cables you have the effect of what other things are drawing on PSU at same time perhaps dropping voltages a tiny percentage.

 

It's all well and good to make grand statements like "maybe its the power/data cables" but when asked to explain your process for checking just imply the reader is stupid.. That ISNT what I have come to expect at these forums.

 

And actually I've been building computers for over 27 years but I would never suggest that means I know everything .. its all "willy waving" to me...  ;)

 

I actually thought MAYBE you were on to something suggesting that the guy had cable issues... until you responded just now...

Link to comment

 

Disk was disabled and many sector errors.  Same happening sector errors so the drive was disabled.

 

I'd put money on there being nothing at all wrong with either disk, and it's actually a bad SATA cable or power cable.

 

That was what i was thinking, but i replaced the Sata cables (also the complete psu (now have a corsair 550) and also (after that didn't work) changed from the sata controller to the onboard sata controller. both the same result (disabled disk after sector errors).

Link to comment

 

Disk was disabled and many sector errors.  Same happening sector errors so the drive was disabled.

 

I'd put money on there being nothing at all wrong with either disk, and it's actually a bad SATA cable or power cable.

 

That was what i was thinking, but i replaced the Sata cables (also the complete psu (now have a corsair 550) and also (after that didn't work) changed from the sata controller to the onboard sata controller. both the same result (disabled disk after sector errors).

 

Did you replace the SATA cable with a high-quality locking cable?    Also, with 9 disks 550w SHOULD be enough, but is this one of Corsair's better power supplies or the low-end CX series?  If your 3rd disk has this same issue, I'd replace the power supply with a higher quality unit that also has a bit more capacity (650w would be PLENTY).

 

Meanwhile, I have to agree with your earlier conclusion ...

... So you can say my luck with these disks is not joyfull at this moment...

Link to comment

... BTW, this is off-topic, but I'll add my nickel's worth r.e. the off-topic arguments r.e. how to determine if it's a bad cable, etc.

 

=>  I agree it's worth checking both the cables and the power supply -- especially after the 2nd virtually identical failure.  As for whether the drive is having issues, or the interface to the drive is perhaps at fault -- look at the SMART data and see if it's reallocated any sectors (or marked them as pending).  This would indicated bad sectors identified by the drive.

 

=>  After you've replaced the cables, you CAN try to rebuild onto the same drive.  Just Stop the array; unassign the drive; Start the array so it shows as missing;  then Stop the array again and assign the drive back to that slot; and then Start the array.

 

=>  r.e. "... how would you diagnose which cable was "bad".. " ==>  Not sure why this is an issue.  You simply need to trace which cable goes to the slot the drive is in.  If the drives are directly connected, that's trivial;  if they're in a multi-bay hot-swap case like a Norco, then you may need to check the manual (and the cable itself may be an SFF cable instead of a single SATA cable).

 

=>  r.e. how many years you've been doing this.  Largely irrelevant.  I think the quality of your experiences and what you've learned from them is more important than the # of years -- at least once you're past a few years of "apprenticeship".  FWIW I've been building computers for over 41 years (since my first Altair in 1975), and have worked with them for over 53 years.

 

Link to comment

 

Disk was disabled and many sector errors.  Same happening sector errors so the drive was disabled.

 

I'd put money on there being nothing at all wrong with either disk, and it's actually a bad SATA cable or power cable.

 

Without logs and smartctl, putting money on anything is speculative.

 

Drives are very clear about frontend (cable/interface) vs backend (media/head/servo) problems. Specific counters, like 199, would indicate a cable problem. But other counters, like 5,196,197,198 are not cable related.

 

That was what i was thinking, but i replaced the Sata cables (also the complete psu (now have a corsair 550) and also (after that didn't work) changed from the sata controller to the onboard sata controller. both the same result (disabled disk after sector errors).

 

That's a lot of part swapping without a log check or smart report.

Link to comment

 

Disk was disabled and many sector errors.  Same happening sector errors so the drive was disabled.

 

I'd put money on there being nothing at all wrong with either disk, and it's actually a bad SATA cable or power cable.

 

That was what i was thinking, but i replaced the Sata cables (also the complete psu (now have a corsair 550) and also (after that didn't work) changed from the sata controller to the onboard sata controller. both the same result (disabled disk after sector errors).

 

Did you replace the SATA cable with a high-quality locking cable?    Also, with 9 disks 550w SHOULD be enough, but is this one of Corsair's better power supplies or the low-end CX series?  If your 3rd disk has this same issue, I'd replace the power supply with a higher quality unit that also has a bit more capacity (650w would be PLENTY).

 

Meanwhile, I have to agree with your earlier conclusion ...

... So you can say my luck with these disks is not joyfull at this moment...

 

The Corsair is a RMX550, the one with the modulair cables.

 

I didn't replace it with a high quality cable, i have a set of sata cables (same for all the other disks)

 

I don't have any smart status at this moment because i wanted to get a new disk asap, however i had an error in the log. That error i googled (i don't have the error available to me right now) that suggested that the disk was really broken.

Link to comment

The Corsair is a RMX550, the one with the modulair cables.

 

I didn't replace it with a high quality cable, i have a set of sata cables (same for all the other disks)

 

I don't have any smart status at this moment because i wanted to get a new disk asap, however i had an error in the log. That error i googled (i don't have the error available to me right now) that suggested that the disk was really broken.

 

If you want to get to the bottom of your problem you need to approach it systematically and the key to that is SMART.

 

Link to comment

Here is the SMART Report for the disk, i just received word from Seagate that the disk that i already send was indeed broken en they just send me a new one.

 

Here is the smart data for the disk (2of2) that i didn't send to seagate (will do it if it fails again).

 

smartctl 6.5 2016-05-07 r4318 [x86_64-linux-4.4.30-unRAID] (local build)

Copyright © 2002-16, Bruce Allen, Christian Franke, www.smartmontools.org

 

=== START OF INFORMATION SECTION ===

Model Family:    Seagate Archive HDD

Device Model:    ST8000AS0002-1NA17Z

Serial Number:    Z840PHNF

LU WWN Device Id: 5 000c50 0925478dd

Firmware Version: AR17

User Capacity:    8,001,563,222,016 bytes [8.00 TB]

Sector Sizes:    512 bytes logical, 4096 bytes physical

Rotation Rate:    5980 rpm

Device is:        In smartctl database [for details use: -P show]

ATA Version is:  ACS-2, ACS-3 T13/2161-D revision 3b

SATA Version is:  SATA 3.1, 6.0 Gb/s (current: 6.0 Gb/s)

Local Time is:    Fri Dec  2 12:57:26 2016 CET

SMART support is: Available - device has SMART capability.

SMART support is: Enabled

 

=== START OF READ SMART DATA SECTION ===

SMART overall-health self-assessment test result: PASSED

 

General SMART Values:

Offline data collection status:  (0x82) Offline data collection activity

was completed without error.

Auto Offline Data Collection: Enabled.

Self-test execution status:      (  0) The previous self-test routine completed

without error or no self-test has ever

been run.

Total time to complete Offline

data collection: (    0) seconds.

Offline data collection

capabilities: (0x7b) SMART execute Offline immediate.

Auto Offline data collection on/off support.

Suspend Offline collection upon new

command.

Offline surface scan supported.

Self-test supported.

Conveyance Self-test supported.

Selective Self-test supported.

SMART capabilities:            (0x0003) Saves SMART data before entering

power-saving mode.

Supports SMART auto save timer.

Error logging capability:        (0x01) Error logging supported.

General Purpose Logging supported.

Short self-test routine

recommended polling time: (  1) minutes.

Extended self-test routine

recommended polling time: ( 945) minutes.

Conveyance self-test routine

recommended polling time: (  2) minutes.

SCT capabilities:       (0x30b5) SCT Status supported.

SCT Feature Control supported.

SCT Data Table supported.

 

SMART Attributes Data Structure revision number: 10

Vendor Specific SMART Attributes with Thresholds:

ID# ATTRIBUTE_NAME          FLAG    VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE

  1 Raw_Read_Error_Rate    0x000f  117  099  006    Pre-fail  Always      -      153884704

  3 Spin_Up_Time            0x0003  091  091  000    Pre-fail  Always      -      0

  4 Start_Stop_Count        0x0032  100  100  020    Old_age  Always      -      15

  5 Reallocated_Sector_Ct  0x0033  100  100  010    Pre-fail  Always      -      0

  7 Seek_Error_Rate        0x000f  081  060  030    Pre-fail  Always      -      4452444632

  9 Power_On_Hours          0x0032  099  099  000    Old_age  Always      -      977

10 Spin_Retry_Count        0x0013  100  100  097    Pre-fail  Always      -      0

12 Power_Cycle_Count      0x0032  100  100  020    Old_age  Always      -      15

183 Runtime_Bad_Block      0x0032  099  099  000    Old_age  Always      -      1

184 End-to-End_Error        0x0032  100  100  099    Old_age  Always      -      0

187 Reported_Uncorrect      0x0032  100  100  000    Old_age  Always      -      0

188 Command_Timeout        0x0032  100  100  000    Old_age  Always      -      0

189 High_Fly_Writes        0x003a  100  100  000    Old_age  Always      -      0

190 Airflow_Temperature_Cel 0x0022  061  052  045    Old_age  Always      -      39 (Min/Max 21/43)

191 G-Sense_Error_Rate      0x0032  100  100  000    Old_age  Always      -      0

192 Power-Off_Retract_Count 0x0032  100  100  000    Old_age  Always      -      266

193 Load_Cycle_Count        0x0032  100  100  000    Old_age  Always      -      400

194 Temperature_Celsius    0x0022  039  048  000    Old_age  Always      -      39 (0 21 0 0 0)

195 Hardware_ECC_Recovered  0x001a  117  099  000    Old_age  Always      -      153884704

197 Current_Pending_Sector  0x0012  100  100  000    Old_age  Always      -      0

198 Offline_Uncorrectable  0x0010  100  100  000    Old_age  Offline      -      0

199 UDMA_CRC_Error_Count    0x003e  200  200  000    Old_age  Always      -      0

240 Head_Flying_Hours      0x0000  100  253  000    Old_age  Offline      -      932 (48 171 0)

241 Total_LBAs_Written      0x0000  100  253  000    Old_age  Offline      -      109234797504

242 Total_LBAs_Read        0x0000  100  253  000    Old_age  Offline      -      198835488429

 

SMART Error Log Version: 1

No Errors Logged

 

SMART Self-test log structure revision number 1

Num  Test_Description    Status                  Remaining  LifeTime(hours)  LBA_of_first_error

# 1  Short offline      Completed without error      00%      975        -

 

SMART Selective self-test log data structure revision number 1

SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS

    1        0        0  Not_testing

    2        0        0  Not_testing

    3        0        0  Not_testing

    4        0        0  Not_testing

    5        0        0  Not_testing

Selective self-test flags (0x0):

  After scanning selected spans, do NOT read-scan remainder of disk.

If Selective self-test is pending on power-up, resume after 0 minute delay.

 

If de disk fails again, i will post the errors i receive maybe that explains it a little bit better.

Link to comment

And here is the output of the Extended test, still everything is ok at this moment. the new WD Disk does its job well.

 

 

smartctl 6.5 2016-05-07 r4318 [x86_64-linux-4.4.30-unRAID] (local build)

Copyright © 2002-16, Bruce Allen, Christian Franke, www.smartmontools.org

 

=== START OF INFORMATION SECTION ===

Model Family:    Seagate Archive HDD

Device Model:    ST8000AS0002-1NA17Z

Serial Number:    Z840PHNF

LU WWN Device Id: 5 000c50 0925478dd

Firmware Version: AR17

User Capacity:    8,001,563,222,016 bytes [8.00 TB]

Sector Sizes:    512 bytes logical, 4096 bytes physical

Rotation Rate:    5980 rpm

Device is:        In smartctl database [for details use: -P show]

ATA Version is:  ACS-2, ACS-3 T13/2161-D revision 3b

SATA Version is:  SATA 3.1, 6.0 Gb/s (current: 6.0 Gb/s)

Local Time is:    Sat Dec  3 19:26:46 2016 CET

SMART support is: Available - device has SMART capability.

SMART support is: Enabled

 

=== START OF READ SMART DATA SECTION ===

SMART overall-health self-assessment test result: PASSED

 

General SMART Values:

Offline data collection status:  (0x82) Offline data collection activity

was completed without error.

Auto Offline Data Collection: Enabled.

Self-test execution status:      (  0) The previous self-test routine completed

without error or no self-test has ever

been run.

Total time to complete Offline

data collection: (    0) seconds.

Offline data collection

capabilities: (0x7b) SMART execute Offline immediate.

Auto Offline data collection on/off support.

Suspend Offline collection upon new

command.

Offline surface scan supported.

Self-test supported.

Conveyance Self-test supported.

Selective Self-test supported.

SMART capabilities:            (0x0003) Saves SMART data before entering

power-saving mode.

Supports SMART auto save timer.

Error logging capability:        (0x01) Error logging supported.

General Purpose Logging supported.

Short self-test routine

recommended polling time: (  1) minutes.

Extended self-test routine

recommended polling time: ( 945) minutes.

Conveyance self-test routine

recommended polling time: (  2) minutes.

SCT capabilities:       (0x30b5) SCT Status supported.

SCT Feature Control supported.

SCT Data Table supported.

 

SMART Attributes Data Structure revision number: 10

Vendor Specific SMART Attributes with Thresholds:

ID# ATTRIBUTE_NAME          FLAG    VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE

  1 Raw_Read_Error_Rate    0x000f  118  099  006    Pre-fail  Always      -      198727088

  3 Spin_Up_Time            0x0003  091  091  000    Pre-fail  Always      -      0

  4 Start_Stop_Count        0x0032  100  100  020    Old_age  Always      -      15

  5 Reallocated_Sector_Ct  0x0033  100  100  010    Pre-fail  Always      -      0

  7 Seek_Error_Rate        0x000f  081  060  030    Pre-fail  Always      -      4460112493

  9 Power_On_Hours          0x0032  099  099  000    Old_age  Always      -      1007

10 Spin_Retry_Count        0x0013  100  100  097    Pre-fail  Always      -      0

12 Power_Cycle_Count      0x0032  100  100  020    Old_age  Always      -      15

183 Runtime_Bad_Block      0x0032  099  099  000    Old_age  Always      -      1

184 End-to-End_Error        0x0032  100  100  099    Old_age  Always      -      0

187 Reported_Uncorrect      0x0032  100  100  000    Old_age  Always      -      0

188 Command_Timeout        0x0032  100  100  000    Old_age  Always      -      0

189 High_Fly_Writes        0x003a  100  100  000    Old_age  Always      -      0

190 Airflow_Temperature_Cel 0x0022  061  052  045    Old_age  Always      -      39 (Min/Max 21/43)

191 G-Sense_Error_Rate      0x0032  100  100  000    Old_age  Always      -      0

192 Power-Off_Retract_Count 0x0032  100  100  000    Old_age  Always      -      268

193 Load_Cycle_Count        0x0032  100  100  000    Old_age  Always      -      402

194 Temperature_Celsius    0x0022  039  048  000    Old_age  Always      -      39 (0 21 0 0 0)

195 Hardware_ECC_Recovered  0x001a  118  099  000    Old_age  Always      -      198727088

197 Current_Pending_Sector  0x0012  100  100  000    Old_age  Always      -      0

198 Offline_Uncorrectable  0x0010  100  100  000    Old_age  Offline      -      0

199 UDMA_CRC_Error_Count    0x003e  200  200  000    Old_age  Always      -      0

240 Head_Flying_Hours      0x0000  100  253  000    Old_age  Offline      -      962 (18 143 0)

241 Total_LBAs_Written      0x0000  100  253  000    Old_age  Offline      -      112011471096

242 Total_LBAs_Read        0x0000  100  253  000    Old_age  Offline      -      199789103661

 

SMART Error Log Version: 1

No Errors Logged

 

SMART Self-test log structure revision number 1

Num  Test_Description    Status                  Remaining  LifeTime(hours)  LBA_of_first_error

# 1  Extended offline    Completed without error      00%      1005        -

# 2  Short offline      Completed without error      00%      975        -

 

SMART Selective self-test log data structure revision number 1

SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS

    1        0        0  Not_testing

    2        0        0  Not_testing

    3        0        0  Not_testing

    4        0        0  Not_testing

    5        0        0  Not_testing

Selective self-test flags (0x0):

  After scanning selected spans, do NOT read-scan remainder of disk.

If Selective self-test is pending on power-up, resume after 0 minute delay.

 

Link to comment

I have 6 of these setup in a 40tb array with a single parity.

 

I'm looking to setup 2-3 IP cameras for home security, does anyone see an issue with using the SMR drives and unRAID in general for an NVR use case?

 

I'd be a bit concerned about the constant writes at random locations from 3 different video streams -- depending on the specific distribution of the writes this may fill the persistent cache ... and if that happens performance will drop drastically -- and could effectively "freeze" the system.    No way to know for sure except to try it -- IP cameras are generally not very high bandwidth video, so it may not be an issue at all.    One easy way to avoid any issue would be to have each camera record on a different drive; and to replace your parity drive with a standard PMR drive [ironwolf or a WD Red].

 

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.