Jump to content

Smart failure


Recommended Posts

Hi, could someone help me interpret this smart failure? Is this a "replace the disk right now" problem or more like an early warning signal?

 

On the advice gleaned from reading other help topics, I've run a short smart test (no errors) and a long smart test (error occurred). I've attached the smart log.

 

I've recently retrieved an older (circa ~2017) 8TB Seagate external drive from my parents'. It has sat unused some time I think. I've shucked it and put it into my array. When I started the array it cleared the drive without complaint and then I formatted it with xfs, again without complaint. I have just started trying to use it (by copying some files on the command line from /mnt/disk1 to this disk) and Unraid has reported an error and disabled the disk.

 

I have also just started using a (new to me) Fujitsu D2607-A21 flavor LSI 9211 HBA card with IT firmeware purchased from a Hong Kong ebay seller. Perhaps that is related, the card hasn't proven itself yet.

 

From my understanding, these two errors are what Unraid is warning me about but I do not have the overall knowledge to understand how serious it is.

 

ATA Error Count: 2
    CR = Command Register [HEX]
    FR = Features Register [HEX]
    SC = Sector Count Register [HEX]
    SN = Sector Number Register [HEX]
    CL = Cylinder Low Register [HEX]
    CH = Cylinder High Register [HEX]
    DH = Device/Head Register [HEX]
    DC = Device Command Register [HEX]
    ER = Error register [HEX]
    ST = Status register [HEX]
Powered_Up_Time is measured from power on, and printed as
DDd+hh:mm:SS.sss where DD=days, hh=hours, mm=minutes,
SS=sec, and sss=millisec. It "wraps" after 49.710 days.

Error 2 occurred at disk power-on lifetime: 4639 hours (193 days + 7 hours)
  When the command that caused the error occurred, the device was active or idle.

  After command completion occurred, registers were:
  ER ST SC SN CL CH DH
  -- -- -- -- -- -- --
  40 51 00 ff ff ff 0f  Error: WP at LBA = 0x0fffffff = 268435455

  Commands leading to the command that caused the error were:
  CR FR SC SN CL CH DH DC   Powered_Up_Time  Command/Feature_Name
  -- -- -- -- -- -- -- --  ----------------  --------------------
  61 00 00 ff ff ff 4f 00      23:37:42.990  WRITE FPDMA QUEUED
  61 00 00 ff ff ff 4f 00      23:37:42.979  WRITE FPDMA QUEUED
  60 00 e8 ff ff ff 4f 00      23:37:42.964  READ FPDMA QUEUED
  60 00 00 ff ff ff 4f 00      23:37:42.964  READ FPDMA QUEUED
  60 00 00 ff ff ff 4f 00      23:37:42.963  READ FPDMA QUEUED

Error 1 occurred at disk power-on lifetime: 4639 hours (193 days + 7 hours)
  When the command that caused the error occurred, the device was active or idle.

  After command completion occurred, registers were:
  ER ST SC SN CL CH DH
  -- -- -- -- -- -- --
  40 51 00 ff ff ff 0f  Error: WP at LBA = 0x0fffffff = 268435455

  Commands leading to the command that caused the error were:
  CR FR SC SN CL CH DH DC   Powered_Up_Time  Command/Feature_Name
  -- -- -- -- -- -- -- --  ----------------  --------------------
  61 00 00 ff ff ff 4f 00      23:37:37.612  WRITE FPDMA QUEUED
  61 00 00 ff ff ff 4f 00      23:37:37.608  WRITE FPDMA QUEUED
  61 00 00 ff ff ff 4f 00      23:37:37.604  WRITE FPDMA QUEUED
  60 00 60 ff ff ff 4f 00      23:37:37.591  READ FPDMA QUEUED
  60 00 00 ff ff ff 4f 00      23:37:37.591  READ FPDMA QUEUED
 

 

 

ST8000DM004-2CX188_WG800K2R-20210320-1529.txt

tower-diagnostics-20210320-1702.zip

Edited by unburt
Link to comment
18 hours ago, unburt said:

8TB Seagate external drive

 

The big problem I have with those Barracuda Compute drives is the fact that they're designed only for very light workloads. Apart from the fact that they run quite warm in their fanless plastic boxes, their use case as only occasionally used backup drives is almost ideal. But if you shuck them and install them in a server that's powered 24/7 you're really operating them outside of their design envelope. The fact that they use SMR recording technology[1] is not a major concern for me but the fact than even a monthly parity check will exceed their Workload Rate Limit[2] of 55 TB/year,[3] by a considerable margin, is. They are not even intended to be powered continuously (2400 hours/year[3], or a duty cycle of approximately 27%), though I would accept that being powered on but spun down is not as wearing as spinning continuously, especially as they are likely to stay much cooler than inside their plastic enclosures.

 

References:

[1] https://www.seagate.com/gb/en/internal-hard-drives/cmr-smr-list/

[2] https://www.seagate.com/gb/en/support/kb/annualized-workload-rate-005902en/

[3] https://www.seagate.com/www-content/datasheets/pdfs/3-5-barracudaDS1900-10-1802US-en_US.pdf

Link to comment

You make a great point about the workloads that can be expected from the drive that I have. I promise I endeavoured to use it in a very light-duty way (probably contravening some other NAS/unraid principles though). I intended to exclude it from other shares and to write 8TB of seldom accessed files to it now and never touch it again. Basically, I could have kept it as an external drive but it seemed "cleaner" to me to get it inside my PC case rather than having the enclosure sit on a nearby shelf.

 

Perhaps having it as a mounted unassigned disk would be an improvement so that it would not be included in monthly parity checks.

 

ps. I really appreciate your citations. With your comments + reading the cited articles, it was much easier for me to understand them.

Edited by unburt
Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...