I have 4x 18Tb WD181KFGX drives in my Synology DS1019+ that have been doing fine for around 1 year. I opted to move them (and 3x 8Tb Red Pros) to my Unraid server to combine all drives into one location (I plan to sell the Synology as my storage demands have increased).
The 8Tb drives moved over flawlessly and have been running for about a month with zero hiccups. I pulled the first 18Tb for transition and Unraid threw a SMART error flag. So I checked it in a number of SMART test environments and it came back clean. So I put it back in the Syno, rebuilt the array, and everything was fine. Then I pulled a different 18Tb from the Syno for transition and stuck it in the Unraid box... it too threw SMART errors. Wut?
Fine... I then ran it through Unraids SMART tests (both short and extended) and it came back clean. Why is Unraid showing it as having errors if the SMART tests are showing it clean? And why would it only be happening to the 18Tb drives?
Here's the log for the second drive from Unraid:
ATA Error Count: 3
CR = Command Register [HEX]
FR = Features Register [HEX]
SC = Sector Count Register [HEX]
SN = Sector Number Register [HEX]
CL = Cylinder Low Register [HEX]
CH = Cylinder High Register [HEX]
DH = Device/Head Register [HEX]
DC = Device Command Register [HEX]
ER = Error register [HEX]
ST = Status register [HEX]
Powered_Up_Time is measured from power on, and printed as
DDd+hh:mm:SS.sss where DD=days, hh=hours, mm=minutes,
SS=sec, and sss=millisec. It "wraps" after 49.710 days.
Error 3 occurred at disk power-on lifetime: 7465 hours (311 days + 1 hours)
When the command that caused the error occurred, the device was active or idle.
After command completion occurred, registers were:
ER ST SC SN CL CH DH
-- -- -- -- -- -- --
84 43 00 00 00 00 00 Error: ICRC, ABRT at LBA = 0x00000000 = 0
Commands leading to the command that caused the error were:
CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name
-- -- -- -- -- -- -- -- ---------------- --------------------
61 20 18 e0 0b 7d 40 08 2d+23:21:11.619 WRITE FPDMA QUEUED
61 20 10 20 13 81 40 08 2d+23:21:11.617 WRITE FPDMA QUEUED
61 20 08 20 11 81 40 08 2d+23:21:11.617 WRITE FPDMA QUEUED
61 20 00 e0 0b 81 40 08 2d+23:21:11.617 WRITE FPDMA QUEUED
61 20 f0 c0 0a 81 40 08 2d+23:21:11.617 WRITE FPDMA QUEUED
Error 2 occurred at disk power-on lifetime: 7465 hours (311 days + 1 hours)
When the command that caused the error occurred, the device was active or idle.
After command completion occurred, registers were:
ER ST SC SN CL CH DH
-- -- -- -- -- -- --
84 43 00 00 00 00 00 Error: ICRC, ABRT at LBA = 0x00000000 = 0
Commands leading to the command that caused the error were:
CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name
-- -- -- -- -- -- -- -- ---------------- --------------------
61 60 78 e0 10 7d 40 08 2d+23:19:43.782 WRITE FPDMA QUEUED
61 20 38 60 08 81 40 08 2d+23:19:43.778 WRITE FPDMA QUEUED
61 20 30 c0 06 81 40 08 2d+23:19:43.778 WRITE FPDMA QUEUED
61 20 28 c0 05 81 40 08 2d+23:19:43.778 WRITE FPDMA QUEUED
61 20 20 a0 00 81 40 08 2d+23:19:43.778 WRITE FPDMA QUEUED
Error 1 occurred at disk power-on lifetime: 7417 hours (309 days + 1 hours)
When the command that caused the error occurred, the device was active or idle.
After command completion occurred, registers were:
ER ST SC SN CL CH DH
-- -- -- -- -- -- --
84 43 00 00 00 00 00 Error: ICRC, ABRT at LBA = 0x00000000 = 0
Commands leading to the command that caused the error were:
CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name
-- -- -- -- -- -- -- -- ---------------- --------------------
61 20 78 c0 cf 80 40 08 23:20:30.907 WRITE FPDMA QUEUED
61 20 b0 c0 dd 80 40 08 23:20:30.902 WRITE FPDMA QUEUED
61 20 a8 c0 dc 80 40 08 23:20:30.902 WRITE FPDMA QUEUED
61 20 a0 c0 db 80 40 08 23:20:30.902 WRITE FPDMA QUEUED
61 40 98 a0 d9 80 40 08 23:20:30.902 WRITE FPDMA QUEUED
And here's the attribute page from the first drive (that's back in the Syno):
Final question: what can I do to fix this? I am already RMA'ing one of the drives in hopes that the replacement is fresh and doesn't trigger any SMART warnings. But I don't want to have to RMA every drive unless absolutely necessary. Especially since this seems to be an Unraid-only issue. Suggestions?