SMART disk test results - differing results - Completed: servo/seek failure

fishface · November 18, 2022

I purchased 3 used 6TB hard drives for my second unRAID server.

Before setting them up I run a disk diagnostics tool named HDDScan, CrystalDiskInfo and on my Linux Desktop all 3 HGST HUS726060ALE610 drives.

All 3 apps gave the same results for the short SMART test, one drive had some UltraDMA CRC Errors, which seems likely down to a suspect cable.

I then install the drives in a new trail version of unRAID, I run the short SMART test again, and one of the drive continually fails the test, with the following error "Completed: servo/seek failure".

The disk appears to work fine, but would SMART test should I believe and why does unRAID SMART test find this issue? Is it doing different SMART tests to other apps?

JorgeB · November 18, 2022

7 minutes ago, fishface said:

why does unRAID SMART test

There's no Unraid SMART test, SMART tests are done by the disk, Unraid only starts a test and shows the results.

fishface · November 18, 2022

Ah, yes I forgot about that.

Strange how it passes with the other apps, I'm getting the disk replace anyway, but I did copy ~200GB to and it appears fine.

I guess unRAID has a better interpretation of the SMART output or something, bit of a puzzle.

Of course, there is a chance the drive failed when being physically moved from one host to another, not that it was dropped our anything.

Edited November 18, 2022 by fishface

fishface · November 18, 2022

Also seems odd, in the diagnostic I ran it says SMART Passed, but only gets 20% through the test and then "Completed: servo/seek failure".

Also the disk has a blue square, I cannot see what the blue square signifies in the Main Disk view, hover text/mouse-over text just says "New device"

I've started a pre-clear on it.

smartctl 7.3 2022-02-28 r5338 [x86_64-linux-5.19.14-Unraid] (local build)
Copyright (C) 2002-22, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Model Family: HGST Ultrastar 7K6000
Device Model: HGST HUS726060ALE610
Serial Number: Omitted
LU WWN Device Id: 5 000cca 242c0fb1c
Firmware Version: APGNT517
User Capacity: 6,001,175,126,016 bytes [6.00 TB]
Sector Sizes: 512 bytes logical, 4096 bytes physical
Rotation Rate: 7200 rpm
Form Factor: 3.5 inches
Device is: In smartctl database 7.3/5405
ATA Version is: ACS-2, ATA8-ACS T13/1699-D revision 4
SATA Version is: SATA 3.1, 6.0 Gb/s (current: 3.0 Gb/s)
Local Time is: Fri Nov 18 08:00:44 2022 PST
SMART support is: Available - device has SMART capability.
SMART support is: Enabled
AAM feature is: Unavailable
APM feature is: Disabled
Rd look-ahead is: Enabled
Write cache is: Enabled
DSN feature is: Unavailable
ATA Security is: Disabled, NOT FROZEN [SEC1]
Wt Cache Reorder: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

Edited November 18, 2022 by fishface

JorgeB · November 18, 2022

52 minutes ago, fishface said:

Also seems odd, in the diagnostic I ran it says SMART Passed,

That's normal SMART overall-health self-assessment only fails if there's a "failing NOW" attribute, and that's why it's basically meaningless.

JonathanM · November 18, 2022

5 hours ago, fishface said:

there is a chance the drive failed when being physically moved from one host to another,

More likely the power wiring is sketchy.

fishface · November 19, 2022

Ok, that is something to look into, thanks.

This server currently has no data on it, I'm testing and playing around with it before I fully commit it.

I'm running a pre-clear on it, with the array stopped, and so far (165MB/s to 234MB/s) no issues, it's been running for only a few hours so not long in, time will tell.

My other rig is a low powered 3 core AMD CPU, this one has a Intel Core i7 2600 @ 3.4GHz, which is a considerable step-up from the old rig. I only run a few dockers so my requirements are quite low.

What I have noticed, as I'm running 2 pre-clears at the same time, is that each pre-clear is using around 9% with dd, and CPU used shown in unRAID is ~25-40%, I was a little surprised and the CPU usage as I had read before that increasing CPU has little impact on pre-clear times, but if each dd is using 9% per dd, then increase CPU may impact pre-clear times a little.

Either way, these newer to me HGST drive are more less twice as fast as the previous drives, I benchmarked them against one my older 2TB HGST drives, and this is being bourne out in the pre-clear numbers I seeing.

itimpi · November 21, 2022

If you are going to use the SMART tests as an indication of drive health then you need to pass both the short and extended tests for a drive. It is not at all unusual for a drive to pass the short test but fail the extended test which is more thorough.

fishface · November 22, 2022

I have run extended test, 2 of the 3 drives passed, the 3rd, and already suspect drive, failed at 10% like before, servo failure, so RMA.

Brother-in-law, who was here this weekend, was, until very recently, a firmware engineer for the hard drive division at Seagate and then Western Digital, he said there are sort of servo track sectors, and if those sectors start going bad it can impact the servo, and of course the servo could be going faulty as well.

The suspect drive started to get sectors errors for the last 98% of the pre-clear, other drives passed.

SMART disk test results - differing results - Completed: servo/seek failure

Recommended Posts

fishface

Link to comment

JorgeB

Link to comment

fishface

Link to comment

fishface

Link to comment

JorgeB

Link to comment

JonathanM

Link to comment

fishface

Link to comment

itimpi

Link to comment

fishface

Link to comment

Join the conversation