PreClear Script 9 out of 20 2TB drives Failed?

snowmirage · August 20, 2018

I have a collection of 20 various 2TB disks I have slowly been adding to over the last few years and expanding different storage server projects. None of these have been heavily used by me, many were referb drives when I purchased them.

They've been in storage for about a year and a half while I built a new system and moved.

Finally building my new Unraid server (6.5.3) I started to run the PreClear Script provided with the PreClear plugin from Community Applications

I was expecting a few of the disks to possibly have some bad sectors or even some with SMART errors warning of impending doom, but so far 9 out of the 20 2TB drives have failed to complete the precheck script.

3 failed during the Pre-read verification
and 6 failed so far Zeroing the disk.

11 of the tests still running

All the 2TB drives are connected via an IBM M1015 + HP SAS expander.

I've read through several posts about this preclear script and what its doing thinking I had missed something to explain at least some of these failures other than failing drives, but I'm coming up empty handed.

Before I toss hundreds of dollars in the bin and spend hundreds more to replace them does it appear from the logs all these drives have really gone bad?

I've attached the preclear log below. I'll also attached the full diagnostic log once it finishes collecting the info (seems to be taking a while...)

PHOENIX-preclear.disk-20180820-1311.zip

gfjardim · August 20, 2018

Hi @snowmirage, thanks for reporting this. Your log is gold and it will take some time to analise it all, but saw some weird things with the script. I'll make some changes to it so I can get more information, ok?

Will let you know when to update it so we can debug this.

snowmirage · August 20, 2018

Fantastic thanks!

snowmirage · August 20, 2018

If there's anything else I can grab for you from the system just let me know happy to.

gfjardim · August 20, 2018

@snowmirage, please update your plugin and start new sessions for those drives it had failed. Let's see what's happening.

snowmirage · August 20, 2018

plugin: updating: preclear.disk.plg
plugin: downloading: https://raw.githubusercontent.com/gfjardim/unRAID-plugins/master/archive/preclear.disk-2018.08.20.txz ... done
plugin: downloading: https://raw.githubusercontent.com/gfjardim/unRAID-plugins/master/archive/preclear.disk-2018.08.20.md5 ... done
Verifying package libevent-2.1.8-x86_64-1.txz.
Installing package libevent-2.1.8-x86_64-1.txz:
PACKAGE DESCRIPTION:
# libevent (event loop library)
#
# libevent is meant to replace the event loop found in event driven
# network servers. An application just needs to call event_dispatch()
# and then add or remove events dynamically without having to change the
# event loop. The libevent API provides a mechanism to execute a
# callback function when a specific event occurs on a file descriptor or
# after a timeout has been reached.
#
# Homepage: http://libevent.org
#
Executing install script for libevent-2.1.8-x86_64-1.txz.
Package libevent-2.1.8-x86_64-1.txz installed.

Verifying package tmux-2.6-x86_64-1.txz.
Installing package tmux-2.6-x86_64-1.txz:
PACKAGE DESCRIPTION:
# tmux (terminal multiplexer)
#
# tmux is a terminal multiplexer. It enables a number of terminals
# (or windows) to be accessed and controlled from a single terminal.
# tmux is intended to be a simple, modern, BSD-licensed alternative to
# programs such as GNU screen.
#
# Homepage: http://tmux.github.io/
#
Executing install script for tmux-2.6-x86_64-1.txz.
Package tmux-2.6-x86_64-1.txz installed.

Verifying package ncurses-6.1_20180324-x86_64-1.txz.
Installing package ncurses-6.1_20180324-x86_64-1.txz:
PACKAGE DESCRIPTION:
# ncurses (CRT screen handling and optimization package)
#
# The ncurses (new curses) library is a free software emulation of
# curses in System V Release 4.0, and more. It uses terminfo format,
# supports pads and color and multiple highlights and forms characters
# and function-key mapping, and has all the other SYSV-curses
# enhancements over BSD curses.
#
# Homepage: https://invisible-island.net/ncurses/
#
Executing install script for ncurses-6.1_20180324-x86_64-1.txz.
Package ncurses-6.1_20180324-x86_64-1.txz installed.

Verifying package utempter-1.1.6-x86_64-2.txz.
Installing package utempter-1.1.6-x86_64-2.txz:
PACKAGE DESCRIPTION:
# utempter (utmp updating library and utility)
#
# The utempter package provides a utility and shared library that
# allows terminal applications such as xterm and screen to update
# /var/run/utmp and /var/log/wtmp without requiring root privileges.
#
Executing install script for utempter-1.1.6-x86_64-2.txz.
Package utempter-1.1.6-x86_64-2.txz installed.


+==============================================================================
| Installing new package /boot/config/plugins/preclear.disk/preclear.disk-2018.08.20.txz
+==============================================================================

Verifying package preclear.disk-2018.08.20.txz.
Installing package preclear.disk-2018.08.20.txz:
PACKAGE DESCRIPTION:
Package preclear.disk-2018.08.20.txz installed.



-----------------------------------------------------------
preclear.disk has been installed.
This plugin requires Dynamix webGui to operate
Copyright 2015-2017, gfjardim
Version: 2018.08.20
-----------------------------------------------------------

plugin: updated

@gfjardim Update looks like it was successful.

Starting Preclear on all the disks again now I'll post the logs when they complete or fail again.

snowmirage · August 21, 2018

I've noticed that the tests appears to be progressing much faster.

On the 2TB HDD's i'm seeing Pre-Read rates ~100 MB/s I'm fairly sure it was significantly less before this update.

Tests still running of course but I'll be sure to report back tomorrow evening when some finish.

gfjardim · August 21, 2018

@snowmirage, any news?

snowmirage · August 21, 2018

Tests are still rolling, they all seem to have gotten much farther, and in less time so far. Posting a current screen shot and current logs

PHOENIX-preclear.disk-20180821-0957.zip

snowmirage · August 21, 2018

Hmm that is strange it looks like the device assignments changed though I'm sure I have made no hardware changes.

sdb was previously one of the Samsung EVO SSDs

and now its one of the 2TB HDDs (ST2000DM001-1CH164_Z2F0MHL8)

itimpi · August 21, 2018

You should never make any assumptions about which sdX type device is assigned to which drive. They are assigned dynamically by Linux during the boot process as they come online. Therefore although they tend to stay the same this is not guaranteed.

snowmirage · August 21, 2018

Thanks that makes sense. I'll keep an eye on this today and post back as soon as the tests finish which looks like it will likely be sometime this evening.

gfjardim · August 21, 2018

You have probably 3 drives with read errors: WD-XXXXXXX79570, ST2000DM001-1CH164_XXXXXHL8 and KingDian_S280-240GB_XXXXXXXXX0135. Send me your diagnostics file to be sure.

snowmirage · August 21, 2018

phoenix-diagnostics-20180821-1142.zip

Current Diag file thanks for the help its much appreciated!

gfjardim · August 21, 2018

ST2000DM001-1CH164_XXXXXHL8 have medium errors, please look at the disk S.M.A.R.T. to see if it's dying.

Aug 21 06:59:00 phoenix kernel: sd 17:0:0:0: [sdb] tag#30 UNKNOWN(0x2003) Result: hostbyte=0x00 driverbyte=0x08
Aug 21 06:59:00 phoenix kernel: sd 17:0:0:0: [sdb] tag#30 Sense Key : 0x3 [current] 
Aug 21 06:59:00 phoenix kernel: sd 17:0:0:0: [sdb] tag#30 ASC=0x11 ASCQ=0x0 
Aug 21 06:59:00 phoenix kernel: sd 17:0:0:0: [sdb] tag#30 CDB: opcode=0x28 28 00 22 dc 1b 38 00 00 08 00
Aug 21 06:59:00 phoenix kernel: print_req_error: critical medium error, dev sdb, sector 584850232
Aug 21 06:59:00 phoenix kernel: Buffer I/O error on dev sdb, logical block 73106279, async page read

KingDian S280-24 4B too have problems:

Aug 20 16:46:28 phoenix kernel: ata6.00: exception Emask 0x0 SAct 0x8000000 SErr 0x0 action 0x0
Aug 20 16:46:28 phoenix kernel: ata6.00: irq_stat 0x40000008
Aug 20 16:46:28 phoenix kernel: ata6.00: failed command: READ FPDMA QUEUED
Aug 20 16:46:28 phoenix kernel: ata6.00: cmd 60/40:d8:00:c4:79/05:00:01:00:00/40 tag 27 ncq dma 688128 in
Aug 20 16:46:28 phoenix kernel:         res 41/40:40:00:c4:79/00:05:01:00:00/40 Emask 0x409 (media error) <F>
Aug 20 16:46:28 phoenix kernel: ata6.00: status: { DRDY ERR }
Aug 20 16:46:28 phoenix kernel: ata6.00: error: { UNC }
Aug 20 16:46:28 phoenix kernel: ata6.00: configured for UDMA/133
Aug 20 16:46:28 phoenix kernel: sd 6:0:0:0: [sdj] tag#27 UNKNOWN(0x2003) Result: hostbyte=0x00 driverbyte=0x08
Aug 20 16:46:28 phoenix kernel: sd 6:0:0:0: [sdj] tag#27 Sense Key : 0x3 [current] 
Aug 20 16:46:28 phoenix kernel: sd 6:0:0:0: [sdj] tag#27 ASC=0x11 ASCQ=0x4 
Aug 20 16:46:28 phoenix kernel: sd 6:0:0:0: [sdj] tag#27 CDB: opcode=0x28 28 00 01 79 c4 00 00 05 40 00
Aug 20 16:46:28 phoenix kernel: print_req_error: I/O error, dev sdj, sector 24757248
Aug 20 16:46:28 phoenix kernel: ata6: EH complete

Same for WDC WD20EADS-00S:

Aug 20 20:31:24 phoenix kernel: mpt2sas_cm0: log_info(0x31080000): originator(PL), code(0x08), sub_code(0x0000)
Aug 20 20:31:24 phoenix kernel: mpt2sas_cm0: log_info(0x31080000): originator(PL), code(0x08), sub_code(0x0000)
Aug 20 20:31:24 phoenix kernel: mpt2sas_cm0: log_info(0x31080000): originator(PL), code(0x08), sub_code(0x0000)
Aug 20 20:31:24 phoenix kernel: mpt2sas_cm0: log_info(0x31080000): originator(PL), code(0x08), sub_code(0x0000)
Aug 20 20:31:24 phoenix kernel: mpt2sas_cm0: log_info(0x31080000): originator(PL), code(0x08), sub_code(0x0000)
Aug 20 20:31:24 phoenix kernel: sd 17:0:13:0: [sdt] tag#2 UNKNOWN(0x2003) Result: hostbyte=0x00 driverbyte=0x08
Aug 20 20:31:24 phoenix kernel: sd 17:0:13:0: [sdt] tag#2 Sense Key : 0x3 [current] 
Aug 20 20:31:24 phoenix kernel: sd 17:0:13:0: [sdt] tag#2 ASC=0x11 ASCQ=0x0 
Aug 20 20:31:24 phoenix kernel: sd 17:0:13:0: [sdt] tag#2 CDB: opcode=0x28 28 00 92 d3 de 18 00 00 08 00
Aug 20 20:31:24 phoenix kernel: print_req_error: critical medium error, dev sdt, sector 2463358488
Aug 20 20:31:24 phoenix kernel: Buffer I/O error on dev sdt, logical block 307919811, async page read

snowmirage · August 21, 2018

Thank you thank you thank you!

Seeing those examples will give me a much better idea what to look for myself.

So far all but 8 of the 26 drives have finished.

And so far all the rest look good.

I'll post the logs again when its finished completely. But here's the latest so far incase you wanted to check something.

PHOENIX-preclear.disk-20180821-1246.zip

Frank1940 · August 21, 2018

Just had a failure while testing the plugin (ver. 2018.08.21) on a disk that had been precleared successfully many, many times before without a failure.

Let me know if you need additional information. Do you want to try the 2018.08.21a version to see if there is still a problem?

ROSE-preclear.disk-20180821-1455.zip

gfjardim · August 21, 2018

23 minutes ago, Frank1940 said:

Just had a failure while testing the plugin (ver. 2018.08.21) on a disk that had been precleared successfully many, many times before without a failure.

Let me know if you need additional information. Do you want to try the 2018.08.21a version to see if there is still a problem?

ROSE-preclear.disk-20180821-1455.zip

Please try the new 2018.08.21b version I just uploaded.

snowmirage · August 22, 2018

Testing of the last of my drives finished overnight here's the logs from preclear

PHOENIX-preclear.disk-20180822-0656.zip

gfjardim · August 22, 2018

@snowmirage, I only saw 3 errors, is that right?

snowmirage · August 22, 2018

Thats correct other than the 3 failed drives you already pointed out all the other drives passed

gfjardim · August 22, 2018

As I said before, those failed drives are defective, apparently. Take a look at their SMART parameters and dispose those with bad status.

Thank you for posting your logs.

snowmirage · August 22, 2018

Rgr that, already ordered replacements. Thanks again for the help

Frank1940 · August 22, 2018

16 hours ago, gfjardim said:

Please try the new 2018.08.21b version I just uploaded.

The test run is now completed using ver. 2018.08.21b. It precleared the disk successfully this time. I have attached the preclear diagnostics if you should want to review them.

Are there any other tests that you might want me to run?

ROSE-preclear.disk-20180822-1218.zip

gfjardim · August 22, 2018

7 minutes ago, Frank1940 said:

The test run is now completed using ver. 2018.08.21b. It precleared the disk successfully this time. I have attached the preclear diagnostics if you should want to review them.

Are there any other tests that you might want me to run?

ROSE-preclear.disk-20180822-1218.zip

Not for now, @Frank1940. Everything appears to be running ok now.

Thanks for the help.

PreClear Script 9 out of 20 2TB drives Failed?

Recommended Posts

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Archived