snowmirage Posted August 20, 2018 Share Posted August 20, 2018 I have a collection of 20 various 2TB disks I have slowly been adding to over the last few years and expanding different storage server projects. None of these have been heavily used by me, many were referb drives when I purchased them. They've been in storage for about a year and a half while I built a new system and moved. Finally building my new Unraid server (6.5.3) I started to run the PreClear Script provided with the PreClear plugin from Community Applications I was expecting a few of the disks to possibly have some bad sectors or even some with SMART errors warning of impending doom, but so far 9 out of the 20 2TB drives have failed to complete the precheck script. 3 failed during the Pre-read verification and 6 failed so far Zeroing the disk. 11 of the tests still running All the 2TB drives are connected via an IBM M1015 + HP SAS expander. I've read through several posts about this preclear script and what its doing thinking I had missed something to explain at least some of these failures other than failing drives, but I'm coming up empty handed. Before I toss hundreds of dollars in the bin and spend hundreds more to replace them does it appear from the logs all these drives have really gone bad? I've attached the preclear log below. I'll also attached the full diagnostic log once it finishes collecting the info (seems to be taking a while...) PHOENIX-preclear.disk-20180820-1311.zip Link to comment
gfjardim Posted August 20, 2018 Share Posted August 20, 2018 Hi @snowmirage, thanks for reporting this. Your log is gold and it will take some time to analise it all, but saw some weird things with the script. I'll make some changes to it so I can get more information, ok? Will let you know when to update it so we can debug this. Link to comment
snowmirage Posted August 20, 2018 Author Share Posted August 20, 2018 Fantastic thanks! Link to comment
snowmirage Posted August 20, 2018 Author Share Posted August 20, 2018 If there's anything else I can grab for you from the system just let me know happy to. Link to comment
gfjardim Posted August 20, 2018 Share Posted August 20, 2018 @snowmirage, please update your plugin and start new sessions for those drives it had failed. Let's see what's happening. Link to comment
snowmirage Posted August 20, 2018 Author Share Posted August 20, 2018 plugin: updating: preclear.disk.plg plugin: downloading: https://raw.githubusercontent.com/gfjardim/unRAID-plugins/master/archive/preclear.disk-2018.08.20.txz ... done plugin: downloading: https://raw.githubusercontent.com/gfjardim/unRAID-plugins/master/archive/preclear.disk-2018.08.20.md5 ... done Verifying package libevent-2.1.8-x86_64-1.txz. Installing package libevent-2.1.8-x86_64-1.txz: PACKAGE DESCRIPTION: # libevent (event loop library) # # libevent is meant to replace the event loop found in event driven # network servers. An application just needs to call event_dispatch() # and then add or remove events dynamically without having to change the # event loop. The libevent API provides a mechanism to execute a # callback function when a specific event occurs on a file descriptor or # after a timeout has been reached. # # Homepage: http://libevent.org # Executing install script for libevent-2.1.8-x86_64-1.txz. Package libevent-2.1.8-x86_64-1.txz installed. Verifying package tmux-2.6-x86_64-1.txz. Installing package tmux-2.6-x86_64-1.txz: PACKAGE DESCRIPTION: # tmux (terminal multiplexer) # # tmux is a terminal multiplexer. It enables a number of terminals # (or windows) to be accessed and controlled from a single terminal. # tmux is intended to be a simple, modern, BSD-licensed alternative to # programs such as GNU screen. # # Homepage: http://tmux.github.io/ # Executing install script for tmux-2.6-x86_64-1.txz. Package tmux-2.6-x86_64-1.txz installed. Verifying package ncurses-6.1_20180324-x86_64-1.txz. Installing package ncurses-6.1_20180324-x86_64-1.txz: PACKAGE DESCRIPTION: # ncurses (CRT screen handling and optimization package) # # The ncurses (new curses) library is a free software emulation of # curses in System V Release 4.0, and more. It uses terminfo format, # supports pads and color and multiple highlights and forms characters # and function-key mapping, and has all the other SYSV-curses # enhancements over BSD curses. # # Homepage: https://invisible-island.net/ncurses/ # Executing install script for ncurses-6.1_20180324-x86_64-1.txz. Package ncurses-6.1_20180324-x86_64-1.txz installed. Verifying package utempter-1.1.6-x86_64-2.txz. Installing package utempter-1.1.6-x86_64-2.txz: PACKAGE DESCRIPTION: # utempter (utmp updating library and utility) # # The utempter package provides a utility and shared library that # allows terminal applications such as xterm and screen to update # /var/run/utmp and /var/log/wtmp without requiring root privileges. # Executing install script for utempter-1.1.6-x86_64-2.txz. Package utempter-1.1.6-x86_64-2.txz installed. +============================================================================== | Installing new package /boot/config/plugins/preclear.disk/preclear.disk-2018.08.20.txz +============================================================================== Verifying package preclear.disk-2018.08.20.txz. Installing package preclear.disk-2018.08.20.txz: PACKAGE DESCRIPTION: Package preclear.disk-2018.08.20.txz installed. ----------------------------------------------------------- preclear.disk has been installed. This plugin requires Dynamix webGui to operate Copyright 2015-2017, gfjardim Version: 2018.08.20 ----------------------------------------------------------- plugin: updated @gfjardim Update looks like it was successful. Starting Preclear on all the disks again now I'll post the logs when they complete or fail again. Link to comment
snowmirage Posted August 21, 2018 Author Share Posted August 21, 2018 I've noticed that the tests appears to be progressing much faster. On the 2TB HDD's i'm seeing Pre-Read rates ~100 MB/s I'm fairly sure it was significantly less before this update. Tests still running of course but I'll be sure to report back tomorrow evening when some finish. Link to comment
snowmirage Posted August 21, 2018 Author Share Posted August 21, 2018 Tests are still rolling, they all seem to have gotten much farther, and in less time so far. Posting a current screen shot and current logs PHOENIX-preclear.disk-20180821-0957.zip Link to comment
snowmirage Posted August 21, 2018 Author Share Posted August 21, 2018 Hmm that is strange it looks like the device assignments changed though I'm sure I have made no hardware changes. sdb was previously one of the Samsung EVO SSDs and now its one of the 2TB HDDs (ST2000DM001-1CH164_Z2F0MHL8) Link to comment
itimpi Posted August 21, 2018 Share Posted August 21, 2018 You should never make any assumptions about which sdX type device is assigned to which drive. They are assigned dynamically by Linux during the boot process as they come online. Therefore although they tend to stay the same this is not guaranteed. Link to comment
snowmirage Posted August 21, 2018 Author Share Posted August 21, 2018 Thanks that makes sense. I'll keep an eye on this today and post back as soon as the tests finish which looks like it will likely be sometime this evening. Link to comment
gfjardim Posted August 21, 2018 Share Posted August 21, 2018 You have probably 3 drives with read errors: WD-XXXXXXX79570, ST2000DM001-1CH164_XXXXXHL8 and KingDian_S280-240GB_XXXXXXXXX0135. Send me your diagnostics file to be sure. Link to comment
snowmirage Posted August 21, 2018 Author Share Posted August 21, 2018 phoenix-diagnostics-20180821-1142.zip Current Diag file thanks for the help its much appreciated! Link to comment
gfjardim Posted August 21, 2018 Share Posted August 21, 2018 ST2000DM001-1CH164_XXXXXHL8 have medium errors, please look at the disk S.M.A.R.T. to see if it's dying. Aug 21 06:59:00 phoenix kernel: sd 17:0:0:0: [sdb] tag#30 UNKNOWN(0x2003) Result: hostbyte=0x00 driverbyte=0x08 Aug 21 06:59:00 phoenix kernel: sd 17:0:0:0: [sdb] tag#30 Sense Key : 0x3 [current] Aug 21 06:59:00 phoenix kernel: sd 17:0:0:0: [sdb] tag#30 ASC=0x11 ASCQ=0x0 Aug 21 06:59:00 phoenix kernel: sd 17:0:0:0: [sdb] tag#30 CDB: opcode=0x28 28 00 22 dc 1b 38 00 00 08 00 Aug 21 06:59:00 phoenix kernel: print_req_error: critical medium error, dev sdb, sector 584850232 Aug 21 06:59:00 phoenix kernel: Buffer I/O error on dev sdb, logical block 73106279, async page read KingDian S280-24 4B too have problems: Aug 20 16:46:28 phoenix kernel: ata6.00: exception Emask 0x0 SAct 0x8000000 SErr 0x0 action 0x0 Aug 20 16:46:28 phoenix kernel: ata6.00: irq_stat 0x40000008 Aug 20 16:46:28 phoenix kernel: ata6.00: failed command: READ FPDMA QUEUED Aug 20 16:46:28 phoenix kernel: ata6.00: cmd 60/40:d8:00:c4:79/05:00:01:00:00/40 tag 27 ncq dma 688128 in Aug 20 16:46:28 phoenix kernel: res 41/40:40:00:c4:79/00:05:01:00:00/40 Emask 0x409 (media error) <F> Aug 20 16:46:28 phoenix kernel: ata6.00: status: { DRDY ERR } Aug 20 16:46:28 phoenix kernel: ata6.00: error: { UNC } Aug 20 16:46:28 phoenix kernel: ata6.00: configured for UDMA/133 Aug 20 16:46:28 phoenix kernel: sd 6:0:0:0: [sdj] tag#27 UNKNOWN(0x2003) Result: hostbyte=0x00 driverbyte=0x08 Aug 20 16:46:28 phoenix kernel: sd 6:0:0:0: [sdj] tag#27 Sense Key : 0x3 [current] Aug 20 16:46:28 phoenix kernel: sd 6:0:0:0: [sdj] tag#27 ASC=0x11 ASCQ=0x4 Aug 20 16:46:28 phoenix kernel: sd 6:0:0:0: [sdj] tag#27 CDB: opcode=0x28 28 00 01 79 c4 00 00 05 40 00 Aug 20 16:46:28 phoenix kernel: print_req_error: I/O error, dev sdj, sector 24757248 Aug 20 16:46:28 phoenix kernel: ata6: EH complete Same for WDC WD20EADS-00S: Aug 20 20:31:24 phoenix kernel: mpt2sas_cm0: log_info(0x31080000): originator(PL), code(0x08), sub_code(0x0000) Aug 20 20:31:24 phoenix kernel: mpt2sas_cm0: log_info(0x31080000): originator(PL), code(0x08), sub_code(0x0000) Aug 20 20:31:24 phoenix kernel: mpt2sas_cm0: log_info(0x31080000): originator(PL), code(0x08), sub_code(0x0000) Aug 20 20:31:24 phoenix kernel: mpt2sas_cm0: log_info(0x31080000): originator(PL), code(0x08), sub_code(0x0000) Aug 20 20:31:24 phoenix kernel: mpt2sas_cm0: log_info(0x31080000): originator(PL), code(0x08), sub_code(0x0000) Aug 20 20:31:24 phoenix kernel: sd 17:0:13:0: [sdt] tag#2 UNKNOWN(0x2003) Result: hostbyte=0x00 driverbyte=0x08 Aug 20 20:31:24 phoenix kernel: sd 17:0:13:0: [sdt] tag#2 Sense Key : 0x3 [current] Aug 20 20:31:24 phoenix kernel: sd 17:0:13:0: [sdt] tag#2 ASC=0x11 ASCQ=0x0 Aug 20 20:31:24 phoenix kernel: sd 17:0:13:0: [sdt] tag#2 CDB: opcode=0x28 28 00 92 d3 de 18 00 00 08 00 Aug 20 20:31:24 phoenix kernel: print_req_error: critical medium error, dev sdt, sector 2463358488 Aug 20 20:31:24 phoenix kernel: Buffer I/O error on dev sdt, logical block 307919811, async page read Link to comment
snowmirage Posted August 21, 2018 Author Share Posted August 21, 2018 Thank you thank you thank you! Seeing those examples will give me a much better idea what to look for myself. So far all but 8 of the 26 drives have finished. And so far all the rest look good. I'll post the logs again when its finished completely. But here's the latest so far incase you wanted to check something. PHOENIX-preclear.disk-20180821-1246.zip Link to comment
Frank1940 Posted August 21, 2018 Share Posted August 21, 2018 Just had a failure while testing the plugin (ver. 2018.08.21) on a disk that had been precleared successfully many, many times before without a failure. Let me know if you need additional information. Do you want to try the 2018.08.21a version to see if there is still a problem? ROSE-preclear.disk-20180821-1455.zip Link to comment
gfjardim Posted August 21, 2018 Share Posted August 21, 2018 23 minutes ago, Frank1940 said: Just had a failure while testing the plugin (ver. 2018.08.21) on a disk that had been precleared successfully many, many times before without a failure. Let me know if you need additional information. Do you want to try the 2018.08.21a version to see if there is still a problem? ROSE-preclear.disk-20180821-1455.zip Please try the new 2018.08.21b version I just uploaded. Link to comment
snowmirage Posted August 22, 2018 Author Share Posted August 22, 2018 Testing of the last of my drives finished overnight here's the logs from preclear PHOENIX-preclear.disk-20180822-0656.zip Link to comment
gfjardim Posted August 22, 2018 Share Posted August 22, 2018 @snowmirage, I only saw 3 errors, is that right? Link to comment
snowmirage Posted August 22, 2018 Author Share Posted August 22, 2018 Thats correct other than the 3 failed drives you already pointed out all the other drives passed Link to comment
gfjardim Posted August 22, 2018 Share Posted August 22, 2018 As I said before, those failed drives are defective, apparently. Take a look at their SMART parameters and dispose those with bad status. Thank you for posting your logs. Link to comment
snowmirage Posted August 22, 2018 Author Share Posted August 22, 2018 Rgr that, already ordered replacements. Thanks again for the help Link to comment
Frank1940 Posted August 22, 2018 Share Posted August 22, 2018 16 hours ago, gfjardim said: Please try the new 2018.08.21b version I just uploaded. The test run is now completed using ver. 2018.08.21b. It precleared the disk successfully this time. I have attached the preclear diagnostics if you should want to review them. Are there any other tests that you might want me to run? ROSE-preclear.disk-20180822-1218.zip Link to comment
gfjardim Posted August 22, 2018 Share Posted August 22, 2018 7 minutes ago, Frank1940 said: The test run is now completed using ver. 2018.08.21b. It precleared the disk successfully this time. I have attached the preclear diagnostics if you should want to review them. Are there any other tests that you might want me to run? ROSE-preclear.disk-20180822-1218.zip Not for now, @Frank1940. Everything appears to be running ok now. Thanks for the help. Link to comment
Recommended Posts
Archived
This topic is now archived and is closed to further replies.