January 12, 20224 yr Hi Guys, I had a drive report an error state and it was disabled. I tried to run an extended SMART but it failed several times with "Interrupted (host reset)" but then completed without error overnight. It did take a few hours longer than another drive I ran an extended test on recently so not sure. Num Test_Description Status Remaining LifeTime(hours) LBA_of_first_error # 1 Extended offline Completed without error 00% 377 - # 2 Extended offline Interrupted (host reset) 90% 362 - # 3 Extended offline Interrupted (host reset) 90% 361 - # 4 Extended offline Interrupted (host reset) 90% 360 - # 5 Extended offline Interrupted (host reset) 90% 360 - # 6 Short offline Completed without error 00% 360 - I have attached the Extended diagnostics for the drive and the diagnostics. tower-smart-20220112-1117.ziptower-diagnostics-20220112-1153.zip I am currently pre-clearing a spare drive to replace this one incase it needs to be replaced - some other forum answers suggest that I should replace the "failed" drive regardless and run a preclear on it to be sure (once the array has been rebuild). Is this the recommended practice? Any help understanding the SMART report would be great. Thanks!
January 12, 20224 yr Community Expert SMART attributes for disabled disk look OK and it did pass extended test. According to syslog emulated disk3 mounts, though would have been easier to see that if you had taken diagnostics with the array started. It is always safer to rebuild to a spare and keep the original just as it is in case there are problems with rebuild, but it should be OK to rebuild to the same disk.
January 12, 20224 yr Author Thanks trurl, I started the array and attached a fresh diagnostic report. Is there any way to figure out what caused the error in the first place? Just feel a bit blind on what I can be doing to improve things going forward. tower-diagnostics-20220112-1405.zip
January 13, 20224 yr Community Expert Do you have syslog or diagnostics from when the drive became disabled? Connection problems are much more common than bad disks. I didn't see anything in syslog about that but I assume you rebooted after it happened.
January 13, 20224 yr Author I did actually think to grab one at the time. Here it is. tower-diagnostics-20220111-1654.zip
January 13, 20224 yr Community Expert Connection problems on multiple disks. Maybe power cables or splitters?
January 13, 20224 yr Author They are all on the same power cable so maybe, but I havent reseated the power for the drives that didnt fail and they are coming online. Where in the diagnostics can I see the connection problems?
January 13, 20224 yr Community Expert syslog starting here and continuing to the end Jan 10 16:24:35 Tower kernel: ata1: SATA link down (SStatus 0 SControl 300)
January 15, 20224 yr Author Hey Trurl, I successfully rebuilt to the replacement drive Such a relief to have it all safe again. Thank you! Im now in the process of pre-clearing the 'failed' drive and noticed something strange in the log. Quote Jan 14 23:26:22 preclear_disk_VGKR9MMG_12539: Command: /usr/local/emhttp/plugins/preclear.disk/script/preclear_disk.sh --cycles 1 --no-prompt /dev/sdg Jan 14 23:26:22 preclear_disk_VGKR9MMG_12539: Preclear Disk Version: 1.0.22 Jan 14 23:26:22 preclear_disk_VGKR9MMG_12539: S.M.A.R.T. info type: default Jan 14 23:26:22 preclear_disk_VGKR9MMG_12539: S.M.A.R.T. attrs type: default Jan 14 23:26:22 preclear_disk_VGKR9MMG_12539: Disk size: 8001563222016 Jan 14 23:26:22 preclear_disk_VGKR9MMG_12539: Disk blocks: 1953506646 Jan 14 23:26:22 preclear_disk_VGKR9MMG_12539: Blocks (512 bytes): 15628053168 Jan 14 23:26:22 preclear_disk_VGKR9MMG_12539: Block size: 4096 Jan 14 23:26:22 preclear_disk_VGKR9MMG_12539: Start sector: 13 Jan 14 23:26:24 preclear_disk_VGKR9MMG_12539: Pre-read: pre-read verification started (1/5).... Jan 14 23:26:24 preclear_disk_VGKR9MMG_12539: Pre-Read: dd if=/dev/sdg of=/dev/null bs=2097152 skip=0 count=8001563222016 conv=noerror iflag=nocache,count_bytes,skip_bytes Jan 15 00:32:09 preclear_disk_VGKR9MMG_12539: Pre-Read: progress - 10% read @ 193 MB/s Jan 15 01:39:43 preclear_disk_VGKR9MMG_12539: Pre-Read: progress - 20% read @ 194 MB/s Jan 15 02:49:39 preclear_disk_VGKR9MMG_12539: Pre-Read: progress - 30% read @ 187 MB/s Jan 15 04:02:42 preclear_disk_VGKR9MMG_12539: Pre-Read: progress - 40% read @ 173 MB/s Jan 15 05:19:15 preclear_disk_VGKR9MMG_12539: Pre-Read: progress - 50% read @ 171 MB/s Jan 15 06:40:09 preclear_disk_VGKR9MMG_12539: Pre-Read: progress - 60% read @ 160 MB/s Jan 15 08:07:13 preclear_disk_VGKR9MMG_12539: Pre-Read: progress - 70% read @ 146 MB/s Jan 15 09:42:45 preclear_disk_VGKR9MMG_12539: Pre-Read: progress - 80% read @ 135 MB/s Jan 15 11:30:19 preclear_disk_VGKR9MMG_12539: Pre-Read: progress - 90% read @ 116 MB/s Why would a drive start in sector 13? I am also curious why my drives always slowdown as they get further into the pre-read cycle? Edited January 15, 20224 yr by Simplify
January 15, 20224 yr Community Expert 1 minute ago, Simplify said: I am also curious why my drives always slowdown as they get further into the pre-read cycle? Drives always get slower as the heads move towards the inner tracks. I believe that this is due to the tracks being shorter and thus having less sectors per disk rotation.
January 17, 20224 yr Author Any ideas on why the drive might be starting on sector 13? It is still doing the pre-clear but if there is anything else I will post that too.
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.