Rob_Dingen Posted July 21, 2011 Share Posted July 21, 2011 Hi I have a problem to preclear one disk WD20EARS 2TB Parity - Hit MB sata 0 Disk 1- Hit MB sata 1 Disk 2- Hit MB sata 2 Disk 3- WD MB sata 3 Disk 4- WD AOC-SASLP Disk 5- WD AOC-SASLP The 3 Hit disks are already precleared. Disk 4 and 5 are preclearing now but Disk 3 is not responding on the preclear command. It's visible in the unmenu Setup Softw v : unRAID 4,7 MB : X8SIL-F Proc : I3 540 Mem : 4G Kingston SATA Contr : AOC-SASLP-MV8 softw v 21 3x HDD Hitachi 5K3000 2TB 3x HDD WD 20EARS 2TB Case : NORCO 2420 Attach the syslog from the disk and smart info Rob Smart_disk.txt Syslog_disk.txt Quote Link to comment
SSD Posted July 21, 2011 Share Posted July 21, 2011 Hi I have a problem to preclear one disk WD20EARS 2TB Parity - Hit MB sata 0 Disk 1- Hit MB sata 1 Disk 2- Hit MB sata 2 Disk 3- WD MB sata 3 Disk 4- WD AOC-SASLP Disk 5- WD AOC-SASLP The 3 Hit disks are already precleared. Disk 4 and 5 are preclearing now but Disk 3 is not responding on the preclear command. It's visible in the unmenu Setup Softw v : unRAID 4,7 MB : X8SIL-F Proc : I3 540 Mem : 4G Kingston SATA Contr : AOC-SASLP-MV8 softw v 21 3x HDD Hitachi 5K3000 2TB 3x HDD WD 20EARS 2TB Attach the syslog from the disk and smart info Rob Looks like you might have a cabling issue to that drive. You should repace (or al least reseat both ends) of the SATA cable to that disk. Try that and see if the problems stop. Quote Link to comment
Rob_Dingen Posted July 21, 2011 Author Share Posted July 21, 2011 Hi Both ends is not possible because I use a NORCO 2420 case but I can try to reseat the connectors on the MB. If the preclear from the other disks finish I can try to put the disk on the AOC-SASLP-MV8 and try again. Rob Quote Link to comment
Rob_Dingen Posted July 21, 2011 Author Share Posted July 21, 2011 Hi Disk 4 which preclear running also quit at 54%. Jul 21 21:17:18 Tower kernel: 72 0b 00 00 00 00 00 0c 00 0a 80 00 00 00 00 00 Jul 21 21:17:18 Tower kernel: 00 00 00 d7 Jul 21 21:17:18 Tower kernel: sd 0:0:0:0: [sdb] ASC=0x0 ASCQ=0x0 (Drive related) Jul 21 21:17:18 Tower kernel: sd 0:0:0:0: [sdb] CDB: cdb[0]=0x28: 28 00 e8 e0 81 20 00 00 08 00 (Drive related) Jul 21 21:17:18 Tower kernel: end_request: I/O error, dev sdb, sector 3907027232 (Errors) Jul 21 21:17:18 Tower kernel: ata1: translated ATA stat/err 0x41/04 to SCSI SK/ASC/ASCQ 0xb/00/00 (Drive related) Jul 21 21:17:18 Tower kernel: ata1: status=0x41 { DriveReady Error } (Errors) Jul 21 21:17:18 Tower kernel: ata1: error=0x04 { DriveStatusError } (Errors) Jul 21 21:17:18 Tower kernel: ata1: translated ATA stat/err 0x41/04 to SCSI SK/ASC/ASCQ 0xb/00/00 (Drive related) Jul 21 21:17:18 Tower kernel: ata1: status=0x41 { DriveReady Error } (Errors) Jul 21 21:17:18 Tower kernel: ata1: error=0x04 { DriveStatusError } (Errors) Jul 21 21:17:18 Tower kernel: ata1: translated ATA stat/err 0x41/04 to SCSI SK/ASC/ASCQ 0xb/00/00 (Drive related) Jul 21 21:17:18 Tower kernel: ata1: status=0x41 { DriveReady Error } (Errors) Jul 21 21:17:18 Tower kernel: ata1: error=0x04 { DriveStatusError } (Errors) Jul 21 21:17:18 Tower kernel: ata1: translated ATA stat/err 0x41/04 to SCSI SK/AS (Drive related) Whats going on? Rob Quote Link to comment
vca Posted July 21, 2011 Share Posted July 21, 2011 Your smart log includes this: SMART Self-test log structure revision number 1 Num Test_Description Status Remaining LifeTime(hours) LBA_of_first_error # 1 Short offline Completed: electrical failure 30% 4 - # 2 Short offline Interrupted (host reset) 90% 4 - # 3 Short offline Interrupted (host reset) 90% 4 - # 4 Short offline Interrupted (host reset) 90% 4 - which I don't recall seeing before - at a guess it looks like the drive is having power issues (electrical failure), so maybe you have an issue with your power supply or with the power connector? Do you have any power splitters - they have been a source of grief for many on this forum. Regards, Stephen Quote Link to comment
Rajahal Posted July 22, 2011 Share Posted July 22, 2011 Are you supplying power to both of the molex connectors on each backplane in your Norco 4220? According to Norco you should only have to connect one of them, but some users have reported in the past that connecting both fixes odd power-related issues like you are seeing. I think it is worth a try. Try it with just one backplane at first (so that you don't have to go buy a bunch of new power splitters) and see if it makes any difference. Quote Link to comment
Rob_Dingen Posted July 23, 2011 Author Share Posted July 23, 2011 Update to the problem. Reconnect all sata cables and power cables and put a extra set of power cables from the psu to the molex connector of the norco backplane. Reboot and still got a disk error. Removed disk 3 and reboot now everything running smooth. Disk 4 is preclearing again and I hope it will finish now. Did some further testing on Disk 3 WD 20EARS and I think it was DOA it doesn't spin up in another computer. Rob Quote Link to comment
Rob_Dingen Posted July 23, 2011 Author Share Posted July 23, 2011 OK preclear stops again with preread at 54% Part of the syslog Jul 24 02:17:14 Tower kernel: sd 0:0:0:0: [sdb] ASC=0x0 ASCQ=0x0 (Drive related) Jul 24 02:17:14 Tower kernel: sd 0:0:0:0: [sdb] CDB: cdb[0]=0x28: 28 00 e8 e0 80 98 00 00 08 00 (Drive related) Jul 24 02:17:14 Tower kernel: end_request: I/O error, dev sdb, sector 3907027096 (Errors) Jul 24 02:17:14 Tower kernel: ata1: translated ATA stat/err 0x41/04 to SCSI SK/ASC/ASCQ 0xb/00/00 (Drive related) Jul 24 02:17:14 Tower kernel: ata1: status=0x41 { DriveReady Error } (Errors) Jul 24 02:17:14 Tower kernel: ata1: error=0x04 { DriveStatusError } (Errors) Jul 24 02:17:14 Tower kernel: ata1: translated ATA stat/err 0x41/04 to SCSI SK/ASC/ASCQ 0xb/00/00 (Drive related) Jul 24 02:17:14 Tower kernel: ata1: status=0x41 { DriveReady Error } (Errors) Jul 24 02:17:14 Tower kernel: ata1: error=0x04 { DriveStatusError } (Errors) Jul 24 02:17:14 Tower kernel: ata1: translated ATA stat/err 0x41/04 to SCSI SK/ASC/ASCQ 0xb/00/00 (Drive related) Jul 24 02:17:14 Tower kernel: ata1: status=0x41 { DriveReady Error } (Errors) Jul 24 02:17:14 Tower kernel: ata1: error=0x04 { DriveStatusError } (Errors) Jul 24 02:17:14 Tower kernel: ata1: translated ATA stat/err 0x41/04 to SCSI SK/ASC/ASCQ 0xb/00/00 (Drive related) Jul 24 02:17:14 Tower kernel: ata1: status=0x41 { DriveReady Error } (Errors) Jul 24 02:17:14 Tower kernel: ata1: error=0x04 { DriveStatusError } (Errors) Jul 24 02:17:14 Tower kernel: ata1: translated ATA stat/err 0x41/04 to SCSI SK/ASC/ASCQ 0xb/00/00 (Drive related) Jul 24 02:17:14 Tower kernel: ata1: status=0x41 { DriveReady Error } (Errors) Jul 24 02:17:14 Tower kernel: ata1: error=0x04 { DriveStatusError } (Errors) Jul 24 02:17:14 Tower kernel: ata1: translated ATA stat/err 0x41/04 to SCSI SK/ASC/ASCQ 0xb/00/00 (Drive related) Jul 24 02:17:14 Tower kernel: ata1: status=0x41 { DriveReady Error } (Errors) Jul 24 02:17:14 Tower kernel: ata1: error=0x04 { DriveStatusError } (Errors) Jul 24 02:17:14 Tower kernel: sd 0:0:0:0: [sdb] Result: hostbyte=0x00 driverbyte=0x08 (System) Jul 24 02:17:14 Tower kernel: sd 0:0:0:0: [sdb] Sense Key : 0xb [current] [descriptor] (Drive related) Jul 24 02:17:14 Tower kernel: Descriptor sense data with sense descriptors (in hex): I'm lost. Rob HDParm_2289.txt Short_smart_test_2289.txt Quote Link to comment
vca Posted July 28, 2011 Share Posted July 28, 2011 You might have another bad drive: note: 197 Current_Pending_Sector 0x0032 200 200 000 Old_age Always - 1 and: SMART Self-test log structure revision number 1 Num Test_Description Status Remaining LifeTime(hours) LBA_of_first_error # 1 Short offline Completed: read failure 90% 39 2112414936 # 2 Short offline Completed: read failure 90% 39 2112414936 In theory your drive only has one bad sector that is queued to be remapped, but I had a brand new WD20EARS that made a similar claim (though I didn't save a copy of the smart report as I usually do). When I tried to do a full smart test it failed to complete the test (though did not report anything else), so I took it out of my unRAID box and put it in my Windows PC and ran the WD Tools on it. The quick test failed to complete and after I cancelled the extended test (as it was taking too long) it reported that the drive had too many bad sectors. As in your case I initially discovered the problem during a preclear. From my notes: Apr25 - Apr 27 the replacement drive WMA ZA3 782 259 also failed, it appeared to have got part way though the writing zeroes phase of the first pre-clear pass when it took out the unraid server (probably too many error messages). When I put it in my desktop CrystalDisk initially showed a normal smart report, then when I tried to run the WD Diagnostic the quick test ran very slowly and I stopped it after two hours and tried the long test overnight, the next morning it was still running and was now saying about 400 hours left, so I stopped the test and got the report which said FAIL, "08-Too many bad sectors detected". Now crystal disk no longer finds this drive. Regards, Stephen Quote Link to comment
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.