Joe L. Posted February 25, 2011 Share Posted February 25, 2011 Well, I stopped the current run of preclear and went into the BIOS to check and this is what it shows (* = current setting) OnChip SATA Channel -Enabled* Disabled OnChip SATA Type - Native IDE* RAID AHCI Legacy IDE IDE->AHCI SATA IDE Combined Mode - Enabled* Disabled SATA-III Mode - Auto* Force Max Gen2 You want AHCI, Not "Native IDE" You'll know if you have it right when all the disks are showing as /dev/sdX Quote Link to comment
vexhold Posted February 25, 2011 Share Posted February 25, 2011 Changing it to AHCI made the system not boot. Went back into BIOS and saw all drives reading as IDE and the only drive I could select to boot from was one of the data drives and no longer the USB. I am beginnning to hate this board. Quote Link to comment
larson Posted February 25, 2011 Share Posted February 25, 2011 Hi Just made a preclear run two times on a disk that I am a bit suspicious of (may have had a hit or two before I started using it now). Here are results for these two runs. Same disk, just ran it two times in a row without reboot. I am aware of my system running a bit hot, and am gonna look into it (but most of the times the disks are spun down). I see a bunch of Hardware_ECC_Recovered in the first run. Could that be a physical damage that has now been recovered? Should I run it more times to look for changes? /Lars Olof ========================================================================1.5 == invoked as: ./preclear_disk.sh /dev/sdc == ST31000340AS 9QJ13D2F== Disk /dev/sdc has been successfully precleared == with a starting sector of 64 == Ran 1 cycle == == Using :Read block size = 8225280 Bytes == Last Cycle's Pre Read Time : 3:57:16 (70 MB/s) == Last Cycle's Zeroing time : 3:24:03 (81 MB/s) == Last Cycle's Post Read Time : 7:17:10 (38 MB/s) == Last Cycle's Total Time : 14:39:30 == == Total Elapsed Time 14:39:30 == == Disk Start Temperature: 36C == == Current Disk Temperature: -->43<--C, == ============================================================================ ** Changed attributes in files: /tmp/smart_start_sdc /tmp/smart_finish_sdc ATTRIBUTE NEW_VAL OLD_VAL FAILURE_THRESHOLD STATUS RAW_VALUE Raw_Read_Error_Rate = 119 115 6 ok 234687296 Spin_Retry_Count = 100 100 97 near_thresh 4 End-to-End_Error = 100 100 99 near_thresh 0 Airflow_Temperature_Cel = 57 64 45 In_the_past 43 Temperature_Celsius = 43 36 0 ok 43 Hardware_ECC_Recovered = 51 26 0 ok 234687296 No SMART attributes are FAILING_NOW 0 sectors were pending re-allocation before the start of the preclear. 0 sectors were pending re-allocation after pre-read in cycle 1 of 1. 0 sectors were pending re-allocation after zero of disk in cycle 1 of 1. 0 sectors are pending re-allocation at the end of the preclear, the number of sectors pending re-allocation did not change. 0 sectors had been re-allocated before the start of the preclear. 0 sectors are re-allocated at the end of the preclear, the number of sectors re-allocated did not change. ============================================================================ ========================================================================1.5 == invoked as: ./preclear_disk.sh /dev/sdc == ST31000340AS 9QJ13D2F== Disk /dev/sdc has been successfully precleared == with a starting sector of 64 == Ran 1 cycle == == Using :Read block size = 8225280 Bytes == Last Cycle's Pre Read Time : 3:32:13 (78 MB/s) == Last Cycle's Zeroing time : 3:24:03 (81 MB/s) == Last Cycle's Post Read Time : 7:18:57 (37 MB/s) == Last Cycle's Total Time : 14:16:16 == == Total Elapsed Time 14:16:16 == == Disk Start Temperature: 43C == == Current Disk Temperature: -->42<--C, == ============================================================================ ** Changed attributes in files: /tmp/smart_start_sdc /tmp/smart_finish_sdc ATTRIBUTE NEW_VAL OLD_VAL FAILURE_THRESHOLD STATUS RAW_VALUE Raw_Read_Error_Rate = 116 119 6 ok 102094578 Spin_Retry_Count = 100 100 97 near_thresh 4 End-to-End_Error = 100 100 99 near_thresh 0 Airflow_Temperature_Cel = 58 57 45 In_the_past 42 Temperature_Celsius = 42 43 0 ok 42 No SMART attributes are FAILING_NOW 0 sectors were pending re-allocation before the start of the preclear. 0 sectors were pending re-allocation after pre-read in cycle 1 of 1. 0 sectors were pending re-allocation after zero of disk in cycle 1 of 1. 0 sectors are pending re-allocation at the end of the preclear, the number of sectors pending re-allocation did not change. 0 sectors had been re-allocated before the start of the preclear. 0 sectors are re-allocated at the end of the preclear, the number of sectors re-allocated did not change. ============================================================================ Quote Link to comment
jeff.lebowski Posted February 25, 2011 Share Posted February 25, 2011 Changing it to AHCI made the system not boot. Went back into BIOS and saw all drives reading as IDE and the only drive I could select to boot from was one of the data drives and no longer the USB. I am beginnning to hate this board. That sounds like the problem I had. Double check the boot device is still your USB drive. It may have changed. Quote Link to comment
prostuff1 Posted February 25, 2011 Share Posted February 25, 2011 Well, I stopped the current run of preclear and went into the BIOS to check and this is what it shows (* = current setting) OnChip SATA Channel -Enabled* Disabled This setting is fine OnChip SATA Type - Native IDE* RAID AHCI Legacy IDE IDE->AHCI This should be AHCI SATA IDE Combined Mode - Enabled* Disabled This should be disabled Quote Link to comment
vexhold Posted February 25, 2011 Share Posted February 25, 2011 Well, I tried those two settings before and only checked the BIOS and saw all of them appearing as IDE, and it would not boot, but I fixed the no boot part and now when I go into unRAID they all read as they should. What a weird motherboard. Thanks guys. Starting preclear.... again. Quote Link to comment
larson Posted February 25, 2011 Share Posted February 25, 2011 Thanks guys. Starting preclear.... again. If you want to avoid stopping your jobs (and also run several at the same time), you should look into the "screen" package in UnMenu, Lets you start virtual sessions and keep them running even if your telnet drops for some reason. Just saying... /Lars Olof Quote Link to comment
Joe L. Posted February 25, 2011 Share Posted February 25, 2011 Thanks guys. Starting preclear.... again. If you want to avoid stopping your jobs (and also run several at the same time), you should look into the "screen" package in UnMenu, Lets you start virtual sessions and keep them running even if your telnet drops for some reason. Just saying... /Lars Olof Unfortunately, even "screen" will not help when you are changing BIOS options. Quote Link to comment
larson Posted February 25, 2011 Share Posted February 25, 2011 I know Joe. :-) Was just refering to his kid wantin' the pc for gaming earlier. :-) Quote Link to comment
Joe L. Posted February 26, 2011 Share Posted February 26, 2011 I figure some of you might get a kick out of this... I purchased a 2TB drive just before Christmas and had never un-packed it. I did not have a spare port in the server to connect it to, and I did not need the space. I just put the disk on the shelf as a spare. It is/was a 2TB ST32000542AS drive... Recently I had ordered some of the very cheap 2 port SATA disk controllers from e-bay, (see here) they arrived the other day from China and I finally installed the 2TB drive in my newer server. The good news, the 2 port disk controller seems to work just fine, and 6 ports for under $25 is a pretty good deal. It connects at 3Gb/s. Feb 25 16:11:56 Tower2 kernel: ata9: SATA max UDMA/133 abar m8192@0xfe9fe000 port 0xfe9fe100 irq 16 Feb 25 16:11:56 Tower2 kernel: ata9: SATA link up 3.0 Gbps (SStatus 123 SControl 300) Feb 25 16:11:56 Tower2 kernel: ata9.00: ATA-8: ST32000542AS, CC34, max UDMA/133 Feb 25 16:11:56 Tower2 kernel: ata9.00: 3907029168 sectors, multi 0: LBA48 NCQ (depth 31/32) Feb 25 16:11:56 Tower2 kernel: ata9.00: configured for UDMA/133 The bad news... the Seagate disk is not doing so well in its initial pre-clear. In the syslog is a constant series of messages like this: Feb 25 18:55:09 Tower2 kernel: ata9.00: error: { UNC } Feb 25 18:55:09 Tower2 kernel: ata9.00: configured for UDMA/133 Feb 25 18:55:09 Tower2 kernel: ata9: EH complete Feb 25 18:55:13 Tower2 kernel: ata9.00: exception Emask 0x0 SAct 0x1 SErr 0x0 action 0x0 Feb 25 18:55:13 Tower2 kernel: ata9.00: irq_stat 0x48000000 Feb 25 18:55:13 Tower2 kernel: ata9.00: failed command: READ FPDMA QUEUED Feb 25 18:55:13 Tower2 kernel: ata9.00: cmd 60/08:00:b8:93:ad/00:00:00:00:00/40 tag 0 ncq 4096 in Feb 25 18:55:13 Tower2 kernel: res 41/40:08:b8:93:ad/00:00:00:00:00/00 Emask 0x409 (media error) <F> Feb 25 18:55:13 Tower2 kernel: ata9.00: status: { DRDY ERR } Feb 25 18:55:13 Tower2 kernel: ata9.00: error: { UNC } Feb 25 18:55:13 Tower2 kernel: ata9.00: configured for UDMA/133 Feb 25 18:55:13 Tower2 kernel: ata9: EH complete Feb 25 18:55:17 Tower2 kernel: ata9.00: exception Emask 0x0 SAct 0x1 SErr 0x0 action 0x0 Feb 25 18:55:17 Tower2 kernel: ata9.00: irq_stat 0x48000000 Feb 25 18:55:17 Tower2 kernel: ata9.00: failed command: READ FPDMA QUEUED Feb 25 18:55:17 Tower2 kernel: ata9.00: cmd 60/08:00:b8:93:ad/00:00:00:00:00/40 tag 0 ncq 4096 in Feb 25 18:55:17 Tower2 kernel: res 41/40:08:b8:93:ad/00:00:00:00:00/00 Emask 0x409 (media error) <F> Feb 25 18:55:17 Tower2 kernel: ata9.00: status: { DRDY ERR } Feb 25 18:55:17 Tower2 kernel: ata9.00: error: { UNC } Feb 25 18:55:17 Tower2 kernel: ata9.00: configured for UDMA/133 Feb 25 18:55:17 Tower2 kernel: ata9: EH complete Feb 25 18:55:20 Tower2 kernel: ata9.00: exception Emask 0x0 SAct 0x1 SErr 0x0 action 0x0 Feb 25 18:55:20 Tower2 kernel: ata9.00: irq_stat 0x48000000 Feb 25 18:55:20 Tower2 kernel: ata9.00: failed command: READ FPDMA QUEUED Feb 25 18:55:20 Tower2 kernel: ata9.00: cmd 60/08:00:b8:93:ad/00:00:00:00:00/40 tag 0 ncq 4096 in Feb 25 18:55:20 Tower2 kernel: res 41/40:08:b8:93:ad/00:00:00:00:00/00 Emask 0x409 (media error) <F> Feb 25 18:55:20 Tower2 kernel: ata9.00: status: { DRDY ERR } Feb 25 18:55:20 Tower2 kernel: ata9.00: error: { UNC } Feb 25 18:55:20 Tower2 kernel: ata9.00: configured for UDMA/133 Feb 25 18:55:20 Tower2 kernel: sd 9:0:0:0: [sdj] Unhandled sense code Feb 25 18:55:20 Tower2 kernel: sd 9:0:0:0: [sdj] Result: hostbyte=0x00 driverbyte=0x08 Feb 25 18:55:20 Tower2 kernel: sd 9:0:0:0: [sdj] Sense Key : 0x3 [current] [descriptor] Feb 25 18:55:20 Tower2 kernel: Descriptor sense data with sense descriptors (in hex): Feb 25 18:55:20 Tower2 kernel: 72 03 11 04 00 00 00 0c 00 0a 80 00 00 00 00 00 Feb 25 18:55:20 Tower2 kernel: 00 ad 93 b8 Feb 25 18:55:20 Tower2 kernel: sd 9:0:0:0: [sdj] ASC=0x11 ASCQ=0x4 Feb 25 18:55:20 Tower2 kernel: sd 9:0:0:0: [sdj] CDB: cdb[0]=0x28: 28 00 00 ad 93 b8 00 00 08 00 Feb 25 18:55:20 Tower2 kernel: end_request: I/O error, dev sdj, sector 11375544 Feb 25 18:55:20 Tower2 kernel: Buffer I/O error on device sdj, logical block 1421943 Feb 25 18:55:20 Tower2 kernel: ata9: EH complete The initial smart report looked like this (Note... 0 Power_on_hours, no re-allocated sectors, no sectors pending re-allocation): [color=maroon]SMART Attributes Data Structure revision number: 10 Vendor Specific SMART Attributes with Thresholds: ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE 1 Raw_Read_Error_Rate 0x000f 100 100 006 Pre-fail Always - 28166 3 Spin_Up_Time 0x0003 100 100 000 Pre-fail Always - 0 4 Start_Stop_Count 0x0032 100 100 020 Old_age Always - 4 5 Reallocated_Sector_Ct 0x0033 100 100 036 Pre-fail Always - 0 7 Seek_Error_Rate 0x000f 100 253 030 Pre-fail Always - 74 9 Power_On_Hours 0x0032 100 100 000 Old_age Always - 0 10 Spin_Retry_Count 0x0013 100 100 097 Pre-fail Always - 0 12 Power_Cycle_Count 0x0032 100 100 020 Old_age Always - 4 183 Runtime_Bad_Block 0x0032 100 100 000 Old_age Always - 0 184 End-to-End_Error 0x0032 100 100 099 Old_age Always - 0 187 Reported_Uncorrect 0x0032 100 100 000 Old_age Always - 0 188 Command_Timeout 0x0032 100 253 000 Old_age Always - 0 189 High_Fly_Writes 0x003a 100 100 000 Old_age Always - 0 190 Airflow_Temperature_Cel 0x0022 068 068 045 Old_age Always - 32 (Lifetime Min/Max 26/32) 194 Temperature_Celsius 0x0022 032 040 000 Old_age Always - 32 (0 26 0 0) 195 Hardware_ECC_Recovered 0x001a 100 100 000 Old_age Always - 28166 197 Current_Pending_Sector 0x0012 100 100 000 Old_age Always - 20 198 Offline_Uncorrectable 0x0010 100 100 000 Old_age Offline - 20 199 UDMA_CRC_Error_Count 0x003e 200 253 000 Old_age Always - 0 240 Head_Flying_Hours 0x0000 100 253 000 Old_age Offline - 141815525146627 241 Total_LBAs_Written 0x0000 100 253 000 Old_age Offline - 0 242 Total_LBAs_Read 0x0000 100 253 000 Old_age Offline - 2407[/color] The disk has been pre-clearing for about 2 hours now... The smart report shows this: [color=maroon]ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE 1 Raw_Read_Error_Rate 0x000f 072 070 006 Pre-fail Always - 9462713 3 Spin_Up_Time 0x0003 100 100 000 Pre-fail Always - 0 4 Start_Stop_Count 0x0032 100 100 020 Old_age Always - 4 5 Reallocated_Sector_Ct 0x0033 094 094 036 Pre-fail Always - [b][color=red]282[/color][/b] 7 Seek_Error_Rate 0x000f 100 253 030 Pre-fail Always - 4727 9 Power_On_Hours 0x0032 100 100 000 Old_age Always - 2 10 Spin_Retry_Count 0x0013 100 100 097 Pre-fail Always - 0 12 Power_Cycle_Count 0x0032 100 100 020 Old_age Always - 4 183 Runtime_Bad_Block 0x0032 100 100 000 Old_age Always - 0 184 End-to-End_Error 0x0032 100 100 099 Old_age Always - 0 187 Reported_Uncorrect 0x0032 001 001 000 Old_age Always - 2204 188 Command_Timeout 0x0032 099 099 000 Old_age Always - 8590065666 189 High_Fly_Writes 0x003a 100 100 000 Old_age Always - 0 190 Airflow_Temperature_Cel 0x0022 069 064 045 Old_age Always - 31 (Lifetime Min/Max 26/36) 194 Temperature_Celsius 0x0022 031 040 000 Old_age Always - 31 (0 26 0 0) 195 Hardware_ECC_Recovered 0x001a 048 046 000 Old_age Always - 9462713 197 Current_Pending_Sector 0x0012 095 095 000 Old_age Always - [b][color=red]209[/color][/b] 198 Offline_Uncorrectable 0x0010 095 095 000 Old_age Offline - [b][color=red]209[/color][/b] 199 UDMA_CRC_Error_Count 0x003e 200 200 000 Old_age Always - 0 240 Head_Flying_Hours 0x0000 100 253 000 Old_age Offline - 12854837116933 241 Total_LBAs_Written 0x0000 100 253 000 Old_age Offline - 0 242 Total_LBAs_Read 0x0000 100 253 000 Old_age Offline - 3544051280[/color] What I find interesting is not that there are 209 sectors pending re-allocation, but that 282 have already been re-allocated? I cannot figure out how that has happened, since I've not written to the disk at all. The pre-read has not even read 1% of the disk. ================================================================== 1.7 = unRAID server Pre-Clear disk /dev/sdj = cycle 1 of 1, partition start on sector 63 = Disk Pre-Read in progress: 0% complete = ( 4,935,168,000 bytes of 2,000,398,934,016 read ) = Disk Temperature: 31C, Elapsed Time: 2:16:22 I've got nothing to lose if I let the preclear run its course. One thing for sure, by the time it is done, Ill be looking at an RMA. My big mistake... putting the disk on the shelf rather than installing it right away... so much for returning it to newegg. Now it will have to go back to Seagate... In the time it has taken me to write this post I now have [color=red] 5 Reallocated_Sector_Ct 0x0033 092 092 036 Pre-fail Always - [b]333[/b] 197 Current_Pending_Sector 0x0012 095 095 000 Old_age Always - [b]238[/b] 198 Offline_Uncorrectable 0x0010 095 095 000 Old_age Offline - [b]238[/b][/color] I wish I knew how it is figuring out what to put in the sectors it has already re-allocated. I've still never written to the disk at all. It is still less than 1% through the pre-read having read about 6Gig out of the 2000Gig. = Disk Pre-Read in progress: 0% complete = ( 6,580,224,000 bytes of 2,000,398,934,016 read ) Ouch... Edit: It has been 10 hours. it is now 1% through the pre-read. I've still not written to the drive at all.... ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE 1 Raw_Read_Error_Rate 0x000f 072 070 006 Pre-fail Always - 38629196 3 Spin_Up_Time 0x0003 100 100 000 Pre-fail Always - 0 4 Start_Stop_Count 0x0032 100 100 020 Old_age Always - 4 5 Reallocated_Sector_Ct 0x0033 065 065 036 Pre-fail Always - [b][color=red] 1440[/color][/b] 7 Seek_Error_Rate 0x000f 100 253 030 Pre-fail Always - 18372 9 Power_On_Hours 0x0032 100 100 000 Old_age Always - [b][color=red]10[/color][/b] 10 Spin_Retry_Count 0x0013 100 100 097 Pre-fail Always - 0 12 Power_Cycle_Count 0x0032 100 100 020 Old_age Always - 4 183 Runtime_Bad_Block 0x0032 100 100 000 Old_age Always - 0 184 End-to-End_Error 0x0032 100 100 099 Old_age Always - 0 187 Reported_Uncorrect 0x0032 001 001 000 Old_age Always - 8932 188 Command_Timeout 0x0032 099 099 000 Old_age Always - 30065229831 189 High_Fly_Writes 0x003a 100 100 000 Old_age Always - 0 190 Airflow_Temperature_Cel 0x0022 073 064 045 Old_age Always - 27 (Lifetime Min/Max 26/36) 194 Temperature_Celsius 0x0022 027 040 000 Old_age Always - 27 (0 26 0 0) 195 Hardware_ECC_Recovered 0x001a 048 046 000 Old_age Always - 38629196 197 Current_Pending_Sector 0x0012 081 081 000 Old_age Always - [b][color=red]794[/color][/b] 198 Offline_Uncorrectable 0x0010 081 081 000 Old_age Offline - [b][color=red]794[/color][/b] 199 UDMA_CRC_Error_Count 0x003e 200 200 000 Old_age Always - 0 240 Head_Flying_Hours 0x0000 100 253 000 Old_age Offline - 66559108186125 241 Total_LBAs_Written 0x0000 100 253 000 Old_age Offline - 0 242 Total_LBAs_Read 0x0000 100 253 000 Old_age Offline - 3572622799 Still have no idea how it is re-allocating sectors... it cannot know what to put in the re-allocated sector, no writes have been made to the drive at all. Joe L. Quote Link to comment
larson Posted February 26, 2011 Share Posted February 26, 2011 Joe. Excuse an amateur in the field of disk farming, but is this not the time to stop this run and put the disk in another known good slot for testing? You are running a new unknown disk on a new unknown controller with new unknown cabling. Isn't it better to have at least something known good in the equation? Sorry if I didn't read t thoroughly and you already did it. /Lars Olof Quote Link to comment
papnikol Posted February 26, 2011 Share Posted February 26, 2011 larson might have a point. although i could not think of a practical solution, i have a possible conspiracy theory explanation: is it possible that this drive has been actually used before, returned and then sent again as new? that would explain your problems... Quote Link to comment
Joe L. Posted February 26, 2011 Share Posted February 26, 2011 larson might have a point. although i could not think of a practical solution, i have a possible conspiracy theory explanation: is it possible that this drive has been actually used before, returned and then sent again as new? that would explain your problems... I would doubt it had been used before I put it in service since the initial SMART report showed zero hours run-time. Now, it might have been drop-kicked by the shipping company on its trip from newegg to my house. Used, no... Abused, possibly. It is now at 15 hours of run-time. I do not know if anybody has seen a disk slowly fail like this, so I'm just letting it run. It is still VERY SLOWLY reading the disk. After 15 hours it is still only at 1% in the pre-read. I figure in a few more hours the re-allocated sectors will have reached the failure threshold and the disk will be considered as failing SMART tests. As far as suspecting the new disk controller card. No... I do not suspect it at all, nothing on it would cause the SMART data and everything else to continuously report media errors. Cable problems would show as timeouts, or ICRC errors. Once the SMART data shows FAILING_NOW I'll swap cables around, but for now, as far as I can see, the disk controller is working exactly as it should. (Oh yes, the disk made a lot of clicking noises when I first powered the server on after installing it... even before it was being accessed by the Linux OS. I had a bad feeling even then.) Joe L. Vendor Specific SMART Attributes with Thresholds: ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE 1 Raw_Read_Error_Rate 0x000f 071 070 006 Pre-fail Always - 57557914 3 Spin_Up_Time 0x0003 100 100 000 Pre-fail Always - 0 4 Start_Stop_Count 0x0032 100 100 020 Old_age Always - 4 5 Reallocated_Sector_Ct 0x0033 046 046 036 Pre-fail Always - [b][color=red] 2238[/color][/b] 7 Seek_Error_Rate 0x000f 100 253 030 Pre-fail Always - 27314 9 Power_On_Hours 0x0032 100 100 000 Old_age Always - [color=red][b]15[/b][/color] 10 Spin_Retry_Count 0x0013 100 100 097 Pre-fail Always - 0 12 Power_Cycle_Count 0x0032 100 100 020 Old_age Always - 4 183 Runtime_Bad_Block 0x0032 100 100 000 Old_age Always - 0 184 End-to-End_Error 0x0032 100 100 099 Old_age Always - 0 187 Reported_Uncorrect 0x0032 001 001 000 Old_age Always - 13219 188 Command_Timeout 0x0032 100 098 000 Old_age Always - 42950328330 189 High_Fly_Writes 0x003a 100 100 000 Old_age Always - 0 190 Airflow_Temperature_Cel 0x0022 073 064 045 Old_age Always - 27 (Lifetime Min/Max 26/36) 194 Temperature_Celsius 0x0022 027 040 000 Old_age Always - 27 (0 26 0 0) 195 Hardware_ECC_Recovered 0x001a 048 046 000 Old_age Always - 57557914 197 Current_Pending_Sector 0x0012 072 072 000 Old_age Always - [color=red][b]1166[/b][/color] 198 Offline_Uncorrectable 0x0010 072 072 000 Old_age Offline - [color=red][b]1166[/b][/color] 199 UDMA_CRC_Error_Count 0x003e 200 200 000 Old_age Always - 0 240 Head_Flying_Hours 0x0000 100 253 000 Old_age Offline - 209435490254866 241 Total_LBAs_Written 0x0000 100 253 000 Old_age Offline - 0 242 Total_LBAs_Read 0x0000 100 253 000 Old_age Offline - 3591162722 Quote Link to comment
Joe L. Posted February 26, 2011 Share Posted February 26, 2011 The disk finally died... Interestingly, it stopped responding to SMART reports, or just about anything. I powered down, swapped it to a different disk controller, (for those who suspected it might be a disk controller thing) and after powering up got this smart report: ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE 1 Raw_Read_Error_Rate 0x000f 072 070 006 Pre-fail Always - 66340428 3 Spin_Up_Time 0x0003 100 100 000 Pre-fail Always - 0 4 Start_Stop_Count 0x0032 100 100 020 Old_age Always - 5 5 Reallocated_Sector_Ct 0x0033 [color=red][b] 036 036 036[/b][/color] Pre-fail Always [b][color=red] FAILING_NOW 2625[/color][/b] 7 Seek_Error_Rate 0x000f 100 253 030 Pre-fail Always - 31627 9 Power_On_Hours 0x0032 100 100 000 Old_age Always - [b][color=red]18[/color][/b] 10 Spin_Retry_Count 0x0013 100 100 097 Pre-fail Always - 0 12 Power_Cycle_Count 0x0032 100 100 020 Old_age Always - 5 183 Runtime_Bad_Block 0x0032 100 100 000 Old_age Always - 0 184 End-to-End_Error 0x0032 100 100 099 Old_age Always - 0 187 Reported_Uncorrect 0x0032 001 001 000 Old_age Always - 15308 188 Command_Timeout 0x0032 099 098 000 Old_age Always - 60130459662 189 High_Fly_Writes 0x003a 100 100 000 Old_age Always - 0 190 Airflow_Temperature_Cel 0x0022 072 064 045 Old_age Always - 28 (Lifetime Min/Max 28/28) 194 Temperature_Celsius 0x0022 028 040 000 Old_age Always - 28 (0 26 0 0) 195 Hardware_ECC_Recovered 0x001a 049 046 000 Old_age Always - 66340428 197 Current_Pending_Sector 0x0012 068 068 000 Old_age Always - [b][color=red]1350[/color][/b] 198 Offline_Uncorrectable 0x0010 068 068 000 Old_age Offline - [b][color=red]1350[/color][/b] 199 UDMA_CRC_Error_Count 0x003e 200 200 000 Old_age Always - 0 240 Head_Flying_Hours 0x0000 100 253 000 Old_age Offline - 10423885627416 241 Total_LBAs_Written 0x0000 100 253 000 Old_age Offline - 0 242 Total_LBAs_Read 0x0000 100 253 000 Old_age Offline - 3599756433 I don't like that it re-allocated the sectors without them being written to... To me, that is a mistake in the disks's firmware. I'll just go through the RMA process now. Joe L. Quote Link to comment
papnikol Posted February 26, 2011 Share Posted February 26, 2011 come to think of it now, isnt that supposed to be one of the seagate drives that cave to have their firmware update because of the "click of death" issue? If it came with the old firmware, that might be your problem. I hope you now understand how important and useful preclear is, Joe (just kidding) Quote Link to comment
SSD Posted February 26, 2011 Share Posted February 26, 2011 I don't like that it re-allocated the sectors without them being written to... To me, that is a mistake in the disks's firmware. I believe the drive is supposed to do pre-emptive reallocations when it has trouble reading data but is ultimately successful. Without such a feature, a parity check would almost never result in a reallocated sector. Am I missing something? Quote Link to comment
Joe L. Posted February 26, 2011 Share Posted February 26, 2011 I don't like that it re-allocated the sectors without them being written to... To me, that is a mistake in the disks's firmware. I believe the drive is supposed to do pre-emptive reallocations when it has trouble reading data but is ultimately successful. Without such a feature, a parity check would almost never result in a reallocated sector. Am I missing something? Yes, a "feature" of the "md" driver is to re-construct the failed "read" (by use of the other disks in the array), supply it to the OS program requesting the data, and then write the same sector back to the disk. It is that "write" that allows the sector pending-reallocation to be re-allocated. The block of code is in unraid.c /* If we're trying to read a failed disk, then we must read * parity and all the "other" disks and compute it. */ if ((col->read_bi || (failed && sh->col[failed_num].read_bi)) && !buff_uptodate(col) && !buff_locked(col)) { if (disk_valid( col)) { dprintk("Reading col %d (sync=%d)\n", i, syncing); set_buff_locked( col); locked++; set_bit(MD_BUFF_READ, &col->state); } else if (uptodate == disks-1) { dprintk("Computing col %d\n", i); compute_block(sh, i); /* also sets it Uptodate */ uptodate++; /* if failed disk is enabled, write it */ if (disk_enabled( col)) { dprintk("Writing reconstructed failed col %d\n", i); set_buff_locked( col); locked++; set_bit(MD_BUFF_WRITE, &col->state); } /* this stripe is also now in-sync */ if (syncing) set_bit(STRIPE_INSYNC, &sh->state); } } Without a subsequent write, or a successful read, I see no way a sector can be re-allocated with the correct contents. Quote Link to comment
subwars Posted February 28, 2011 Share Posted February 28, 2011 Hi, have just run preclear for the first time on a couple new 2tb drives. had them running simultaneously, and the difference the report gave was a fair bit, especially the g-sense, drive most of been dropped at some point by the looks of it? still says it passed though. Anyway heres a copy past from unmenu drive 1 - 26hrs 39mins to clear SMART Attributes Data Structure revision number: 16 Vendor Specific SMART Attributes with Thresholds: ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE 1 Raw_Read_Error_Rate 0x002f 100 100 051 Pre-fail Always - 8 2 Throughput_Performance 0x0026 252 252 000 Old_age Always - 0 3 Spin_Up_Time 0x0023 068 068 025 Pre-fail Always - 9971 4 Start_Stop_Count 0x0032 100 100 000 Old_age Always - 3 5 Reallocated_Sector_Ct 0x0033 252 252 010 Pre-fail Always - 0 7 Seek_Error_Rate 0x002e 252 252 051 Old_age Always - 0 8 Seek_Time_Performance 0x0024 252 252 015 Old_age Offline - 0 9 Power_On_Hours 0x0032 100 100 000 Old_age Always - 33 10 Spin_Retry_Count 0x0032 252 252 051 Old_age Always - 0 11 Calibration_Retry_Count 0x0032 252 252 000 Old_age Always - 0 12 Power_Cycle_Count 0x0032 100 100 000 Old_age Always - 3 181 Program_Fail_Cnt_Total 0x0022 100 100 000 Old_age Always - 154883 191 G-Sense_Error_Rate 0x0022 100 100 000 Old_age Always - 29 192 Power-Off_Retract_Count 0x0022 252 252 000 Old_age Always - 0 194 Temperature_Celsius 0x0002 064 050 000 Old_age Always - 30 (Lifetime Min/Max 25/50) 195 Hardware_ECC_Recovered 0x003a 100 100 000 Old_age Always - 0 196 Reallocated_Event_Count 0x0032 252 252 000 Old_age Always - 0 197 Current_Pending_Sector 0x0032 252 252 000 Old_age Always - 0 198 Offline_Uncorrectable 0x0030 252 252 000 Old_age Offline - 0 199 UDMA_CRC_Error_Count 0x0036 200 200 000 Old_age Always - 0 200 Multi_Zone_Error_Rate 0x002a 100 100 000 Old_age Always - 7 223 Load_Retry_Count 0x0032 252 252 000 Old_age Always - 0 225 Load_Cycle_Count 0x0032 100 100 000 Old_age Always - 3 drive2 - 27hrs 18mins to clear ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE 1 Raw_Read_Error_Rate 0x002f 100 100 051 Pre-fail Always - 52 2 Throughput_Performance 0x0026 252 252 000 Old_age Always - 0 3 Spin_Up_Time 0x0023 068 068 025 Pre-fail Always - 9927 4 Start_Stop_Count 0x0032 100 100 000 Old_age Always - 3 5 Reallocated_Sector_Ct 0x0033 252 252 010 Pre-fail Always - 0 7 Seek_Error_Rate 0x002e 252 252 051 Old_age Always - 0 8 Seek_Time_Performance 0x0024 252 252 015 Old_age Offline - 0 9 Power_On_Hours 0x0032 100 100 000 Old_age Always - 33 10 Spin_Retry_Count 0x0032 252 252 051 Old_age Always - 0 11 Calibration_Retry_Count 0x0032 252 252 000 Old_age Always - 0 12 Power_Cycle_Count 0x0032 100 100 000 Old_age Always - 3 181 Program_Fail_Cnt_Total 0x0022 100 100 000 Old_age Always - 30707 191 G-Sense_Error_Rate 0x0022 100 100 000 Old_age Always - 254 192 Power-Off_Retract_Count 0x0022 252 252 000 Old_age Always - 0 194 Temperature_Celsius 0x0002 064 054 000 Old_age Always - 30 (Lifetime Min/Max 24/46) 195 Hardware_ECC_Recovered 0x003a 100 100 000 Old_age Always - 0 196 Reallocated_Event_Count 0x0032 252 252 000 Old_age Always - 0 197 Current_Pending_Sector 0x0032 252 252 000 Old_age Always - 0 198 Offline_Uncorrectable 0x0030 252 252 000 Old_age Offline - 0 199 UDMA_CRC_Error_Count 0x0036 200 200 000 Old_age Always - 0 200 Multi_Zone_Error_Rate 0x002a 100 100 000 Old_age Always - 13 223 Load_Retry_Count 0x0032 252 252 000 Old_age Always - 0 225 Load_Cycle_Count 0x0032 100 100 000 Old_age Always - 3 Quote Link to comment
icon123 Posted February 28, 2011 Share Posted February 28, 2011 If these are the results that I get after (2) preclears on a new 2 TB EARS drive, should I RMA it? ** Changed attributes in files: /tmp/smart_start_sdb /tmp/smart_finish_sdb ATTRIBUTE NEW_VAL OLD_VAL FAILURE_THRESHOLD STATUS RAW_VALUE Raw_Read_Error_Rate = 143 100 51 ok 5916 Temperature_Celsius = 132 133 0 ok 18 No SMART attributes are FAILING_NOW 0 sectors were pending re-allocation before the start of the preclear. 0 sectors were pending re-allocation after pre-read in cycle 1 of 1. 0 sectors were pending re-allocation after zero of disk in cycle 1 of 1. 0 sectors are pending re-allocation at the end of the preclear, the number of sectors pending re-allocation did not change. 0 sectors had been re-allocated before the start of the preclear. 0 sectors are re-allocated at the end of the preclear, the number of sectors re-allocated did not change. ** Changed attributes in files: /tmp/smart_start_sdg /tmp/smart_finish_sdg ATTRIBUTE NEW_VAL OLD_VAL FAILURE_THRESHOLD STATUS RAW_VALUE Raw_Read_Error_Rate = 166 143 51 ok 7198 Seek_Error_Rate = 100 200 0 ok 0 Temperature_Celsius = 132 131 0 ok 18 No SMART attributes are FAILING_NOW 0 sectors were pending re-allocation before the start of the preclear. 0 sectors were pending re-allocation after pre-read in cycle 1 of 1. 0 sectors were pending re-allocation after zero of disk in cycle 1 of 1. 0 sectors are pending re-allocation at the end of the preclear, the number of sectors pending re-allocation did not change. 0 sectors had been re-allocated before the start of the preclear. 0 sectors are re-allocated at the end of the preclear, the number of sectors re-allocated did not change. Thanks. Quote Link to comment
prostuff1 Posted February 28, 2011 Share Posted February 28, 2011 No they are fine. Quote Link to comment
icon123 Posted February 28, 2011 Share Posted February 28, 2011 Were you responding to me or subwars? Quote Link to comment
prostuff1 Posted February 28, 2011 Share Posted February 28, 2011 Were you responding to me or subwars? Both Quote Link to comment
cal87 Posted March 1, 2011 Share Posted March 1, 2011 I have a Seagate 1.5TB drive that I was using in my DirecTV HD DVR. The video was freezing frequently. I replaced it with a EVDS drive, and no more freezing issues. I ran it through preclear and got the following results. Disk Temperature: 31C, Elapsed Time: 18:35:57 ========================================================================1.7 == ST31500341AS 9VS09RZ8 == Disk /dev/sdm has been successfully precleared == with a starting sector of 63 ============================================================================ ** Changed attributes in files: /tmp/smart_start_sdm /tmp/smart_finish_sdm ATTRIBUTE NEW_VAL OLD_VAL FAILURE_THRESHOLD STATUS RAW_VA LUE Raw_Read_Error_Rate = 102 115 6 ok 396794 7 Spin_Retry_Count = 100 100 97 near_thresh 2 End-to-End_Error = 100 100 99 near_thresh 0 High_Fly_Writes = 1 1 0 near_thresh 284 Airflow_Temperature_Cel = 69 75 45 near_thresh 31 Temperature_Celsius = 31 25 0 ok 31 Hardware_ECC_Recovered = 49 22 0 ok 396794 7 No SMART attributes are FAILING_NOW 0 sectors were pending re-allocation before the start of the preclear. 0 sectors were pending re-allocation after pre-read in cycle 1 of 1. 0 sectors were pending re-allocation after zero of disk in cycle 1 of 1. 0 sectors are pending re-allocation at the end of the preclear, the number of sectors pending re-allocation did not change. 44 sectors had been re-allocated before the start of the preclear. 44 sectors are re-allocated at the end of the preclear, the number of sectors re-allocated did not change. Is this drive ok? I am not planning to use it in my array. HTPC perhaps. Quote Link to comment
Joe L. Posted March 2, 2011 Share Posted March 2, 2011 I have a Seagate 1.5TB drive that I was using in my DirecTV HD DVR. The video was freezing frequently. I replaced it with a EVDS drive, and no more freezing issues. I ran it through preclear and got the following results. Disk Temperature: 31C, Elapsed Time: 18:35:57 ========================================================================1.7 == ST31500341AS 9VS09RZ8 == Disk /dev/sdm has been successfully precleared == with a starting sector of 63 ============================================================================ ** Changed attributes in files: /tmp/smart_start_sdm /tmp/smart_finish_sdm ATTRIBUTE NEW_VAL OLD_VAL FAILURE_THRESHOLD STATUS RAW_VA LUE Raw_Read_Error_Rate = 102 115 6 ok 396794 7 Spin_Retry_Count = 100 100 97 near_thresh 2 End-to-End_Error = 100 100 99 near_thresh 0 High_Fly_Writes = 1 1 0 near_thresh 284 Airflow_Temperature_Cel = 69 75 45 near_thresh 31 Temperature_Celsius = 31 25 0 ok 31 Hardware_ECC_Recovered = 49 22 0 ok 396794 7 No SMART attributes are FAILING_NOW 0 sectors were pending re-allocation before the start of the preclear. 0 sectors were pending re-allocation after pre-read in cycle 1 of 1. 0 sectors were pending re-allocation after zero of disk in cycle 1 of 1. 0 sectors are pending re-allocation at the end of the preclear, the number of sectors pending re-allocation did not change. 44 sectors had been re-allocated before the start of the preclear. 44 sectors are re-allocated at the end of the preclear, the number of sectors re-allocated did not change. Is this drive ok? I am not planning to use it in my array. HTPC perhaps. Looks OK to me. Quote Link to comment
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.