Jump to content

jbuszkie

Members
  • Posts

    696
  • Joined

  • Last visited

Everything posted by jbuszkie

  1. Ok.. I'll try the reseat and switch the cables around... The drives are being cleared in a separate machine. so it's not my main UnRaid machine. Interestingly.. Even on this separate machine I'm seeing much slower reads on the post read. I still am baffled by this. I'm getting about 40MB/s calculated while the test says 84MB/s Post Read in progress on /dev/sda: 75% complete. ( 1,501,936,128,000 of 2,000,398,934,016 bytes read )at 84.3 MB/s Disk Temperature: 35C, Using Block size of 8,225,280 Bytes Next report at 100% Calculated Read Speed: 40 MB/s Elapsed Time of current cycle: 10:15:27 Total Elapsed time: 22:31:51 All three remaining drives exhibit this... The pre-read and the zeroing all were fast... Pre Read finished on /dev/sdc ( 2,000,388,096,000 of 2,000,398,934,016 bytes read) Pre Read Elapsed Time: 6:15:27 Total Elapsed Time: 6:15:32 Disk Temperature: -->41<--C, Using Block size of 8,225,280 Bytes Calculated Read Speed - 88 MB/s Zeroing Disk /dev/sdc Done. Zeroing Elapsed Time: 5:55:17 Total Elapsed Time: 12:10:52 Disk Temperature: -->42<--C, Calculated Write Speed: 93 MB/s
  2. Joe, I'm trying to preclear 4 2T samsung drives. 3 are chugging along but one failed right after zeroing. I grabbed the first smart report as the drive, now, is un responsive. There are some errors reported. SMART Attributes Data Structure revision number: 16 Vendor Specific SMART Attributes with Thresholds: ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE 1 Raw_Read_Error_Rate 0x002f 100 100 051 Pre-fail Always - 0 2 Throughput_Performance 0x0026 252 252 000 Old_age Always - 0 3 Spin_Up_Time 0x0023 066 066 025 Pre-fail Always - 10438 4 Start_Stop_Count 0x0032 100 100 000 Old_age Always - 2 5 Reallocated_Sector_Ct 0x0033 252 252 010 Pre-fail Always - 0 7 Seek_Error_Rate 0x002e 252 252 051 Old_age Always - 0 8 Seek_Time_Performance 0x0024 252 252 015 Old_age Offline - 0 9 Power_On_Hours 0x0032 100 100 000 Old_age Always - 4 10 Spin_Retry_Count 0x0032 252 252 051 Old_age Always - 0 11 Calibration_Retry_Count 0x0032 252 252 000 Old_age Always - 0 12 Power_Cycle_Count 0x0032 100 100 000 Old_age Always - 2 181 Unknown_Attribute 0x0022 252 252 000 Old_age Always - 0 191 G-Sense_Error_Rate 0x0022 252 252 000 Old_age Always - 0 192 Power-Off_Retract_Count 0x0022 252 252 000 Old_age Always - 0 195 Hardware_ECC_Recovered 0x003a 100 100 000 Old_age Always - 0 196 Reallocated_Event_Count 0x0032 252 252 000 Old_age Always - 0 197 Current_Pending_Sector 0x0032 252 252 000 Old_age Always - 0 198 Offline_Uncorrectable 0x0030 252 252 000 Old_age Offline - 0 199 UDMA_CRC_Error_Count 0x0036 100 100 000 Old_age Always - 7 200 Multi_Zone_Error_Rate 0x002a 252 252 000 Old_age Always - 0 223 Load_Retry_Count 0x0032 252 252 000 Old_age Always - 0 225 Load_Cycle_Count 0x0032 100 100 000 Old_age Always - 2 SMART Error Log Version: 1 ATA Error Count: 7 (device log contains only the most recent five errors) CR = Command Register [HEX] FR = Features Register [HEX] SC = Sector Count Register [HEX] SN = Sector Number Register [HEX] CL = Cylinder Low Register [HEX] CH = Cylinder High Register [HEX] DH = Device/Head Register [HEX] DC = Device Command Register [HEX] ER = Error register [HEX] ST = Status register [HEX] Powered_Up_Time is measured from power on, and printed as DDd+hh:mm:SS.sss where DD=days, hh=hours, mm=minutes, SS=sec, and sss=millisec. It "wraps" after 49.710 days. Error 7 occurred at disk power-on lifetime: 3 hours (0 days + 3 hours) When the command that caused the error occurred, the device was active or idle. After command completion occurred, registers were: ER ST SC SN CL CH DH -- -- -- -- -- -- -- 84 51 00 00 4f c2 00 Error: ABRT Commands leading to the command that caused the error were: CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name -- -- -- -- -- -- -- -- ---------------- -------------------- b0 d0 01 00 4f c2 00 08 00:00:13.498 SMART READ DATA b0 d0 01 00 4f c2 00 08 00:00:13.498 SMART READ DATA b0 da 00 00 4f c2 00 08 00:00:13.498 SMART RETURN STATUS b0 da 00 00 4f c2 00 08 00:00:13.497 SMART RETURN STATUS ec 00 00 00 00 00 00 08 00:00:13.497 IDENTIFY DEVICE Error 6 occurred at disk power-on lifetime: 3 hours (0 days + 3 hours) When the command that caused the error occurred, the device was active or idle. After command completion occurred, registers were: ER ST SC SN CL CH DH -- -- -- -- -- -- -- 84 51 00 00 4f c2 00 Error: ABRT Commands leading to the command that caused the error were: CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name -- -- -- -- -- -- -- -- ---------------- -------------------- b0 d1 01 01 4f c2 00 08 00:00:10.913 SMART READ ATTRIBUTE THRESHOLDS [OBS-4] b0 d0 01 00 4f c2 00 08 00:00:10.913 SMART READ DATA b0 da 00 00 4f c2 00 08 00:00:10.912 SMART RETURN STATUS b0 da 00 00 4f c2 00 08 00:00:10.912 SMART RETURN STATUS ec 00 00 00 00 00 00 08 00:00:10.912 IDENTIFY DEVICE Error 5 occurred at disk power-on lifetime: 2 hours (0 days + 2 hours) When the command that caused the error occurred, the device was active or idle. After command completion occurred, registers were: ER ST SC SN CL CH DH -- -- -- -- -- -- -- 84 51 00 00 4f c2 00 Error: ABRT Commands leading to the command that caused the error were: CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name -- -- -- -- -- -- -- -- ---------------- -------------------- b0 d0 01 00 4f c2 00 08 00:00:08.413 SMART READ DATA b0 d0 01 00 4f c2 00 08 00:00:08.413 SMART READ DATA b0 da 00 00 4f c2 00 08 00:00:08.413 SMART RETURN STATUS b0 da 00 00 4f c2 00 08 00:00:08.412 SMART RETURN STATUS ec 00 00 00 00 00 00 08 00:00:08.412 IDENTIFY DEVICE Error 4 occurred at disk power-on lifetime: 1 hours (0 days + 1 hours) When the command that caused the error occurred, the device was active or idle. After command completion occurred, registers were: ER ST SC SN CL CH DH -- -- -- -- -- -- -- 84 51 00 00 00 00 00 Commands leading to the command that caused the error were: CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name -- -- -- -- -- -- -- -- ---------------- -------------------- ec 00 00 00 00 00 00 08 00:00:07.044 IDENTIFY DEVICE 60 d1 00 18 ea e6 40 08 00:00:07.044 READ FPDMA QUEUED 60 d1 00 18 e9 e6 40 08 00:00:07.044 READ FPDMA QUEUED 60 d1 00 18 e8 e6 40 08 00:00:07.044 READ FPDMA QUEUED 60 d1 00 18 e7 e6 40 08 00:00:07.044 READ FPDMA QUEUED Error 3 occurred at disk power-on lifetime: 1 hours (0 days + 1 hours) When the command that caused the error occurred, the device was active or idle. After command completion occurred, registers were: ER ST SC SN CL CH DH -- -- -- -- -- -- -- 84 51 00 00 4f c2 00 Error: ABRT Commands leading to the command that caused the error were: CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name -- -- -- -- -- -- -- -- ---------------- -------------------- b0 d1 01 01 4f c2 00 08 00:00:06.651 SMART READ ATTRIBUTE THRESHOLDS [OBS-4] b0 d0 01 00 4f c2 00 08 00:00:06.650 SMART READ DATA b0 da 00 00 4f c2 00 08 00:00:06.650 SMART RETURN STATUS b0 da 00 00 4f c2 00 08 00:00:06.649 SMART RETURN STATUS ec 00 00 00 00 00 00 08 00:00:06.649 IDENTIFY DEVICE SMART Self-test log structure revision number 1 No self-tests have been logged. [To run self-tests, use: smartctl -t] SMART Selective Self-Test Log Data Structure Revision Number (0) should be 1 SMART Selective self-test log data structure revision number 0 Warning: ATA Specification requires selective self-test log data structure revision number = 1 SPAN MIN_LBA MAX_LBA CURRENT_TEST_STATUS 1 0 0 Completed [00% left] (0-65535) 2 0 0 Not_testing 3 0 0 Not_testing 4 0 0 Not_testing 5 0 0 Not_testing Selective self-test flags (0x0): After scanning selected spans, do NOT read-scan remainder of disk. If Selective self-test is pending on power-up, resume after 0 minute delay. I'll try power cycling once the other drives finish (about 2 hours or so) and see if the drive comes back.. and grab another SMART report... but my guess is this drive might be a dud! Do you concur?
  3. Was your unraid array on the same system and was it up? Mine's on the same system (but most of the time wasn't doing anything!) So if you have 1GB and I have 1GB, then maybe it's the processor speed? I think mine's at 900MHz.... Maybe I'll try to bump it up and see what I get for speeds. Maybe I'll just hack the script to do the post read... I'd hate to wait another two days just for a speed experiment! Jim And I have the jumper installed...
  4. Wow it took ~52 hours for the 2T to finish. I added a second disk (also 2T) to the mix so that probably added some time! Results posted in the pre-clear results. No questions.. just FYI... Now on to replacing my parity and removing two 500GB drives (One is PATA) And since when did 500GB drive become small to me! Thanks again for the great script! Jim
  5. Wow the 2T disks take a while.. Disk1 == Disk /dev/sdh has been successfully precleared == == Ran 1 preclear-disk cycle == == Using :Read block size = 8225280 Bytes == Last Cycle's Pre Read Time : 7:41:11 (72 MB/s) == Last Cycle's Zeroing time : 9:42:22 (57 MB/s) == Last Cycle's Post Read Time : 35:29:32 (15 MB/s) == Last Cycle's Total Time : 52:54:19 == == Total Elapsed Time 52:54:19 == == Disk Start Temperature: 33C == == Current Disk Temperature: 36C, == ============================================================================ S.M.A.R.T. error count differences detected after pre-clear note, some 'raw' values may change, but not be an indication of a problem 58c58 < 7 Seek_Error_Rate 0x002e 200 200 000 Old_age Always - 0 --- > > 7 Seek_Error_Rate 0x002e 100 253 000 Old_age Always - 0 63c63 < 193 Load_Cycle_Count 0x0032 200 200 000 Old_age Always - 83 --- > > 193 Load_Cycle_Count 0x0032 200 200 000 Old_age Always - 821 ============================================================================ Disk 2 Date: Sun Jun 13 22:59:32 EDT 2010 ============================================================================ == == Disk /dev/sdb has been successfully precleared == == Ran 1 preclear-disk cycle == == Using :Read block size = 8225280 Bytes == Last Cycle's Pre Read Time : 7:29:39 (74 MB/s) == Last Cycle's Zeroing time : 8:24:39 (66 MB/s) == Last Cycle's Post Read Time : 35:33:46 (15 MB/s) == Last Cycle's Total Time : 51:29:23 == == Total Elapsed Time 51:29:23 == == Disk Start Temperature: 26C == == Current Disk Temperature: 32C, == ============================================================================ S.M.A.R.T. error count differences detected after pre-clear note, some 'raw' values may change, but not be an indication of a problem 19,20c19,20 < Offline data collection status: (0x80) Offline data collection activity < was never started. --- > > Offline data collection status: (0x84) Offline data collection activity > > was suspended by an interrupting command from host. 54c54 < 1 Raw_Read_Error_Rate 0x002f 100 253 051 Pre-fail Always - 0 --- > > 1 Raw_Read_Error_Rate 0x002f 200 200 051 Pre-fail Always - 0 63c63 < 193 Load_Cycle_Count 0x0032 200 200 000 Old_age Always - 19 --- > > 193 Load_Cycle_Count 0x0032 200 200 000 Old_age Always - 760 67c67 < 199 UDMA_CRC_Error_Count 0x0032 200 253 000 Old_age Always - 0 --- > > 199 UDMA_CRC_Error_Count 0x0032 200 200 000 Old_age Always - 0 ============================================================================ I went from 72MB/s down to 15MB/s for the Post read! 52 hours!! Granted I was running two disks at once. I wonder if it would have gone faster if I had more memory. Or Maybe boosted the cpu speed. Right now I have one 1GB and running at 900MHz (under clocked to save power) It also seems like the 2 port syba 1x PCIe sata card is slightly faster than the on board (sdb was on that card). Maybe because the rest of the array is on the on board as well. Jim
  6. That's what I thought and was hoping for... Now I'm going to have to wait ~6 extra hours for the test to finish! Jim
  7. Joe, I'm currently pre-clearing a new 2T disk. The post read is going way slower than the pre-read or zeroing phase. I averaged about 70MB/s for the first 2 and now I'm getting about 30MB/s (the display still say 85-95MB/s which is also weird) Is this normal with the new zero checking? Thanks, Jim
  8. This may be a stupid question.. but how do I get minicom to work? I looked in the forum and did a search for a slackware minicom package... But that yielded no fruits! Is there another serial port communication package that's built in or has a package for it? Will I have to run a full slackware distribution? Thanks, Jim
  9. I think I fiured out what was happening with the speed of my tests. It turn out the disks are slower when they are formatted. Not sure why.. but if I "clear" the disk and wipe out the partitions, the test is much faster. It uses a much smaller block size when it's formatted vs when it's clear. fdisk (formatted): root@Tower2:/boot/scripts# fdisk -l /dev/sda Disk /dev/sda: 1500.3 GB, 1500301910016 bytes 1 heads, 63 sectors/track, 46512336 cylinders Units = cylinders of 63 * 512 = 32256 bytes Disk identifier: 0x00000000 Device Boot Start End Blocks Id System /dev/sda1 2 46512336 1465138552+ 83 Linux Partition 1 does not end on cylinder boundary. fdisk after clearing: root@Tower2:/boot/scripts# fdisk -l /dev/sda Disk /dev/sda: 1500.3 GB, 1500301910016 bytes 255 heads, 63 sectors/track, 182401 cylinders Units = cylinders of 16065 * 512 = 8225280 bytes Disk identifier: 0x00000000 Device Boot Start End Blocks Id System /dev/sda1 1 182402 1465138552+ 0 Empty Partition 1 does not end on cylinder boundary. The smaller block size makes the test run much slower. Have you seen this? This makes my reads drop to about 30-40MB/s from 90ish MB/s Why does fdisk report differently? Does the smaller block size make the test any better? As in more thrashing? This explains why I had two very different speeds! Jim
  10. There is a program called screen. It allows you to disconnect and reconnect to a particular session. I have that setup and it works nicely. there is a post somewhere that describes what you need.. You might be able to search the forums for it. Also.. if you have mail setup.. you can use the mail parameters and it will e-mail you the differences as well.. Jim
  11. If you didn't get any results when precear finished then your disk is good! Also looking at the smart results you posted look great. If you had issues you would have seen output along the lines of: S.M.A.R.T. error count differences detected after pre-clear note, some 'raw' values may change, but not be an indication of a problem 57c57 < 1 Raw_Read_Error_Rate 0x000f 099 099 051 Pre-fail Always - 5005 --- > 1 Raw_Read_Error_Rate 0x000f 099 099 051 Pre-fail Always - 5264 66c66 < 13 Read_Soft_Error_Rate 0x000e 099 099 000 Old_age Always - 4648 --- > 13 Read_Soft_Error_Rate 0x000e 099 099 000 Old_age Always - 4912 69c69 < 187 Reported_Uncorrect 0x0032 100 100 000 Old_age Always - 4952 --- > 187 Reported_Uncorrect 0x0032 100 100 000 Old_age Always - 9596 71c71 < 190 Airflow_Temperature_Cel 0x0022 071 067 000 Old_age Always - 29 (Lifetime Min/Max 29/33) --- > 190 Airflow_Temperature_Cel 0x0022 068 067 000 Old_age Always - 32 (Lifetime Min/Max 29/33) 74c74 < 197 Current_Pending_Sector 0x0012 092 092 000 Old_age Always - 331 --- > 197 Current_Pending_Sector 0x0012 100 092 000 Old_age Always - 0 78c78 < 201 Soft_Read_Error_Rate 0x000a 097 097 000 Old_age Always - 228 --- > 201 Soft_Read_Error_Rate 0x000a 100 097 000 Old_age Always - 0 You probably got a "pre-clear was successful" or something like that.. Enjoy the new drive!
  12. There is something weird going on... I ran the test on both disks again concurrently and this time I got the same results as running a single. I can't duplicate the 25MB/s or even the 34MB/s Maybe I should just be happy I'm getting the faster speeds! Maybe I was running a some modifed version of the script that ran slow? Bizarre..
  13. I after I ran it a second time by its self, I got to 17.8ish hours. So I feel better.. I would have thought that running two disks wouldn't have THAT much of an effect! Maybe I'll boost the memory speed and CPU speed. Maybe that will help concurrent pre-clears. I've got it crippled to lower the power...
  14. I ran the disk one more time. This is what I got: S.M.A.R.T. error count differences detected after pre-clear note, some 'raw' values may change, but not be an indication of a problem 57c57 < 1 Raw_Read_Error_Rate 0x000f 099 099 051 Pre-fail Always - 5005 --- > 1 Raw_Read_Error_Rate 0x000f 099 099 051 Pre-fail Always - 5264 66c66 < 13 Read_Soft_Error_Rate 0x000e 099 099 000 Old_age Always - 4648 --- > 13 Read_Soft_Error_Rate 0x000e 099 099 000 Old_age Always - 4912 69c69 < 187 Reported_Uncorrect 0x0032 100 100 000 Old_age Always - 4952 --- > 187 Reported_Uncorrect 0x0032 100 100 000 Old_age Always - 9596 71c71 < 190 Airflow_Temperature_Cel 0x0022 071 067 000 Old_age Always - 29 (Lifetime Min/Max 29/33) --- > 190 Airflow_Temperature_Cel 0x0022 068 067 000 Old_age Always - 32 (Lifetime Min/Max 29/33) 74c74 < 197 Current_Pending_Sector 0x0012 092 092 000 Old_age Always - 331 --- > 197 Current_Pending_Sector 0x0012 100 092 000 Old_age Always - 0 78c78 < 201 Soft_Read_Error_Rate 0x000a 097 097 000 Old_age Always - 228 --- > 201 Soft_Read_Error_Rate 0x000a 100 097 000 Old_age Always - 0 ============================================================================ The Current_Pending_Sector didn't go up.. But neither did the Reallocated_Sectors?? what Happened to the 331 previous pending? Also the Raw_Read_Error_Rate and the Read_Soft_Error_Rate both went up.. but not as much as the first time. However the Reported_Uncorrect almost doubled. I also noted a bunch of errors in the syslog from the first time I ran the test (with both disks going) Here is a snippit of the error: Aug 23 16:20:49 Tower2 kernel: ata1.00: exception Emask 0x0 SAct 0x7 SErr 0x0 action 0x0 Aug 23 16:20:49 Tower2 kernel: ata1.00: irq_stat 0x40000008 Aug 23 16:20:49 Tower2 kernel: ata1.00: cmd 60/00:08:18:fa:a1/02:00:62:00:00/40 tag 1 ncq 262144 in Aug 23 16:20:49 Tower2 kernel: res 41/40:60:b8:fa:a1/85:01:62:00:00/40 Emask 0x409 (media error) Aug 23 16:20:49 Tower2 kernel: ata1.00: status: { DRDY ERR } Aug 23 16:20:49 Tower2 kernel: ata1.00: error: { UNC } Aug 23 16:20:49 Tower2 kernel: ata1.00: configured for UDMA/133 Aug 23 16:20:49 Tower2 kernel: ata1: EH complete Aug 23 16:20:49 Tower2 kernel: sd 1:0:0:0: [sda] 2930277168 512-byte hardware sectors (1500302 MB) Aug 23 16:20:49 Tower2 kernel: sd 1:0:0:0: [sda] Write Protect is off Aug 23 16:20:49 Tower2 kernel: sd 1:0:0:0: [sda] Mode Sense: 00 3a 00 00 Aug 23 16:20:49 Tower2 kernel: sd 1:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA Aug 23 16:20:52 Tower2 kernel: ata1.00: exception Emask 0x0 SAct 0x7 SErr 0x0 action 0x0 Aug 23 16:20:52 Tower2 kernel: ata1.00: irq_stat 0x40000008 Aug 23 16:20:52 Tower2 kernel: ata1.00: cmd 60/00:08:18:fa:a1/02:00:62:00:00/40 tag 1 ncq 262144 in Aug 23 16:20:52 Tower2 kernel: res 41/40:60:b8:fa:a1/85:01:62:00:00/40 Emask 0x409 (media error) Aug 23 16:20:52 Tower2 kernel: ata1.00: status: { DRDY ERR } Aug 23 16:20:52 Tower2 kernel: ata1.00: error: { UNC } Aug 23 16:20:52 Tower2 kernel: ata1.00: configured for UDMA/133 Aug 23 16:20:52 Tower2 kernel: ata1: EH complete Aug 23 16:20:52 Tower2 kernel: sd 1:0:0:0: [sda] 2930277168 512-byte hardware sectors (1500302 MB) Aug 23 16:20:52 Tower2 kernel: sd 1:0:0:0: [sda] Write Protect is off Aug 23 16:20:52 Tower2 kernel: sd 1:0:0:0: [sda] Mode Sense: 00 3a 00 00 Aug 23 16:20:52 Tower2 kernel: sd 1:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA Aug 23 16:20:55 Tower2 kernel: ata1.00: exception Emask 0x0 SAct 0x7 SErr 0x0 action 0x0 Aug 23 16:20:55 Tower2 kernel: ata1.00: irq_stat 0x40000008 Aug 23 16:20:55 Tower2 kernel: ata1.00: cmd 60/00:08:18:fa:a1/02:00:62:00:00/40 tag 1 ncq 262144 in Aug 23 16:20:55 Tower2 kernel: res 41/40:60:b8:fa:a1/85:01:62:00:00/40 Emask 0x409 (media error) Aug 23 16:20:55 Tower2 kernel: ata1.00: status: { DRDY ERR } Aug 23 16:20:55 Tower2 kernel: ata1.00: error: { UNC } Aug 23 16:20:55 Tower2 kernel: ata1.00: configured for UDMA/133 Aug 23 16:20:55 Tower2 kernel: ata1: EH complete Aug 23 16:20:55 Tower2 kernel: sd 1:0:0:0: [sda] 2930277168 512-byte hardware sectors (1500302 MB) Aug 23 16:20:55 Tower2 kernel: sd 1:0:0:0: [sda] Write Protect is off Aug 23 16:20:55 Tower2 kernel: sd 1:0:0:0: [sda] Mode Sense: 00 3a 00 00 Aug 23 16:20:55 Tower2 kernel: sd 1:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA Aug 23 16:20:57 Tower2 kernel: ata1.00: exception Emask 0x0 SAct 0x7 SErr 0x0 action 0x0 Aug 23 16:20:57 Tower2 kernel: ata1.00: irq_stat 0x40000008 Aug 23 16:20:57 Tower2 kernel: ata1.00: cmd 60/00:08:18:fa:a1/02:00:62:00:00/40 tag 1 ncq 262144 in Aug 23 16:20:57 Tower2 kernel: res 41/40:60:b8:fa:a1/85:01:62:00:00/40 Emask 0x409 (media error) Aug 23 16:20:57 Tower2 kernel: ata1.00: status: { DRDY ERR } Aug 23 16:20:57 Tower2 kernel: ata1.00: error: { UNC } Aug 23 16:20:57 Tower2 kernel: ata1.00: configured for UDMA/133 Aug 23 16:20:57 Tower2 kernel: ata1: EH complete Aug 23 16:20:57 Tower2 kernel: sd 1:0:0:0: [sda] 2930277168 512-byte hardware sectors (1500302 MB) Aug 23 16:20:57 Tower2 kernel: sd 1:0:0:0: [sda] Write Protect is off Aug 23 16:20:57 Tower2 kernel: sd 1:0:0:0: [sda] Mode Sense: 00 3a 00 00 Aug 23 16:20:57 Tower2 kernel: sd 1:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA Aug 23 16:21:00 Tower2 kernel: ata1.00: exception Emask 0x0 SAct 0x7 SErr 0x0 action 0x0 Aug 23 16:21:00 Tower2 kernel: ata1.00: irq_stat 0x40000008 Aug 23 16:21:00 Tower2 kernel: ata1.00: cmd 60/00:08:18:fa:a1/02:00:62:00:00/40 tag 1 ncq 262144 in Aug 23 16:21:00 Tower2 kernel: res 41/40:60:b8:fa:a1/85:01:62:00:00/40 Emask 0x409 (media error) Aug 23 16:21:00 Tower2 kernel: ata1.00: status: { DRDY ERR } Aug 23 16:21:00 Tower2 kernel: ata1.00: error: { UNC } Aug 23 16:21:00 Tower2 kernel: ata1.00: configured for UDMA/133 Aug 23 16:21:00 Tower2 kernel: ata1: EH complete Aug 23 16:21:00 Tower2 kernel: sd 1:0:0:0: [sda] 2930277168 512-byte hardware sectors (1500302 MB) Aug 23 16:21:00 Tower2 kernel: sd 1:0:0:0: [sda] Write Protect is off Aug 23 16:21:00 Tower2 kernel: sd 1:0:0:0: [sda] Mode Sense: 00 3a 00 00 Aug 23 16:21:00 Tower2 kernel: sd 1:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA Aug 23 16:21:03 Tower2 kernel: ata1.00: exception Emask 0x0 SAct 0x7 SErr 0x0 action 0x0 Aug 23 16:21:03 Tower2 kernel: ata1.00: irq_stat 0x40000008 Aug 23 16:21:03 Tower2 kernel: ata1.00: cmd 60/00:08:18:fa:a1/02:00:62:00:00/40 tag 1 ncq 262144 in Aug 23 16:21:03 Tower2 kernel: res 41/40:60:b8:fa:a1/85:01:62:00:00/40 Emask 0x409 (media error) Aug 23 16:21:03 Tower2 kernel: ata1.00: status: { DRDY ERR } Aug 23 16:21:03 Tower2 kernel: ata1.00: error: { UNC } Aug 23 16:21:03 Tower2 kernel: ata1.00: configured for UDMA/133 Aug 23 16:21:03 Tower2 kernel: sd 1:0:0:0: [sda] Result: hostbyte=0x00 driverbyte=0x08 Aug 23 16:21:03 Tower2 kernel: sd 1:0:0:0: [sda] Sense Key : 0x3 [current] [descriptor] Aug 23 16:21:03 Tower2 kernel: Descriptor sense data with sense descriptors (in hex): Aug 23 16:21:03 Tower2 kernel: 72 03 11 04 00 00 00 0c 00 0a 80 00 00 00 00 00 Aug 23 16:21:03 Tower2 kernel: 62 a1 fa b8 Aug 23 16:21:03 Tower2 kernel: sd 1:0:0:0: [sda] ASC=0x11 ASCQ=0x4 Aug 23 16:21:03 Tower2 kernel: end_request: I/O error, dev sda, sector 1654782648 Aug 23 16:21:03 Tower2 kernel: Buffer I/O error on device sda, logical block 206847831 Aug 23 16:21:03 Tower2 kernel: Buffer I/O error on device sda, logical block 206847832 Aug 23 16:21:03 Tower2 kernel: Buffer I/O error on device sda, logical block 206847833 Aug 23 16:21:03 Tower2 kernel: Buffer I/O error on device sda, logical block 206847834 Aug 23 16:21:03 Tower2 kernel: Buffer I/O error on device sda, logical block 206847835 Aug 23 16:21:03 Tower2 kernel: Buffer I/O error on device sda, logical block 206847836 Aug 23 16:21:03 Tower2 kernel: Buffer I/O error on device sda, logical block 206847837 Aug 23 16:21:03 Tower2 kernel: Buffer I/O error on device sda, logical block 206847838 Aug 23 16:21:03 Tower2 kernel: Buffer I/O error on device sda, logical block 206847839 Aug 23 16:21:03 Tower2 kernel: ata1: EH complete Aug 23 16:21:03 Tower2 kernel: sd 1:0:0:0: [sda] 2930277168 512-byte hardware sectors (1500302 MB) Aug 23 16:21:03 Tower2 kernel: sd 1:0:0:0: [sda] Write Protect is off Aug 23 16:21:03 Tower2 kernel: sd 1:0:0:0: [sda] Mode Sense: 00 3a 00 00 Aug 23 16:21:03 Tower2 kernel: sd 1:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA Full syslog attached except for a bug chunk in the middle I had to cut out to make the attacment the right size. It seems like there were a lot less errors the second time around. Now is this still an RMA canidate or do you think this might be a MB error? (It's new too) I'm running one more cycle Thanks, Jim
  15. I just ran 2 disks single cycle. One disk was fine the other was not so much. Do you agree that this might be an RMA canidate? I'm running a sencond cycle to be sure.. S.M.A.R.T. error count differences detected after pre-clear note, some 'raw' values may change, but not be an indication of a problem 57c57 < 1 Raw_Read_Error_Rate 0x000f 100 100 051 Pre-fail Always - 0 --- > 1 Raw_Read_Error_Rate 0x000f 099 099 051 Pre-fail Always - 5005 66c66 < 13 Read_Soft_Error_Rate 0x000e 100 100 000 Old_age Always - 0 --- > 13 Read_Soft_Error_Rate 0x000e 099 099 000 Old_age Always - 4648 69c69 < 187 Reported_Uncorrect 0x0032 100 100 000 Old_age Always - 0 --- > 187 Reported_Uncorrect 0x0032 100 100 000 Old_age Always - 4952 71c71 < 190 Airflow_Temperature_Cel 0x0022 070 070 000 Old_age Always - 30 (Lifetime Min/Max 30/30) --- > 190 Airflow_Temperature_Cel 0x0022 068 067 000 Old_age Always - 32 (Lifetime Min/Max 30/33) 74c74 < 197 Current_Pending_Sector 0x0012 100 100 000 Old_age Always - 0 --- > 197 Current_Pending_Sector 0x0012 092 092 000 Old_age Always - 331 78c78 < 201 Soft_Read_Error_Rate 0x000a 253 253 000 Old_age Always - 0 --- > 201 Soft_Read_Error_Rate 0x000a 097 097 000 Old_age Always - 228 ============================================================================
  16. But why the big descrepency with the cycle time? 17 hours vs 28 and 30 hours? I wouldn't think the 7200 vs 5400 would have that much of a difference. I'm running a single drive now again to see how much one drive vs 2 does. Jim
  17. What was your cycle time for a 1.5T disk? It seemed like yours was in the 17 hour time frame from your screen capture? I would hope that I would get closer to that rather than the 28 hour time frame. Oh.. And the zeroing took ~5 hours.
  18. I wasn't doing anything else with the array.. It was stopped. I was getting parity check speeds of 90-100MBs (parity synch was about 50-60MB/s) with the two drives when I tested that.. That's why I would expect to get something similiar with the pre-read. Maybe I'll try some dd comands. The preclear cycle for the disks took about 28 hours for one and 30 hours for the other. One was fine and the longer one had some smart errors which I'll post in the other thread. Jim
  19. Joe, I just got a new unraid MB and CPU and I'm currently testing it with two new Samsung 1.5T drives. I'm preclearing both and I'm not getting nowhere near the speeds you are. If yours was a PCI based system.. Mine is a new pci-e based system. I only have the two drives attached. Syslog says they are runnign in 3.0Gbs... But they are both going at a rate of about 25% every 4 hours for the preread. Even when I just did one drive I was getting 2GB/min ~ 34MB/s. I would expect a lot better than that! Right now I'm getting about 25.6MB/s. Am I missing something. In the log I see: Aug 21 23:17:43 Tower2 preclear_disk-start[14626]: smartctl version 5.38 [i486-slackware-linux-gnu] Copyright (C) 2002-8 Bruce Allen Aug 21 23:17:43 Tower2 preclear_disk-start[14626]: Home page is http://smartmontools.sourceforge.net/ Aug 21 23:17:43 Tower2 preclear_disk-start[14626]: Aug 21 23:17:43 Tower2 preclear_disk-start[14626]: === START OF INFORMATION SECTION === Aug 21 23:17:43 Tower2 preclear_disk-start[14626]: Device Model: SAMSUNG HD154UI Aug 21 23:17:43 Tower2 preclear_disk-start[14626]: Serial Number: S1Y6J1KS743788 Aug 21 23:17:43 Tower2 preclear_disk-start[14626]: Firmware Version: 1AG01118 Aug 21 23:17:43 Tower2 preclear_disk-start[14626]: User Capacity: 1,500,301,910,016 bytes Aug 21 23:17:43 Tower2 preclear_disk-start[14626]: Device is: In smartctl database [for details use: -P show] Aug 21 23:17:43 Tower2 preclear_disk-start[14626]: ATA Version is: 8 Aug 21 23:17:43 Tower2 preclear_disk-start[14626]: ATA Standard is: ATA-8-ACS revision 3b Aug 21 23:17:43 Tower2 preclear_disk-start[14626]: Local Time is: Fri Aug 21 23:17:43 2009 EDT Aug 21 23:17:43 Tower2 preclear_disk-start[14626]: Aug 21 23:17:43 Tower2 preclear_disk-start[14626]: ==> WARNING: May need -F samsung or -F samsung2 enabled; see manual for details. Aug 21 23:17:43 Tower2 preclear_disk-start[14626]: Aug 21 23:17:43 Tower2 preclear_disk-start[14626]: SMART support is: Available - device has SMART capability. Aug 21 23:17:43 Tower2 preclear_disk-start[14626]: SMART support is: Enabled Aug 21 23:17:43 Tower2 preclear_disk-start[14626]: What is this -F samsung? I'm running in AHCI mode set in the bios. Anything else I'm missing?
  20. Each cycle is just about 12hours. I'm in no immediate rush so I just popped off another cycle. Maybe an interesting additiion to the script would be to save the smart data after every cycle so we can see when the events happend. When I ran the 1st 3 cycles I don't know if the events happened in the 1st, 2nd, or 3rd cycle.. Jim
  21. Done! I started a new thread that can be devoted to just questions about the results of the script. Hopefully all the gurus will monitor that thread too! Thanks again, Joe, for a great script! Results discussion thread can be found here
  22. In an effort to keep the Preclear script thread more about questions about the script itself, I've started another thread here to discuss the results. The preclear thread is peppered with result questions and questions about the script and is now 15 pages long! So I'm thinking that a seperate thread was warranted. So I'll start it off... If it stays at 5, in my opinion, no problem. If it increases over time, then you might want to use the RMA process. Odds are good it will stabilize. I have one 250Gig drive that has had 100 relocated sectors since the first time I ran smartctl on it. That number has never changed on that disk. I'd say, download the new version of preclear_disk.sh and run another set of test cycles and see if it shows an increase in re-allocated sectors. (the new version stress-tests the drive more. The old one had a bug that prevented the random cylinders from being read in addition to the linear read that was properly occurring) If the number stays at 5, fine, if not another test cycle might be in order. At that point you have all the evidence you need if an RMA is warranted. You might want to start a thread with your preclear experience. It will allow the questions about the output to all be in one spot. Joe L. Ok.. I ran one more full cycle with the new verions of the script and I got no reallocated sector changes. Should I run once more or do you think I'm good now and can put the disk into service? So... first 3 cycles. - 5 reallocated sectors 4th cycle - no more reallocated sectors. Jim
  23. After running 3 interations on my new 1TB green disk I had < 5 Reallocated_Sector_Ct 0x0033 200 200 140 Pre-fail Always - 0 --- > 5 Reallocated_Sector_Ct 0x0033 199 199 140 Pre-fail Always - 5 64c64 < 196 Reallocated_Event_Count 0x0032 200 200 000 Old_age Always - 0 --- > 196 Reallocated_Event_Count 0x0032 199 199 000 Old_age Always - 1 Are 5 reallocated sectors anything to worry about.. I was hoping for 0! This is still running on the old version of the script.. Maybe I should try the new version.. (I started my test the morning before Joe posted the new version!) I did start a cycle again on a different controller (one cycle this time - and still the old script) Another thought... Should we start a new thread for preclear disk result questions and keep this thread for questions/comments about the functionality of preclear? Jim
  24. If you do use the mail programs listed from the posts above (from unraid_notify and it's mail offshoot) you will have to use the -m [email protected] command line parameter It will not default to the e-mail address in the unraid_notify.config file. Maybe we can change the mail script to handle "root" as a recipient someday.. I'm still hoping that brianbone will update the package into a seperate mail and unraid_notify package! Jim
  25. The thirst for adventure is outweighed by my thirst for more disk space! My cache drive is filling up because there is no room left on the array!
×
×
  • Create New...