Jump to content

jbuszkie

Members
  • Posts

    696
  • Joined

  • Last visited

Everything posted by jbuszkie

  1. I ran the disk one more time. This is what I got: S.M.A.R.T. error count differences detected after pre-clear note, some 'raw' values may change, but not be an indication of a problem 57c57 < 1 Raw_Read_Error_Rate 0x000f 099 099 051 Pre-fail Always - 5005 --- > 1 Raw_Read_Error_Rate 0x000f 099 099 051 Pre-fail Always - 5264 66c66 < 13 Read_Soft_Error_Rate 0x000e 099 099 000 Old_age Always - 4648 --- > 13 Read_Soft_Error_Rate 0x000e 099 099 000 Old_age Always - 4912 69c69 < 187 Reported_Uncorrect 0x0032 100 100 000 Old_age Always - 4952 --- > 187 Reported_Uncorrect 0x0032 100 100 000 Old_age Always - 9596 71c71 < 190 Airflow_Temperature_Cel 0x0022 071 067 000 Old_age Always - 29 (Lifetime Min/Max 29/33) --- > 190 Airflow_Temperature_Cel 0x0022 068 067 000 Old_age Always - 32 (Lifetime Min/Max 29/33) 74c74 < 197 Current_Pending_Sector 0x0012 092 092 000 Old_age Always - 331 --- > 197 Current_Pending_Sector 0x0012 100 092 000 Old_age Always - 0 78c78 < 201 Soft_Read_Error_Rate 0x000a 097 097 000 Old_age Always - 228 --- > 201 Soft_Read_Error_Rate 0x000a 100 097 000 Old_age Always - 0 ============================================================================ The Current_Pending_Sector didn't go up.. But neither did the Reallocated_Sectors?? what Happened to the 331 previous pending? Also the Raw_Read_Error_Rate and the Read_Soft_Error_Rate both went up.. but not as much as the first time. However the Reported_Uncorrect almost doubled. I also noted a bunch of errors in the syslog from the first time I ran the test (with both disks going) Here is a snippit of the error: Aug 23 16:20:49 Tower2 kernel: ata1.00: exception Emask 0x0 SAct 0x7 SErr 0x0 action 0x0 Aug 23 16:20:49 Tower2 kernel: ata1.00: irq_stat 0x40000008 Aug 23 16:20:49 Tower2 kernel: ata1.00: cmd 60/00:08:18:fa:a1/02:00:62:00:00/40 tag 1 ncq 262144 in Aug 23 16:20:49 Tower2 kernel: res 41/40:60:b8:fa:a1/85:01:62:00:00/40 Emask 0x409 (media error) Aug 23 16:20:49 Tower2 kernel: ata1.00: status: { DRDY ERR } Aug 23 16:20:49 Tower2 kernel: ata1.00: error: { UNC } Aug 23 16:20:49 Tower2 kernel: ata1.00: configured for UDMA/133 Aug 23 16:20:49 Tower2 kernel: ata1: EH complete Aug 23 16:20:49 Tower2 kernel: sd 1:0:0:0: [sda] 2930277168 512-byte hardware sectors (1500302 MB) Aug 23 16:20:49 Tower2 kernel: sd 1:0:0:0: [sda] Write Protect is off Aug 23 16:20:49 Tower2 kernel: sd 1:0:0:0: [sda] Mode Sense: 00 3a 00 00 Aug 23 16:20:49 Tower2 kernel: sd 1:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA Aug 23 16:20:52 Tower2 kernel: ata1.00: exception Emask 0x0 SAct 0x7 SErr 0x0 action 0x0 Aug 23 16:20:52 Tower2 kernel: ata1.00: irq_stat 0x40000008 Aug 23 16:20:52 Tower2 kernel: ata1.00: cmd 60/00:08:18:fa:a1/02:00:62:00:00/40 tag 1 ncq 262144 in Aug 23 16:20:52 Tower2 kernel: res 41/40:60:b8:fa:a1/85:01:62:00:00/40 Emask 0x409 (media error) Aug 23 16:20:52 Tower2 kernel: ata1.00: status: { DRDY ERR } Aug 23 16:20:52 Tower2 kernel: ata1.00: error: { UNC } Aug 23 16:20:52 Tower2 kernel: ata1.00: configured for UDMA/133 Aug 23 16:20:52 Tower2 kernel: ata1: EH complete Aug 23 16:20:52 Tower2 kernel: sd 1:0:0:0: [sda] 2930277168 512-byte hardware sectors (1500302 MB) Aug 23 16:20:52 Tower2 kernel: sd 1:0:0:0: [sda] Write Protect is off Aug 23 16:20:52 Tower2 kernel: sd 1:0:0:0: [sda] Mode Sense: 00 3a 00 00 Aug 23 16:20:52 Tower2 kernel: sd 1:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA Aug 23 16:20:55 Tower2 kernel: ata1.00: exception Emask 0x0 SAct 0x7 SErr 0x0 action 0x0 Aug 23 16:20:55 Tower2 kernel: ata1.00: irq_stat 0x40000008 Aug 23 16:20:55 Tower2 kernel: ata1.00: cmd 60/00:08:18:fa:a1/02:00:62:00:00/40 tag 1 ncq 262144 in Aug 23 16:20:55 Tower2 kernel: res 41/40:60:b8:fa:a1/85:01:62:00:00/40 Emask 0x409 (media error) Aug 23 16:20:55 Tower2 kernel: ata1.00: status: { DRDY ERR } Aug 23 16:20:55 Tower2 kernel: ata1.00: error: { UNC } Aug 23 16:20:55 Tower2 kernel: ata1.00: configured for UDMA/133 Aug 23 16:20:55 Tower2 kernel: ata1: EH complete Aug 23 16:20:55 Tower2 kernel: sd 1:0:0:0: [sda] 2930277168 512-byte hardware sectors (1500302 MB) Aug 23 16:20:55 Tower2 kernel: sd 1:0:0:0: [sda] Write Protect is off Aug 23 16:20:55 Tower2 kernel: sd 1:0:0:0: [sda] Mode Sense: 00 3a 00 00 Aug 23 16:20:55 Tower2 kernel: sd 1:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA Aug 23 16:20:57 Tower2 kernel: ata1.00: exception Emask 0x0 SAct 0x7 SErr 0x0 action 0x0 Aug 23 16:20:57 Tower2 kernel: ata1.00: irq_stat 0x40000008 Aug 23 16:20:57 Tower2 kernel: ata1.00: cmd 60/00:08:18:fa:a1/02:00:62:00:00/40 tag 1 ncq 262144 in Aug 23 16:20:57 Tower2 kernel: res 41/40:60:b8:fa:a1/85:01:62:00:00/40 Emask 0x409 (media error) Aug 23 16:20:57 Tower2 kernel: ata1.00: status: { DRDY ERR } Aug 23 16:20:57 Tower2 kernel: ata1.00: error: { UNC } Aug 23 16:20:57 Tower2 kernel: ata1.00: configured for UDMA/133 Aug 23 16:20:57 Tower2 kernel: ata1: EH complete Aug 23 16:20:57 Tower2 kernel: sd 1:0:0:0: [sda] 2930277168 512-byte hardware sectors (1500302 MB) Aug 23 16:20:57 Tower2 kernel: sd 1:0:0:0: [sda] Write Protect is off Aug 23 16:20:57 Tower2 kernel: sd 1:0:0:0: [sda] Mode Sense: 00 3a 00 00 Aug 23 16:20:57 Tower2 kernel: sd 1:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA Aug 23 16:21:00 Tower2 kernel: ata1.00: exception Emask 0x0 SAct 0x7 SErr 0x0 action 0x0 Aug 23 16:21:00 Tower2 kernel: ata1.00: irq_stat 0x40000008 Aug 23 16:21:00 Tower2 kernel: ata1.00: cmd 60/00:08:18:fa:a1/02:00:62:00:00/40 tag 1 ncq 262144 in Aug 23 16:21:00 Tower2 kernel: res 41/40:60:b8:fa:a1/85:01:62:00:00/40 Emask 0x409 (media error) Aug 23 16:21:00 Tower2 kernel: ata1.00: status: { DRDY ERR } Aug 23 16:21:00 Tower2 kernel: ata1.00: error: { UNC } Aug 23 16:21:00 Tower2 kernel: ata1.00: configured for UDMA/133 Aug 23 16:21:00 Tower2 kernel: ata1: EH complete Aug 23 16:21:00 Tower2 kernel: sd 1:0:0:0: [sda] 2930277168 512-byte hardware sectors (1500302 MB) Aug 23 16:21:00 Tower2 kernel: sd 1:0:0:0: [sda] Write Protect is off Aug 23 16:21:00 Tower2 kernel: sd 1:0:0:0: [sda] Mode Sense: 00 3a 00 00 Aug 23 16:21:00 Tower2 kernel: sd 1:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA Aug 23 16:21:03 Tower2 kernel: ata1.00: exception Emask 0x0 SAct 0x7 SErr 0x0 action 0x0 Aug 23 16:21:03 Tower2 kernel: ata1.00: irq_stat 0x40000008 Aug 23 16:21:03 Tower2 kernel: ata1.00: cmd 60/00:08:18:fa:a1/02:00:62:00:00/40 tag 1 ncq 262144 in Aug 23 16:21:03 Tower2 kernel: res 41/40:60:b8:fa:a1/85:01:62:00:00/40 Emask 0x409 (media error) Aug 23 16:21:03 Tower2 kernel: ata1.00: status: { DRDY ERR } Aug 23 16:21:03 Tower2 kernel: ata1.00: error: { UNC } Aug 23 16:21:03 Tower2 kernel: ata1.00: configured for UDMA/133 Aug 23 16:21:03 Tower2 kernel: sd 1:0:0:0: [sda] Result: hostbyte=0x00 driverbyte=0x08 Aug 23 16:21:03 Tower2 kernel: sd 1:0:0:0: [sda] Sense Key : 0x3 [current] [descriptor] Aug 23 16:21:03 Tower2 kernel: Descriptor sense data with sense descriptors (in hex): Aug 23 16:21:03 Tower2 kernel: 72 03 11 04 00 00 00 0c 00 0a 80 00 00 00 00 00 Aug 23 16:21:03 Tower2 kernel: 62 a1 fa b8 Aug 23 16:21:03 Tower2 kernel: sd 1:0:0:0: [sda] ASC=0x11 ASCQ=0x4 Aug 23 16:21:03 Tower2 kernel: end_request: I/O error, dev sda, sector 1654782648 Aug 23 16:21:03 Tower2 kernel: Buffer I/O error on device sda, logical block 206847831 Aug 23 16:21:03 Tower2 kernel: Buffer I/O error on device sda, logical block 206847832 Aug 23 16:21:03 Tower2 kernel: Buffer I/O error on device sda, logical block 206847833 Aug 23 16:21:03 Tower2 kernel: Buffer I/O error on device sda, logical block 206847834 Aug 23 16:21:03 Tower2 kernel: Buffer I/O error on device sda, logical block 206847835 Aug 23 16:21:03 Tower2 kernel: Buffer I/O error on device sda, logical block 206847836 Aug 23 16:21:03 Tower2 kernel: Buffer I/O error on device sda, logical block 206847837 Aug 23 16:21:03 Tower2 kernel: Buffer I/O error on device sda, logical block 206847838 Aug 23 16:21:03 Tower2 kernel: Buffer I/O error on device sda, logical block 206847839 Aug 23 16:21:03 Tower2 kernel: ata1: EH complete Aug 23 16:21:03 Tower2 kernel: sd 1:0:0:0: [sda] 2930277168 512-byte hardware sectors (1500302 MB) Aug 23 16:21:03 Tower2 kernel: sd 1:0:0:0: [sda] Write Protect is off Aug 23 16:21:03 Tower2 kernel: sd 1:0:0:0: [sda] Mode Sense: 00 3a 00 00 Aug 23 16:21:03 Tower2 kernel: sd 1:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA Full syslog attached except for a bug chunk in the middle I had to cut out to make the attacment the right size. It seems like there were a lot less errors the second time around. Now is this still an RMA canidate or do you think this might be a MB error? (It's new too) I'm running one more cycle Thanks, Jim
  2. I just ran 2 disks single cycle. One disk was fine the other was not so much. Do you agree that this might be an RMA canidate? I'm running a sencond cycle to be sure.. S.M.A.R.T. error count differences detected after pre-clear note, some 'raw' values may change, but not be an indication of a problem 57c57 < 1 Raw_Read_Error_Rate 0x000f 100 100 051 Pre-fail Always - 0 --- > 1 Raw_Read_Error_Rate 0x000f 099 099 051 Pre-fail Always - 5005 66c66 < 13 Read_Soft_Error_Rate 0x000e 100 100 000 Old_age Always - 0 --- > 13 Read_Soft_Error_Rate 0x000e 099 099 000 Old_age Always - 4648 69c69 < 187 Reported_Uncorrect 0x0032 100 100 000 Old_age Always - 0 --- > 187 Reported_Uncorrect 0x0032 100 100 000 Old_age Always - 4952 71c71 < 190 Airflow_Temperature_Cel 0x0022 070 070 000 Old_age Always - 30 (Lifetime Min/Max 30/30) --- > 190 Airflow_Temperature_Cel 0x0022 068 067 000 Old_age Always - 32 (Lifetime Min/Max 30/33) 74c74 < 197 Current_Pending_Sector 0x0012 100 100 000 Old_age Always - 0 --- > 197 Current_Pending_Sector 0x0012 092 092 000 Old_age Always - 331 78c78 < 201 Soft_Read_Error_Rate 0x000a 253 253 000 Old_age Always - 0 --- > 201 Soft_Read_Error_Rate 0x000a 097 097 000 Old_age Always - 228 ============================================================================
  3. But why the big descrepency with the cycle time? 17 hours vs 28 and 30 hours? I wouldn't think the 7200 vs 5400 would have that much of a difference. I'm running a single drive now again to see how much one drive vs 2 does. Jim
  4. What was your cycle time for a 1.5T disk? It seemed like yours was in the 17 hour time frame from your screen capture? I would hope that I would get closer to that rather than the 28 hour time frame. Oh.. And the zeroing took ~5 hours.
  5. I wasn't doing anything else with the array.. It was stopped. I was getting parity check speeds of 90-100MBs (parity synch was about 50-60MB/s) with the two drives when I tested that.. That's why I would expect to get something similiar with the pre-read. Maybe I'll try some dd comands. The preclear cycle for the disks took about 28 hours for one and 30 hours for the other. One was fine and the longer one had some smart errors which I'll post in the other thread. Jim
  6. Joe, I just got a new unraid MB and CPU and I'm currently testing it with two new Samsung 1.5T drives. I'm preclearing both and I'm not getting nowhere near the speeds you are. If yours was a PCI based system.. Mine is a new pci-e based system. I only have the two drives attached. Syslog says they are runnign in 3.0Gbs... But they are both going at a rate of about 25% every 4 hours for the preread. Even when I just did one drive I was getting 2GB/min ~ 34MB/s. I would expect a lot better than that! Right now I'm getting about 25.6MB/s. Am I missing something. In the log I see: Aug 21 23:17:43 Tower2 preclear_disk-start[14626]: smartctl version 5.38 [i486-slackware-linux-gnu] Copyright (C) 2002-8 Bruce Allen Aug 21 23:17:43 Tower2 preclear_disk-start[14626]: Home page is http://smartmontools.sourceforge.net/ Aug 21 23:17:43 Tower2 preclear_disk-start[14626]: Aug 21 23:17:43 Tower2 preclear_disk-start[14626]: === START OF INFORMATION SECTION === Aug 21 23:17:43 Tower2 preclear_disk-start[14626]: Device Model: SAMSUNG HD154UI Aug 21 23:17:43 Tower2 preclear_disk-start[14626]: Serial Number: S1Y6J1KS743788 Aug 21 23:17:43 Tower2 preclear_disk-start[14626]: Firmware Version: 1AG01118 Aug 21 23:17:43 Tower2 preclear_disk-start[14626]: User Capacity: 1,500,301,910,016 bytes Aug 21 23:17:43 Tower2 preclear_disk-start[14626]: Device is: In smartctl database [for details use: -P show] Aug 21 23:17:43 Tower2 preclear_disk-start[14626]: ATA Version is: 8 Aug 21 23:17:43 Tower2 preclear_disk-start[14626]: ATA Standard is: ATA-8-ACS revision 3b Aug 21 23:17:43 Tower2 preclear_disk-start[14626]: Local Time is: Fri Aug 21 23:17:43 2009 EDT Aug 21 23:17:43 Tower2 preclear_disk-start[14626]: Aug 21 23:17:43 Tower2 preclear_disk-start[14626]: ==> WARNING: May need -F samsung or -F samsung2 enabled; see manual for details. Aug 21 23:17:43 Tower2 preclear_disk-start[14626]: Aug 21 23:17:43 Tower2 preclear_disk-start[14626]: SMART support is: Available - device has SMART capability. Aug 21 23:17:43 Tower2 preclear_disk-start[14626]: SMART support is: Enabled Aug 21 23:17:43 Tower2 preclear_disk-start[14626]: What is this -F samsung? I'm running in AHCI mode set in the bios. Anything else I'm missing?
  7. Each cycle is just about 12hours. I'm in no immediate rush so I just popped off another cycle. Maybe an interesting additiion to the script would be to save the smart data after every cycle so we can see when the events happend. When I ran the 1st 3 cycles I don't know if the events happened in the 1st, 2nd, or 3rd cycle.. Jim
  8. Done! I started a new thread that can be devoted to just questions about the results of the script. Hopefully all the gurus will monitor that thread too! Thanks again, Joe, for a great script! Results discussion thread can be found here
  9. In an effort to keep the Preclear script thread more about questions about the script itself, I've started another thread here to discuss the results. The preclear thread is peppered with result questions and questions about the script and is now 15 pages long! So I'm thinking that a seperate thread was warranted. So I'll start it off... If it stays at 5, in my opinion, no problem. If it increases over time, then you might want to use the RMA process. Odds are good it will stabilize. I have one 250Gig drive that has had 100 relocated sectors since the first time I ran smartctl on it. That number has never changed on that disk. I'd say, download the new version of preclear_disk.sh and run another set of test cycles and see if it shows an increase in re-allocated sectors. (the new version stress-tests the drive more. The old one had a bug that prevented the random cylinders from being read in addition to the linear read that was properly occurring) If the number stays at 5, fine, if not another test cycle might be in order. At that point you have all the evidence you need if an RMA is warranted. You might want to start a thread with your preclear experience. It will allow the questions about the output to all be in one spot. Joe L. Ok.. I ran one more full cycle with the new verions of the script and I got no reallocated sector changes. Should I run once more or do you think I'm good now and can put the disk into service? So... first 3 cycles. - 5 reallocated sectors 4th cycle - no more reallocated sectors. Jim
  10. After running 3 interations on my new 1TB green disk I had < 5 Reallocated_Sector_Ct 0x0033 200 200 140 Pre-fail Always - 0 --- > 5 Reallocated_Sector_Ct 0x0033 199 199 140 Pre-fail Always - 5 64c64 < 196 Reallocated_Event_Count 0x0032 200 200 000 Old_age Always - 0 --- > 196 Reallocated_Event_Count 0x0032 199 199 000 Old_age Always - 1 Are 5 reallocated sectors anything to worry about.. I was hoping for 0! This is still running on the old version of the script.. Maybe I should try the new version.. (I started my test the morning before Joe posted the new version!) I did start a cycle again on a different controller (one cycle this time - and still the old script) Another thought... Should we start a new thread for preclear disk result questions and keep this thread for questions/comments about the functionality of preclear? Jim
  11. If you do use the mail programs listed from the posts above (from unraid_notify and it's mail offshoot) you will have to use the -m [email protected] command line parameter It will not default to the e-mail address in the unraid_notify.config file. Maybe we can change the mail script to handle "root" as a recipient someday.. I'm still hoping that brianbone will update the package into a seperate mail and unraid_notify package! Jim
  12. The thirst for adventure is outweighed by my thirst for more disk space! My cache drive is filling up because there is no room left on the array!
  13. Whew! 56 hours of testing later, my replacement 1.5T disk passed 3 cycles! Thanks again for a great script! It helped catch the first disk being bad!! =========================================================================== = unRAID server Pre-Clear disk /dev/sdd = cycle 3 of 3 = Disk Pre-Clear-Read completed DONE = Step 1 of 10 - Copying zeros to first 2048k bytes DONE = Step 2 of 10 - Copying zeros to remainder of disk to clear it DONE = Step 3 of 10 - Disk is now cleared from MBR onward. DONE = Step 4 of 10 - Clearing MBR bytes for partition 2,3 & 4 DONE = Step 5 of 10 - Clearing MBR code area DONE = Step 6 of 10 - Setting MBR signature bytes DONE = Step 7 of 10 - Setting partition 1 to precleared state DONE = Step 8 of 10 - Notifying kernel we changed the partitioning DONE = Step 9 of 10 - Creating the /dev/disk/by* entries DONE = Step 10 of 10 - Testing if the clear has been successful. DONE = Disk Post-Clear-Read completed DONE Elapsed Time: 56:18:58 ============================================================================ == == Disk /dev/sdd has been successfully precleared == ============================================================================ S.M.A.R.T. error count differences detected after pre-clear note, some 'raw' values may change, but not be an indication of a problem 19,20c19,20 < Offline data collection status: (0x80) Offline data collection activity < was never started. --- > Offline data collection status: (0x84) Offline data collection activity > was suspended by an interrupting command from host. 54c54 < 1 Raw_Read_Error_Rate 0x002f 100 253 051 Pre-fail Always - 0 --- > 1 Raw_Read_Error_Rate 0x002f 200 200 051 Pre-fail Always - 0 ============================================================================
  14. So you are saying you are also known as google.com? Awww.. come on! Where's your thirst for adventure! Think of it as a challenge! I dare you to try! just kidding! Yeah.. Adding something to display_progress() function might be easy enough.. I'll tinker once I have some time.. but my guess is that I'll be done with the parity upgrade and forget about this until the next time! Oh.. and it is well commented.. good job!
  15. Nice! Let me rephrase.. how hard would it be for you to modify this script to e-mail updates as the test is progressing :-) I've already got unraid notify working.. Maybe I will dig into that.. someday.. For now I'm having to connect back into the session every couple of hours.. :-) hmm.. redirect the output to tee and send the contents to bashmail.. hhmm... all with variables to disable it... Jim
  16. Dumb question here.. I got got my replacement drive and it's on it's second pass. (18 hours for the first pass!) Since this is my new parity drive, do I really gain anything by running this script? (other than testing the drive) It's going to have to write all the parity anyway when I assign it, correct? oh.. And how hard would it be to modify this script to e-mail me updates as the test is progressing? Jim
  17. Yeah... It's WD not Maxtor like I said in my original post.. My replacement drive should arrive today! Jim
  18. Oh... and btw... Thanks for the excellent script!!! It probably saved me a lot of headaches!
  19. Any tests you want me to try before I RMA the drive? It happily went through the pre-read and the zeroing out without reporting an error. I imagine it would have shown some problem when it did the smart diff, right?
  20. Here is my hdparm info and smart test info. Interestingly.. After the power cycle my disk 2 was "missing" I power cycled again and it came back? Now I just have to see if disk 2 is on the same controller as my new disk.. /dev/sdd: ATA device, with non-removable media Model Number: WDC WD15EADS-00H7B0 Serial Number: WD-WCAUP0018631 Firmware Revision: 05.00K05 Transport: Serial, SATA 1.0a, SATA II Extensions, SATA Rev 2.5 Standards: Supported: 8 7 6 5 Likely used: 8 Configuration: Logical max current cylinders 16383 16383 heads 16 16 sectors/track 63 63 -- CHS current addressable sectors: 16514064 LBA user addressable sectors: 268435455 LBA48 user addressable sectors: 2930277168 device size with M = 1024*1024: 1430799 MBytes device size with M = 1000*1000: 1500301 MBytes (1500 GB) Capabilities: LBA, IORDY(can be disabled) Queue depth: 32 Standby timer values: spec'd by Standard, with device specific minimum R/W multiple sector transfer: Max = 16 Current = 16 Recommended acoustic management value: 128, current value: 254 DMA: mdma0 mdma1 mdma2 udma0 udma1 udma2 udma3 udma4 *udma5 udma6 Cycle time: min=120ns recommended=120ns PIO: pio0 pio1 pio2 pio3 pio4 Cycle time: no flow control=120ns IORDY flow control=120ns Commands/features: Enabled Supported: * SMART feature set Security Mode feature set * Power Management feature set * Write cache * Look-ahead * Host Protected Area feature set * WRITE_BUFFER command * READ_BUFFER command * NOP cmd * DOWNLOAD_MICROCODE Power-Up In Standby feature set * SET_FEATURES required to spinup after power up SET_MAX security extension Automatic Acoustic Management feature set * 48-bit Address feature set * Device Configuration Overlay feature set * Mandatory FLUSH_CACHE * FLUSH_CACHE_EXT * SMART error logging * SMART self-test * General Purpose Logging feature set * 64-bit World wide name * {READ,WRITE}_DMA_EXT_GPL commands * Segmented DOWNLOAD_MICROCODE * SATA-I signaling speed (1.5Gb/s) * SATA-II signaling speed (3.0Gb/s) * Native Command Queueing (NCQ) * Host-initiated interface power management * Phy event counters DMA Setup Auto-Activate optimization * Software settings preservation * SMART Command Transport (SCT) feature set * SCT Long Sector Access (AC1) * SCT LBA Segment Access (AC2) * SCT Error Recovery Control (AC3) * SCT Features Control (AC4) * SCT Data Tables (AC5) unknown 206[12] (vendor specific) unknown 206[13] (vendor specific) Security: Master password revision code = 65534 supported not enabled not locked not frozen not expired: security count supported: enhanced erase 412min for SECURITY ERASE UNIT. 412min for ENHANCED SECURITY ERASE UNIT. Logical Unit WWN Device Identifier: 50014ee2ad0035dd NAA : 5 IEEE OUI : 14ee Unique ID : 2ad0035dd Checksum: correct root@Tower:~# root@Tower:~# root@Tower:~# smartctl -a -d ata /dev/sdd smartctl version 5.38 [i486-slackware-linux-gnu] Copyright (C) 2002-8 Bruce Allen Home page is http://smartmontools.sourceforge.net/ === START OF INFORMATION SECTION === Device Model: WDC WD15EADS-00H7B0 Serial Number: WD-WCAUP0018631 Firmware Version: 05.00K05 User Capacity: 1,500,301,910,016 bytes Device is: Not in smartctl database [for details use: -P showall] ATA Version is: 8 ATA Standard is: Exact ATA specification draft version not indicated Local Time is: Thu May 14 13:33:22 2009 GMT+5 SMART support is: Available - device has SMART capability. SMART support is: Enabled === START OF READ SMART DATA SECTION === SMART overall-health self-assessment test result: PASSED General SMART Values: Offline data collection status: (0x84) Offline data collection activity was suspended by an interrupting command from host. Auto Offline Data Collection: Enabled. Self-test execution status: ( 0) The previous self-test routine completed without error or no self-test has ever been run. Total time to complete Offline data collection: (40500) seconds. Offline data collection capabilities: (0x7b) SMART execute Offline immediate. Auto Offline data collection on/off support. Suspend Offline collection upon new command. Offline surface scan supported. Self-test supported. Conveyance Self-test supported. Selective Self-test supported. SMART capabilities: (0x0003) Saves SMART data before entering power-saving mode. Supports SMART auto save timer. Error logging capability: (0x01) Error logging supported. General Purpose Logging supported. Short self-test routine recommended polling time: ( 2) minutes. Extended self-test routine recommended polling time: ( 255) minutes. Conveyance self-test routine recommended polling time: ( 5) minutes. SCT capabilities: (0x303f) SCT Status supported. SCT Feature Control supported. SCT Data Table supported. SMART Attributes Data Structure revision number: 16 Vendor Specific SMART Attributes with Thresholds: ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE 1 Raw_Read_Error_Rate 0x002f 139 139 051 Pre-fail Always - 14844 3 Spin_Up_Time 0x0027 100 253 021 Pre-fail Always - 0 4 Start_Stop_Count 0x0032 100 100 000 Old_age Always - 9 5 Reallocated_Sector_Ct 0x0033 200 200 140 Pre-fail Always - 0 7 Seek_Error_Rate 0x002e 200 200 000 Old_age Always - 0 9 Power_On_Hours 0x0032 100 100 000 Old_age Always - 0 10 Spin_Retry_Count 0x0032 100 253 000 Old_age Always - 0 11 Calibration_Retry_Count 0x0032 100 253 000 Old_age Always - 0 12 Power_Cycle_Count 0x0032 100 100 000 Old_age Always - 7 192 Power-Off_Retract_Count 0x0032 200 200 000 Old_age Always - 1 193 Load_Cycle_Count 0x0032 200 200 000 Old_age Always - 9 194 Temperature_Celsius 0x0022 127 121 000 Old_age Always - 25 196 Reallocated_Event_Count 0x0032 200 200 000 Old_age Always - 0 197 Current_Pending_Sector 0x0032 195 195 000 Old_age Always - 1311 198 Offline_Uncorrectable 0x0030 100 253 000 Old_age Offline - 0 199 UDMA_CRC_Error_Count 0x0032 200 200 000 Old_age Always - 0 200 Multi_Zone_Error_Rate 0x0008 100 253 000 Old_age Offline - 0 SMART Error Log Version: 1 No Errors Logged SMART Self-test log structure revision number 1 No self-tests have been logged. [To run self-tests, use: smartctl -t] SMART Selective self-test log data structure revision number 1 SPAN MIN_LBA MAX_LBA CURRENT_TEST_STATUS 1 0 0 Not_testing 2 0 0 Not_testing 3 0 0 Not_testing 4 0 0 Not_testing 5 0 0 Not_testing Selective self-test flags (0x0): After scanning selected spans, do NOT read-scan remainder of disk. If Selective self-test is pending on power-up, resume after 0 minute delay. Now do I have to be concerned about the "Current_Pending_Sector " Number? Seems like that should be 0 for a new good drive.. Could a bad controller have any effect on that number?
  21. After the reboot it seems to be happily be running I'm at 98% of the pre-read... Ok.. Change that... I guess it isn't happy... I'm getting more of those errors on the zeroing.. Here is a snippet of the log. The snippet starts at close to the end of the pre read and captures the start of the zeroing.. I'm in the middle of a power cycle (remotly so I may not get it back). I'll have to look to see if the very first pass of this test behaved well... I'll post the smart results when the computer reboots..
  22. I am running this on my new 1.5TB Maxtor Green. It did one full pass that seemed to work. On the second pass, it didn't finish. I looked in /tmp for the smart logs, but it appears to have been deleted. where should I look to see what happened? I remember seeing something about not being able to do something with the MBR. My putty session got killed when I rebooted. I'm going to try again and see if it was just some sort of fluke. My syslog file is 500Meg! With a whole ton of these: May 14 04:48:23 Tower kernel: end_request: I/O error, dev sdd, sector 2930137216 May 14 04:48:23 Tower kernel: sd 6:0:0:0: [sdd] Result: hostbyte=0x04 driverbyte=0x00 And I see this May 14 04:48:22 Tower kernel: sd 6:0:0:0: [sdd] Result: hostbyte=0x04 driverbyte=0x00 May 14 04:48:22 Tower kernel: end_request: I/O error, dev sdd, sector 2930043648 May 14 04:48:22 Tower kernel: __ratelimit: 78016 callbacks suppressed May 14 04:48:22 Tower kernel: Buffer I/O error on device sdd, logical block 366255456 May 14 04:48:22 Tower kernel: lost page write due to I/O error on sdd May 14 04:48:22 Tower kernel: Buffer I/O error on device sdd, logical block 366255457 May 14 04:48:22 Tower kernel: lost page write due to I/O error on sdd May 14 04:48:22 Tower kernel: Buffer I/O error on device sdd, logical block 366255458 May 14 04:48:22 Tower kernel: lost page write due to I/O error on sdd May 14 04:48:22 Tower kernel: Buffer I/O error on device sdd, logical block 366255459 May 14 04:48:22 Tower kernel: lost page write due to I/O error on sdd May 14 04:48:22 Tower kernel: Buffer I/O error on device sdd, logical block 366255460 May 14 04:48:22 Tower kernel: lost page write due to I/O error on sdd May 14 04:48:22 Tower kernel: Buffer I/O error on device sdd, logical block 366255461 May 14 04:48:22 Tower kernel: lost page write due to I/O error on sdd May 14 04:48:22 Tower kernel: Buffer I/O error on device sdd, logical block 366255462 May 14 04:48:22 Tower kernel: lost page write due to I/O error on sdd May 14 04:48:22 Tower kernel: Buffer I/O error on device sdd, logical block 366255463 May 14 04:48:22 Tower kernel: lost page write due to I/O error on sdd May 14 04:48:22 Tower kernel: Buffer I/O error on device sdd, logical block 366255464 May 14 04:48:22 Tower kernel: lost page write due to I/O error on sdd May 14 04:48:22 Tower kernel: Buffer I/O error on device sdd, logical block 366255465 May 14 04:48:22 Tower kernel: lost page write due to I/O error on sdd May 14 04:48:22 Tower kernel: sd 6:0:0:0: [sdd] Result: hostbyte=0x04 driverbyte=0x00 May 14 04:48:22 Tower kernel: end_request: I/O error, dev sdd, sector 2930044672 As well Is there anything I should look for? Jim
×
×
  • Create New...