internetfriend

Members
  • Posts

    116
  • Joined

  • Last visited

Everything posted by internetfriend

  1. Hey Lionelhutz, I think it was the power supply on a previous pc the server was in. Its in an all new step now so the trouble hardware is removed, its now just the drives that may have been messed up due to it left over. I'll keep an eye on the drive and see what happens.
  2. So last month I had an issue where disk 2, my parity and my cache drive went nuts with reallocation errors. Disk 2 needed to be RMA'd and cache drive and parity drive seemed to correct themselves after a preclear. I just ran through my monthly parity check; and the parity drive threw 533 errors on the disk, but 0 errors on the parity summary. Smart is below. I'm trying to figure out if I should RMA, and if not, next steps? Should I run another preclear on the disk or just keep an eye on it and see if it sorts itself out? smartctl -a -d ata /dev/sdd (parity) smartctl 5.39.1 2010-01-28 r3054 [i486-slackware-linux-gnu] (local build) Copyright (C) 2002-10 by Bruce Allen, http://smartmontools.sourceforge.net === START OF INFORMATION SECTION === Device Model: WDC WD20EARS-00MVWB0 Serial Number: WD-WCAZA1020507 Firmware Version: 51.0AB51 User Capacity: 2,000,398,934,016 bytes Device is: Not in smartctl database [for details use: -P showall] ATA Version is: 8 ATA Standard is: Exact ATA specification draft version not indicated Local Time is: Thu Mar 1 07:19:18 2012 PST SMART support is: Available - device has SMART capability. SMART support is: Enabled === START OF READ SMART DATA SECTION === SMART overall-health self-assessment test result: PASSED General SMART Values: Offline data collection status: (0x84) Offline data collection activity was suspended by an interrupting command from host. Auto Offline Data Collection: Enabled. Self-test execution status: ( 0) The previous self-test routine completed without error or no self-test has ever been run. Total time to complete Offline data collection: (40500) seconds. Offline data collection capabilities: (0x7b) SMART execute Offline immediate. Auto Offline data collection on/off support. Suspend Offline collection upon new command. Offline surface scan supported. Self-test supported. Conveyance Self-test supported. Selective Self-test supported. SMART capabilities: (0x0003) Saves SMART data before entering power-saving mode. Supports SMART auto save timer. Error logging capability: (0x01) Error logging supported. General Purpose Logging supported. Short self-test routine recommended polling time: ( 2) minutes. Extended self-test routine recommended polling time: ( 255) minutes. Conveyance self-test routine recommended polling time: ( 5) minutes. SCT capabilities: (0x3035) SCT Status supported. SCT Feature Control supported. SCT Data Table supported. SMART Attributes Data Structure revision number: 16 Vendor Specific SMART Attributes with Thresholds: ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE 1 Raw_Read_Error_Rate 0x002f 200 200 051 Pre-fail Always - 0 3 Spin_Up_Time 0x0027 165 162 021 Pre-fail Always - 6716 4 Start_Stop_Count 0x0032 100 100 000 Old_age Always - 656 5 Reallocated_Sector_Ct 0x0033 200 200 140 Pre-fail Always - 0 7 Seek_Error_Rate 0x002e 200 200 000 Old_age Always - 0 9 Power_On_Hours 0x0032 088 088 000 Old_age Always - 9464 10 Spin_Retry_Count 0x0032 100 100 000 Old_age Always - 0 11 Calibration_Retry_Count 0x0032 100 253 000 Old_age Always - 0 12 Power_Cycle_Count 0x0032 100 100 000 Old_age Always - 93 192 Power-Off_Retract_Count 0x0032 200 200 000 Old_age Always - 29 193 Load_Cycle_Count 0x0032 200 200 000 Old_age Always - 2530 194 Temperature_Celsius 0x0022 120 115 000 Old_age Always - 30 196 Reallocated_Event_Count 0x0032 200 200 000 Old_age Always - 0 197 Current_Pending_Sector 0x0032 200 200 000 Old_age Always - 7 198 Offline_Uncorrectable 0x0030 200 200 000 Old_age Offline - 0 199 UDMA_CRC_Error_Count 0x0032 200 200 000 Old_age Always - 0 200 Multi_Zone_Error_Rate 0x0008 200 200 000 Old_age Offline - 1 SMART Error Log Version: 1 No Errors Logged SMART Self-test log structure revision number 1 Num Test_Description Status Remaining LifeTime(hours) LBA_of_first_error # 1 Short offline Completed: read failure 90% 9082 3511387976 SMART Selective self-test log data structure revision number 1 SPAN MIN_LBA MAX_LBA CURRENT_TEST_STATUS 1 0 0 Not_testing 2 0 0 Not_testing 3 0 0 Not_testing 4 0 0 Not_testing 5 0 0 Not_testing Selective self-test flags (0x0): After scanning selected spans, do NOT read-scan remainder of disk. If Selective self-test is pending on power-up, resume after 0 minute delay. Thank you!
  3. Haha, I didnt even notice it was the same post by you. THe pi actually just released today but their website got totally destroyed by traffic. I'll wait for batch 2
  4. Check out this, thread, this is the one we're all excited about. http://lime-technology.com/forum/index.php?topic=17771.0
  5. My biggest issue is once I got it set up I realized I didn't really have a use for it, I already keep a browser open with tabs to unraid, SAB, sickbeard & Couchpotato open, and the only one I really need to refer to is the SAB one for if something downloaded. I use a harmony remote for the tv itself so I don't need it on a laptop or anything. Viewing it on a tablet would be cool, but it doesnt function too hot on the android browser. My uses aside though, the unmenu plugin and the app itself work great!
  6. works great in 4.7, I havent upgraded to the beta yet and wanted to use this. Thank you!
  7. Follow up! I'm guessing I had a PSU giving up overall like a few of you guessed. Preclear of parity drive and cache drive cleared pending sectors on the new hardware. parity sync went without issue and a parity check passed with 0 errors. Only remaining thing mymain throws as an issue for the cache drive is a "soft_read_error_rate=1" but I'm assuming thats ok, all sectors are corrected. syslog attached for closure. Thanks everyone for their help! Fingers crossed I dont come back 3/1 with something when the new party check hits. syslog-2012-02-19.txt
  8. Yeah, it was. I'm doing a final preclear on the cache drie now to see if i can correct that pesky bad sector. I'm hoping/assuming most of these came from a poor PSU and not actually bad drives, but I'm still wondering If I should RMA that parity drive. I guess I'll just keep an eye on it, I still have 1.5 years of warranty on it.
  9. So check it out, here's the results of my parity drive preclear. ========================================================================1.13 == WDC WD20EARS-00MVWB0 WD-WCAZA1020507 == Disk /dev/sdd has been successfully precleared == with a starting sector of 64 ============================================================================ ** Changed attributes in files: /tmp/smart_start_sdd /tmp/smart_finish_sdd ATTRIBUTE NEW_VAL OLD_VAL FAILURE_THRESHOLD STATUS RAW_VA LUE Seek_Error_Rate = 100 200 0 ok 0 Temperature_Celsius = 124 125 0 ok 26 No SMART attributes are FAILING_NOW 0 sectors were pending re-allocation before the start of the preclear. 4 sectors were pending re-allocation after pre-read in cycle 1 of 1. 0 sectors were pending re-allocation after zero of disk in cycle 1 of 1. 0 sectors are pending re-allocation at the end of the preclear, the number of sectors pending re-allocation did not change. 0 sectors had been re-allocated before the start of the preclear. 0 sectors are re-allocated at the end of the preclear, the number of sectors re-allocated did not change. Here's a new SMART report smartctl -a -d ata /dev/sdd (--) smartctl 5.39.1 2010-01-28 r3054 [i486-slackware-linux-gnu] (local build) Copyright (C) 2002-10 by Bruce Allen, http://smartmontools.sourceforge.net === START OF INFORMATION SECTION === Device Model: WDC WD20EARS-00MVWB0 Serial Number: WD-WCAZA1020507 Firmware Version: 51.0AB51 User Capacity: 2,000,398,934,016 bytes Device is: Not in smartctl database [for details use: -P showall] ATA Version is: 8 ATA Standard is: Exact ATA specification draft version not indicated Local Time is: Thu Feb 16 23:40:25 2012 PST SMART support is: Available - device has SMART capability. SMART support is: Enabled === START OF READ SMART DATA SECTION === SMART overall-health self-assessment test result: PASSED General SMART Values: Offline data collection status: (0x84) Offline data collection activity was suspended by an interrupting command from host. Auto Offline Data Collection: Enabled. Self-test execution status: ( 0) The previous self-test routine completed without error or no self-test has ever been run. Total time to complete Offline data collection: (40500) seconds. Offline data collection capabilities: (0x7b) SMART execute Offline immediate. Auto Offline data collection on/off support. Suspend Offline collection upon new command. Offline surface scan supported. Self-test supported. Conveyance Self-test supported. Selective Self-test supported. SMART capabilities: (0x0003) Saves SMART data before entering power-saving mode. Supports SMART auto save timer. Error logging capability: (0x01) Error logging supported. General Purpose Logging supported. Short self-test routine recommended polling time: ( 2) minutes. Extended self-test routine recommended polling time: ( 255) minutes. Conveyance self-test routine recommended polling time: ( 5) minutes. SCT capabilities: (0x3035) SCT Status supported. SCT Feature Control supported. SCT Data Table supported. SMART Attributes Data Structure revision number: 16 Vendor Specific SMART Attributes with Thresholds: ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE 1 Raw_Read_Error_Rate 0x002f 200 200 051 Pre-fail Always - 0 3 Spin_Up_Time 0x0027 165 162 021 Pre-fail Always - 6741 4 Start_Stop_Count 0x0032 100 100 000 Old_age Always - 633 5 Reallocated_Sector_Ct 0x0033 200 200 140 Pre-fail Always - 0 7 Seek_Error_Rate 0x002e 200 200 000 Old_age Always - 0 9 Power_On_Hours 0x0032 088 088 000 Old_age Always - 9156 10 Spin_Retry_Count 0x0032 100 100 000 Old_age Always - 0 11 Calibration_Retry_Count 0x0032 100 253 000 Old_age Always - 0 12 Power_Cycle_Count 0x0032 100 100 000 Old_age Always - 90 192 Power-Off_Retract_Count 0x0032 200 200 000 Old_age Always - 28 193 Load_Cycle_Count 0x0032 200 200 000 Old_age Always - 2439 194 Temperature_Celsius 0x0022 123 115 000 Old_age Always - 27 196 Reallocated_Event_Count 0x0032 200 200 000 Old_age Always - 0 197 Current_Pending_Sector 0x0032 200 200 000 Old_age Always - 0 198 Offline_Uncorrectable 0x0030 200 200 000 Old_age Offline - 7 199 UDMA_CRC_Error_Count 0x0032 200 200 000 Old_age Always - 0 200 Multi_Zone_Error_Rate 0x0008 200 200 000 Old_age Offline - 19 SMART Error Log Version: 1 No Errors Logged SMART Self-test log structure revision number 1 Num Test_Description Status Remaining LifeTime(hours) LBA_of_first_error # 1 Short offline Completed: read failure 90% 9082 3511387976 SMART Selective self-test log data structure revision number 1 SPAN MIN_LBA MAX_LBA CURRENT_TEST_STATUS 1 0 0 Not_testing 2 0 0 Not_testing 3 0 0 Not_testing 4 0 0 Not_testing 5 0 0 Not_testing Selective self-test flags (0x0): After scanning selected spans, do NOT read-scan remainder of disk. If Selective self-test is pending on power-up, resume after 0 minute delay. I'm torn on if I should RMA this drive. I live close to the warranty center so I dont mind the wait, but I'd rather not waste my time if this is a perfectly good drive that had some goofy random sectors. Or should I consider any ba sectors an RMA if I can?
  10. Periodic forum posts is not only a good decision from a support perspective but also a good business perspective. A very large selling point of UNRAID is the support users receive both officially and unofficially. I've recommended UNRAID to a few friends who have ended up using something else because they think it's a "dead" software because of the frequency of updates/communication dropping the past few months. Its obviously not true, but some people are definitely assuming it is. Even a twitter account that's updated weekly could be a good idea if you're socially inclined...?
  11. So I went hardcore and threw the parity drive into a preclear. Even though the pending sectors took care of themselves I'm not too happy with having any busted sectors in the first place. If it goes up after the preclear I'll RMA. 2 disks down, 1 to go, no data lost (yet!)
  12. So after copying about 1.5TB back onto the drive, the parity drive as used heavily and the pending sectors were handled. Here is a revised SMART. 87 errors are reported on the unraid page. Should I still RMA this drive? If not, should I just keep doing parity checks until errors are zero? thanks! smartctl -a -d ata /dev/sdd (parity) smartctl 5.39.1 2010-01-28 r3054 [i486-slackware-linux-gnu] (local build) Copyright (C) 2002-10 by Bruce Allen, http://smartmontools.sourceforge.net === START OF INFORMATION SECTION === Device Model: WDC WD20EARS-00MVWB0 Serial Number: WD-WCAZA1020507 Firmware Version: 51.0AB51 User Capacity: 2,000,398,934,016 bytes Device is: Not in smartctl database [for details use: -P showall] ATA Version is: 8 ATA Standard is: Exact ATA specification draft version not indicated Local Time is: Wed Feb 15 01:23:53 2012 PST SMART support is: Available - device has SMART capability. SMART support is: Enabled === START OF READ SMART DATA SECTION === SMART overall-health self-assessment test result: PASSED General SMART Values: Offline data collection status: (0x84) Offline data collection activity was suspended by an interrupting command from host. Auto Offline Data Collection: Enabled. Self-test execution status: ( 0) The previous self-test routine completed without error or no self-test has ever been run. Total time to complete Offline data collection: (40500) seconds. Offline data collection capabilities: (0x7b) SMART execute Offline immediate. Auto Offline data collection on/off support. Suspend Offline collection upon new command. Offline surface scan supported. Self-test supported. Conveyance Self-test supported. Selective Self-test supported. SMART capabilities: (0x0003) Saves SMART data before entering power-saving mode. Supports SMART auto save timer. Error logging capability: (0x01) Error logging supported. General Purpose Logging supported. Short self-test routine recommended polling time: ( 2) minutes. Extended self-test routine recommended polling time: ( 255) minutes. Conveyance self-test routine recommended polling time: ( 5) minutes. SCT capabilities: (0x3035) SCT Status supported. SCT Feature Control supported. SCT Data Table supported. SMART Attributes Data Structure revision number: 16 Vendor Specific SMART Attributes with Thresholds: ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE 1 Raw_Read_Error_Rate 0x002f 200 200 051 Pre-fail Always - 0 3 Spin_Up_Time 0x0027 165 162 021 Pre-fail Always - 6733 4 Start_Stop_Count 0x0032 100 100 000 Old_age Always - 632 5 Reallocated_Sector_Ct 0x0033 200 200 140 Pre-fail Always - 0 7 Seek_Error_Rate 0x002e 200 200 000 Old_age Always - 0 9 Power_On_Hours 0x0032 088 088 000 Old_age Always - 9110 10 Spin_Retry_Count 0x0032 100 100 000 Old_age Always - 0 11 Calibration_Retry_Count 0x0032 100 253 000 Old_age Always - 0 12 Power_Cycle_Count 0x0032 100 100 000 Old_age Always - 90 192 Power-Off_Retract_Count 0x0032 200 200 000 Old_age Always - 28 193 Load_Cycle_Count 0x0032 200 200 000 Old_age Always - 2430 194 Temperature_Celsius 0x0022 122 115 000 Old_age Always - 28 196 Reallocated_Event_Count 0x0032 200 200 000 Old_age Always - 0 197 Current_Pending_Sector 0x0032 200 200 000 Old_age Always - 0 198 Offline_Uncorrectable 0x0030 200 200 000 Old_age Offline - 6 199 UDMA_CRC_Error_Count 0x0032 200 200 000 Old_age Always - 0 200 Multi_Zone_Error_Rate 0x0008 200 200 000 Old_age Offline - 23 SMART Error Log Version: 1 No Errors Logged SMART Self-test log structure revision number 1 Num Test_Description Status Remaining LifeTime(hours) LBA_of_first_error # 1 Short offline Completed: read failure 90% 9082 3511387976 SMART Selective self-test log data structure revision number 1 SPAN MIN_LBA MAX_LBA CURRENT_TEST_STATUS 1 0 0 Not_testing 2 0 0 Not_testing 3 0 0 Not_testing 4 0 0 Not_testing 5 0 0 Not_testing Selective self-test flags (0x0): After scanning selected spans, do NOT read-scan remainder of disk. If Selective self-test is pending on power-up, resume after 0 minute delay.
  13. I believe going forward ALL drives are only 2 years. There's going to be some old drives that still have 5 years advertised and that would be honored of course. Sounds like the Best Buy drives are old stock which means you should grab it while you can.
  14. Update: Disk 2 was RMAed, new disk was precleared and passed the test so I added it to the array. There was a data rebuild to it, but since the previous disk was empty my files arent on there and potentially corupted. Right now I'm copying all the files back onto disk 2. I still have the issue of the parity drive throwing errors, my pending sector count actually went up by 1 when I did the rebuild. How do I go about forcing it to refresh? Will a simple parity check do this or would I need to pull it out of the array, preclear it and then introduce it back if it fixes itself?
  15. Argh what a pain! I think this is the first time I've been frustrated I've gotten something better than what I bought when i make a return haha. Thanks Joe, I'll need to consider my options.
  16. Sorry for the slight hijack, but I just sent in a drive for RMA and got a 2.5tb back too haha. My issue is I'm in 4.7, but I only have up to 2tb drives, and in my situation, my parity drive ALSO needs to be replaced. Is it possible to upgrade to Beta5, drop in the new 2.5tb drive in as a data drive, and then later upgrade to a 2.5 or 3tb parity? Will it "unlock" the remaining 500gb I cant use with a 2tb parity drive or will it just not allow me to mount it as a data drive at all?
  17. Do you have extra space in your server to add that new drive in and keep the old one in too? If so then you can keep your array running with the old drive and run the preclear on the new one at the same time. The preclear script will refuse to run on any HD in the array anyway, so even if you take the old drive out and replace it with the new one as long as you dont associate the new one with the array you can preclear just fine.
  18. New server is assembled, upgraded to a light duty AMD Athlon 4850e processor and mobo combo in an old CMStacker, but more importantly Corsair HX650 PSU. I had a Corsair 620 watt PSU in my desktop I wanted to test with but it has 3 (!) rails so while its good for my 3 disk 1 cdrom desktop, not so much for a big server. I have submitted an RMA request for Disk2 since its a big offender on the reallocated sectors. Once I get the new disk, I'll preclear, load it into the array and copy over data into it. Then I'll tackle the parity drive and not too important cache drive. Lionelhutz, the issue is the fitment is upside down in the ML110, so the psu was unable to intake since the inlet was flush against the top of the case. Doesn't matter now though, anyone want a used HP ML110 that can handle loads of up to 6 disks but not 7?
  19. Perhaps you already ruled this out, but I get similar performance when I just use windows to manage the file copying. When you do that it has to copy twice, once to the pc and then back to the server on a new disk. I usually average 35MBps so seems correct to me getting about half that when I copy from 1 disk to another. My anecdotal evidence anyway.
  20. Hey Dave, you can probably use the steps I had to do on my end with disk2, I am having similar issues but more severe haha. Thread below. http://lime-technology.com/forum/index.php?topic=18237.0
  21. Preclear finished on disk2, funny because after the zero-write all the sectors were rewritten, but by the end of the test it jumped to 373. Sounds like it's time to RMA? If I still have a SMART= PASS they'll take the disk regardless? I'm getting a new PSU today and plan to migrate everything into a new machine and do testing from a stronger platform. as long as I reassign drives in the correct order I shouldnt have any migration issues right? results below. Thanks! Disk Temperature: 30C, Elapsed Time: 38:22:31 ========================================================================1.13 == WDC WD20EARS-00MVWB0 WD-WCAZA1050439 == Disk /dev/sde has been successfully precleared == with a starting sector of 63 ============================================================================ ** Changed attributes in files: /tmp/smart_start_sde /tmp/smart_finish_sde ATTRIBUTE NEW_VAL OLD_VAL FAILURE_THRESHOLD STATUS RAW_VA LUE Raw_Read_Error_Rate = 188 198 51 ok 15935 Reallocated_Sector_Ct = 199 200 140 ok 30 Temperature_Celsius = 120 122 0 ok 30 Reallocated_Event_Count = 171 200 0 ok 29 Current_Pending_Sector = 199 200 0 ok 373 No SMART attributes are FAILING_NOW 132 sectors were pending re-allocation before the start of the preclear. 211 sectors were pending re-allocation after pre-read in cycle 1 of 1. 0 sectors were pending re-allocation after zero of disk in cycle 1 of 1. 373 sectors are pending re-allocation at the end of the preclear, a change of 241 in the number of sectors pending re-allocation. 0 sectors had been re-allocated before the start of the preclear. 30 sectors are re-allocated at the end of the preclear, a change of 30 in the number of sectors re-allocated.
  22. turns out the PSU isnt proprietary but its designed in a way where it's upside down compared to standard PSUs, so any PSU worth its salt with a fan blows air against a piece of aluminum. I'm going to order my parts and do an upgrade before I finish this up and do preclears in the meantime, this thing is a P4 that really is on it's last legs. And the PSU is a noname clunker at 350Watts, on paper it should be fine but guessing it just cant cut it anymore due to time.
  23. Sorry, poor choice of words. If I do a preclear on the disk, its going involve me taking it out of thr array and deleting everything on it, meaning I'd have to recalulate parity. So I copied everything off, pending sectors went UP by about 5 sectors on disk2. I'm going to shut down the server, see if perhaps the PSU is a normal deal or if its some proprietary HP thing and try a swap. From there I'll do a preclear on disk2 and see how it fares. Anything bad about my nest steps? thanks all!
  24. Sounds like I need to fasttrack my new build then. I'm currently still in the process of moving everything off disk 2. if I do a preclear its going to toast the parity validity(I think) - will unraid just ask if I want to rebuild from parity and should I do that, or should I just copy everything back over normally and then build a NEW parity?