Mettbrot Posted May 6, 2013 Share Posted May 6, 2013 Hi there. I recently purchased the pro version to add up more drives to the array and ran 3 HDDs + parity just fine. Suddenly the parity tuned red and got diabled. I attached the syslog. Here is the smart status, I ran a couple of tests: smartctl 5.40 2010-10-16 r3189 [i486-slackware-linux-gnu] (local build) Copyright © 2002-10 by Bruce Allen, http://smartmontools.sourceforge.net === START OF INFORMATION SECTION === Device Model: ST3000DM001-9YN166 Serial Number: Z1F0E70H Firmware Version: CC4C User Capacity: 3,000,592,982,016 bytes Device is: Not in smartctl database [for details use: -P showall] ATA Version is: 8 ATA Standard is: ATA-8-ACS revision 4 Local Time is: Tue May 7 00:38:13 2013 CEST SMART support is: Available - device has SMART capability. SMART support is: Enabled === START OF READ SMART DATA SECTION === SMART overall-health self-assessment test result: PASSED General SMART Values: Offline data collection status: (0x00) Offline data collection activity was never started. Auto Offline Data Collection: Disabled. Self-test execution status: ( 0) The previous self-test routine completed without error or no self-test has ever been run. Total time to complete Offline data collection: ( 584) seconds. Offline data collection capabilities: (0x73) SMART execute Offline immediate. Auto Offline data collection on/off support. Suspend Offline collection upon new command. No Offline surface scan supported. Self-test supported. Conveyance Self-test supported. Selective Self-test supported. SMART capabilities: (0x0003) Saves SMART data before entering power-saving mode. Supports SMART auto save timer. Error logging capability: (0x01) Error logging supported. General Purpose Logging supported. Short self-test routine recommended polling time: ( 1) minutes. Extended self-test routine recommended polling time: ( 255) minutes. Conveyance self-test routine recommended polling time: ( 2) minutes. SCT capabilities: (0x3085) SCT Status supported. SMART Attributes Data Structure revision number: 10 Vendor Specific SMART Attributes with Thresholds: ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE 1 Raw_Read_Error_Rate 0x000f 114 099 006 Pre-fail Always - 64856304 3 Spin_Up_Time 0x0003 092 092 000 Pre-fail Always - 0 4 Start_Stop_Count 0x0032 100 100 020 Old_age Always - 1012 5 Reallocated_Sector_Ct 0x0033 100 100 036 Pre-fail Always - 0 7 Seek_Error_Rate 0x000f 066 060 030 Pre-fail Always - 17199648463 9 Power_On_Hours 0x0032 098 098 000 Old_age Always - 2557 10 Spin_Retry_Count 0x0013 100 100 097 Pre-fail Always - 0 12 Power_Cycle_Count 0x0032 100 100 020 Old_age Always - 419 183 Runtime_Bad_Block 0x0032 100 100 000 Old_age Always - 0 184 End-to-End_Error 0x0032 100 100 099 Old_age Always - 0 187 Reported_Uncorrect 0x0032 100 100 000 Old_age Always - 0 188 Command_Timeout 0x0032 100 100 000 Old_age Always - 0 189 High_Fly_Writes 0x003a 094 094 000 Old_age Always - 6 190 Airflow_Temperature_Cel 0x0022 061 049 045 Old_age Always - 39 (Min/Max 21/39) 191 G-Sense_Error_Rate 0x0032 100 100 000 Old_age Always - 0 192 Power-Off_Retract_Count 0x0032 100 100 000 Old_age Always - 47 193 Load_Cycle_Count 0x0032 093 093 000 Old_age Always - 15448 194 Temperature_Celsius 0x0022 039 051 000 Old_age Always - 39 (0 16 0 0) 197 Current_Pending_Sector 0x0012 100 100 000 Old_age Always - 0 198 Offline_Uncorrectable 0x0010 100 100 000 Old_age Offline - 0 199 UDMA_CRC_Error_Count 0x003e 200 200 000 Old_age Always - 138 240 Head_Flying_Hours 0x0000 100 253 000 Old_age Offline - 12940736464244 241 Total_LBAs_Written 0x0000 100 253 000 Old_age Offline - 227063173794212 242 Total_LBAs_Read 0x0000 100 253 000 Old_age Offline - 174954067296341 SMART Error Log Version: 1 No Errors Logged SMART Self-test log structure revision number 1 Num Test_Description Status Remaining LifeTime(hours) LBA_of_first_error # 1 Short offline Completed without error 00% 2549 - # 2 Extended offline Completed without error 00% 5 - SMART Selective self-test log data structure revision number 1 SPAN MIN_LBA MAX_LBA CURRENT_TEST_STATUS 1 0 0 Not_testing 2 0 0 Not_testing 3 0 0 Not_testing 4 0 0 Not_testing 5 0 0 Not_testing Selective self-test flags (0x0): After scanning selected spans, do NOT read-scan remainder of disk. If Selective self-test is pending on power-up, resume after 0 minute delay. Could it be that this is just coused by a crappy cable or the power supply not having enaugh power? Its a rather old computer with an PCI SATA expansion card (4 slots). Thank you syslog.txt Link to comment
garycase Posted May 6, 2013 Share Posted May 6, 2013 I'd reseat (unplug/replug) both the power and data cables to the drive -- better yet, replace the data cable with a new locking cable. Hard to say whether the power supply could be an issue, since you didn't post its specs; but with only 4 drives its generally unlikely. Reseat the cables; then see if the status changes. In addition, post the specifics of your system (motherboard/CPU/memory/PSU specs/add-in controller card make/model) Link to comment
Mettbrot Posted May 7, 2013 Author Share Posted May 7, 2013 Intel Pentium 4 with 2.35 GHz OEM motherboard MS-6583 from MSI 2 GB DDR1 RAM 250W Power supply S-ATA Raid Controller, 4 channel with Silicon Image 3114 chip 3x 3TB hard drive 1x 2TB hard drive What is the best procedure to detect which drive is which? Do I disconnect one at a time and start the server to see which one is missing? The 3 3TB HDDs are of the same type and I have no record :-( Although this time I will create one^^ Link to comment
Joe L. Posted May 7, 2013 Share Posted May 7, 2013 The 3 3TB HDDs are of the same type and I have no record :-( Although this time I will create one^^ I have no doubt you will. Since unRAID uses disk model/serial numbers, that is what you should also track. Link to comment
garycase Posted May 7, 2013 Share Posted May 7, 2013 As Joe noted, UnRAID displays the serial number for each drive on the Web GUI ... so you can note the serial number for each; then look on the drives and determine which is which [you'll likely have to remove the drives to read the serial number]. Link to comment
Mettbrot Posted May 7, 2013 Author Share Posted May 7, 2013 OK, I noted everything and changed the SATA cable of the parity drive (it was different from the other 3 so I swapped it). Rebuilding parity now. Any notes on the hardware or is everything ok? Link to comment
Joe L. Posted May 7, 2013 Share Posted May 7, 2013 OK, I noted everything and changed the SATA cable of the parity drive (it was different from the other 3 so I swapped it). Rebuilding parity now. Any notes on the hardware or is everything ok? power supply might be a bit on the small side for 4 drives and an older power hungry pentium. Link to comment
garycase Posted May 7, 2013 Share Posted May 7, 2013 Agree the PSU is a bit small -- if you get random issues when the drives are spinning up or during parity checks, this is probably the first thing I'd change [Go with a good 80+ certified 400W unit if you replace it ... you don't want too large a unit, as then you're running outside of its most efficient operating range). You probably don't have any choice on this system, but PCI SATA cards are much slower than motherboard ports or PCIe x4 (or x8) cards. Something to remember when you eventually decide to upgrade the motherboard/CPU. The performance difference probably doesn't matter for most purposes; but parity checks will take appreciably longer than they would on a faster controller. Link to comment
Mettbrot Posted May 7, 2013 Author Share Posted May 7, 2013 Thanks, I know. I started with this old system, knowing that it would some day reach its capacity. But I am realy impressed how much unraid can make out of such an old system - it runs very smoothly :-) When the third drive is filled I will eventually invest in a completely new system but until that day I am happy with it (hoping it was the sata cable, not the PSU :-P ) Link to comment
garycase Posted May 7, 2013 Share Posted May 7, 2013 Sounds like it was just the cable. For only 4 drives, you're PSU is probably fine -- UnRAID doesn't tax the CPU much, so it won't hit it's rated power rating anyway. I suspect if you measured your power consumption with a Kill-a-Watt you'd find it's not much over 100W. If the PSU was going to be a problem, you'd notice it during boot or during spin-ups when doing a parity check. Link to comment
Mettbrot Posted May 7, 2013 Author Share Posted May 7, 2013 Tank you very much! You guys have been a great help!! Link to comment
Recommended Posts
Archived
This topic is now archived and is closed to further replies.