September 17, 201114 yr Okay I was watching the Other Guys streamed to my iPhone via air video. I closed the app to make a phone call and then couldnt get it running again. Check the server thru the gui unreachable. Went to the actual counsel and couldnt get the root screen. I subsequently restarted and parity check was invoked. Im at 62 % with 300487295 sync errors. Should I be concerned I was not writing data to the server just steaming. Attached is syslog. syslog.txt
September 17, 201114 yr Run smart tests on your drives - starting with the parity drive. I don't see anything in your syslog except the presence of sync errors.
September 17, 201114 yr Author Just run the smart test thru unmenu and post? Or should I run extended tests? Here is the smart for parity. parity.txt
September 17, 201114 yr think that is enough... D# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE 1 Raw_Read_Error_Rate 0x002f 100 100 051 Pre-fail Always - 72 2 Throughput_Performance 0x0026 252 252 000 Old_age Always - 0 3 Spin_Up_Time 0x0023 075 064 025 Pre-fail Always - 7752 4 Start_Stop_Count 0x0032 100 100 000 Old_age Always - 788 5 Reallocated_Sector_Ct 0x0033 252 252 010 Pre-fail Always - 0 7 Seek_Error_Rate 0x002e 252 252 051 Old_age Always - 0 8 Seek_Time_Performance 0x0024 252 252 015 Old_age Offline - 0 9 Power_On_Hours 0x0032 100 100 000 Old_age Always - 1552 10 Spin_Retry_Count 0x0032 252 252 051 Old_age Always - 0 11 Calibration_Retry_Count 0x0032 100 100 000 Old_age Always - 1 12 Power_Cycle_Count 0x0032 100 100 000 Old_age Always - 97 181 Unknown_Attribute 0x0022 100 100 000 Old_age Always - 14258524 191 G-Sense_Error_Rate 0x0022 001 001 000 Old_age Always - 3499182 192 Power-Off_Retract_Count 0x0022 252 252 000 Old_age Always - 0 194 Temperature_Celsius 0x0002 064 059 000 Old_age Always - 27 (Lifetime Min/Max 17/41) 195 Hardware_ECC_Recovered 0x003a 100 100 000 Old_age Always - 0 196 Reallocated_Event_Count 0x0032 252 252 000 Old_age Always - 0 197 Current_Pending_Sector 0x0032 252 252 000 Old_age Always - 0 198 Offline_Uncorrectable 0x0030 252 252 000 Old_age Offline - 0 199 UDMA_CRC_Error_Count 0x0036 086 086 000 Old_age Always - 7729 200 Multi_Zone_Error_Rate 0x002a 001 001 000 Old_age Always - 110068 223 Load_Retry_Count 0x0032 100 100 000 Old_age Always - 1 225 Load_Cycle_Count 0x0032 100 100 000 Old_age Always - 799 CRC errors may have been caused by cabling. Reseat/replace the parity drive cable. Run another smart test to see if you get more
September 17, 201114 yr Author Okay sounds good ran it on parity and it looks good, nothing failing. Im just worried that I may be having a failing parity drive? The crc were previous bad cables.
September 17, 201114 yr Multi-zone error rate doesn't look good. G-sense error rate either; in combination with the parity sync errors. Run the smart tests, that way you will have a history. Have you completed a parity check? Compared to one of my failed drives (early in the failure) Vendor Specific SMART Attributes with Thresholds: ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE 1 Raw_Read_Error_Rate 0x002f 100 100 051 Pre-fail Always - 1453 2 Throughput_Performance 0x0026 252 252 000 Old_age Always - 0 3 Spin_Up_Time 0x0023 067 066 025 Pre-fail Always - 10205 4 Start_Stop_Count 0x0032 100 100 000 Old_age Always - 212 5 Reallocated_Sector_Ct 0x0033 252 252 010 Pre-fail Always - 0 7 Seek_Error_Rate 0x002e 252 252 051 Old_age Always - 0 8 Seek_Time_Performance 0x0024 252 252 015 Old_age Offline - 0 9 Power_On_Hours 0x0032 100 100 000 Old_age Always - 532 10 Spin_Retry_Count 0x0032 252 252 051 Old_age Always - 0 11 Calibration_Retry_Count 0x0032 252 252 000 Old_age Always - 0 12 Power_Cycle_Count 0x0032 100 100 000 Old_age Always - 16 181 Program_Fail_Cnt_Total 0x0022 252 252 000 Old_age Always - 0 191 G-Sense_Error_Rate 0x0022 100 100 000 Old_age Always - 1 192 Power-Off_Retract_Count 0x0022 252 252 000 Old_age Always - 0 194 Temperature_Celsius 0x0002 064 061 000 Old_age Always - 31 (Lifetime Min/Max 20/39) 195 Hardware_ECC_Recovered 0x003a 100 100 000 Old_age Always - 0 196 Reallocated_Event_Count 0x0032 252 252 000 Old_age Always - 0 197 Current_Pending_Sector 0x0032 100 100 000 Old_age Always - 15 198 Offline_Uncorrectable 0x0030 252 252 000 Old_age Offline - 0 199 UDMA_CRC_Error_Count 0x0036 200 200 000 Old_age Always - 0 200 Multi_Zone_Error_Rate 0x002a 100 100 000 Old_age Always - 1 223 Load_Retry_Count 0x0032 252 252 000 Old_age Always - 0 225 Load_Cycle_Count 0x0032 100 100 000 Old_age Always - 213
September 17, 201114 yr Author Yeah the multi is rather high but I have 2 other drives including this one with high multi zone errors. I'm at 70 percent on parity. I have an an antec ea650 power supply. Could that be the culprit?
September 17, 201114 yr Author Parity completed with an absurd amount of sync errors. I replaced the cable to the parity drive and began another parity check. Withing the first minute I was getting sync errors....is my parity drive going or a regular data drive? Im doing a mem test as we speak....if that does not work Ill be ordering a wd black drive to replace the samsung.
September 17, 201114 yr Likely your parity drive. Post your other smart tests as well as a .zip 200 Multi_Zone_Error_Rate 0x002a 001 001 000 Old_age Always - 110068 191 G-Sense_Error_Rate 0x0022 001 001 000 Old_age Always - 3499182 Those in red are your current status. They start at 100 and 000 indicates failure.
September 17, 201114 yr Likely your parity drive. Post your other smart tests as well as a .zip 200 Multi_Zone_Error_Rate 0x002a 001 001 000 Old_age Always - 110068 191 G-Sense_Error_Rate 0x0022 001 001 000 Old_age Always - 3499182 Those in red are your current status. They start at 100 and 000 indicates failure. Some times they start at 100. Other times, at 200, and yet others at any value the manufacturer wishes. The starting value could easily be "1" For something like a G-Force error (drive subjected to high G-Force ) it could easily be 1 for never having been dropped, and 0 if it has. There is no in-between value that is meaningful.
September 17, 201114 yr Author I have enclosed all 3 disk in my array. I went a step further and took the parity drive out of the device selection, started the array, stopped the array, and then added the parity drive back. Its rebuilding parity and after running the smart test on parity it has increased its multi zone error rate. disk1.txt disk2.txt disk3.txt parity.txt
September 17, 201114 yr Likely your parity drive. Post your other smart tests as well as a .zip 200 Multi_Zone_Error_Rate 0x002a 001 001 000 Old_age Always - 110068 191 G-Sense_Error_Rate 0x0022 001 001 000 Old_age Always - 3499182 Those in red are your current status. They start at 100 and 000 indicates failure. Some times they start at 100. Other times, at 200, and yet others at any value the manufacturer wishes. The starting value could easily be "1" For something like a G-Force error (drive subjected to high G-Force ) it could easily be 1 for never having been dropped, and 0 if it has. There is no in-between value that is meaningful. Agreed that most of the times these values mean nothing to the end user. The only reason I said the G-force started at 100...was because my identical drive (HD204UI) started at 100.
September 17, 201114 yr The other drives look good for the most part. Few raw read errors etc. One of the things I try to do is take a smart test monthly to get a basic trend on all parameters. unMenu's smart history also helps, installation pkg found here: http://lime-technology.com/forum/index.php?topic=3146.0
September 18, 201114 yr Is this something that is tracked automatically when you run them? Or do you save all the smart tests? Edit: I see, I had unmenu installed already from when I first built the server but never used the smart history, I downloaded it and php, it's pretty swank. Got a couple old hard drives in the server so be nice to have some warning. For others who were wondering, Smart History here http://lime-technology.com/forum/index.php?topic=3146.0 and PHP can be downloaded through the packages in unmenu.
September 18, 201114 yr Author I have had unmenu installed but nothing popped up about a drive failing. I checked the SMART reports monthly, but nothing showed any indication of a drive failing just that the multi zone error rate was climbing....stinks, but I ordered the WD BLACK drive for replacement. Im going to try to get a new drive from Samsung on Monday. Anyone have luck getting a new drive back from Samsung? And thanks for the help trying to diagnose the problem with the parity drive.
September 18, 201114 yr Yeah...I RMA'd a HD154UI no problems. Sent next day after receipt by Samsung. Have a HD204UI in process as well.
September 18, 201114 yr Author Yeah...I RMA'd a HD154UI no problems. Sent next day after receipt by Samsung. Have a HD204UI in process as well. Thanks for your help....I will be RMA'ing at work
Archived
This topic is now archived and is closed to further replies.