July 28, 201411 yr Hi everyone, Recently while downloading I can't peak more than 3MB/s and I'm noticing my copies over shares are bound to similar speeds. I've attached some SMART logs (I noticed an error on one drive, but the rest seems to be OK). My parity check came up clean. I need to RMA this drive for sure, but I'm concerned at the write speeds dropping so low. Can anyone help me diagnose why? I see firmware updates available, and I should probably upgrade after I complete my latest backup. smartctl 6.2 2013-07-26 r3841 [i686-linux-3.9.11p-unRAID] (local build) Copyright (C) 2002-13, Bruce Allen, Christian Franke, www.smartmontools.org === START OF INFORMATION SECTION === Model Family: Seagate Barracuda 7200.14 (AF) Device Model: ST3000DM001-1CH166 Serial Number: Z1F37YED LU WWN Device Id: 5 000c50 063af85f3 Firmware Version: CC27 User Capacity: 3,000,592,982,016 bytes [3.00 TB] Sector Sizes: 512 bytes logical, 4096 bytes physical Rotation Rate: 7200 rpm Device is: In smartctl database [for details use: -P show] ATA Version is: ACS-2, ACS-3 T13/2161-D revision 3b SATA Version is: SATA 3.1, 6.0 Gb/s (current: 3.0 Gb/s) Local Time is: Mon Jul 28 02:38:42 2014 EDT ==> WARNING: A firmware update for this drive may be available, see the following Seagate web pages: http://knowledge.seagate.com/articles/en_US/FAQ/207931en http://knowledge.seagate.com/articles/en_US/FAQ/223651en SMART support is: Available - device has SMART capability. SMART support is: Enabled === START OF READ SMART DATA SECTION === SMART overall-health self-assessment test result: PASSED See vendor-specific Attribute list for marginal Attributes. General SMART Values: Offline data collection status: (0x00) Offline data collection activity was never started. Auto Offline Data Collection: Disabled. Self-test execution status: ( 241) Self-test routine in progress... 10% of test remaining. Total time to complete Offline data collection: ( 584) seconds. Offline data collection capabilities: (0x73) SMART execute Offline immediate. Auto Offline data collection on/off support. Suspend Offline collection upon new command. No Offline surface scan supported. Self-test supported. Conveyance Self-test supported. Selective Self-test supported. SMART capabilities: (0x0003) Saves SMART data before entering power-saving mode. Supports SMART auto save timer. Error logging capability: (0x01) Error logging supported. General Purpose Logging supported. Short self-test routine recommended polling time: ( 1) minutes. Extended self-test routine recommended polling time: ( 343) minutes. Conveyance self-test routine recommended polling time: ( 2) minutes. SCT capabilities: (0x3085) SCT Status supported. SMART Attributes Data Structure revision number: 10 Vendor Specific SMART Attributes with Thresholds: ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE 1 Raw_Read_Error_Rate 0x000f 115 099 006 Pre-fail Always - 99424032 3 Spin_Up_Time 0x0003 095 094 000 Pre-fail Always - 0 4 Start_Stop_Count 0x0032 100 100 020 Old_age Always - 656 5 Reallocated_Sector_Ct 0x0033 100 100 010 Pre-fail Always - 0 7 Seek_Error_Rate 0x000f 045 045 030 Pre-fail Always - 4372311308766 9 Power_On_Hours 0x0032 094 094 000 Old_age Always - 5649 10 Spin_Retry_Count 0x0013 100 100 097 Pre-fail Always - 0 12 Power_Cycle_Count 0x0032 100 100 020 Old_age Always - 24 183 Runtime_Bad_Block 0x0032 099 099 000 Old_age Always - 1 184 End-to-End_Error 0x0032 099 099 099 Old_age Always FAILING_NOW 1 187 Reported_Uncorrect 0x0032 100 100 000 Old_age Always - 0 188 Command_Timeout 0x0032 100 100 000 Old_age Always - 0 0 0 189 High_Fly_Writes 0x003a 098 098 000 Old_age Always - 2 190 Airflow_Temperature_Cel 0x0022 061 059 045 Old_age Always - 39 (Min/Max 35/40) 191 G-Sense_Error_Rate 0x0032 100 100 000 Old_age Always - 0 192 Power-Off_Retract_Count 0x0032 100 100 000 Old_age Always - 24 193 Load_Cycle_Count 0x0032 089 089 000 Old_age Always - 23164 194 Temperature_Celsius 0x0022 039 041 000 Old_age Always - 39 (0 22 0 0 0) 197 Current_Pending_Sector 0x0012 100 100 000 Old_age Always - 0 198 Offline_Uncorrectable 0x0010 100 100 000 Old_age Offline - 0 199 UDMA_CRC_Error_Count 0x003e 200 200 000 Old_age Always - 0 240 Head_Flying_Hours 0x0000 100 253 000 Old_age Offline - 2210h+37m+19.068s 241 Total_LBAs_Written 0x0000 100 253 000 Old_age Offline - 10399554392 242 Total_LBAs_Read 0x0000 100 253 000 Old_age Offline - 100095612467 SMART Error Log Version: 1 ATA Error Count: 1 CR = Command Register [HEX] FR = Features Register [HEX] SC = Sector Count Register [HEX] SN = Sector Number Register [HEX] CL = Cylinder Low Register [HEX] CH = Cylinder High Register [HEX] DH = Device/Head Register [HEX] DC = Device Command Register [HEX] ER = Error register [HEX] ST = Status register [HEX] Powered_Up_Time is measured from power on, and printed as DDd+hh:mm:SS.sss where DD=days, hh=hours, mm=minutes, SS=sec, and sss=millisec. It "wraps" after 49.710 days. Error 1 occurred at disk power-on lifetime: 5380 hours (224 days + 4 hours) When the command that caused the error occurred, the device was active or idle. After command completion occurred, registers were: ER ST SC SN CL CH DH -- -- -- -- -- -- -- 40 51 00 00 00 00 00 Error: UNC at LBA = 0x00000000 = 0 Commands leading to the command that caused the error were: CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name -- -- -- -- -- -- -- -- ---------------- -------------------- c8 00 08 c0 00 00 e0 00 2d+17:33:23.675 READ DMA ca 00 08 10 57 00 e0 00 2d+17:32:50.848 WRITE DMA c8 00 08 10 57 00 e0 00 2d+17:32:50.848 READ DMA ca 00 18 f8 56 00 e0 00 2d+17:32:50.817 WRITE DMA ca 00 08 f0 56 00 e0 00 2d+17:32:50.792 WRITE DMA SMART Self-test log structure revision number 1 Num Test_Description Status Remaining LifeTime(hours) LBA_of_first_error # 1 Short offline Self-test routine in progress 10% 5649 - SMART Selective self-test log data structure revision number 1 SPAN MIN_LBA MAX_LBA CURRENT_TEST_STATUS 1 0 0 Not_testing 2 0 0 Not_testing 3 0 0 Not_testing 4 0 0 Not_testing 5 0 0 Not_testing Selective self-test flags (0x0): After scanning selected spans, do NOT read-scan remainder of disk. If Selective self-test is pending on power-up, resume after 0 minute delay. smartctl 6.2 2013-07-26 r3841 [i686-linux-3.9.11p-unRAID] (local build) Copyright (C) 2002-13, Bruce Allen, Christian Franke, www.smartmontools.org === START OF INFORMATION SECTION === Model Family: Seagate Barracuda 7200.14 (AF) Device Model: ST3000DM001-1CH166 Serial Number: Z1F37XZN LU WWN Device Id: 5 000c50 063afa0e4 Firmware Version: CC27 User Capacity: 3,000,592,982,016 bytes [3.00 TB] Sector Sizes: 512 bytes logical, 4096 bytes physical Rotation Rate: 7200 rpm Device is: In smartctl database [for details use: -P show] ATA Version is: ACS-2, ACS-3 T13/2161-D revision 3b SATA Version is: SATA 3.1, 6.0 Gb/s (current: 3.0 Gb/s) Local Time is: Mon Jul 28 02:38:47 2014 EDT ==> WARNING: A firmware update for this drive may be available, see the following Seagate web pages: http://knowledge.seagate.com/articles/en_US/FAQ/207931en http://knowledge.seagate.com/articles/en_US/FAQ/223651en SMART support is: Available - device has SMART capability. SMART support is: Enabled === START OF READ SMART DATA SECTION === SMART overall-health self-assessment test result: PASSED General SMART Values: Offline data collection status: (0x00) Offline data collection activity was never started. Auto Offline Data Collection: Disabled. Self-test execution status: ( 0) The previous self-test routine completed without error or no self-test has ever been run. Total time to complete Offline data collection: ( 592) seconds. Offline data collection capabilities: (0x73) SMART execute Offline immediate. Auto Offline data collection on/off support. Suspend Offline collection upon new command. No Offline surface scan supported. Self-test supported. Conveyance Self-test supported. Selective Self-test supported. SMART capabilities: (0x0003) Saves SMART data before entering power-saving mode. Supports SMART auto save timer. Error logging capability: (0x01) Error logging supported. General Purpose Logging supported. Short self-test routine recommended polling time: ( 1) minutes. Extended self-test routine recommended polling time: ( 335) minutes. Conveyance self-test routine recommended polling time: ( 2) minutes. SCT capabilities: (0x3085) SCT Status supported. SMART Attributes Data Structure revision number: 10 Vendor Specific SMART Attributes with Thresholds: ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE 1 Raw_Read_Error_Rate 0x000f 117 099 006 Pre-fail Always - 137024568 3 Spin_Up_Time 0x0003 094 094 000 Pre-fail Always - 0 4 Start_Stop_Count 0x0032 100 100 020 Old_age Always - 294 5 Reallocated_Sector_Ct 0x0033 100 100 010 Pre-fail Always - 0 7 Seek_Error_Rate 0x000f 075 060 030 Pre-fail Always - 4332122580 9 Power_On_Hours 0x0032 094 094 000 Old_age Always - 5647 10 Spin_Retry_Count 0x0013 100 100 097 Pre-fail Always - 0 12 Power_Cycle_Count 0x0032 100 100 020 Old_age Always - 24 183 Runtime_Bad_Block 0x0032 099 099 000 Old_age Always - 1 184 End-to-End_Error 0x0032 100 100 099 Old_age Always - 0 187 Reported_Uncorrect 0x0032 100 100 000 Old_age Always - 0 188 Command_Timeout 0x0032 100 100 000 Old_age Always - 0 0 0 189 High_Fly_Writes 0x003a 100 100 000 Old_age Always - 0 190 Airflow_Temperature_Cel 0x0022 060 058 045 Old_age Always - 40 (Min/Max 29/42) 191 G-Sense_Error_Rate 0x0032 100 100 000 Old_age Always - 0 192 Power-Off_Retract_Count 0x0032 100 100 000 Old_age Always - 24 193 Load_Cycle_Count 0x0032 088 088 000 Old_age Always - 24582 194 Temperature_Celsius 0x0022 040 042 000 Old_age Always - 40 (0 23 0 0 0) 197 Current_Pending_Sector 0x0012 100 100 000 Old_age Always - 0 198 Offline_Uncorrectable 0x0010 100 100 000 Old_age Offline - 0 199 UDMA_CRC_Error_Count 0x003e 200 200 000 Old_age Always - 0 240 Head_Flying_Hours 0x0000 100 253 000 Old_age Offline - 2185h+17m+17.850s 241 Total_LBAs_Written 0x0000 100 253 000 Old_age Offline - 9994957320 242 Total_LBAs_Read 0x0000 100 253 000 Old_age Offline - 104455669833 SMART Error Log Version: 1 No Errors Logged SMART Self-test log structure revision number 1 Num Test_Description Status Remaining LifeTime(hours) LBA_of_first_error # 1 Short offline Completed without error 00% 5647 - SMART Selective self-test log data structure revision number 1 SPAN MIN_LBA MAX_LBA CURRENT_TEST_STATUS 1 0 0 Not_testing 2 0 0 Not_testing 3 0 0 Not_testing 4 0 0 Not_testing 5 0 0 Not_testing Selective self-test flags (0x0): After scanning selected spans, do NOT read-scan remainder of disk. If Selective self-test is pending on power-up, resume after 0 minute delay. smartctl 6.2 2013-07-26 r3841 [i686-linux-3.9.11p-unRAID] (local build) Copyright (C) 2002-13, Bruce Allen, Christian Franke, www.smartmontools.org === START OF INFORMATION SECTION === Model Family: Seagate Barracuda 7200.14 (AF) Device Model: ST3000DM001-1CH166 Serial Number: Z1F40YWL LU WWN Device Id: 5 000c50 065298f6e Firmware Version: CC27 User Capacity: 3,000,592,982,016 bytes [3.00 TB] Sector Sizes: 512 bytes logical, 4096 bytes physical Rotation Rate: 7200 rpm Device is: In smartctl database [for details use: -P show] ATA Version is: ACS-2, ACS-3 T13/2161-D revision 3b SATA Version is: SATA 3.1, 6.0 Gb/s (current: 3.0 Gb/s) Local Time is: Mon Jul 28 02:38:53 2014 EDT ==> WARNING: A firmware update for this drive may be available, see the following Seagate web pages: http://knowledge.seagate.com/articles/en_US/FAQ/207931en http://knowledge.seagate.com/articles/en_US/FAQ/223651en SMART support is: Available - device has SMART capability. SMART support is: Enabled === START OF READ SMART DATA SECTION === SMART overall-health self-assessment test result: PASSED General SMART Values: Offline data collection status: (0x00) Offline data collection activity was never started. Auto Offline Data Collection: Disabled. Self-test execution status: ( 0) The previous self-test routine completed without error or no self-test has ever been run. Total time to complete Offline data collection: ( 584) seconds. Offline data collection capabilities: (0x73) SMART execute Offline immediate. Auto Offline data collection on/off support. Suspend Offline collection upon new command. No Offline surface scan supported. Self-test supported. Conveyance Self-test supported. Selective Self-test supported. SMART capabilities: (0x0003) Saves SMART data before entering power-saving mode. Supports SMART auto save timer. Error logging capability: (0x01) Error logging supported. General Purpose Logging supported. Short self-test routine recommended polling time: ( 1) minutes. Extended self-test routine recommended polling time: ( 320) minutes. Conveyance self-test routine recommended polling time: ( 2) minutes. SCT capabilities: (0x3085) SCT Status supported. SMART Attributes Data Structure revision number: 10 Vendor Specific SMART Attributes with Thresholds: ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE 1 Raw_Read_Error_Rate 0x000f 118 099 006 Pre-fail Always - 191814184 3 Spin_Up_Time 0x0003 096 094 000 Pre-fail Always - 0 4 Start_Stop_Count 0x0032 100 100 020 Old_age Always - 23 5 Reallocated_Sector_Ct 0x0033 100 100 010 Pre-fail Always - 0 7 Seek_Error_Rate 0x000f 069 060 030 Pre-fail Always - 34419506415 9 Power_On_Hours 0x0032 094 094 000 Old_age Always - 5631 10 Spin_Retry_Count 0x0013 100 100 097 Pre-fail Always - 0 12 Power_Cycle_Count 0x0032 100 100 020 Old_age Always - 23 183 Runtime_Bad_Block 0x0032 099 099 000 Old_age Always - 1 184 End-to-End_Error 0x0032 100 100 099 Old_age Always - 0 187 Reported_Uncorrect 0x0032 100 100 000 Old_age Always - 0 188 Command_Timeout 0x0032 100 100 000 Old_age Always - 0 0 0 189 High_Fly_Writes 0x003a 100 100 000 Old_age Always - 0 190 Airflow_Temperature_Cel 0x0022 062 060 045 Old_age Always - 38 (Min/Max 34/40) 191 G-Sense_Error_Rate 0x0032 100 100 000 Old_age Always - 0 192 Power-Off_Retract_Count 0x0032 100 100 000 Old_age Always - 23 193 Load_Cycle_Count 0x0032 084 084 000 Old_age Always - 32784 194 Temperature_Celsius 0x0022 038 040 000 Old_age Always - 38 (0 23 0 0 0) 197 Current_Pending_Sector 0x0012 100 100 000 Old_age Always - 0 198 Offline_Uncorrectable 0x0010 100 100 000 Old_age Offline - 0 199 UDMA_CRC_Error_Count 0x003e 200 200 000 Old_age Always - 0 240 Head_Flying_Hours 0x0000 100 253 000 Old_age Offline - 3578h+59m+15.178s 241 Total_LBAs_Written 0x0000 100 253 000 Old_age Offline - 26039179496 242 Total_LBAs_Read 0x0000 100 253 000 Old_age Offline - 92557252833 SMART Error Log Version: 1 No Errors Logged SMART Self-test log structure revision number 1 Num Test_Description Status Remaining LifeTime(hours) LBA_of_first_error # 1 Short offline Completed without error 00% 5631 - SMART Selective self-test log data structure revision number 1 SPAN MIN_LBA MAX_LBA CURRENT_TEST_STATUS 1 0 0 Not_testing 2 0 0 Not_testing 3 0 0 Not_testing 4 0 0 Not_testing 5 0 0 Not_testing Selective self-test flags (0x0): After scanning selected spans, do NOT read-scan remainder of disk. If Selective self-test is pending on power-up, resume after 0 minute delay. log.txt
July 28, 201411 yr Are you getting those slow speeds when you're (a) writing to a share; (b) writing to a specific drive; or © no matter which drive you write to? Also, are ALL of the drives spinning up when you write? ... or just the drive you're writing to along with the parity drive?
July 28, 201411 yr Author Are you getting those slow speeds when you're (a) writing to a share; (b) writing to a specific drive; or © no matter which drive you write to? Also, are ALL of the drives spinning up when you write? ... or just the drive you're writing to along with the parity drive? All drives are spinned up. Writing to the share is slow, but it could be luck.. b) Writing to my disk 1 starts off at 70MB/s for a brief econd (windows explorer issues?) and then slows down to somewhere between 15-30MB/s, which is acceptable (no cache drive) disk 2 starts off fast for 2s as well, and then drops down to 2-5MB/s writing directly do it, sometimes it stalls to 0KB/s or 800KB/s. I'm thinking perhaps this is just a bad drive, but the SMART still says 'healthy' (even though it does have one thing listed as FAILED, I guess.. not sure how to read it) All drives are spun up. Reads are still great from both.
July 28, 201411 yr Are you getting those slow speeds when you're (a) writing to a share; (b) writing to a specific drive; or © no matter which drive you write to? Also, are ALL of the drives spinning up when you write? ... or just the drive you're writing to along with the parity drive? All drives are spinned up. Writing to the share is slow, but it could be luck.. b) Writing to my disk 1 starts off at 70MB/s for a brief econd (windows explorer issues?) and then slows down to somewhere between 15-30MB/s, which is acceptable (no cache drive) disk 2 starts off fast for 2s as well, and then drops down to 2-5MB/s writing directly do it, sometimes it stalls to 0KB/s or 800KB/s. I'm thinking perhaps this is just a bad drive, but the SMART still says 'healthy' (even though it does have one thing listed as FAILED, I guess.. not sure how to read it) All drives are spun up. Always important to post the unRaid version with these types of reports. And if it applies to a beta it should be posted in the beta's announcement thread. If this relates to 6b6 you'll see this issue had been reported. Suggest you post there pointing to this thread to let LT know that a number of users are seeing this behavior. Are you only seeing this when it is accessed over the network?
July 28, 201411 yr Author Are you getting those slow speeds when you're (a) writing to a share; (b) writing to a specific drive; or © no matter which drive you write to? Also, are ALL of the drives spinning up when you write? ... or just the drive you're writing to along with the parity drive? All drives are spinned up. Writing to the share is slow, but it could be luck.. b) Writing to my disk 1 starts off at 70MB/s for a brief econd (windows explorer issues?) and then slows down to somewhere between 15-30MB/s, which is acceptable (no cache drive) disk 2 starts off fast for 2s as well, and then drops down to 2-5MB/s writing directly do it, sometimes it stalls to 0KB/s or 800KB/s. I'm thinking perhaps this is just a bad drive, but the SMART still says 'healthy' (even though it does have one thing listed as FAILED, I guess.. not sure how to read it) All drives are spun up. Always important to post the unRaid version with these types of reports. And if it applies to a beta it should be posted in the beta's announcement thread. If this relates to 6b6 you'll see this issue had been reported. Suggest you post there pointing to this thread to let LT know that a number of users are seeing this behavior. Are you only seeing this when it is accessed over the network? Doh! version: 5.0.3 Just tried with rsync --progress, and I get the same results (in a telnet session)
July 28, 201411 yr I'd suggest upping to 5.0.5 and see if it happens there, and if so report it in the 5.0.5 announcement thread. Sounds very much like the issue I and others are seeing in 6b6, and could be the same bug. A syslog is always a good idea BTW.
July 28, 201411 yr All drives spinning may mean you're writing to a failed drive (which means you're actually writing to every drive EXCEPT the failed drive ... which is very slow). Look at the Web GUI and see if any of your drives have a red dot by them (in particularly drive 2).
July 28, 201411 yr This is a long shot, but what the heck... I've had slow performance writing to the array if too much system memory was used by cache. I think this can happen if the cache pressure setting is too low (a plugin or script may alter it). try: echo 3 > /proc/sys/vm/drop_caches and see if speed improves (it may only improve for a short time if it is indeed what I'm talking about).
July 28, 201411 yr Author All drives spinning may mean you're writing to a failed drive (which means you're actually writing to every drive EXCEPT the failed drive ... which is very slow). Look at the Web GUI and see if any of your drives have a red dot by them (in particularly drive 2). No red dots. All of them are green.
July 28, 201411 yr Author This is a long shot, but what the heck... I've had slow performance writing to the array if too much system memory was used by cache. I think this can happen if the cache pressure setting is too low (a plugin or script may alter it). try: echo 3 > /proc/sys/vm/drop_caches and see if speed improves (it may only improve for a short time if it is indeed what I'm talking about). No luck with that, I'm afraid.
July 28, 201411 yr It would help if you provided a syslog to see if you are getting error reports for any drive. I have also encountered situations where the GUI shows all drives as green until I stop the array - and then at that point shows that a drive has dropped off-line. It takes a reboot to recover the drive in this situation. Quite why the drive is not flagged with a red icon in the GUI I am not sure.
July 28, 201411 yr Author Enclosed is a syslog. I'm not sure why I'm not getting the red dot failure. log.txt
July 29, 201411 yr That syslog appears to be after a fresh reboot (is that correct?). To be able to use it to diagnose a problem we need one that covers the period when the problem is happening. As an aside, you are loading a number of plugins. Have you checked if the problem also occurs if you use the 'Safe Boot' option to start without any plugins?
July 29, 201411 yr Author That's right -- this is a relatively fresh boot. The problem occurs immediately on start up (any copy), so that should be OK. I'll try a safe boot and report back. I'm not sure whether to blame the HDD or a piece of software yet.
July 29, 201411 yr Looks like you have several user plugins installed as system plugins (in /boot/plugins) instead of as user plugins (/boot/config/plugins) and you have SAB installing twice, once from each place. This is probably unrelated to your problem, but you should probably clean this up. The reason there are system plugins separate from user plugins is because the system plugins are installed first. Normally the system plugin folder should be reserved for alternate webGUIs.
August 1, 201411 yr Author It's the first drive in that report, the one that has END_TO_END listed as "FAILING_NOW".
Archived
This topic is now archived and is closed to further replies.