tr0910 Posted August 6, 2015 Share Posted August 6, 2015 I have a 3tb Seagate showing 128 errors in the error column of the gui. It has no problems with parity checks passing them successfully, but I expect it is going bad. I tried to run a smart report, but it doesn't want to work. Drive is attached to a M1015 controller in a Norco 4224 chassis ST3000DM001-9YN166_W1F0MED2 (sdl) 2930266532 root@Server1:~# smartctl -a -d ata /dev/sdl smartctl 6.2 2013-07-26 r3841 [i686-linux-3.9.11p-unRAID] (local build) Copyright (C) 2002-13, Bruce Allen, Christian Franke, www.smartmontools.org Read Device Identity failed: Invalid argument A mandatory SMART command failed: exiting. To continue, add one or more '-T permissive' options. ========== root@Server1:~# smartctl -a -d ata -T permissive /dev/sdl smartctl 6.2 2013-07-26 r3841 [i686-linux-3.9.11p-unRAID] (local build) Copyright (C) 2002-13, Bruce Allen, Christian Franke, www.smartmontools.org Read Device Identity failed: Invalid argument === START OF INFORMATION SECTION === Device Model: [No Information Found] Serial Number: [No Information Found] Firmware Version: [No Information Found] Device is: Not in smartctl database [for details use: -P showall] ATA Version is: [No Information Found] Local Time is: Thu Aug 6 09:29:56 2015 CDT SMART support is: Ambiguous - ATA IDENTIFY DEVICE words 82-83 don't show if SMART supported. SMART support is: Ambiguous - ATA IDENTIFY DEVICE words 85-87 don't show if SMART is enabled. A mandatory SMART command failed: exiting. To continue, add one or more '-T permissive' options. ========== root@Server1:~# smartctl -a -d ata -T verypermissive /dev/sdl smartctl 6.2 2013-07-26 r3841 [i686-linux-3.9.11p-unRAID] (local build) Copyright (C) 2002-13, Bruce Allen, Christian Franke, www.smartmontools.org Read Device Identity failed: Invalid argument === START OF INFORMATION SECTION === Device Model: [No Information Found] Serial Number: [No Information Found] Firmware Version: [No Information Found] Device is: Not in smartctl database [for details use: -P showall] ATA Version is: [No Information Found] Local Time is: Thu Aug 6 09:30:05 2015 CDT SMART support is: Ambiguous - ATA IDENTIFY DEVICE words 82-83 don't show if SMART supported. SMART support is: Ambiguous - ATA IDENTIFY DEVICE words 85-87 don't show if SMART is enabled. Checking to be sure by trying SMART RETURN STATUS command. SMART support is: Unknown - Try option -s with argument 'on' to enable it. Read SMART Data failed: Invalid argument === START OF READ SMART DATA SECTION === Error SMART Status command failed: Invalid argument SMART overall-health self-assessment test result: UNKNOWN! SMART Status, Attributes and Thresholds cannot be read. Read SMART Error Log failed: Invalid argument Read SMART Self-test Log failed: Invalid argument Selective Self-tests/Logging not supported root@Server1:~# Link to comment
HellDiverUK Posted August 7, 2015 Share Posted August 7, 2015 Well, it is a ST3000DM001, probably the worst drive of the modern times. They're nearly as bad as the old IBM 75GXP... Link to comment
tr0910 Posted August 7, 2015 Author Share Posted August 7, 2015 I have had one failure out of 20 disks. I like their speed. They haven't treated me badly yet. But, why can't I run a smart report? Link to comment
tr0910 Posted August 7, 2015 Author Share Posted August 7, 2015 Get rid of the "-d ata" $ smartctl -a -T verypermissive /dev/sdl That did it. I think I will pull it and rebuild the array then put it into another machine and pre-clear the daylights out of it. (wonder why My-Main from unMenu runs it with the -d ata parameter) Serial Number: W1F0MED2 LU WWN Device Id: 5 000c50 0524a68f1 Firmware Version: CC9E User Capacity: 3,000,592,982,016 bytes [3.00 TB] Sector Sizes: 512 bytes logical, 4096 bytes physical Rotation Rate: 7200 rpm Device is: In smartctl database [for details use: -P show] ATA Version is: ATA8-ACS T13/1699-D revision 4 SATA Version is: SATA 3.0, 6.0 Gb/s (current: 6.0 Gb/s) Local Time is: Fri Aug 7 09:34:58 2015 CDT SMART support is: Available - device has SMART capability. SMART support is: Enabled === START OF READ SMART DATA SECTION === SMART overall-health self-assessment test result: PASSED See vendor-specific Attribute list for marginal Attributes. General SMART Values: Offline data collection status: (0x00) Offline data collection activity was never started. Auto Offline Data Collection: Disabled. Self-test execution status: ( 0) The previous self-test routine completed without error or no self-test has ever been run. Total time to complete Offline data collection: ( 575) seconds. Offline data collection capabilities: (0x73) SMART execute Offline immediate. Auto Offline data collection on/off support. Suspend Offline collection upon new command. No Offline surface scan supported. Self-test supported. Conveyance Self-test supported. Selective Self-test supported. SMART capabilities: (0x0003) Saves SMART data before entering power-saving mode. Supports SMART auto save timer. Error logging capability: (0x01) Error logging supported. General Purpose Logging supported. Short self-test routine recommended polling time: ( 1) minutes. Extended self-test routine recommended polling time: ( 333) minutes. Conveyance self-test routine recommended polling time: ( 2) minutes. SCT capabilities: (0x3081) SCT Status supported. SMART Attributes Data Structure revision number: 10 Vendor Specific SMART Attributes with Thresholds: ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE 1 Raw_Read_Error_Rate 0x000f 117 099 006 Pre-fail Always - 129471864 3 Spin_Up_Time 0x0003 092 092 000 Pre-fail Always - 0 4 Start_Stop_Count 0x0032 095 095 020 Old_age Always - 6141 5 Reallocated_Sector_Ct 0x0033 100 100 036 Pre-fail Always - 200 7 Seek_Error_Rate 0x000f 085 060 030 Pre-fail Always - 396581271 9 Power_On_Hours 0x0032 078 078 000 Old_age Always - 19956 10 Spin_Retry_Count 0x0013 100 100 097 Pre-fail Always - 0 12 Power_Cycle_Count 0x0032 100 100 020 Old_age Always - 96 183 Runtime_Bad_Block 0x0032 100 100 000 Old_age Always - 0 184 End-to-End_Error 0x0032 100 100 099 Old_age Always - 0 187 Reported_Uncorrect 0x0032 087 087 000 Old_age Always - 13 188 Command_Timeout 0x0032 100 100 000 Old_age Always - 0 0 0 189 High_Fly_Writes 0x003a 100 100 000 Old_age Always - 0 190 Airflow_Temperature_Cel 0x0022 063 045 045 Old_age Always In_the_past 37 (Min/Max 24/45) 191 G-Sense_Error_Rate 0x0032 100 100 000 Old_age Always - 0 192 Power-Off_Retract_Count 0x0032 100 100 000 Old_age Always - 85 193 Load_Cycle_Count 0x0032 063 063 000 Old_age Always - 74034 194 Temperature_Celsius 0x0022 037 055 000 Old_age Always - 37 (0 10 0 0 0) 197 Current_Pending_Sector 0x0012 100 100 000 Old_age Always - 0 198 Offline_Uncorrectable 0x0010 100 100 000 Old_age Offline - 0 199 UDMA_CRC_Error_Count 0x003e 200 200 000 Old_age Always - 0 240 Head_Flying_Hours 0x0000 100 253 000 Old_age Offline - 2548h+18m+43.797s 241 Total_LBAs_Written 0x0000 100 253 000 Old_age Offline - 23074907530402 242 Total_LBAs_Read 0x0000 100 253 000 Old_age Offline - 3995737547186 SMART Error Log Version: 1 ATA Error Count: 13 (device log contains only the most recent five errors) CR = Command Register [HEX] FR = Features Register [HEX] SC = Sector Count Register [HEX] SN = Sector Number Register [HEX] CL = Cylinder Low Register [HEX] CH = Cylinder High Register [HEX] DH = Device/Head Register [HEX] DC = Device Command Register [HEX] ER = Error register [HEX] ST = Status register [HEX] Powered_Up_Time is measured from power on, and printed as DDd+hh:mm:SS.sss where DD=days, hh=hours, mm=minutes, SS=sec, and sss=millisec. It "wraps" after 49.710 days. Error 13 occurred at disk power-on lifetime: 19773 hours (823 days + 21 hours) When the command that caused the error occurred, the device was active or idle. After command completion occurred, registers were: ER ST SC SN CL CH DH -- -- -- -- -- -- -- 40 51 00 ff ff ff 0f Error: UNC at LBA = 0x0fffffff = 268435455 Commands leading to the command that caused the error were: CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name -- -- -- -- -- -- -- -- ---------------- -------------------- 60 00 00 ff ff ff 4f 00 44d+07:28:25.683 READ FPDMA QUEUED 60 00 08 ff ff ff 4f 00 44d+07:28:25.656 READ FPDMA QUEUED 60 00 00 ff ff ff 4f 00 44d+07:28:25.646 READ FPDMA QUEUED 60 00 00 ff ff ff 4f 00 44d+07:28:25.620 READ FPDMA QUEUED 60 00 08 ff ff ff 4f 00 44d+07:28:25.593 READ FPDMA QUEUED Error 12 occurred at disk power-on lifetime: 19756 hours (823 days + 4 hours) When the command that caused the error occurred, the device was active or idle. After command completion occurred, registers were: ER ST SC SN CL CH DH -- -- -- -- -- -- -- 40 51 00 ff ff ff 0f Error: UNC at LBA = 0x0fffffff = 268435455 Commands leading to the command that caused the error were: CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name -- -- -- -- -- -- -- -- ---------------- -------------------- 60 00 10 ff ff ff 4f 00 43d+14:23:04.071 READ FPDMA QUEUED 60 00 20 ff ff ff 4f 00 43d+14:23:04.065 READ FPDMA QUEUED 60 00 08 ff ff ff 4f 00 43d+14:22:57.187 READ FPDMA QUEUED 60 00 08 ff ff ff 4f 00 43d+14:22:57.184 READ FPDMA QUEUED 60 00 18 ff ff ff 4f 00 43d+14:22:50.309 READ FPDMA QUEUED Error 11 occurred at disk power-on lifetime: 19755 hours (823 days + 3 hours) When the command that caused the error occurred, the device was active or idle. After command completion occurred, registers were: ER ST SC SN CL CH DH -- -- -- -- -- -- -- 40 51 00 ff ff ff 0f Error: UNC at LBA = 0x0fffffff = 268435455 Commands leading to the command that caused the error were: CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name -- -- -- -- -- -- -- -- ---------------- -------------------- 60 00 00 ff ff ff 4f 00 43d+13:52:23.939 READ FPDMA QUEUED 60 00 38 ff ff ff 4f 00 43d+13:52:19.501 READ FPDMA QUEUED 60 00 40 ff ff ff 4f 00 43d+13:52:19.471 READ FPDMA QUEUED 60 00 90 ff ff ff 4f 00 43d+13:52:19.469 READ FPDMA QUEUED 60 00 70 ff ff ff 4f 00 43d+13:52:19.468 READ FPDMA QUEUED Error 10 occurred at disk power-on lifetime: 19742 hours (822 days + 14 hours) When the command that caused the error occurred, the device was active or idle. After command completion occurred, registers were: ER ST SC SN CL CH DH -- -- -- -- -- -- -- 40 51 00 ff ff ff 0f Error: UNC at LBA = 0x0fffffff = 268435455 Commands leading to the command that caused the error were: CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name -- -- -- -- -- -- -- -- ---------------- -------------------- 60 00 00 ff ff ff 4f 00 43d+00:45:47.175 READ FPDMA QUEUED 60 00 00 ff ff ff 4f 00 43d+00:45:47.098 READ FPDMA QUEUED 60 00 10 ff ff ff 4f 00 43d+00:45:41.793 READ FPDMA QUEUED 60 00 38 ff ff ff 4f 00 43d+00:45:39.292 READ FPDMA QUEUED 60 00 38 ff ff ff 4f 00 43d+00:45:39.286 READ FPDMA QUEUED Error 9 occurred at disk power-on lifetime: 19742 hours (822 days + 14 hours) When the command that caused the error occurred, the device was active or idle. After command completion occurred, registers were: ER ST SC SN CL CH DH -- -- -- -- -- -- -- 40 51 00 ff ff ff 0f Error: UNC at LBA = 0x0fffffff = 268435455 Commands leading to the command that caused the error were: CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name -- -- -- -- -- -- -- -- ---------------- -------------------- 60 00 08 ff ff ff 4f 00 42d+23:59:45.230 READ FPDMA QUEUED 60 00 08 ff ff ff 4f 00 42d+23:59:45.226 READ FPDMA QUEUED 60 00 08 ff ff ff 4f 00 42d+23:59:39.302 READ FPDMA QUEUED 60 00 10 ff ff ff 4f 00 42d+23:59:39.295 READ FPDMA QUEUED 60 00 10 ff ff ff 4f 00 42d+23:59:33.293 READ FPDMA QUEUED SMART Self-test log structure revision number 1 No self-tests have been logged. [To run self-tests, use: smartctl -t] SMART Selective self-test log data structure revision number 1 SPAN MIN_LBA MAX_LBA CURRENT_TEST_STATUS 1 0 0 Not_testing 2 0 0 Not_testing 3 0 0 Not_testing 4 0 0 Not_testing 5 0 0 Not_testing Selective self-test flags (0x0): After scanning selected spans, do NOT read-scan remainder of disk. If Selective self-test is pending on power-up, resume after 0 minute delay. root@Server1:~# Link to comment
c3 Posted August 10, 2015 Share Posted August 10, 2015 I believe the smartctl parms can be configured, so the -d ata was added and can be removed. The drive reported errors recently (within weeks) after 2+ years of service. Warranty or trash it. Unless you have time and patience to test it to death. Link to comment
SSD Posted August 10, 2015 Share Posted August 10, 2015 Get rid of the "-d ata" $ smartctl -a -T verypermissive /dev/sdl That did it. I think I will pull it and rebuild the array then put it into another machine and pre-clear the daylights out of it. (wonder why My-Main from unMenu runs it with the -d ata parameter) In myMain, it you click on the hyperlinked last 4 digits of the serial number for the drive in question, a drive settings page will be displayed. Set the smartopt to -A as shown in the screenshot below. That will cause myMain to stop using the "-d ata". The -T option you are using is not required. Link to comment
tr0910 Posted August 10, 2015 Author Share Posted August 10, 2015 Get rid of the "-d ata" $ smartctl -a -T verypermissive /dev/sdl That did it. I think I will pull it and rebuild the array then put it into another machine and pre-clear the daylights out of it. (wonder why My-Main from unMenu runs it with the -d ata parameter) In myMain, it you click on the hyperlinked last 4 digits of the serial number for the drive in question, a drive settings page will be displayed. Set the smartopt to -A as shown in the screenshot below. That will cause myMain to stop using the "-d ata". The -T option you are using is not required. Awesome I'll change that. But first I need to get disk 23 to show up in my main Smart view. This is also very strange. Disk23 shows up in Default my main view but not when I go to Smart view. All disks up to 22 show but not 23. Link to comment
Recommended Posts
Archived
This topic is now archived and is closed to further replies.