April 3, 201115 yr I am running 4.6 and my server rebooted unexpectedly. I checked the syslog but it looks like it was restarted when the server was turned back on. Is there anyway to find out why or what was going on when this server rebooted? This isn't the first time and I was copying data to a share at the time. Thanks, Neil
April 3, 201115 yr I am running 4.6 and my server rebooted unexpectedly. I checked the syslog but it looks like it was restarted when the server was turned back on. Is there anyway to find out why or what was going on when this server rebooted? This isn't the first time and I was copying data to a share at the time. Thanks, Neil Something like this is almost always hardware related. Could be your PSU is bad or underpowered, caps on the motherboard could be bluging, or any one of a dozen other problems. Has this machine been running perfectly for some period of time and this has just started happening? If so, have you done anything recently with the box (e.g., added a drive, moved it from location A to location B)? Or is this a new build? My first thought would be to unplug and replug everything (PSU connections, SATA cables, controller cards, etc.) If it continues to sort of randomly reboot, it might require some spart parts to swap in and out to try to isolate it.
April 3, 201115 yr Author This is a fairly new build. I would say about 3 months and during that time I think it has done this 3 times. Right now it is doing the parity check and the box is really running poorly. I am seeing this in the syslog: Apr 3 19:27:11 Storage kernel: ata5.00: configured for UDMA/33 Apr 3 19:27:11 Storage kernel: ata5.01: configured for UDMA/133 Apr 3 19:27:11 Storage kernel: ata5.00: device reported invalid CHS sector 0 Apr 3 19:27:11 Storage kernel: ata5: EH complete Apr 3 19:27:37 Storage kernel: ata5.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0 Apr 3 19:27:37 Storage kernel: ata5.00: BMDMA stat 0x64 Apr 3 19:27:37 Storage kernel: ata5.00: failed command: READ DMA EXT Apr 3 19:27:37 Storage kernel: ata5.00: cmd 25/00:18:0f:ef:27/00:01:63:00:00/e0 tag 0 dma 143360 in Apr 3 19:27:37 Storage kernel: res 51/40:00:eb:ef:27/40:00:63:00:00/00 Emask 0x9 (media error) Apr 3 19:27:37 Storage kernel: ata5.00: status: { DRDY ERR } Apr 3 19:27:37 Storage kernel: ata5.00: error: { UNC } Apr 3 19:27:37 Storage kernel: ata5.00: configured for UDMA/33 Apr 3 19:27:37 Storage kernel: ata5.01: configured for UDMA/133 Apr 3 19:27:37 Storage kernel: ata5: EH complete Apr 3 19:27:53 Storage kernel: ata5.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0 Apr 3 19:27:53 Storage kernel: ata5.00: BMDMA stat 0x64 Apr 3 19:27:53 Storage kernel: ata5.00: failed command: READ DMA EXT Apr 3 19:27:53 Storage kernel: ata5.00: cmd 25/00:18:0f:ef:27/00:01:63:00:00/e0 tag 0 dma 143360 in Apr 3 19:27:53 Storage kernel: res 51/40:00:eb:ef:27/40:00:63:00:00/00 Emask 0x9 (media error) Apr 3 19:27:53 Storage kernel: ata5.00: status: { DRDY ERR } Apr 3 19:27:53 Storage kernel: ata5.00: error: { UNC } Apr 3 19:27:53 Storage kernel: ata5.00: configured for UDMA/33 Apr 3 19:27:53 Storage kernel: ata5.01: configured for UDMA/133 Apr 3 19:27:53 Storage kernel: ata5: EH complete Apr 3 19:28:11 Storage kernel: ata5.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0 Apr 3 19:28:11 Storage kernel: ata5.00: BMDMA stat 0x64 Apr 3 19:28:11 Storage kernel: ata5.00: failed command: READ DMA EXT Apr 3 19:28:11 Storage kernel: ata5.00: cmd 25/00:18:0f:ef:27/00:01:63:00:00/e0 tag 0 dma 143360 in Apr 3 19:28:11 Storage kernel: res 51/40:00:eb:ef:27/40:00:63:00:00/00 Emask 0x9 (media error) Apr 3 19:28:11 Storage kernel: ata5.00: status: { DRDY ERR } Apr 3 19:28:11 Storage kernel: ata5.00: error: { UNC } Apr 3 19:28:11 Storage kernel: ata5.00: configured for UDMA/33 Apr 3 19:28:11 Storage kernel: ata5.01: configured for UDMA/133 Apr 3 19:28:11 Storage kernel: ata5: EH complete Apr 3 19:28:25 Storage kernel: ata5.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0 Apr 3 19:28:25 Storage kernel: ata5.00: BMDMA stat 0x64 Apr 3 19:28:25 Storage kernel: ata5.00: failed command: READ DMA EXT Apr 3 19:28:25 Storage kernel: ata5.00: cmd 25/00:18:0f:ef:27/00:01:63:00:00/e0 tag 0 dma 143360 in Apr 3 19:28:25 Storage kernel: res 51/40:00:eb:ef:27/40:00:63:00:00/00 Emask 0x9 (media error) Apr 3 19:28:25 Storage kernel: ata5.00: status: { DRDY ERR } Apr 3 19:28:25 Storage kernel: ata5.00: error: { UNC } Apr 3 19:28:25 Storage kernel: ata5.00: configured for UDMA/33 Apr 3 19:28:26 Storage kernel: ata5.01: configured for UDMA/133 Apr 3 19:28:26 Storage kernel: sd 3:0:0:0: [sdg] Unhandled sense code Apr 3 19:28:26 Storage kernel: sd 3:0:0:0: [sdg] Result: hostbyte=0x00 driverbyte=0x08 Apr 3 19:28:26 Storage kernel: sd 3:0:0:0: [sdg] Sense Key : 0x3 [current] [descriptor] Apr 3 19:28:26 Storage kernel: Descriptor sense data with sense descriptors (in hex): Apr 3 19:28:26 Storage kernel: 72 03 11 04 00 00 00 0c 00 0a 80 00 00 00 00 00 Apr 3 19:28:26 Storage kernel: 63 27 ef eb Apr 3 19:28:26 Storage kernel: sd 3:0:0:0: [sdg] ASC=0x11 ASCQ=0x4 Apr 3 19:28:26 Storage kernel: sd 3:0:0:0: [sdg] CDB: cdb[0]=0x28: 28 00 63 27 ef 0f 00 01 18 00 Apr 3 19:28:26 Storage kernel: end_request: I/O error, dev sdg, sector 1663561707 Apr 3 19:28:26 Storage kernel: ata5: EH complete Apr 3 19:28:26 Storage emhttp: disk_temperature: ATTR_Temperature_Celsius not found Apr 3 19:28:26 Storage kernel: md: disk2 read error Apr 3 19:28:26 Storage kernel: handle_stripe read error: 1663561640/2, count: 1 Apr 3 19:28:26 Storage kernel: md: disk2 read error Apr 3 19:28:26 Storage kernel: handle_stripe read error: 1663561648/2, count: 1 Apr 3 19:28:26 Storage kernel: md: disk2 read error Apr 3 19:28:26 Storage kernel: handle_stripe read error: 1663561656/2, count: 1 Apr 3 19:28:26 Storage kernel: md: disk2 read error Apr 3 19:28:26 Storage kernel: handle_stripe read error: 1663561664/2, count: 1 Apr 3 19:28:26 Storage kernel: md: disk2 read error Apr 3 19:28:26 Storage kernel: handle_stripe read error: 1663561672/2, count: 1 Apr 3 19:28:26 Storage kernel: md: disk2 read error Apr 3 19:28:26 Storage kernel: handle_stripe read error: 1663561680/2, count: 1 Apr 3 19:28:26 Storage kernel: md: disk2 read error Apr 3 19:28:26 Storage kernel: handle_stripe read error: 1663561688/2, count: 1 Apr 3 19:28:26 Storage kernel: md: disk2 read error Apr 3 19:28:26 Storage kernel: handle_stripe read error: 1663561696/2, count: 1
April 4, 201115 yr Post the entire syslog. Zip if needed. At least one of your drives is flaky. Media errors (UNC errors) are un-correctable/un-readable sectors on a physical disk. They will be marked for re-allocation, or may have already been re-allocated. Get a "SMART" report for the drive. smartctl -d ata -a /dev/sdg
April 4, 201115 yr Author root@Storage:~# smartctl -d ata -a /dev/sdg smartctl 5.39.1 2010-01-28 r3054 [i486-slackware-linux-gnu] (local build) Copyright © 2002-10 by Bruce Allen, http://smartmontools.sourceforge.net === START OF INFORMATION SECTION === Device Model: ST32000542AS Serial Number: 5XW1CTY2 Firmware Version: CC34 User Capacity: 2,000,398,934,016 bytes Device is: Not in smartctl database [for details use: -P showall] ATA Version is: 8 ATA Standard is: ATA-8-ACS revision 4 Local Time is: Sun Apr 3 21:30:15 2011 EDT SMART support is: Available - device has SMART capability. SMART support is: Enabled === START OF READ SMART DATA SECTION === SMART overall-health self-assessment test result: PASSED See vendor-specific Attribute list for marginal Attributes. General SMART Values: Offline data collection status: (0x00) Offline data collection activity was never started. Auto Offline Data Collection: Disabled. Self-test execution status: ( 0) The previous self-test routine completed without error or no self-test has ever been run. Total time to complete Offline data collection: ( 663) seconds. Offline data collection capabilities: (0x73) SMART execute Offline immediate. Auto Offline data collection on/off supp ort. Suspend Offline collection upon new command. No Offline surface scan supported. Self-test supported. Conveyance Self-test supported. Selective Self-test supported. SMART capabilities: (0x0003) Saves SMART data before entering power-saving mode. Supports SMART auto save timer. Error logging capability: (0x01) Error logging supported. General Purpose Logging supported. Short self-test routine recommended polling time: ( 1) minutes. Extended self-test routine recommended polling time: ( 255) minutes. Conveyance self-test routine recommended polling time: ( 2) minutes. SCT capabilities: (0x103f) SCT Status supported. SCT Feature Control supported. SCT Data Table supported. SMART Attributes Data Structure revision number: 10 Vendor Specific SMART Attributes with Thresholds: ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_ FAILED RAW_VALUE 1 Raw_Read_Error_Rate 0x000f 115 085 006 Pre-fail Always - 191956925 3 Spin_Up_Time 0x0003 100 100 000 Pre-fail Always - 0 4 Start_Stop_Count 0x0032 100 100 020 Old_age Always - 453 5 Reallocated_Sector_Ct 0x0033 100 100 036 Pre-fail Always - 31 7 Seek_Error_Rate 0x000f 065 060 030 Pre-fail Always - 3215323 9 Power_On_Hours 0x0032 096 096 000 Old_age Always - 3714 10 Spin_Retry_Count 0x0013 100 100 097 Pre-fail Always - 0 12 Power_Cycle_Count 0x0032 100 100 020 Old_age Always - 17 183 Runtime_Bad_Block 0x0032 060 060 000 Old_age Always - 40 184 End-to-End_Error 0x0032 100 100 099 Old_age Always - 0 187 Reported_Uncorrect 0x0032 040 040 000 Old_age Always - 60 188 Command_Timeout 0x0032 100 057 000 Old_age Always - 184686411819 189 High_Fly_Writes 0x003a 100 100 000 Old_age Always - 0 190 Airflow_Temperature_Cel 0x0022 068 043 045 Old_age Always In_th e_past 32 (10 76 34 29) 194 Temperature_Celsius 0x0022 032 057 000 Old_age Always - 32 (0 17 0 0) 195 Hardware_ECC_Recovered 0x001a 046 037 000 Old_age Always - 191956925 197 Current_Pending_Sector 0x0012 100 100 000 Old_age Always - 0 198 Offline_Uncorrectable 0x0010 100 100 000 Old_age Offline - 0 199 UDMA_CRC_Error_Count 0x003e 200 200 000 Old_age Always - 0 240 Head_Flying_Hours 0x0000 100 253 000 Old_age Offline - 8091718386471 241 Total_LBAs_Written 0x0000 100 253 000 Old_age Offline - 2557532092 242 Total_LBAs_Read 0x0000 100 253 000 Old_age Offline - 681364335 SMART Error Log Version: 1 ATA Error Count: 60 (device log contains only the most recent five errors) CR = Command Register [HEX] FR = Features Register [HEX] SC = Sector Count Register [HEX] SN = Sector Number Register [HEX] CL = Cylinder Low Register [HEX] CH = Cylinder High Register [HEX] DH = Device/Head Register [HEX] DC = Device Command Register [HEX] ER = Error register [HEX] ST = Status register [HEX] Powered_Up_Time is measured from power on, and printed as DDd+hh:mm:SS.sss where DD=days, hh=hours, mm=minutes, SS=sec, and sss=millisec. It "wraps" after 49.710 days. Error 60 occurred at disk power-on lifetime: 3712 hours (154 days + 16 hours) When the command that caused the error occurred, the device was active or idle . After command completion occurred, registers were: ER ST SC SN CL CH DH -- -- -- -- -- -- -- 40 51 00 ff ff ff 0f Error: UNC at LBA = 0x0fffffff = 268435455 Commands leading to the command that caused the error were: CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name -- -- -- -- -- -- -- -- ---------------- -------------------- 25 00 00 ff ff ff ef 00 03:38:51.582 READ DMA EXT 27 00 00 00 00 00 e0 00 03:38:51.541 READ NATIVE MAX ADDRESS EXT ec 00 00 00 00 00 a0 00 03:38:51.500 IDENTIFY DEVICE ef 03 42 00 00 00 a0 00 03:38:51.489 SET FEATURES [set transfer mode] 27 00 00 00 00 00 e0 00 03:38:51.461 READ NATIVE MAX ADDRESS EXT Error 59 occurred at disk power-on lifetime: 3712 hours (154 days + 16 hours) When the command that caused the error occurred, the device was active or idle . After command completion occurred, registers were: ER ST SC SN CL CH DH -- -- -- -- -- -- -- 40 51 00 ff ff ff 0f Error: UNC at LBA = 0x0fffffff = 268435455 Commands leading to the command that caused the error were: CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name -- -- -- -- -- -- -- -- ---------------- -------------------- 25 00 00 ff ff ff ef 00 03:38:47.391 READ DMA EXT 27 00 00 00 00 00 e0 00 03:38:47.350 READ NATIVE MAX ADDRESS EXT ec 00 00 00 00 00 a0 00 03:38:47.310 IDENTIFY DEVICE ef 03 42 00 00 00 a0 00 03:38:47.298 SET FEATURES [set transfer mode] 27 00 00 00 00 00 e0 00 03:38:47.270 READ NATIVE MAX ADDRESS EXT Error 58 occurred at disk power-on lifetime: 3712 hours (154 days + 16 hours) When the command that caused the error occurred, the device was active or idle . After command completion occurred, registers were: ER ST SC SN CL CH DH -- -- -- -- -- -- -- 40 51 00 ff ff ff 0f Error: UNC at LBA = 0x0fffffff = 268435455 Commands leading to the command that caused the error were: CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name -- -- -- -- -- -- -- -- ---------------- -------------------- 25 00 00 ff ff ff ef 00 03:38:43.568 READ DMA EXT 25 00 00 ff ff ff ef 00 03:38:42.459 READ DMA EXT 25 00 08 ff ff ff ef 00 03:38:42.450 READ DMA EXT 25 00 f8 ff ff ff ef 00 03:38:42.407 READ DMA EXT 25 00 08 ff ff ff ef 00 03:38:42.407 READ DMA EXT Error 57 occurred at disk power-on lifetime: 3712 hours (154 days + 16 hours) When the command that caused the error occurred, the device was active or idle . After command completion occurred, registers were: ER ST SC SN CL CH DH -- -- -- -- -- -- -- 40 51 00 ff ff ff 0f Error: UNC at LBA = 0x0fffffff = 268435455 Commands leading to the command that caused the error were: CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name -- -- -- -- -- -- -- -- ---------------- -------------------- 25 00 00 ff ff ff ef 00 03:36:30.077 READ DMA EXT 25 00 08 ff ff ff ef 00 03:36:30.076 READ DMA EXT 25 00 08 ff ff ff ef 00 03:36:30.076 READ DMA EXT 25 00 08 ff ff ff ef 00 03:36:30.076 READ DMA EXT 25 00 08 ff ff ff ef 00 03:36:30.076 READ DMA EXT Error 56 occurred at disk power-on lifetime: 3712 hours (154 days + 16 hours) When the command that caused the error occurred, the device was active or idle . After command completion occurred, registers were: ER ST SC SN CL CH DH -- -- -- -- -- -- -- 40 51 00 ff ff ff 0f Error: UNC at LBA = 0x0fffffff = 268435455 Commands leading to the command that caused the error were: CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name -- -- -- -- -- -- -- -- ---------------- -------------------- 25 00 00 ff ff ff ef 00 03:34:43.198 READ DMA EXT 27 00 00 00 00 00 e0 00 03:34:43.157 READ NATIVE MAX ADDRESS EXT ec 00 00 00 00 00 a0 00 03:34:43.117 IDENTIFY DEVICE ef 03 42 00 00 00 a0 00 03:34:43.100 SET FEATURES [set transfer mode] 27 00 00 00 00 00 e0 00 03:34:43.077 READ NATIVE MAX ADDRESS EXT SMART Self-test log structure revision number 1 No self-tests have been logged. [To run self-tests, use: smartctl -t] SMART Selective self-test log data structure revision number 1 SPAN MIN_LBA MAX_LBA CURRENT_TEST_STATUS 1 0 0 Not_testing 2 0 0 Not_testing 3 0 0 Not_testing 4 0 0 Not_testing 5 0 0 Not_testing Selective self-test flags (0x0): After scanning selected spans, do NOT read-scan remainder of disk. If Selective self-test is pending on power-up, resume after 0 minute delay.
April 4, 201115 yr sdg is showing problems: 5 Reallocated_Sector_Ct 0x0033 100 100 036 Pre-fail Always - 31 It is likely that ata5=sdg=disk2 but in order to confirm we need the entire syslog. 31 reallocated sectors is not a show stopper but if the number increases the drive needs to be replaced.
April 4, 201115 yr Author Attached is the syslog. Sorry for all the g-d damned hacker attempts! Script kiddies ruin the internet. syslog-2011-04-04.txt.zip
April 4, 201115 yr Author model name : Intel® Core i5 CPU 750 @ 2.67GHz Mem: 8300232 PSU: Antec EarthWatts EA750 750W Continuous Power ATX12V version 2.3 SLI Certified CrossFire Ready 80 PLUS Certified Active PFC ... Thanks, Neil
April 4, 201115 yr How many and what type of HBAs do you have? There seem to be problems with 2 drives. ATA5.00 = sdi sdg = disk2 = ATA3.00
April 7, 201115 yr Author /dev/sde ST32000542AS_5XW1D19C /mnt/disk1 /dev/sdf ST32000542AS_5XW18K0D /mnt/disk2 /dev/sdg ST32000542AS_5XW1CTY2 /mnt/disk3 /dev/sdh ST32000542AS_5XW1BAFV /mnt/disk4 /dev/sdd 00M_WD-WCAZA1003122 /mnt/disk5 /dev/sdc 00M_WD-WCAZA1055519 /mnt/disk6 /dev/sdb 00M_WD-WCAZA1055748
April 7, 201115 yr Author pci-0000:00:1f.5-scsi-1:0:0:0 host5 (sdi) OCZ-VERTEX_TSOEM7IPS6G0R4D7AJP6 Cache drive for now!
April 7, 201115 yr Has the server been up for three days? What is the current value for sdg: 5 Reallocated_Sector_Ct 0x0033 100 100 036 Pre-fail Always - 31
April 7, 201115 yr Author I guess? Since Sunday when I reported the fact it randomly rebooted i turned it back on and it has been running ever since?
April 7, 201115 yr What is the current value for Reallocated_Sector_Ct of sdg. Get a new SMART report. I think the cache drive may causing the problem it is throwing a lot of errors in the syslog. That may be because it is a SSD. I have no experience with SSD in unRAID. Check the cables of sdg. Let me look at a new syslog to see if the errors are continuing.
April 7, 201115 yr Author smartctl -a -d ata /dev/sdg smartctl 5.39.1 2010-01-28 r3054 [i486-slackware-linux-gnu] (local build) Copyright © 2002-10 by Bruce Allen, http://smartmontools.sourceforge.net === START OF INFORMATION SECTION === Device Model: ST32000542AS Serial Number: 5XW1CTY2 Firmware Version: CC34 User Capacity: 2,000,398,934,016 bytes Device is: Not in smartctl database [for details use: -P showall] ATA Version is: 8 ATA Standard is: ATA-8-ACS revision 4 Local Time is: Wed Apr 6 22:29:46 2011 EDT SMART support is: Available - device has SMART capability. SMART support is: Enabled === START OF READ SMART DATA SECTION === SMART overall-health self-assessment test result: PASSED See vendor-specific Attribute list for marginal Attributes. General SMART Values: Offline data collection status: (0x00) Offline data collection activity was never started. Auto Offline Data Collection: Disabled. Self-test execution status: ( 0) The previous self-test routine completed without error or no self-test has ever been run. Total time to complete Offline data collection: ( 663) seconds. Offline data collection capabilities: (0x73) SMART execute Offline immediate. Auto Offline data collection on/off support. Suspend Offline collection upon new command. No Offline surface scan supported. Self-test supported. Conveyance Self-test supported. Selective Self-test supported. SMART capabilities: (0x0003) Saves SMART data before entering power-saving mode. Supports SMART auto save timer. Error logging capability: (0x01) Error logging supported. General Purpose Logging supported. Short self-test routine recommended polling time: ( 1) minutes. Extended self-test routine recommended polling time: ( 255) minutes. Conveyance self-test routine recommended polling time: ( 2) minutes. SCT capabilities: (0x103f) SCT Status supported. SCT Feature Control supported. SCT Data Table supported. SMART Attributes Data Structure revision number: 10 Vendor Specific SMART Attributes with Thresholds: ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE 1 Raw_Read_Error_Rate 0x000f 114 085 006 Pre-fail Always - 78973165 3 Spin_Up_Time 0x0003 100 100 000 Pre-fail Always - 0 4 Start_Stop_Count 0x0032 100 100 020 Old_age Always - 466 5 Reallocated_Sector_Ct 0x0033 099 099 036 Pre-fail Always - 67 7 Seek_Error_Rate 0x000f 065 060 030 Pre-fail Always - 3327570 9 Power_On_Hours 0x0032 096 096 000 Old_age Always - 3787 10 Spin_Retry_Count 0x0013 100 100 097 Pre-fail Always - 0 12 Power_Cycle_Count 0x0032 100 100 020 Old_age Always - 17 183 Runtime_Bad_Block 0x0032 060 060 000 Old_age Always - 40 184 End-to-End_Error 0x0032 100 100 099 Old_age Always - 0 187 Reported_Uncorrect 0x0032 040 040 000 Old_age Always - 60 188 Command_Timeout 0x0032 100 057 000 Old_age Always - 184686411819 189 High_Fly_Writes 0x003a 100 100 000 Old_age Always - 0 190 Airflow_Temperature_Cel 0x0022 072 043 045 Old_age Always In_the_past 28 (10 76 35 24) 194 Temperature_Celsius 0x0022 028 057 000 Old_age Always - 28 (0 17 0 0) 195 Hardware_ECC_Recovered 0x001a 039 037 000 Old_age Always - 78973165 197 Current_Pending_Sector 0x0012 100 100 000 Old_age Always - 0 198 Offline_Uncorrectable 0x0010 100 100 000 Old_age Offline - 0 199 UDMA_CRC_Error_Count 0x003e 200 200 000 Old_age Always - 0 240 Head_Flying_Hours 0x0000 100 253 000 Old_age Offline - 31971736552300 241 Total_LBAs_Written 0x0000 100 253 000 Old_age Offline - 2576703228 242 Total_LBAs_Read 0x0000 100 253 000 Old_age Offline - 2463417362 SMART Error Log Version: 1 ATA Error Count: 60 (device log contains only the most recent five errors) CR = Command Register [HEX] FR = Features Register [HEX] SC = Sector Count Register [HEX] SN = Sector Number Register [HEX] CL = Cylinder Low Register [HEX] CH = Cylinder High Register [HEX] DH = Device/Head Register [HEX] DC = Device Command Register [HEX] ER = Error register [HEX] ST = Status register [HEX] Powered_Up_Time is measured from power on, and printed as DDd+hh:mm:SS.sss where DD=days, hh=hours, mm=minutes, SS=sec, and sss=millisec. It "wraps" after 49.710 days. Error 60 occurred at disk power-on lifetime: 3712 hours (154 days + 16 hours) When the command that caused the error occurred, the device was active or idle. After command completion occurred, registers were: ER ST SC SN CL CH DH -- -- -- -- -- -- -- 40 51 00 ff ff ff 0f Error: UNC at LBA = 0x0fffffff = 268435455 Commands leading to the command that caused the error were: CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name -- -- -- -- -- -- -- -- ---------------- -------------------- 25 00 00 ff ff ff ef 00 03:38:51.582 READ DMA EXT 27 00 00 00 00 00 e0 00 03:38:51.541 READ NATIVE MAX ADDRESS EXT ec 00 00 00 00 00 a0 00 03:38:51.500 IDENTIFY DEVICE ef 03 42 00 00 00 a0 00 03:38:51.489 SET FEATURES [set transfer mode] 27 00 00 00 00 00 e0 00 03:38:51.461 READ NATIVE MAX ADDRESS EXT Error 59 occurred at disk power-on lifetime: 3712 hours (154 days + 16 hours) When the command that caused the error occurred, the device was active or idle. After command completion occurred, registers were: ER ST SC SN CL CH DH -- -- -- -- -- -- -- 40 51 00 ff ff ff 0f Error: UNC at LBA = 0x0fffffff = 268435455 Commands leading to the command that caused the error were: CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name -- -- -- -- -- -- -- -- ---------------- -------------------- 25 00 00 ff ff ff ef 00 03:38:47.391 READ DMA EXT 27 00 00 00 00 00 e0 00 03:38:47.350 READ NATIVE MAX ADDRESS EXT ec 00 00 00 00 00 a0 00 03:38:47.310 IDENTIFY DEVICE ef 03 42 00 00 00 a0 00 03:38:47.298 SET FEATURES [set transfer mode] 27 00 00 00 00 00 e0 00 03:38:47.270 READ NATIVE MAX ADDRESS EXT Error 58 occurred at disk power-on lifetime: 3712 hours (154 days + 16 hours) When the command that caused the error occurred, the device was active or idle. After command completion occurred, registers were: ER ST SC SN CL CH DH -- -- -- -- -- -- -- 40 51 00 ff ff ff 0f Error: UNC at LBA = 0x0fffffff = 268435455 Commands leading to the command that caused the error were: CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name -- -- -- -- -- -- -- -- ---------------- -------------------- 25 00 00 ff ff ff ef 00 03:38:43.568 READ DMA EXT 25 00 00 ff ff ff ef 00 03:38:42.459 READ DMA EXT 25 00 08 ff ff ff ef 00 03:38:42.450 READ DMA EXT 25 00 f8 ff ff ff ef 00 03:38:42.407 READ DMA EXT 25 00 08 ff ff ff ef 00 03:38:42.407 READ DMA EXT Error 57 occurred at disk power-on lifetime: 3712 hours (154 days + 16 hours) When the command that caused the error occurred, the device was active or idle. After command completion occurred, registers were: ER ST SC SN CL CH DH -- -- -- -- -- -- -- 40 51 00 ff ff ff 0f Error: UNC at LBA = 0x0fffffff = 268435455 Commands leading to the command that caused the error were: CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name -- -- -- -- -- -- -- -- ---------------- -------------------- 25 00 00 ff ff ff ef 00 03:36:30.077 READ DMA EXT 25 00 08 ff ff ff ef 00 03:36:30.076 READ DMA EXT 25 00 08 ff ff ff ef 00 03:36:30.076 READ DMA EXT 25 00 08 ff ff ff ef 00 03:36:30.076 READ DMA EXT 25 00 08 ff ff ff ef 00 03:36:30.076 READ DMA EXT Error 56 occurred at disk power-on lifetime: 3712 hours (154 days + 16 hours) When the command that caused the error occurred, the device was active or idle. After command completion occurred, registers were: ER ST SC SN CL CH DH -- -- -- -- -- -- -- 40 51 00 ff ff ff 0f Error: UNC at LBA = 0x0fffffff = 268435455 Commands leading to the command that caused the error were: CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name -- -- -- -- -- -- -- -- ---------------- -------------------- 25 00 00 ff ff ff ef 00 03:34:43.198 READ DMA EXT 27 00 00 00 00 00 e0 00 03:34:43.157 READ NATIVE MAX ADDRESS EXT ec 00 00 00 00 00 a0 00 03:34:43.117 IDENTIFY DEVICE ef 03 42 00 00 00 a0 00 03:34:43.100 SET FEATURES [set transfer mode] 27 00 00 00 00 00 e0 00 03:34:43.077 READ NATIVE MAX ADDRESS EXT SMART Self-test log structure revision number 1 No self-tests have been logged. [To run self-tests, use: smartctl -t] SMART Selective self-test log data structure revision number 1 SPAN MIN_LBA MAX_LBA CURRENT_TEST_STATUS 1 0 0 Not_testing 2 0 0 Not_testing 3 0 0 Not_testing 4 0 0 Not_testing 5 0 0 Not_testing Selective self-test flags (0x0): After scanning selected spans, do NOT read-scan remainder of disk. If Selective self-test is pending on power-up, resume after 0 minute delay.
April 7, 201115 yr Do you have a UPS? Is the BIOS set to restart when power is restored? Could you have had a power failure? Is that the entire syslog?
April 7, 201115 yr Author That is the entire syslog. It is on a 2000VA UPS. Other devices on this UPS did not reboot. The bios is set to restart on power failure. Later this week I am going to replace the SSD with a Western Digital Black Drive so I have more space and can utilize the cache with more of my shares. When I do this I will double check my bios settings... Thanks, Neil
April 7, 201115 yr sdg is failing. The Reallocated_Sector_Ct has gone from 31 to 67 in 3 days. I suggest you replace it ASAP. Then you can preclear it 4 or 5 times and see if the count stops increasing or it dies. Hopefully you can RMA it. EDIT: based on the power on hours I'm guessing you can RMA. They may give you an issue saying that there are thousands of spare sectors. That is why you need to pre-clear it to death.
April 7, 201115 yr Author Funny it is an SSD. No moving parts seems hardly likely that it is an issue but I believe you! I only use it for a few shares. I am going to use a bigger traditional drive as soon as my others arrive.
April 7, 201115 yr Author Oh SDG. Well F- me. Can you direct me to the best way to replace that drive in terms of unraid? I will order another one ASAP to replace it. Neil
Archived
This topic is now archived and is closed to further replies.