January 2, 201313 yr My tower was rebooted and following the reboot it said SAMBA is STOPPED, Shared drives will not be visible on the LAN, I read somewhere some advice to delete the /etc/samba/private/secrets.tdb file and it would be recreated. Whether this was a good idea im not sure but i mv it to /etc/samba/private/secrets.bak so as it to be backed up and being able to restore later, or so i thought as after rebooted the file is re-created but my .bak has gone. Anyway situation now is that all disks are showing as new, see attached screenshot and syslog, can anyone offer any advice as to what happened or my options for recovery? I assume that the contents of my files are all ok on the invividual disks. syslog.zip
January 2, 201313 yr /etc/.... is on a RAM disk. Reboot and whatever you saved there will be gone. If you wish to preserve files you need to either save them on the USB flash drive (/boot/....) or on a hard drive somewhere (if you have cache drive, for example).
January 3, 201313 yr Author /etc/.... is on a RAM disk. Reboot and whatever you saved there will be gone. If you wish to preserve files you need to either save them on the USB flash drive (/boot/....) or on a hard drive somewhere (if you have cache drive, for example). I pretty much realised that the moment I saw the file gone, would my issue likely be caused by me deleting the secrets file? I am not quite sure of my options now, If I dont get any advice how to get unraid back seeing all the disks ok my only thoughts are to start fresh install with 1 disk, manually copy the files from another disk to it then add that blank disk to the array and so on, not a pretty thought for 20+ TB of data!!! I really hope there is an easier way?
January 5, 201313 yr Author See my sig to disable all add-ons. Show an image of the stock unRAID main page. I took the usb flash disk out and plugged into my PC but there is no /config/plugins folder? There is a /packages folder in the root of the USB There is also no /extra folder either. I dont quite understand what use the stock go file means? If I click the unRAID Main option from unMENU is this teh page you wanted to see? if so its attached.
January 5, 201313 yr Author Also attached here is the last syslog where system was working, and the first one where system was bad on boot, not sure if these help? I am starting to get a bit desperate and wondering if my files are actually ok on the disks or not. Can anyone advise some commands or what I can do to verify my files are actually ok on the disks? syslog-20121231-224449.zip syslog-20121231-230401.zip
January 5, 201313 yr The stock go file is shown in my sig. Start the array. DO NOT click anything else. Are the disks mounted correctly or do they appear as unformatted?
January 5, 201313 yr See my sig to disable all add-ons. Show an image of the stock unRAID main page. I took the usb flash disk out and plugged into my PC but there is no /config/plugins folder? There is a /packages folder in the root of the USB There is also no /extra folder either. The "plugins" and "extra" folder are on the 5.X beta/rc series. They are not on the 4.7 or prior unRAID. Everything looks fine. you might want to perform a checkdisk/scandisk while you have the flash drive in your PC. For all the drives to be "blue" it is as if the config/super.dat file is corrupted, or missing. (or, you typed "initconfig", responding with "Yes" and you set a new disk configuration) When you next start the array, parity will be initially calculated, as if it is a new disk configuration. All your data should be fine. HOWEVER... there is a bug in the 4.7 unRAID where when a super.dat file is missing, the disks get their MBR re-written based on your current preferences (4k-alligned, or un-aligned). If you start the array and all (or any) of the disks show as unformatted DO NOT FORMAT THEM. You may need to point their MBR back to the correct partition start. (The data is still there, but you are not pointing to it properly, therefore the disk cannot be mounted, and appears unformatted in the display. It really is formatted, just not mounted properly) Joe L.
January 7, 201313 yr Author I started the array and it carried out its parity check, there were quite a few parity sync fixes required and after running a second parity sync it completed with no sync errors so all good there. I am now still left with my original problem that SAMBA is STOPPED, Shared drives will not be visible on the LAN. Oddly one folder shows up when i go to \\tower although only a few of the files are actually available. I have been through most of what I could find searching for this error but have not found a solution that works. If I try to start Samba from Array Management screen nothing happens.
January 9, 201313 yr Author Just noticed those errors against disk4 as well, should i be worried about those?
January 15, 201313 yr Author I managed to get my server back working, just decided in the end to re-install unraid, it was pretty painless so the easy option. I now have just the errors remaining, below is smart report, I dont have any experience of analysing smart reports but im assuming SMART overall-health self-assessment test result: FAILED! Drive failure expected in less than 24 hours. SAVE ALL DATA. and 1 Raw_Read_Error_Rate 0x002f 049 049 051 Pre-fail Always FAILING_NOW 164785 are not good news? Time to change the disk? Statistics for /dev/sdi WDC_WD20EARS-00M_WD-WCAZA3072211 smartctl -a -d ata /dev/sdi smartctl 5.39.1 2010-01-28 r3054 [i486-slackware-linux-gnu] (local build) Copyright (C) 2002-10 by Bruce Allen, http://smartmontools.sourceforge.net === START OF INFORMATION SECTION === Device Model: WDC WD20EARS-00MVWB0 Serial Number: WD-WCAZA3072211 Firmware Version: 51.0AB51 User Capacity: 2,000,398,934,016 bytes Device is: Not in smartctl database [for details use: -P showall] ATA Version is: 8 ATA Standard is: Exact ATA specification draft version not indicated Local Time is: Tue Jan 15 22:17:54 2013 Local time zone must be set--see zic m SMART support is: Available - device has SMART capability. SMART support is: Enabled === START OF READ SMART DATA SECTION === SMART overall-health self-assessment test result: FAILED! Drive failure expected in less than 24 hours. SAVE ALL DATA. See vendor-specific Attribute list for failed Attributes. General SMART Values: Offline data collection status: (0x82) Offline data collection activity was completed without error. Auto Offline Data Collection: Enabled. Self-test execution status: ( 0) The previous self-test routine completed without error or no self-test has ever been run. Total time to complete Offline data collection: (36960) seconds. Offline data collection capabilities: (0x7b) SMART execute Offline immediate. Auto Offline data collection on/off support. Suspend Offline collection upon new command. Offline surface scan supported. Self-test supported. Conveyance Self-test supported. Selective Self-test supported. SMART capabilities: (0x0003) Saves SMART data before entering power-saving mode. Supports SMART auto save timer. Error logging capability: (0x01) Error logging supported. General Purpose Logging supported. Short self-test routine recommended polling time: ( 2) minutes. Extended self-test routine recommended polling time: ( 255) minutes. Conveyance self-test routine recommended polling time: ( 5) minutes. SCT capabilities: (0x3035) SCT Status supported. SCT Feature Control supported. SCT Data Table supported. SMART Attributes Data Structure revision number: 16 Vendor Specific SMART Attributes with Thresholds: ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE 1 Raw_Read_Error_Rate 0x002f 049 049 051 Pre-fail Always FAILING_NOW 164785 3 Spin_Up_Time 0x0027 171 167 021 Pre-fail Always - 6441 4 Start_Stop_Count 0x0032 098 098 000 Old_age Always - 2463 5 Reallocated_Sector_Ct 0x0033 200 200 140 Pre-fail Always - 1 7 Seek_Error_Rate 0x002e 200 200 000 Old_age Always - 0 9 Power_On_Hours 0x0032 078 078 000 Old_age Always - 16075 10 Spin_Retry_Count 0x0032 100 100 000 Old_age Always - 0 11 Calibration_Retry_Count 0x0032 100 253 000 Old_age Always - 0 12 Power_Cycle_Count 0x0032 100 100 000 Old_age Always - 81 192 Power-Off_Retract_Count 0x0032 200 200 000 Old_age Always - 29 193 Load_Cycle_Count 0x0032 173 173 000 Old_age Always - 81098 194 Temperature_Celsius 0x0022 120 098 000 Old_age Always - 30 196 Reallocated_Event_Count 0x0032 199 199 000 Old_age Always - 1 197 Current_Pending_Sector 0x0032 197 197 000 Old_age Always - 1259 198 Offline_Uncorrectable 0x0030 200 197 000 Old_age Offline - 2 199 UDMA_CRC_Error_Count 0x0032 200 200 000 Old_age Always - 0 200 Multi_Zone_Error_Rate 0x0008 108 083 000 Old_age Offline - 24702 SMART Error Log Version: 1 ATA Error Count: 216 (device log contains only the most recent five errors) CR = Command Register [HEX] FR = Features Register [HEX] SC = Sector Count Register [HEX] SN = Sector Number Register [HEX] CL = Cylinder Low Register [HEX] CH = Cylinder High Register [HEX] DH = Device/Head Register [HEX] DC = Device Command Register [HEX] ER = Error register [HEX] ST = Status register [HEX] Powered_Up_Time is measured from power on, and printed as DDd+hh:mm:SS.sss where DD=days, hh=hours, mm=minutes, SS=sec, and sss=millisec. It "wraps" after 49.710 days. Error 216 occurred at disk power-on lifetime: 15837 hours (659 days + 21 hours) When the command that caused the error occurred, the device was active or idle. After command completion occurred, registers were: ER ST SC SN CL CH DH -- -- -- -- -- -- -- 40 51 00 a0 2f 78 ef Error: UNC at LBA = 0x0f782fa0 = 259534752 Commands leading to the command that caused the error were: CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name -- -- -- -- -- -- -- -- ---------------- -------------------- c8 00 00 28 2f 78 ef 08 00:51:43.096 READ DMA ec 00 00 00 00 00 a0 08 00:51:43.056 IDENTIFY DEVICE ef 03 46 00 00 00 a0 08 00:51:43.056 SET FEATURES [set transfer mode] Error 215 occurred at disk power-on lifetime: 15837 hours (659 days + 21 hours) When the command that caused the error occurred, the device was active or idle. After command completion occurred, registers were: ER ST SC SN CL CH DH -- -- -- -- -- -- -- 40 51 00 a0 2f 78 ef Error: UNC at LBA = 0x0f782fa0 = 259534752 Commands leading to the command that caused the error were: CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name -- -- -- -- -- -- -- -- ---------------- -------------------- c8 00 00 28 2f 78 ef 08 00:51:40.198 READ DMA ec 00 00 00 00 00 a0 08 00:51:40.158 IDENTIFY DEVICE ef 03 46 00 00 00 a0 08 00:51:40.158 SET FEATURES [set transfer mode] Error 214 occurred at disk power-on lifetime: 15837 hours (659 days + 21 hours) When the command that caused the error occurred, the device was active or idle. After command completion occurred, registers were: ER ST SC SN CL CH DH -- -- -- -- -- -- -- 40 51 00 a0 2f 78 ef Error: UNC at LBA = 0x0f782fa0 = 259534752 Commands leading to the command that caused the error were: CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name -- -- -- -- -- -- -- -- ---------------- -------------------- c8 00 00 28 2f 78 ef 08 00:51:37.324 READ DMA c8 00 00 28 26 78 ef 08 00:51:36.440 READ DMA Error 213 occurred at disk power-on lifetime: 15837 hours (659 days + 21 hours) When the command that caused the error occurred, the device was active or idle. After command completion occurred, registers were: ER ST SC SN CL CH DH -- -- -- -- -- -- -- 40 51 00 18 45 76 ef Error: UNC at LBA = 0x0f764518 = 259409176 Commands leading to the command that caused the error were: CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name -- -- -- -- -- -- -- -- ---------------- -------------------- c8 00 00 28 44 76 ef 08 00:50:01.012 READ DMA c8 00 00 28 3b 76 ef 08 00:50:00.415 READ DMA Error 212 occurred at disk power-on lifetime: 9849 hours (410 days + 9 hours) When the command that caused the error occurred, the device was active or idle. After command completion occurred, registers were: ER ST SC SN CL CH DH -- -- -- -- -- -- -- 40 51 08 e0 b0 c6 ee Error: UNC 8 sectors at LBA = 0x0ec6b0e0 = 247902432 Commands leading to the command that caused the error were: CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name -- -- -- -- -- -- -- -- ---------------- -------------------- c8 00 08 e0 b0 c6 ee 08 39d+22:19:14.365 READ DMA c8 00 08 50 d7 c6 ee 08 39d+22:19:14.349 READ DMA SMART Self-test log structure revision number 1 Num Test_Description Status Remaining LifeTime(hours) LBA_of_first_error # 1 Extended offline Completed: read failure 90% 15940 142919840 # 2 Short offline Completed without error 00% 15884 - SMART Selective self-test log data structure revision number 1 SPAN MIN_LBA MAX_LBA CURRENT_TEST_STATUS 1 0 0 Not_testing 2 0 0 Not_testing 3 0 0 Not_testing 4 0 0 Not_testing 5 0 0 Not_testing Selective self-test flags (0x0): After scanning selected spans, do NOT read-scan remainder of disk. If Selective self-test is pending on power-up, resume after 0 minute delay.
January 16, 201313 yr I managed to get my server back working, just decided in the end to re-install unraid, it was pretty painless so the easy option. I now have just the errors remaining, below is smart report, I dont have any experience of analysing smart reports but im assuming SMART overall-health self-assessment test result: FAILED! Drive failure expected in less than 24 hours. SAVE ALL DATA. and 1 Raw_Read_Error_Rate 0x002f 049 049 051 Pre-fail Always FAILING_NOW 164785 are not good news? Time to change the disk? Statistics for /dev/sdi WDC_WD20EARS-00M_WD-WCAZA3072211 smartctl -a -d ata /dev/sdi smartctl 5.39.1 2010-01-28 r3054 [i486-slackware-linux-gnu] (local build) Copyright (C) 2002-10 by Bruce Allen, http://smartmontools.sourceforge.net === START OF INFORMATION SECTION === Device Model: WDC WD20EARS-00MVWB0 Serial Number: WD-WCAZA3072211 Firmware Version: 51.0AB51 User Capacity: 2,000,398,934,016 bytes Device is: Not in smartctl database [for details use: -P showall] ATA Version is: 8 ATA Standard is: Exact ATA specification draft version not indicated Local Time is: Tue Jan 15 22:17:54 2013 Local time zone must be set--see zic m SMART support is: Available - device has SMART capability. SMART support is: Enabled === START OF READ SMART DATA SECTION === SMART overall-health self-assessment test result: FAILED! Drive failure expected in less than 24 hours. SAVE ALL DATA. See vendor-specific Attribute list for failed Attributes. General SMART Values: Offline data collection status: (0x82) Offline data collection activity was completed without error. Auto Offline Data Collection: Enabled. Self-test execution status: ( 0) The previous self-test routine completed without error or no self-test has ever been run. Total time to complete Offline data collection: (36960) seconds. Offline data collection capabilities: (0x7b) SMART execute Offline immediate. Auto Offline data collection on/off support. Suspend Offline collection upon new command. Offline surface scan supported. Self-test supported. Conveyance Self-test supported. Selective Self-test supported. SMART capabilities: (0x0003) Saves SMART data before entering power-saving mode. Supports SMART auto save timer. Error logging capability: (0x01) Error logging supported. General Purpose Logging supported. Short self-test routine recommended polling time: ( 2) minutes. Extended self-test routine recommended polling time: ( 255) minutes. Conveyance self-test routine recommended polling time: ( 5) minutes. SCT capabilities: (0x3035) SCT Status supported. SCT Feature Control supported. SCT Data Table supported. SMART Attributes Data Structure revision number: 16 Vendor Specific SMART Attributes with Thresholds: ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE 1 Raw_Read_Error_Rate 0x002f 049 049 051 Pre-fail Always FAILING_NOW 164785 3 Spin_Up_Time 0x0027 171 167 021 Pre-fail Always - 6441 4 Start_Stop_Count 0x0032 098 098 000 Old_age Always - 2463 5 Reallocated_Sector_Ct 0x0033 200 200 140 Pre-fail Always - 1 7 Seek_Error_Rate 0x002e 200 200 000 Old_age Always - 0 9 Power_On_Hours 0x0032 078 078 000 Old_age Always - 16075 10 Spin_Retry_Count 0x0032 100 100 000 Old_age Always - 0 11 Calibration_Retry_Count 0x0032 100 253 000 Old_age Always - 0 12 Power_Cycle_Count 0x0032 100 100 000 Old_age Always - 81 192 Power-Off_Retract_Count 0x0032 200 200 000 Old_age Always - 29 193 Load_Cycle_Count 0x0032 173 173 000 Old_age Always - 81098 194 Temperature_Celsius 0x0022 120 098 000 Old_age Always - 30 196 Reallocated_Event_Count 0x0032 199 199 000 Old_age Always - 1 197 Current_Pending_Sector 0x0032 197 197 000 Old_age Always - 1259 198 Offline_Uncorrectable 0x0030 200 197 000 Old_age Offline - 2 199 UDMA_CRC_Error_Count 0x0032 200 200 000 Old_age Always - 0 200 Multi_Zone_Error_Rate 0x0008 108 083 000 Old_age Offline - 24702 SMART Error Log Version: 1 ATA Error Count: 216 (device log contains only the most recent five errors) CR = Command Register [HEX] FR = Features Register [HEX] SC = Sector Count Register [HEX] SN = Sector Number Register [HEX] CL = Cylinder Low Register [HEX] CH = Cylinder High Register [HEX] DH = Device/Head Register [HEX] DC = Device Command Register [HEX] ER = Error register [HEX] ST = Status register [HEX] Powered_Up_Time is measured from power on, and printed as DDd+hh:mm:SS.sss where DD=days, hh=hours, mm=minutes, SS=sec, and sss=millisec. It "wraps" after 49.710 days. Error 216 occurred at disk power-on lifetime: 15837 hours (659 days + 21 hours) When the command that caused the error occurred, the device was active or idle. After command completion occurred, registers were: ER ST SC SN CL CH DH -- -- -- -- -- -- -- 40 51 00 a0 2f 78 ef Error: UNC at LBA = 0x0f782fa0 = 259534752 Commands leading to the command that caused the error were: CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name -- -- -- -- -- -- -- -- ---------------- -------------------- c8 00 00 28 2f 78 ef 08 00:51:43.096 READ DMA ec 00 00 00 00 00 a0 08 00:51:43.056 IDENTIFY DEVICE ef 03 46 00 00 00 a0 08 00:51:43.056 SET FEATURES [set transfer mode] Error 215 occurred at disk power-on lifetime: 15837 hours (659 days + 21 hours) When the command that caused the error occurred, the device was active or idle. After command completion occurred, registers were: ER ST SC SN CL CH DH -- -- -- -- -- -- -- 40 51 00 a0 2f 78 ef Error: UNC at LBA = 0x0f782fa0 = 259534752 Commands leading to the command that caused the error were: CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name -- -- -- -- -- -- -- -- ---------------- -------------------- c8 00 00 28 2f 78 ef 08 00:51:40.198 READ DMA ec 00 00 00 00 00 a0 08 00:51:40.158 IDENTIFY DEVICE ef 03 46 00 00 00 a0 08 00:51:40.158 SET FEATURES [set transfer mode] Error 214 occurred at disk power-on lifetime: 15837 hours (659 days + 21 hours) When the command that caused the error occurred, the device was active or idle. After command completion occurred, registers were: ER ST SC SN CL CH DH -- -- -- -- -- -- -- 40 51 00 a0 2f 78 ef Error: UNC at LBA = 0x0f782fa0 = 259534752 Commands leading to the command that caused the error were: CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name -- -- -- -- -- -- -- -- ---------------- -------------------- c8 00 00 28 2f 78 ef 08 00:51:37.324 READ DMA c8 00 00 28 26 78 ef 08 00:51:36.440 READ DMA Error 213 occurred at disk power-on lifetime: 15837 hours (659 days + 21 hours) When the command that caused the error occurred, the device was active or idle. After command completion occurred, registers were: ER ST SC SN CL CH DH -- -- -- -- -- -- -- 40 51 00 18 45 76 ef Error: UNC at LBA = 0x0f764518 = 259409176 Commands leading to the command that caused the error were: CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name -- -- -- -- -- -- -- -- ---------------- -------------------- c8 00 00 28 44 76 ef 08 00:50:01.012 READ DMA c8 00 00 28 3b 76 ef 08 00:50:00.415 READ DMA Error 212 occurred at disk power-on lifetime: 9849 hours (410 days + 9 hours) When the command that caused the error occurred, the device was active or idle. After command completion occurred, registers were: ER ST SC SN CL CH DH -- -- -- -- -- -- -- 40 51 08 e0 b0 c6 ee Error: UNC 8 sectors at LBA = 0x0ec6b0e0 = 247902432 Commands leading to the command that caused the error were: CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name -- -- -- -- -- -- -- -- ---------------- -------------------- c8 00 08 e0 b0 c6 ee 08 39d+22:19:14.365 READ DMA c8 00 08 50 d7 c6 ee 08 39d+22:19:14.349 READ DMA SMART Self-test log structure revision number 1 Num Test_Description Status Remaining LifeTime(hours) LBA_of_first_error # 1 Extended offline Completed: read failure 90% 15940 142919840 # 2 Short offline Completed without error 00% 15884 - SMART Selective self-test log data structure revision number 1 SPAN MIN_LBA MAX_LBA CURRENT_TEST_STATUS 1 0 0 Not_testing 2 0 0 Not_testing 3 0 0 Not_testing 4 0 0 Not_testing 5 0 0 Not_testing Selective self-test flags (0x0): After scanning selected spans, do NOT read-scan remainder of disk. If Selective self-test is pending on power-up, resume after 0 minute delay. Replace drive ASAP. It is dead.
Archived
This topic is now archived and is closed to further replies.