talmania Posted September 6, 2011 Posted September 6, 2011 I noticed today that my drive 5 was offline and had the red DISK_DSBL next to it. Took the array down, removed drive 5 from the array, powered off, reseated the drive, powered back on and reassigned the drive to slot five. When trying to start the array it looked like it was going to mount then jumped back to the default screen. So I figured the drive was truly having issues so I removed it, shut down, installed new drive, assigned it to the slot and UNRAID noticed that the disabled drive had been replaced. However when I clicked on the "I'm sure I want to do this" and start the array it then stopped and gave the message that "New Parity disk installed". My parity disk is a 2TB WD EARS and the disk I replaced (also an EARS) was a Hitachi 2TB. Size reported is the same. I'm running 4.7 final and have attached my syslog. Help! Thanks so much! syslog.txt
talmania Posted September 7, 2011 Author Posted September 7, 2011 Ok so I spent last night looking at things and remembered a previous problem I had here: http://lime-technology.com/forum/index.php?topic=11886.0 Similar issue now where I've tried to start the array multiple times (without the disabled drive and with it) and everytime it looks like it's starting to mount then stop. I've looked over the syslog and I don't see a tell-tale entry from the previous issue like: Mar 23 22:38:21 Deed emhttp: get_fstype: open /dev/sdh1: No such file or directory I do see a lock-dev error. Will post an updated syslog soon but I'm stumped on why I'm having these issues. Thanks!
talmania Posted September 7, 2011 Author Posted September 7, 2011 A partial syslog tail when trying to start the array from a ready state. partial_syslog.txt
talmania Posted September 9, 2011 Author Posted September 9, 2011 Anyone? Where's Joe when a guy needs him! I'm approaching this as follows: I've swapped back in the brand new Hitachi and will attempt to preclear it before adding it to the array (just added it directly before) in hopes to avoid the "replace parity" message when trying to start the array. I'm also going to buy a 2TB EARS in the miniscule chance that the issue is that Unraid thinks the unformatted Hitachi is larger than the EARS that's currently the parity drive.
talmania Posted September 10, 2011 Author Posted September 10, 2011 Finished the preclear of the Hitachi and after adding it successfully as disk 5 and then starting the array I again got the message "new parity installed". I'm stuck again.
mbryanr Posted September 10, 2011 Posted September 10, 2011 A similar situation as yours happened here: http://lime-technology.com/forum/index.php?topic=13974.msg132405#msg132405 Also do you have an original device assignments? Post it here.
talmania Posted September 11, 2011 Author Posted September 11, 2011 Thanks for the response--I've reseated every cable in my Norco 2040 and just got back from buying a new EARS, tried it with the exact same result. It recognizes a disabled disk has been replaced but after starting it states "new parity disk installed". I've made zero changes to the drive order other than to try different brand new drives in the same disk 5 slot. I'm attaching a new syslog where I did notice the following: Interesting about the sdj and sdj1. Not entirely sure what it means though. Sep 10 20:08:49 Deed emhttp: writing mbr on disk 5 (/dev/sdj) with partition 1 offset 63 Sep 10 20:08:49 Deed emhttp: re-reading /dev/sdj partition table Sep 10 20:08:49 Deed kernel: sdj: sdj1 Sep 10 20:08:50 Deed emhttp: mdcmd: write: No such device or address Sep 10 20:08:50 Deed kernel: mdcmd (44): start RECON_DISK Sep 10 20:08:50 Deed kernel: md: do_run: lock_rdev error: -6 syslog_2.txt
talmania Posted September 12, 2011 Author Posted September 12, 2011 A couple of screen shots... after more research tonight I found a couple of posts referencing that I should be able to start up the array with one drive missing so I unassigned it, rebooted and still can't start the array. Here's what I see...
mbryanr Posted September 12, 2011 Posted September 12, 2011 Alright...I just had something very similar happen to me. Although the disk cables and slots were the same...they were assigned by unRAID in a different location (disk6, disk7). Anyways, I had the original disk configuration and was able to reassign the disks to the correct slots. I saw the same entry in your syslog...stale configuration. Do you have the original disk assignments with serial numbers?
talmania Posted September 13, 2011 Author Posted September 13, 2011 Thanks for the reply---by original config do you mean original as in first ever config? I've grown from 3-4 disks to now almost 20. Or do you mean the config right before removing the disabled disk? I'm not sure it matters since I don't have any disk config other than the previous syslog linked above.
talmania Posted September 13, 2011 Author Posted September 13, 2011 Maybe were onto something here... From the old syslog: Mar 22 07:52:15 Deed emhttp: Device inventory: Mar 22 07:52:15 Deed emhttp: pci-0000:00:1f.2-scsi-0:0:0:0 host3 (sdr) WDC_WD20EARS-00MVWB0_WD-WCAZA0399136 Mar 22 07:52:15 Deed emhttp: pci-0000:00:1f.2-scsi-1:0:0:0 host4 (sds) ST31500341AS_9VS09K8D Mar 22 07:52:15 Deed emhttp: pci-0000:00:1f.2-scsi-2:0:0:0 host5 (sdt) ST31500341AS_9VS1ELPJ Mar 22 07:52:15 Deed emhttp: pci-0000:00:1f.2-scsi-3:0:0:0 host6 (sdu) ST31500341AS_9VS1Z549 Mar 22 07:52:15 Deed emhttp: pci-0000:00:1f.2-scsi-4:0:0:0 host7 (sdv) WDC_WD3000GLFS-01F8U0_WD-WXL508054114 Mar 22 07:52:15 Deed emhttp: pci-0000:01:00.0-sas-phy0:1-0x0000000000000000:0-lun0 host0 (sdb) ST31500541AS_6XW0PY4L Mar 22 07:52:15 Deed emhttp: pci-0000:01:00.0-sas-phy1:1-0x0100000000000000:1-lun0 host0 (sdc) ST31500541AS_6XW0JM2Q Mar 22 07:52:15 Deed emhttp: pci-0000:01:00.0-sas-phy2:1-0x0200000000000000:2-lun0 host0 (sdd) Hitachi_HDS5C30_ML0220F30KUV9D Mar 22 07:52:15 Deed emhttp: pci-0000:01:00.0-sas-phy3:1-0x0300000000000000:3-lun0 host0 (sde) WDC_WD20EARS-00_WD-WMAZA0987578 Mar 22 07:52:15 Deed emhttp: pci-0000:01:00.0-sas-phy4:1-0x0400000000000000:4-lun0 host0 (sdf) WDC_WD20EARS-00_WD-WCAZA1239135 Mar 22 07:52:15 Deed emhttp: pci-0000:01:00.0-sas-phy5:1-0x0500000000000000:5-lun0 host0 (sdg) WDC_WD20EARS-00_WD-WCAZA1241990 Mar 22 07:52:15 Deed emhttp: pci-0000:01:00.0-sas-phy6:1-0x0600000000000000:6-lun0 host0 (sdh) WDC_WD20EARS-00_WD-WCAZA1263685 Mar 22 07:52:15 Deed emhttp: pci-0000:01:00.0-sas-phy7:1-0x0700000000000000:7-lun0 host0 (sdi) ST32000542AS_5XW1NRNQ Mar 22 07:52:15 Deed emhttp: pci-0000:02:00.0-sas-phy0:1-0x0000000000000000:0-lun0 host2 (sdj) WDC_WD20EADS-00S2B0_WD-WCAVY5770676 Mar 22 07:52:15 Deed emhttp: pci-0000:02:00.0-sas-phy1:1-0x0100000000000000:1-lun0 host2 (sdk) WDC_WD20EADS-00_WD-WCAVY5771098 Mar 22 07:52:15 Deed emhttp: pci-0000:02:00.0-sas-phy2:1-0x0200000000000000:2-lun0 host2 (sdl) ST32000542AS_5XW2K2X7 Mar 22 07:52:15 Deed emhttp: pci-0000:02:00.0-sas-phy3:1-0x0300000000000000:3-lun0 host2 (sdm) ST31500341AS_9VS01DQK Mar 22 07:52:15 Deed emhttp: pci-0000:02:00.0-sas-phy4:1-0x0400000000000000:4-lun0 host2 (sdn) WDC_WD20EARS-00_WD-WCAZA2829716 Mar 22 07:52:15 Deed emhttp: pci-0000:02:00.0-sas-phy5:1-0x0500000000000000:5-lun0 host2 (sdo) Hitachi_HDS5C30_ML0220F30LBM0D Mar 22 07:52:15 Deed emhttp: pci-0000:02:00.0-sas-phy6:1-0x0600000000000000:6-lun0 host2 (sdp) WDC_WD20EARS-00_WD-WMAZA1059076 Mar 22 07:52:15 Deed emhttp: pci-0000:02:00.0-sas-phy7:1-0x0700000000000000:7-lun0 host2 (sdq) WDC_WD20EARS-00_WD-WMAZA1010557 And the latest....now to compare. If they are off should I modify to correct or ? Sep 10 20:08:27 Deed emhttp: Device inventory: Sep 10 20:08:27 Deed emhttp: pci-0000:00:1f.2-scsi-0:0:0:0 host3 (sdr) WDC_WD20EARS-00MVWB0_WD-WCAZA0399136 Sep 10 20:08:27 Deed emhttp: pci-0000:00:1f.2-scsi-1:0:0:0 host4 (sds) ST31500341AS_9VS09K8D Sep 10 20:08:27 Deed emhttp: pci-0000:00:1f.2-scsi-2:0:0:0 host5 (sdt) ST31500341AS_9VS1ELPJ Sep 10 20:08:27 Deed emhttp: pci-0000:00:1f.2-scsi-3:0:0:0 host6 (sdu) ST31500341AS_9VS1Z549 Sep 10 20:08:27 Deed emhttp: pci-0000:00:1f.2-scsi-4:0:0:0 host7 (sdv) WDC_WD3000GLFS-01F8U0_WD-WXL508054114 Sep 10 20:08:27 Deed emhttp: pci-0000:01:00.0-sas-phy0:1-0x0000000000000000:0-lun0 host0 (sdb) ST31500541AS_6XW0PY4L Sep 10 20:08:27 Deed emhttp: pci-0000:01:00.0-sas-phy1:1-0x0100000000000000:1-lun0 host0 (sdc) ST31500541AS_6XW0JM2Q Sep 10 20:08:27 Deed emhttp: pci-0000:01:00.0-sas-phy2:1-0x0200000000000000:2-lun0 host0 (sdd) Hitachi_HDS5C30_ML0220F30KUV9D Sep 10 20:08:27 Deed emhttp: pci-0000:01:00.0-sas-phy3:1-0x0300000000000000:3-lun0 host0 (sde) WDC_WD20EARS-00_WD-WMAZA0987578 Sep 10 20:08:27 Deed emhttp: pci-0000:01:00.0-sas-phy4:1-0x0400000000000000:4-lun0 host0 (sdf) WDC_WD20EARS-00_WD-WCAZA1239135 Sep 10 20:08:27 Deed emhttp: pci-0000:01:00.0-sas-phy5:1-0x0500000000000000:5-lun0 host0 (sdg) WDC_WD20EARS-00_WD-WCAZA1241990 Sep 10 20:08:27 Deed emhttp: pci-0000:01:00.0-sas-phy6:1-0x0600000000000000:6-lun0 host0 (sdh) WDC_WD20EARS-00_WD-WCAZA1263685 Sep 10 20:08:27 Deed emhttp: pci-0000:01:00.0-sas-phy7:1-0x0700000000000000:7-lun0 host0 (sdi) ST32000542AS_5XW1NRNQ Sep 10 20:08:27 Deed emhttp: pci-0000:02:00.0-sas-phy0:1-0x0000000000000000:0-lun0 host2 (sdj) WDC_WD20EARS-00MVWB0_WD-WCAZA8408520 Sep 10 20:08:27 Deed emhttp: pci-0000:02:00.0-sas-phy1:1-0x0100000000000000:1-lun0 host2 (sdk) WDC_WD20EADS-00_WD-WCAVY5771098 Sep 10 20:08:27 Deed emhttp: pci-0000:02:00.0-sas-phy2:1-0x0200000000000000:2-lun0 host2 (sdl) ST32000542AS_5XW2K2X7 Sep 10 20:08:27 Deed emhttp: pci-0000:02:00.0-sas-phy3:1-0x0300000000000000:3-lun0 host2 (sdm) ST31500341AS_9VS01DQK Sep 10 20:08:27 Deed emhttp: pci-0000:02:00.0-sas-phy4:1-0x0400000000000000:4-lun0 host2 (sdn) WDC_WD20EARS-00_WD-WCAZA2829716 Sep 10 20:08:27 Deed emhttp: pci-0000:02:00.0-sas-phy5:1-0x0500000000000000:5-lun0 host2 (sdo) Hitachi_HDS5C30_ML0220F30LBM0D Sep 10 20:08:27 Deed emhttp: pci-0000:02:00.0-sas-phy6:1-0x0600000000000000:6-lun0 host2 (sdp) WDC_WD20EARS-00_WD-WMAZA1059076 Sep 10 20:08:27 Deed emhttp: pci-0000:02:00.0-sas-phy7:1-0x0700000000000000:7-lun0 host2 (sdq) WDC_WD20EARS-00_WD-WMAZA1010557 Sep 10 20:08:27 Deed emhttp: get_fstype: open /dev/sdj1: No such file or directory Sep 10 20:08:27 Deed emhttp: get_fstype: open /dev/sdh1: No such file or directory
mbryanr Posted September 13, 2011 Posted September 13, 2011 If they are off you will want to correct. I haven't had a chance to review. Also, IIRC - Joe L posted a command line method to determine which drive is the parity drive (as it doesn't have a file system). You could search for those commands if you can't find the parity drive.
talmania Posted September 13, 2011 Author Posted September 13, 2011 I'm still confused. My guess is that the /sdX assignments are done by Linux and the drive assignments are done by Unraid right? If so, I don't have all the pieces since I don't know what the disk device assignments looked like before (see the above Unraid screen shot). Am I screwed? Joe do you check your PM's? Thanks anyone in advance for any help--I sincerely appreciate it.
mbryanr Posted September 13, 2011 Posted September 13, 2011 Alright...before proceeding get confirmation that the steps are here (and that this is the problem) - but there is a method.The important thing is not to assign parity since you don't know which is your parity drive. If you've lost your disk configuration, safest way to proceed is as follows: 1.Then, we need to force a new disk configuration to get disk 8 back in the array and to force it to rebuild parity. You do that by typing initconfig while the array is stopped. answer "Yes" on the command line (Capital "Y", lower case "es") to its prompt. Then back on the management web-page, refresh it and all the disks should show as "blue" Then start the array. It will completely rebuild parity.<----unRAID 4.7 Do Step 1 ONLY if Parity Drive is known. 1. Go to Utils and click 'New config'. Check the 'Yes I want to do this' box, then click Apply.<<<<--5 beta only! 2. Go to Main and start assigning your drives. Do not assign Parity. Whichever drive you think is Parity, just don't assign. 3. Now you have all your data disks and cache disk assigned, and they all have a blue dot - that's ok (and Parity is set to "unassigned"). Click Start and array will start and attempt to mount all the data drives. 4. If any disk did not mount (that is, appears 'unformatted'), well you have a problem: perhaps that is the actual parity disk? 5. You can spot check the files on the disks to assure yourself everything looks good. 6. Now Stop array and assign your Parity disk. 7. Click Start and you should see a parity sync start up. Variations on the theme: a) Suppose you don't know which physical disk is Parity? In this case assign all your hard drives to data disk slots (do NOT assign a Parity disk). Click Start and the one that comes up 'unformatted' is your parity disk (now you know which one it is). Repeat steps above except at step 3 now you know which is Parity so go ahead and assign it. b) Suppose you lost the config, but you know that Parity is valid, so you want to skip the lengthy re-sync. In this case, once you know which disk is Parity, and you have it and all other disks assigned, just prior to clicking the 'Start' button you can type this command in a telnet window: Also on the "identifying which is the parity disk question: ... Parity is XOR'd across all of the data drives plus the parity drive, and in certain circumstances with an even number of drives, it is possible to have a pattern of bits in the early sectors that could closely approximate the start of a Reiser file system, enough to fool a file system identifying tool. What you can assume is that the parity drive is the largest drive, or one of them if there are multiples of the largest size. Then you could individually try to mount a Reiser file system on each drive, then check for good directories, and that should clearly identify which drives are data drives. You really should have a backup of your flash drive, and a printout or screen capture or notes of your drive assignments, showing a table of drive serial numbers matched to unRAID disk numbers.
talmania Posted September 13, 2011 Author Posted September 13, 2011 Thanks again for the help--I truly appreciate it. My concern is that I'm almost positive of what my parity drive is but Unraid isn't (at least from the message given). In summary: Logged into unraid menu one night with array running as expected and noticed disk 5 in Disk_DSBL state. Stopped array and replaced disk but when trying to bring the array back online Unraid stated it detected a new parity drive. By proceeding in the steps below am essentially chancing it that my data is there? If drive 5 is truly bad and my array has no idea what my parity drive is aren't I screwed? Should I attempt to bring the original disk 5 (the one that started this whole mess) back into the array in the steps listed below or leave it out and proceed as listed, adding in parity at the very end? I will absolutely keep prinouts and notes going forward. I'm embarrased I haven't and I should know better. Thanks!
mbryanr Posted September 13, 2011 Posted September 13, 2011 By proceeding in the steps below am essentially chancing it that my data is there? If drive 5 is truly bad and my array has no idea what my parity drive is aren't I screwed? Should I attempt to bring the original disk 5 (the one that started this whole mess) back into the array in the steps listed below or leave it out and proceed as listed, adding in parity at the very end? I will absolutely keep prinouts and notes going forward. I'm embarrased I haven't and I should know better. Thanks! I'm really a novice with this; and learning as we go so- 1. Lets take this one step at a time....run a smart test on the original drive 5. We need that back to recover all your data. 2. You got a REDBALL because a write failed; it could have been any number of things. I should have asked for the smart test earlier. If you replace it with a new drive - then it will not be able to recalculate parity - 3. If you follow the steps to initconfig, where you have the parity disk correctly assigned, and get disk 5 back into the array - your data will be protected. That is ways away - so lets get step 1.
talmania Posted September 14, 2011 Author Posted September 14, 2011 You're an expert compared to me!! Regardless I really appreciate the help. I've lost at least 6 months off my life worrying about my data. Here's the output of the smart report: smartctl 5.39.1 2010-01-28 r3054 [i486-slackware-linux-gnu] (local build) Copyright (C) 2002-10 by Bruce Allen, http://smartmontools.sourceforge.net === START OF INFORMATION SECTION === Model Family: Western Digital Caviar Green family Device Model: WDC WD20EADS-00S2B0 Serial Number: WD-WCAVY5770676 Firmware Version: 01.00A01 User Capacity: 2,000,398,934,016 bytes Device is: In smartctl database [for details use: -P show] ATA Version is: 8 ATA Standard is: Exact ATA specification draft version not indicated Local Time is: Tue Sep 13 17:56:56 2011 PDT SMART support is: Available - device has SMART capability. SMART support is: Enabled === START OF READ SMART DATA SECTION === SMART overall-health self-assessment test result: PASSED General SMART Values: Offline data collection status: (0x85) Offline data collection activity was aborted by an interrupting command from host. Auto Offline Data Collection: Enabled. Self-test execution status: ( 0) The previous self-test routine completed without error or no self-test has ever been run. Total time to complete Offline data collection: (41700) seconds. Offline data collection capabilities: (0x7b) SMART execute Offline immediate. Auto Offline data collection on/off support. Suspend Offline collection upon new command. Offline surface scan supported. Self-test supported. Conveyance Self-test supported. Selective Self-test supported. SMART capabilities: (0x0003) Saves SMART data before entering power-saving mode. Supports SMART auto save timer. Error logging capability: (0x01) Error logging supported. General Purpose Logging supported. Short self-test routine recommended polling time: ( 2) minutes. Extended self-test routine recommended polling time: ( 255) minutes. Conveyance self-test routine recommended polling time: ( 5) minutes. SCT capabilities: (0x303f) SCT Status supported. SCT Feature Control supported. SCT Data Table supported. SMART Attributes Data Structure revision number: 16 Vendor Specific SMART Attributes with Thresholds: ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE 1 Raw_Read_Error_Rate 0x002f 200 200 051 Pre-fail Always - 0 3 Spin_Up_Time 0x0027 147 145 021 Pre-fail Always - 9633 4 Start_Stop_Count 0x0032 099 099 000 Old_age Always - 1843 5 Reallocated_Sector_Ct 0x0033 200 200 140 Pre-fail Always - 0 7 Seek_Error_Rate 0x002e 200 200 000 Old_age Always - 0 9 Power_On_Hours 0x0032 091 091 000 Old_age Always - 7269 10 Spin_Retry_Count 0x0032 100 100 000 Old_age Always - 0 11 Calibration_Retry_Count 0x0032 100 253 000 Old_age Always - 0 12 Power_Cycle_Count 0x0032 100 100 000 Old_age Always - 22 192 Power-Off_Retract_Count 0x0032 200 200 000 Old_age Always - 6 193 Load_Cycle_Count 0x0032 189 189 000 Old_age Always - 35731 194 Temperature_Celsius 0x0022 123 107 000 Old_age Always - 29 196 Reallocated_Event_Count 0x0032 200 200 000 Old_age Always - 0 197 Current_Pending_Sector 0x0032 200 200 000 Old_age Always - 0 198 Offline_Uncorrectable 0x0030 200 200 000 Old_age Offline - 0 199 UDMA_CRC_Error_Count 0x0032 200 200 000 Old_age Always - 0 200 Multi_Zone_Error_Rate 0x0008 200 200 000 Old_age Offline - 0 SMART Error Log Version: 1 Warning: ATA error count 6427 inconsistent with error log pointer 1 ATA Error Count: 6427 (device log contains only the most recent five errors) CR = Command Register [HEX] FR = Features Register [HEX] SC = Sector Count Register [HEX] SN = Sector Number Register [HEX] CL = Cylinder Low Register [HEX] CH = Cylinder High Register [HEX] DH = Device/Head Register [HEX] DC = Device Command Register [HEX] ER = Error register [HEX] ST = Status register [HEX] Powered_Up_Time is measured from power on, and printed as DDd+hh:mm:SS.sss where DD=days, hh=hours, mm=minutes, SS=sec, and sss=millisec. It "wraps" after 49.710 days. Error 6427 occurred at disk power-on lifetime: 6546 hours (272 days + 18 hours) When the command that caused the error occurred, the device was active or idle. After command completion occurred, registers were: ER ST SC SN CL CH DH -- -- -- -- -- -- -- 04 61 00 00 00 00 00 Commands leading to the command that caused the error were: CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name -- -- -- -- -- -- -- -- ---------------- -------------------- e0 00 00 00 00 00 00 08 48d+21:57:33.705 STANDBY IMMEDIATE e0 00 00 00 00 00 00 08 48d+21:57:33.704 STANDBY IMMEDIATE e0 00 00 00 00 00 00 08 48d+21:57:33.704 STANDBY IMMEDIATE e0 00 00 00 00 00 00 08 48d+21:57:33.704 STANDBY IMMEDIATE e0 00 00 00 00 00 00 08 48d+21:57:33.704 STANDBY IMMEDIATE Error 6426 occurred at disk power-on lifetime: 6546 hours (272 days + 18 hours) When the command that caused the error occurred, the device was active or idle. After command completion occurred, registers were: ER ST SC SN CL CH DH -- -- -- -- -- -- -- 04 61 00 00 00 00 00 Commands leading to the command that caused the error were: CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name -- -- -- -- -- -- -- -- ---------------- -------------------- e0 00 00 00 00 00 00 08 48d+21:57:33.704 STANDBY IMMEDIATE e0 00 00 00 00 00 00 08 48d+21:57:33.704 STANDBY IMMEDIATE e0 00 00 00 00 00 00 08 48d+21:57:33.704 STANDBY IMMEDIATE e0 00 00 00 00 00 00 08 48d+21:57:33.704 STANDBY IMMEDIATE e0 00 00 00 00 00 00 08 48d+21:57:33.704 STANDBY IMMEDIATE Error 6425 occurred at disk power-on lifetime: 6546 hours (272 days + 18 hours) When the command that caused the error occurred, the device was active or idle. After command completion occurred, registers were: ER ST SC SN CL CH DH -- -- -- -- -- -- -- 04 61 00 00 00 00 00 Commands leading to the command that caused the error were: CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name -- -- -- -- -- -- -- -- ---------------- -------------------- e0 00 00 00 00 00 00 08 48d+21:57:33.704 STANDBY IMMEDIATE e0 00 00 00 00 00 00 08 48d+21:57:33.704 STANDBY IMMEDIATE e0 00 00 00 00 00 00 08 48d+21:57:33.704 STANDBY IMMEDIATE e0 00 00 00 00 00 00 08 48d+21:57:33.704 STANDBY IMMEDIATE e5 00 00 00 00 00 00 08 48d+21:57:33.703 CHECK POWER MODE Error 6424 occurred at disk power-on lifetime: 6546 hours (272 days + 18 hours) When the command that caused the error occurred, the device was active or idle. After command completion occurred, registers were: ER ST SC SN CL CH DH -- -- -- -- -- -- -- 04 61 00 00 00 00 00 Commands leading to the command that caused the error were: CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name -- -- -- -- -- -- -- -- ---------------- -------------------- e0 00 00 00 00 00 00 08 48d+21:57:33.704 STANDBY IMMEDIATE e0 00 00 00 00 00 00 08 48d+21:57:33.704 STANDBY IMMEDIATE e0 00 00 00 00 00 00 08 48d+21:57:33.704 STANDBY IMMEDIATE e5 00 00 00 00 00 00 08 48d+21:57:33.703 CHECK POWER MODE e0 00 00 00 00 00 00 08 48d+21:57:19.071 STANDBY IMMEDIATE SMART Self-test log structure revision number 1 Num Test_Description Status Remaining LifeTime(hours) LBA_of_first_error # 1 Short offline Completed without error 00% 7269 - # 2 Short offline Completed without error 00% 7218 - SMART Selective self-test log data structure revision number 1 SPAN MIN_LBA MAX_LBA CURRENT_TEST_STATUS 1 0 0 Not_testing 2 0 0 Not_testing 3 0 0 Not_testing 4 0 0 Not_testing 5 0 0 Not_testing Selective self-test flags (0x0): After scanning selected spans, do NOT read-scan remainder of disk. If Selective self-test is pending on power-up, resume after 0 minute delay. I know enough to look at reallocated sectors and pending writes and it looks like they are zero. This is the original disk 5 that was reported DISK_DSBL
mbryanr Posted September 14, 2011 Posted September 14, 2011 Very likely good, which means that your data is good. Read some on the ATA error that shows on your drive, also looks to be harmless http://www.ozzu.com/hosting-forum/possible-failure-t62392.html As long as the data drives are good - you should be able to recalculate parity. Search the wiki for initconfig, and find the exact steps. Remember unassign the parity drive.
talmania Posted September 14, 2011 Author Posted September 14, 2011 Woohoo!! I think I'm almost there! The disk 5 took a bit longer than the others to mount and you can see the reads/writes. Not sure what to make of that.... The data shares are now present and the only issue I'm seeing that I don't think I should see is an unformatted drive message below. I'm not moving forward with assigning the parity---I'm concerned I shouldn't be seeing that option:
talmania Posted September 14, 2011 Author Posted September 14, 2011 OMG...I swear the fear of losing data makes me completely incompetent. I now see that disk 18 is reporting as unformatted--I'm positive it wasn't that way before. Not sure what's going on but investigating....
talmania Posted September 14, 2011 Author Posted September 14, 2011 After looking it over I'm almost certain there was no data on drive 18 yet (thankfully) just reassigned the parity drive and it's doing a parity sync now.
mbryanr Posted September 14, 2011 Posted September 14, 2011 Good deal! Post another syslog. The only concern was that Disk 18 wasn't empty, but was your parity drive. 4. If any disk did not mount (that is, appears 'unformatted'), well you have a problem: perhaps that is the actual parity disk? a) Suppose you don't know which physical disk is Parity? In this case assign all your hard drives to data disk slots (do NOT assign a Parity disk). Click Start and the one that comes up 'unformatted' is your parity disk (now you know which one it is). Repeat steps above except at step 3 now you know which is Parity so go ahead and assign it.
talmania Posted September 14, 2011 Author Posted September 14, 2011 Parity sync complete and Unraid reports parity is valid! What a relief! Looking over the syslog it looks like mover initialized and some directories were recreated that must have been on 18. I've trimmed that part out in order to get the syslog down below 192k (along with a bit of the beginning). Regardless from a quick glance it looks like things are back to normal!! syslog4.txt
mbryanr Posted September 14, 2011 Posted September 14, 2011 Looks clean. Check your disk folders to ensure everything is there. You will have to setup your user shares again. Mark as solved when you are happy with the results. Quick question: Do you have unRAID set to "MBR: 4K-aligned"?
talmania Posted September 14, 2011 Author Posted September 14, 2011 Thanks so much mbryanr--please check your PM's when you get a chance. My user shares are already present and while I haven't looked through everything (20 something terabytes!!) I have looked in several locations and things appear completely normal. I don't have MBR 4k-aligned enabled. I'm using jumpers on all my EARS. Brings up a good question about moving to the next version of Unraid--do I need to make a change to 4k aligned? Can that switch be made to an existing array?
Recommended Posts
Archived
This topic is now archived and is closed to further replies.