Jump to content
talmania

[SOLVED!!!] Replacing disabled drive now wants to replace parity?

26 posts in this topic Last Reply

Recommended Posts

I noticed today that my drive 5 was offline and had the red DISK_DSBL next to it.  Took the array down, removed drive 5 from the array, powered off, reseated the drive, powered back on and reassigned the drive to slot five.  When trying to start the array it looked like it was going to mount then jumped back to the default screen.

 

So I figured the drive was truly having issues so I removed it, shut down, installed new drive, assigned it to the slot and UNRAID noticed that the disabled drive had been replaced.  However when I clicked on the "I'm sure I want to do this" and start the array it then stopped and gave the message that "New Parity disk installed".  My parity disk is a 2TB WD EARS and the disk I replaced (also an EARS) was a Hitachi 2TB.  Size reported is the same.

 

I'm running 4.7 final and have attached my syslog.  Help!  Thanks so much!

syslog.txt

Share this post


Link to post

Ok so I spent last night looking at things and remembered a previous problem I had here:

 

http://lime-technology.com/forum/index.php?topic=11886.0

 

Similar issue now where I've tried to start the array multiple times (without the disabled drive and with it) and everytime it looks like it's starting to mount then stop.

 

I've looked over the syslog and I don't see a tell-tale entry from the previous issue like:

 

Mar 23 22:38:21 Deed emhttp: get_fstype: open /dev/sdh1: No such file or directory

 

I do see a lock-dev error.  Will post an updated syslog soon but I'm stumped on why I'm having these issues.  Thanks!

Share this post


Link to post

Anyone????

 

Where's Joe when a guy needs him!  ;D

 

I'm approaching this as follows:

 

I've swapped back in the brand new Hitachi and will attempt to preclear it before adding it to the array (just added it directly before) in hopes to avoid the "replace parity" message when trying to start the array.

 

I'm also going to buy a 2TB EARS in the miniscule chance that the issue is that Unraid thinks the unformatted Hitachi is larger than the EARS that's currently the parity drive. 

 

Share this post


Link to post

Finished the preclear of the Hitachi and after adding it successfully as disk 5 and then starting the array I again got the message "new parity installed". 

 

I'm stuck again.

Share this post


Link to post

Thanks for the response--I've reseated every cable in my Norco 2040 and just got back from buying a new EARS, tried it with the exact same result.  It recognizes a disabled disk has been replaced but after starting it states "new parity disk installed".

 

I've made zero changes to the drive order other than to try different brand new drives in the same disk 5 slot.  I'm attaching a new syslog where I did notice the following:  Interesting about the sdj and sdj1.  Not entirely sure what it means though.

 

Sep 10 20:08:49 Deed emhttp: writing mbr on disk 5 (/dev/sdj) with partition 1 offset 63

Sep 10 20:08:49 Deed emhttp: re-reading /dev/sdj partition table

Sep 10 20:08:49 Deed kernel:  sdj: sdj1

Sep 10 20:08:50 Deed emhttp: mdcmd: write: No such device or address

Sep 10 20:08:50 Deed kernel: mdcmd (44): start RECON_DISK

Sep 10 20:08:50 Deed kernel: md: do_run: lock_rdev error: -6

syslog_2.txt

Share this post


Link to post

A couple of screen shots...

 

after more research tonight I found a couple of posts referencing that I should be able to start up the array with one drive missing so I unassigned it, rebooted and still can't start the array.  Here's what I see...

 

Capture.PNG

 

Capture2.PNG

Share this post


Link to post

Alright...I just had something very similar happen to me.  Although the disk cables and slots were the same...they were assigned by unRAID in a different location (disk6, disk7). 

 

Anyways, I had the original disk configuration and was able to reassign the disks to the correct slots.

 

I saw the same entry in your syslog...stale configuration.  Do you have the original disk assignments with serial numbers?

Share this post


Link to post

Thanks for the reply---by original config do you mean original as in first ever config?  I've grown from 3-4 disks to now almost 20.  Or do you mean the config right before removing the disabled disk?

 

I'm not sure it matters since I don't have any disk config other than the previous syslog linked above.

Share this post


Link to post

Maybe were onto something here...

 

From the old syslog:

 

Mar 22 07:52:15 Deed emhttp: Device inventory:
Mar 22 07:52:15 Deed emhttp: pci-0000:00:1f.2-scsi-0:0:0:0 host3 (sdr) WDC_WD20EARS-00MVWB0_WD-WCAZA0399136
Mar 22 07:52:15 Deed emhttp: pci-0000:00:1f.2-scsi-1:0:0:0 host4 (sds) ST31500341AS_9VS09K8D
Mar 22 07:52:15 Deed emhttp: pci-0000:00:1f.2-scsi-2:0:0:0 host5 (sdt) ST31500341AS_9VS1ELPJ
Mar 22 07:52:15 Deed emhttp: pci-0000:00:1f.2-scsi-3:0:0:0 host6 (sdu) ST31500341AS_9VS1Z549
Mar 22 07:52:15 Deed emhttp: pci-0000:00:1f.2-scsi-4:0:0:0 host7 (sdv) WDC_WD3000GLFS-01F8U0_WD-WXL508054114
Mar 22 07:52:15 Deed emhttp: pci-0000:01:00.0-sas-phy0:1-0x0000000000000000:0-lun0 host0 (sdb) ST31500541AS_6XW0PY4L
Mar 22 07:52:15 Deed emhttp: pci-0000:01:00.0-sas-phy1:1-0x0100000000000000:1-lun0 host0 (sdc) ST31500541AS_6XW0JM2Q
Mar 22 07:52:15 Deed emhttp: pci-0000:01:00.0-sas-phy2:1-0x0200000000000000:2-lun0 host0 (sdd) Hitachi_HDS5C30_ML0220F30KUV9D
Mar 22 07:52:15 Deed emhttp: pci-0000:01:00.0-sas-phy3:1-0x0300000000000000:3-lun0 host0 (sde) WDC_WD20EARS-00_WD-WMAZA0987578
Mar 22 07:52:15 Deed emhttp: pci-0000:01:00.0-sas-phy4:1-0x0400000000000000:4-lun0 host0 (sdf) WDC_WD20EARS-00_WD-WCAZA1239135
Mar 22 07:52:15 Deed emhttp: pci-0000:01:00.0-sas-phy5:1-0x0500000000000000:5-lun0 host0 (sdg) WDC_WD20EARS-00_WD-WCAZA1241990
Mar 22 07:52:15 Deed emhttp: pci-0000:01:00.0-sas-phy6:1-0x0600000000000000:6-lun0 host0 (sdh) WDC_WD20EARS-00_WD-WCAZA1263685
Mar 22 07:52:15 Deed emhttp: pci-0000:01:00.0-sas-phy7:1-0x0700000000000000:7-lun0 host0 (sdi) ST32000542AS_5XW1NRNQ
Mar 22 07:52:15 Deed emhttp: pci-0000:02:00.0-sas-phy0:1-0x0000000000000000:0-lun0 host2 (sdj) WDC_WD20EADS-00S2B0_WD-WCAVY5770676
Mar 22 07:52:15 Deed emhttp: pci-0000:02:00.0-sas-phy1:1-0x0100000000000000:1-lun0 host2 (sdk) WDC_WD20EADS-00_WD-WCAVY5771098
Mar 22 07:52:15 Deed emhttp: pci-0000:02:00.0-sas-phy2:1-0x0200000000000000:2-lun0 host2 (sdl) ST32000542AS_5XW2K2X7
Mar 22 07:52:15 Deed emhttp: pci-0000:02:00.0-sas-phy3:1-0x0300000000000000:3-lun0 host2 (sdm) ST31500341AS_9VS01DQK
Mar 22 07:52:15 Deed emhttp: pci-0000:02:00.0-sas-phy4:1-0x0400000000000000:4-lun0 host2 (sdn) WDC_WD20EARS-00_WD-WCAZA2829716
Mar 22 07:52:15 Deed emhttp: pci-0000:02:00.0-sas-phy5:1-0x0500000000000000:5-lun0 host2 (sdo) Hitachi_HDS5C30_ML0220F30LBM0D
Mar 22 07:52:15 Deed emhttp: pci-0000:02:00.0-sas-phy6:1-0x0600000000000000:6-lun0 host2 (sdp) WDC_WD20EARS-00_WD-WMAZA1059076
Mar 22 07:52:15 Deed emhttp: pci-0000:02:00.0-sas-phy7:1-0x0700000000000000:7-lun0 host2 (sdq) WDC_WD20EARS-00_WD-WMAZA1010557

 

And the latest....now to compare.  If they are off should I modify to correct or ????

 

 

Sep 10 20:08:27 Deed emhttp: Device inventory:
Sep 10 20:08:27 Deed emhttp: pci-0000:00:1f.2-scsi-0:0:0:0 host3 (sdr) WDC_WD20EARS-00MVWB0_WD-WCAZA0399136
Sep 10 20:08:27 Deed emhttp: pci-0000:00:1f.2-scsi-1:0:0:0 host4 (sds) ST31500341AS_9VS09K8D
Sep 10 20:08:27 Deed emhttp: pci-0000:00:1f.2-scsi-2:0:0:0 host5 (sdt) ST31500341AS_9VS1ELPJ
Sep 10 20:08:27 Deed emhttp: pci-0000:00:1f.2-scsi-3:0:0:0 host6 (sdu) ST31500341AS_9VS1Z549
Sep 10 20:08:27 Deed emhttp: pci-0000:00:1f.2-scsi-4:0:0:0 host7 (sdv) WDC_WD3000GLFS-01F8U0_WD-WXL508054114
Sep 10 20:08:27 Deed emhttp: pci-0000:01:00.0-sas-phy0:1-0x0000000000000000:0-lun0 host0 (sdb) ST31500541AS_6XW0PY4L
Sep 10 20:08:27 Deed emhttp: pci-0000:01:00.0-sas-phy1:1-0x0100000000000000:1-lun0 host0 (sdc) ST31500541AS_6XW0JM2Q
Sep 10 20:08:27 Deed emhttp: pci-0000:01:00.0-sas-phy2:1-0x0200000000000000:2-lun0 host0 (sdd) Hitachi_HDS5C30_ML0220F30KUV9D
Sep 10 20:08:27 Deed emhttp: pci-0000:01:00.0-sas-phy3:1-0x0300000000000000:3-lun0 host0 (sde) WDC_WD20EARS-00_WD-WMAZA0987578
Sep 10 20:08:27 Deed emhttp: pci-0000:01:00.0-sas-phy4:1-0x0400000000000000:4-lun0 host0 (sdf) WDC_WD20EARS-00_WD-WCAZA1239135
Sep 10 20:08:27 Deed emhttp: pci-0000:01:00.0-sas-phy5:1-0x0500000000000000:5-lun0 host0 (sdg) WDC_WD20EARS-00_WD-WCAZA1241990
Sep 10 20:08:27 Deed emhttp: pci-0000:01:00.0-sas-phy6:1-0x0600000000000000:6-lun0 host0 (sdh) WDC_WD20EARS-00_WD-WCAZA1263685
Sep 10 20:08:27 Deed emhttp: pci-0000:01:00.0-sas-phy7:1-0x0700000000000000:7-lun0 host0 (sdi) ST32000542AS_5XW1NRNQ
Sep 10 20:08:27 Deed emhttp: pci-0000:02:00.0-sas-phy0:1-0x0000000000000000:0-lun0 host2 (sdj) WDC_WD20EARS-00MVWB0_WD-WCAZA8408520
Sep 10 20:08:27 Deed emhttp: pci-0000:02:00.0-sas-phy1:1-0x0100000000000000:1-lun0 host2 (sdk) WDC_WD20EADS-00_WD-WCAVY5771098
Sep 10 20:08:27 Deed emhttp: pci-0000:02:00.0-sas-phy2:1-0x0200000000000000:2-lun0 host2 (sdl) ST32000542AS_5XW2K2X7
Sep 10 20:08:27 Deed emhttp: pci-0000:02:00.0-sas-phy3:1-0x0300000000000000:3-lun0 host2 (sdm) ST31500341AS_9VS01DQK
Sep 10 20:08:27 Deed emhttp: pci-0000:02:00.0-sas-phy4:1-0x0400000000000000:4-lun0 host2 (sdn) WDC_WD20EARS-00_WD-WCAZA2829716
Sep 10 20:08:27 Deed emhttp: pci-0000:02:00.0-sas-phy5:1-0x0500000000000000:5-lun0 host2 (sdo) Hitachi_HDS5C30_ML0220F30LBM0D
Sep 10 20:08:27 Deed emhttp: pci-0000:02:00.0-sas-phy6:1-0x0600000000000000:6-lun0 host2 (sdp) WDC_WD20EARS-00_WD-WMAZA1059076
Sep 10 20:08:27 Deed emhttp: pci-0000:02:00.0-sas-phy7:1-0x0700000000000000:7-lun0 host2 (sdq) WDC_WD20EARS-00_WD-WMAZA1010557
Sep 10 20:08:27 Deed emhttp: get_fstype: open /dev/sdj1: No such file or directory
Sep 10 20:08:27 Deed emhttp: get_fstype: open /dev/sdh1: No such file or directory

Share this post


Link to post

If they are off you will want to correct. I haven't had a chance to review.

 

Also, IIRC - Joe L posted a command line method to determine which drive is the parity drive (as it doesn't have a file system).  You could search for those commands if you can't find the parity drive.

Share this post


Link to post

I'm still confused.  My guess is that the /sdX assignments are done by Linux and the drive assignments are done by Unraid right?

 

If so, I don't have all the pieces since I don't know what the disk device assignments looked like before (see the above Unraid screen shot).

 

Am I screwed? 

 

Joe do you check your PM's?

 

Thanks anyone in advance for any help--I sincerely appreciate it.

Share this post


Link to post

Alright...before proceeding get confirmation that the steps are here (and that this is the problem) - but there is a method.The important thing is not to assign parity since you don't know which is your parity drive.

 

If you've lost your disk configuration, safest way to proceed is as follows:

1.Then, we need to force a new disk configuration to get disk 8 back in the array and to force it to rebuild parity.

You do that by typing

initconfig

while the array is stopped.   answer "Yes" on the command line (Capital "Y", lower case "es") to its prompt.

Then back on the management web-page, refresh it and all the disks should show as "blue"  Then start the array.  It will completely rebuild

parity.<----unRAID 4.7 Do Step 1 ONLY if Parity Drive is known.

1. Go to Utils and click 'New config'.  Check the 'Yes I want to do this' box, then click Apply.<<<<--5 beta only!

2. Go to Main and start assigning your drives.  Do not assign Parity.  Whichever drive you think is Parity, just don't assign.

3. Now you have all your data disks and cache disk assigned, and they all have a blue dot - that's ok (and Parity is set to "unassigned"). Click Start and array will start and attempt to mount all the data drives.

4. If any disk did not mount (that is, appears 'unformatted'), well you have a problem: perhaps that is the actual parity disk?

5. You can spot check the files on the disks to assure yourself everything looks good.

6. Now Stop array and assign your Parity disk.

7. Click Start and you should see a parity sync start up.

 

Variations on the theme:

a) Suppose you don't know which physical disk is Parity?  In this case assign all your hard drives to data disk slots (do NOT assign a Parity disk).  Click Start and the one that comes up 'unformatted' is your parity disk (now you know which one it is).  Repeat steps above except at step 3 now you know which is Parity so go ahead and assign it.

 

b) Suppose you lost the config, but you know that Parity is valid, so you want to skip the lengthy re-sync. In this case, once you know which disk is Parity, and you have it and all other disks assigned, just prior to clicking the 'Start' button you can type this command in a telnet window:

 

Also on the "identifying which is the parity disk question:

... Parity is XOR'd across all of the data drives plus the parity drive, and in certain circumstances with an even number of drives, it is possible to have a pattern of bits in the early sectors that could closely approximate the start of a Reiser file system, enough to fool a file system identifying tool.  What you can assume is that the parity drive is the largest drive, or one of them if there are multiples of the largest size.  Then you could individually try to mount a Reiser file system on each drive, then check for good directories, and that should clearly identify which drives are data drives.

 

You really should have a backup of your flash drive, and a printout or screen capture or notes of your drive assignments, showing a table of drive serial numbers matched to unRAID disk numbers.

Share this post


Link to post

Thanks again for the help--I truly appreciate it. 

 

My concern is that I'm almost positive of what my parity drive is but Unraid isn't (at least from the message given).  In summary:

 

Logged into unraid menu one night with array running as expected and noticed disk 5 in Disk_DSBL state.  Stopped array and replaced disk but when trying to bring the array back online Unraid stated it detected a new parity drive.

 

By proceeding in the steps below am essentially chancing it that my data is there?  If drive 5 is truly bad and my array has no idea what my parity drive is aren't I screwed?

 

Should I attempt to bring the original disk 5 (the one that started this whole mess) back into the array in the steps listed below or leave it out and proceed as listed, adding in parity at the very end?

 

I will absolutely keep prinouts and notes going forward.  I'm embarrased I haven't and I should know better.

 

Thanks!

Share this post


Link to post

By proceeding in the steps below am essentially chancing it that my data is there?  If drive 5 is truly bad and my array has no idea what my parity drive is aren't I screwed?

 

Should I attempt to bring the original disk 5 (the one that started this whole mess) back into the array in the steps listed below or leave it out and proceed as listed, adding in parity at the very end?

 

I will absolutely keep prinouts and notes going forward.  I'm embarrased I haven't and I should know better.

 

Thanks!

I'm really a novice with this; and learning as we go so-

1.  Lets take this one step at a time....run a smart test on the original drive 5. We need that back to recover all your data. 

2.  You got a REDBALL because a write failed; it could have been any number of things. I should have asked for the smart test earlier. If you replace it with a new drive - then it will not be able to recalculate parity -

3.  If you follow the steps to initconfig, where you have the parity disk correctly assigned, and get disk 5 back into the array - your data will be protected.  That is ways away - so lets get step 1.

 

Share this post


Link to post

You're an expert compared to me!!   ;D

 

Regardless I really appreciate the help.  I've lost at least 6 months off my life worrying about my data.   :D

 

Here's the output of the smart report:

 

smartctl 5.39.1 2010-01-28 r3054 [i486-slackware-linux-gnu] (local build)
Copyright (C) 2002-10 by Bruce Allen, http://smartmontools.sourceforge.net

=== START OF INFORMATION SECTION ===
Model Family:     Western Digital Caviar Green family
Device Model:     WDC WD20EADS-00S2B0
Serial Number:    WD-WCAVY5770676
Firmware Version: 01.00A01
User Capacity:    2,000,398,934,016 bytes
Device is:        In smartctl database [for details use: -P show]
ATA Version is:   8
ATA Standard is:  Exact ATA specification draft version not indicated
Local Time is:    Tue Sep 13 17:56:56 2011 PDT
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

General SMART Values:
Offline data collection status:  (0x85)	Offline data collection activity
				was aborted by an interrupting command from host.
				Auto Offline Data Collection: Enabled.
Self-test execution status:      (   0)	The previous self-test routine completed
				without error or no self-test has ever 
				been run.
Total time to complete Offline 
data collection: 		 (41700) seconds.
Offline data collection
capabilities: 			 (0x7b) SMART execute Offline immediate.
				Auto Offline data collection on/off support.
				Suspend Offline collection upon new
				command.
				Offline surface scan supported.
				Self-test supported.
				Conveyance Self-test supported.
				Selective Self-test supported.
SMART capabilities:            (0x0003)	Saves SMART data before entering
				power-saving mode.
				Supports SMART auto save timer.
Error logging capability:        (0x01)	Error logging supported.
				General Purpose Logging supported.
Short self-test routine 
recommended polling time: 	 (   2) minutes.
Extended self-test routine
recommended polling time: 	 ( 255) minutes.
Conveyance self-test routine
recommended polling time: 	 (   5) minutes.
SCT capabilities: 	       (0x303f)	SCT Status supported.
				SCT Feature Control supported.
				SCT Data Table supported.

SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
 1 Raw_Read_Error_Rate     0x002f   200   200   051    Pre-fail  Always       -       0
 3 Spin_Up_Time            0x0027   147   145   021    Pre-fail  Always       -       9633
 4 Start_Stop_Count        0x0032   099   099   000    Old_age   Always       -       1843
 5 Reallocated_Sector_Ct   0x0033   200   200   140    Pre-fail  Always       -       0
 7 Seek_Error_Rate         0x002e   200   200   000    Old_age   Always       -       0
 9 Power_On_Hours          0x0032   091   091   000    Old_age   Always       -       7269
10 Spin_Retry_Count        0x0032   100   100   000    Old_age   Always       -       0
11 Calibration_Retry_Count 0x0032   100   253   000    Old_age   Always       -       0
12 Power_Cycle_Count       0x0032   100   100   000    Old_age   Always       -       22
192 Power-Off_Retract_Count 0x0032   200   200   000    Old_age   Always       -       6
193 Load_Cycle_Count        0x0032   189   189   000    Old_age   Always       -       35731
194 Temperature_Celsius     0x0022   123   107   000    Old_age   Always       -       29
196 Reallocated_Event_Count 0x0032   200   200   000    Old_age   Always       -       0
197 Current_Pending_Sector  0x0032   200   200   000    Old_age   Always       -       0
198 Offline_Uncorrectable   0x0030   200   200   000    Old_age   Offline      -       0
199 UDMA_CRC_Error_Count    0x0032   200   200   000    Old_age   Always       -       0
200 Multi_Zone_Error_Rate   0x0008   200   200   000    Old_age   Offline      -       0

SMART Error Log Version: 1
Warning: ATA error count 6427 inconsistent with error log pointer 1

ATA Error Count: 6427 (device log contains only the most recent five errors)
CR = Command Register [HEX]
FR = Features Register [HEX]
SC = Sector Count Register [HEX]
SN = Sector Number Register [HEX]
CL = Cylinder Low Register [HEX]
CH = Cylinder High Register [HEX]
DH = Device/Head Register [HEX]
DC = Device Command Register [HEX]
ER = Error register [HEX]
ST = Status register [HEX]
Powered_Up_Time is measured from power on, and printed as
DDd+hh:mm:SS.sss where DD=days, hh=hours, mm=minutes,
SS=sec, and sss=millisec. It "wraps" after 49.710 days.

Error 6427 occurred at disk power-on lifetime: 6546 hours (272 days + 18 hours)
 When the command that caused the error occurred, the device was active or idle.

 After command completion occurred, registers were:
 ER ST SC SN CL CH DH
 -- -- -- -- -- -- --
 04 61 00 00 00 00 00

 Commands leading to the command that caused the error were:
 CR FR SC SN CL CH DH DC   Powered_Up_Time  Command/Feature_Name
 -- -- -- -- -- -- -- --  ----------------  --------------------
 e0 00 00 00 00 00 00 08  48d+21:57:33.705  STANDBY IMMEDIATE
 e0 00 00 00 00 00 00 08  48d+21:57:33.704  STANDBY IMMEDIATE
 e0 00 00 00 00 00 00 08  48d+21:57:33.704  STANDBY IMMEDIATE
 e0 00 00 00 00 00 00 08  48d+21:57:33.704  STANDBY IMMEDIATE
 e0 00 00 00 00 00 00 08  48d+21:57:33.704  STANDBY IMMEDIATE

Error 6426 occurred at disk power-on lifetime: 6546 hours (272 days + 18 hours)
 When the command that caused the error occurred, the device was active or idle.

 After command completion occurred, registers were:
 ER ST SC SN CL CH DH
 -- -- -- -- -- -- --
 04 61 00 00 00 00 00

 Commands leading to the command that caused the error were:
 CR FR SC SN CL CH DH DC   Powered_Up_Time  Command/Feature_Name
 -- -- -- -- -- -- -- --  ----------------  --------------------
 e0 00 00 00 00 00 00 08  48d+21:57:33.704  STANDBY IMMEDIATE
 e0 00 00 00 00 00 00 08  48d+21:57:33.704  STANDBY IMMEDIATE
 e0 00 00 00 00 00 00 08  48d+21:57:33.704  STANDBY IMMEDIATE
 e0 00 00 00 00 00 00 08  48d+21:57:33.704  STANDBY IMMEDIATE
 e0 00 00 00 00 00 00 08  48d+21:57:33.704  STANDBY IMMEDIATE

Error 6425 occurred at disk power-on lifetime: 6546 hours (272 days + 18 hours)
 When the command that caused the error occurred, the device was active or idle.

 After command completion occurred, registers were:
 ER ST SC SN CL CH DH
 -- -- -- -- -- -- --
 04 61 00 00 00 00 00

 Commands leading to the command that caused the error were:
 CR FR SC SN CL CH DH DC   Powered_Up_Time  Command/Feature_Name
 -- -- -- -- -- -- -- --  ----------------  --------------------
 e0 00 00 00 00 00 00 08  48d+21:57:33.704  STANDBY IMMEDIATE
 e0 00 00 00 00 00 00 08  48d+21:57:33.704  STANDBY IMMEDIATE
 e0 00 00 00 00 00 00 08  48d+21:57:33.704  STANDBY IMMEDIATE
 e0 00 00 00 00 00 00 08  48d+21:57:33.704  STANDBY IMMEDIATE
 e5 00 00 00 00 00 00 08  48d+21:57:33.703  CHECK POWER MODE

Error 6424 occurred at disk power-on lifetime: 6546 hours (272 days + 18 hours)
 When the command that caused the error occurred, the device was active or idle.

 After command completion occurred, registers were:
 ER ST SC SN CL CH DH
 -- -- -- -- -- -- --
 04 61 00 00 00 00 00

 Commands leading to the command that caused the error were:
 CR FR SC SN CL CH DH DC   Powered_Up_Time  Command/Feature_Name
 -- -- -- -- -- -- -- --  ----------------  --------------------
 e0 00 00 00 00 00 00 08  48d+21:57:33.704  STANDBY IMMEDIATE
 e0 00 00 00 00 00 00 08  48d+21:57:33.704  STANDBY IMMEDIATE
 e0 00 00 00 00 00 00 08  48d+21:57:33.704  STANDBY IMMEDIATE
 e5 00 00 00 00 00 00 08  48d+21:57:33.703  CHECK POWER MODE
 e0 00 00 00 00 00 00 08  48d+21:57:19.071  STANDBY IMMEDIATE

SMART Self-test log structure revision number 1
Num  Test_Description    Status                  Remaining  LifeTime(hours)  LBA_of_first_error
# 1  Short offline       Completed without error       00%      7269         -
# 2  Short offline       Completed without error       00%      7218         -

SMART Selective self-test log data structure revision number 1
SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS
   1        0        0  Not_testing
   2        0        0  Not_testing
   3        0        0  Not_testing
   4        0        0  Not_testing
   5        0        0  Not_testing
Selective self-test flags (0x0):
 After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.

 

I know enough to look at reallocated sectors and pending writes and it looks like they are zero.  This is the original disk 5 that was reported DISK_DSBL

Share this post


Link to post

Very likely good, which means that your data is good.

Read some on the ATA error that shows on your drive, also looks to be harmless

http://www.ozzu.com/hosting-forum/possible-failure-t62392.html

 

As long as the data drives are good - you should be able to recalculate parity.

Search the wiki for initconfig, and find the exact steps. Remember unassign the parity drive.

 

 

Share this post


Link to post

Woohoo!!  I think I'm almost there!

 

The disk 5 took a bit longer than the others to mount and you can see the reads/writes.  Not sure what to make of that....

 

The data shares are now present and the only issue I'm seeing that I don't think I should see is an unformatted drive message below.  I'm not moving forward with assigning the parity---I'm concerned I shouldn't be seeing that option:

 

postinitconfig.PNG

Share this post


Link to post

OMG...I swear the fear of losing data makes me completely incompetent.  I now see that disk 18 is reporting as unformatted--I'm positive it wasn't that way before.  Not sure what's going on but investigating....

Share this post


Link to post

After looking it over I'm almost certain there was no data on drive 18 yet (thankfully) just reassigned the parity drive and it's doing a parity sync now. 

Share this post


Link to post

Good deal!

Post another syslog.

The only concern was that Disk 18 wasn't empty, but was your parity drive.

4. If any disk did not mount (that is, appears 'unformatted'), well you have a problem: perhaps that is the actual parity disk?

a) Suppose you don't know which physical disk is Parity?  In this case assign all your hard drives to data disk slots (do NOT assign a Parity disk).  Click Start and the one that comes up 'unformatted' is your parity disk (now you know which one it is).  Repeat steps above except at step 3 now you know which is Parity so go ahead and assign it.

Share this post


Link to post

Parity sync complete and Unraid reports parity is valid! What a relief!

 

Looking over the syslog it looks like mover initialized and some directories were recreated that must have been on 18.  I've trimmed that part out in order to get the syslog down below 192k (along with a bit of the beginning).  Regardless from a quick glance it looks like things are back to normal!!

 

syslog4.txt

Share this post


Link to post

Looks clean.  Check your disk folders to ensure everything is there.  You will have to setup your user shares again.  Mark as solved when you are happy with the results. ;D

 

Quick question: Do you have unRAID set to "MBR: 4K-aligned"?

Share this post


Link to post

Thanks so much mbryanr--please check your PM's when you get a chance.

 

My user shares are already present and while I haven't looked through everything (20 something terabytes!!) I have looked in several locations and things appear completely normal.

 

I don't have MBR 4k-aligned enabled.  I'm using jumpers on all my EARS.

 

Brings up a good question about moving to the next version of Unraid--do I need to make a change to 4k aligned?  Can that switch be made to an existing array?

Share this post


Link to post

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.