Multiple issues with drive upgrades


Recommended Posts

NOTE:  Thread renamed for clarity

 

Running Server Pro version 5.0.4.  I just precleared two 4TB drives using preclear_disk.sh and used one of them to replace my existing 3TB parity drive.  While attempting to rebuild parity on the new drive, four of the existing 21 drives showed errors during the parity check.  I shut down and checked all cables and controllers to make sure everything was seated.  Powered it back up and everything showed green except a blue ball for the parity drive.  I re-initiated the parity rebuild and eventually had the same problem.  This reoccurred one or two more times until the number of drives showing errors increased to seven.

 

I shut down and rechecked all of my connections again.  This time everything came up green except now that drive 1 is showing as being unformatted (drive 1 was one of the drives that had been showing errors during the parity check).  I reverted back to version 5.0-rc16c, which I had been using prior to version 5.0.4, and the same thing happened.  I'm back to 5.0.4 and I tried reinstalling the original 3TB parity drive, which is currently the same size or larger than any other drive in the array.  The original parity drive is showing up as the wrong drive, drive 1 is still unformatted, and I cannot start the array as it is indicating an invalid configuration.  I reinstalled the 4TB parity drive and it shows up with an orange ball with drive 1 still unformatted.  I cannot rebuild the data on drive 1 with the 4TB parity drive because it never successfully rebuilt parity.  The only way to do this is with the original 3TB parity drive.

 

I have another drive of the same size as drive 1 that can be used to rebuild the data.  I need to know if there's a way to restore the original parity drive so I can rebuild the data to a different drive.

 

I'm also wondering if it's feasible to clone the parity drive from the original 3TB drive to the new 4TB drive using dd or some other method.  That way the system would see the correct parity drive.  My main concern using this method would be that I'm not sure that the configuration sees the parity as being valid.  If I can convince it that it is valid then I perhaps stand a chance of rebuilding the lost data.

Link to comment

I put the array in maintenance mode and ran the following from the command line:

 

reiserfsck --check /dev/md1

 

The check failed immediately with an error message about block 2 being bad or something to that effect and basically telling me that the drive is bad.  I'm running the Seatools long test on the drive to confirm the status of the drive.  I just checked the warranty status and it expires on February 7, 2014, so I'm almost hoping the drive is bad so I can get it replaced under warranty.

 

I did a spot check of the TV shows I have archived and noted which episodes of each show were missing.  I can recover pretty much all of them so the only thing that's missing is some movies on Blu-Ray and DVD, neither of which are a monumental loss.  I still wouldn't mind being able to recover the data using the original parity drive, but I don't even know if that's possible at this point.  If it isn't, I'll just have to bite the bullet are replace the drive and rerun parity from scratch.

 

Just a quick question - drive md1 corresponds to data disk 1, correct?  I'm not sure if I actually checked disk 1 or the parity disk using this command.

 

Here's my setup:

 

unRAID Server Pro 5.0.4

Gigabyte F2A85XM-HD3 motherboard

AMD A4 6300 CPU

2x4GB Mushkin PC3-10666 RAM

two Supermicro AOC-SASLP-MV8 controllers with 0.21 firmware

Intel PCI-e gigabit NIC

Supermicro SC846TQ 24-bay server case

Corsair HX850 PSU

 

syslog.txt

Link to comment

I decided to bite the bullet and go ahead and reformat the drive.  I tried using a utility that allows me to view reiserfs drives in Windows and the drive is showing up blank.

 

I started the array with the drive installed and tried to format it.  It appeared to start OK, but when I hit refresh it just reverted back to the original screen that showed the drive as being unformatted.  I tried reformatting it several more times with the same results. 

 

I swapped out the questionable 1.5TB drive with one of the new 4TB drives I had already precleared.  The other 4TB drive is installed as the new parity drive.  Numerous attempts to start the array and format the new drive failed so I'm back to square one.

 

My next plan is to delete the existing configuration and start from scratch.  Theoretically, the existing data drives should remain unaffected and the new blank data drive should format and then start building parity.

 

What's the recommended way to erase the current configuration?  I have several shares set up as well as a static IP.  Is it OK to restore one or more of the .cfg files back to the configuration folder on the flash drive so I don't have to reconfigure everything?

Link to comment

OK, I think I've got it figured out.  I forgot there was an option on the Settings tab to wipe the current configuration and create a new one.  I knew there was a simple way to do it but I forgot what it was and there's little to no documentation I could find on anything newer than version 4.7.

 

I also discovered I could specify that the parity was valid even after creating the new configuration.  This allowed me to create the new configuration with the original parity drive and replace the flaky drive with a new one of the same capacity.  It's currently clearing the drive and I expect to be able to rebuild the drive using the original parity drive when I get home from work this evening. 

 

If all goes well I'll upgrade the original parity drive to one of the 4TB drives and then swap out the old 3TB parity drive and the 2nd new 4TB drive for two of the existing smaller data drives that are getting long in the tooth.  I'm keeping my fingers crossed in any case.

 

What's crazy is that I rarely have drive issues with unRAID until I try to upgrade the configuration.

Link to comment

Well, that didn't work quite like I expected.  Here's what transpired:

 

Installed 4TB drives for both parity and disk 1.  Disk 1 would not format after countless attempts.

 

Swapped out disk 1 for old 3TB parity drive, created new configuration, and formatted it with no problems.

 

Started the array and initiated a parity rebuild.  Worked fine for a while and then the parity drive red balled and one of the drives (drive 8 ) showed multiple errors.  I mapped drive 8 from my PC and it only listed a couple of files.

 

I rebooted and all drives showed green once again.  I went through this scenario a couple of times with the same result (i.e., parity drive red balled and drive 8 showed multiple errors). 

 

Just before going to bed, I started the array and initiated parity rebuild once again and let it run all night.  When I checked it this morning there were several drives with errors in addition to drive 8  I stopped the array and all of the drives with errors were shown as not installed with red balls.  I mapped a couple of the drives and they were all showing as empty.

 

Rebooted the array and all drives that were shown as not installed were now included back in the array and had green balls.  I mapped one of the questionable drives and all of the contents now appeared to be intact.

 

I've run the SeaTools long test on several of the suspect drives and so far every one has passed.  One of them was a WD green drive (2TB WDEARX) so I haven't checked that one yet with the WD Lifeguard diagnostics.

 

I recently upgraded my motherboard, CPU, and memory so my next plan is to swap the new hardware for the old and see what happens.  Several of the suspect drives are getting long in the tooth so I will probably have to think about replacing them if they fail again with the old motherboard and CPU.

Link to comment

If it occurs again I will do so.  Right now it's running a parity synch with the old motherboard, CPU, and RAM and so far it hasn't displayed a single error at 18% completion.  The new motherboard setup would have at least shown failures with drive 8 at this point in the parity synch.

 

I wasn't all that happy with the physical alignment of the new Gigabyte motherboard with respect to the Supermicro chassis.  The 1st PCI-e x16 slot was just slightly off with respect to the slot opening in the case, causing me to stress the Supermicro SATA controller slightly to get things lined up.  Installing the controller in the slot turned out to be quite a chore as well.  I'm thinking that the way the controller was being mounted may have caused some intermittent connections or some other issue that was causing the problem. 

 

For now, I'm keeping my fingers crossed that it makes it through the parity synch with no further incidents.  If it does I'll take another shot at trying to format the other 4TB drive.  The parity synch won't be complete until sometime tomorrow morning so I'm in a holding pattern until something happens or it completes parity synch.

Link to comment

Parity synch completed with no errors showing on any drive.  I swapped out my last remaining 750GB drive for the 2nd 4TB drive and started the array.  I hit refresh to see what was going on and it wouldn't respond.  I checked the server and all of the drive activity lights were flashing so I assumed that it was doing a data rebuild on the new drive.  I left my web browser open and tried to reconnect with the server but it just hung there.  I left it for a while and when I checked back it had finally connected and was showing a data rebuild in progress on the new drive. 

 

It appears that the new hardware was causing the problem.  I just need to determine if the hardware is defective or if flexing the SATA controller was the root cause of my problems.  Once I get all of my drives updated I'll experiment with it and see what I can find out.  For now, it's all good.

 

One parting thought -  The main reason I swapped out the motherboard was for the eight onboard SATA ports and two PCI-e x16 slots to house the two Supermicro 8-port SATA controllers.  The Supermicro server chassis holds up to 24 drives so I wanted the ability to populate all available drive bays.

 

Both my old motherboard and the new one each have two PCI-e x16 slots, one x1 slot and one PCI slot.  The 8-port SATA controllers occupy the two x16 slots and the x1 slot currently holds an Intel gigabit NIC.  I have a Promise SATA4 PCI controller to handle the remaining two drive slots, but I also have a Silicon Image SIL3132 PCI-e x1 2-port SATA controller I could use instead.

 

The question boils down to what setup gives me the most benefit - going with the Intel NIC and the PCI SATA controller or the onboard Realtek NIC and the PCI-e SATA controller.

 

Any thoughts or suggestions?  I went with the Gigabyte board because it had eight onboard SATA ports, thereby eliminating the need for the extra SATA controller that occupied a slot.  This allowed me to have 24 SATA ports as well as the Intel NIC with the option to use the PCI slot if I ever need an additional SATA port for a cache drive.

Link to comment

That's pretty much what I was thinking as well.  I went ahead and switched back to the onboard Realtek NIC and replaced the Promise SATA controller with a 2-port Silicon Image SIL3132 PCI-e controller.  I've got two more 3TB Seagate drives on the way so now I won't have to swap them out with two of my existing drives.  I had been trying to avoid using the ports connected to the Promise controller for fear it would take forever to perform a parity check or data rebuild if it caused a severe bottleneck.  If I can get the issues resolved with the Gigabyte motherboard setup I may put it back into service so I can use the Intel NIC.

 

Thanks for your input.

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.