Jump to content

HELP!! unRAID 5.0.4 with 1 drive unformatted, parity drive 2 errors


Recommended Posts

don't get me wrong, i do have backups of what i consider the *very* most important folders (i.e. "Projects", "Design", "Photos", etc.), but some others, such as "Captures" which contain6-8TB of raw video footage, aren't entirely backed up...although i *did* have it all uploaded into my bitcasa cloud drive (as explained a few posts ago) until they so royally screwed me and everyone else like me who fell for their criminal switch and bate routine.

 

but yeah, it's time to re-group, forget about my lost time and disappointment and do it better.

btw, where did you see the 8TB Seagate Backup Plus drive for $250? best price i could find is $284 at NothingButSavings.com (which i don't really recognize) and $299 at B&H or NewEgg (sound like a more reliable sources).

Link to comment
  • Replies 183
  • Created
  • Last Reply

Top Posters In This Topic

I didn't say you could buy the Backup Plus for $250 ... I said you could get bare 8TB drives for that  :)

http://www.bhphotovideo.com/c/product/1107004-REG/seagate_st8000as0002_archive_hdd_8tb_sata.html/prm/alsVwDtl

 

Not actually available yet, but should be shipping in about a month.    The Backup Plus units are available now, so that may be a better choice for you at the moment.

 

I do think you're likely to recover the vast majority of your data -- possibly even all of it.    But I'd do this very methodically ... and do the 2-disk-at-a-time checks I outlined to validate the disks.

 

Link to comment

interesting drive you linked to...nice find! i registered to be contacted when they ship...they seem to be built for backup applications, which is nice.

 

well, i think you did convince me to go the route you so thoroughly suggested...seems like i will have to try the 2 drives at a time method to have the best chance of walking away unscathed...after that i might just build a slightly more powerful unRAID with less disks (maybe 10 instead of 15) and learn about how to fully take advantage of the new features built into v6 of unRAID.

 

i'll report back when i have anything more to add...but the chance of drive cage #3 having an internal circuit-board tear seem like a very very good guess!

Link to comment

thanks for that link, dgaschk...hope i won't have to rebuild any drives, but if i do, this will be a life-saver.

 

Since you are using Dynamix, after you have set all the drives correctly under the Array Devices tab, you will have to switch to the 'Array Operations' tab before you run the command line on the console or telnet session!  That is where the 'Start Array' button is on Dynamix.

Link to comment

That's correct -- hopefully that won't be necessary in your case.

 

Another question ... r.e. your comment:

 

... but the chance of drive cage #3 having an internal circuit-board tear seem like a very very good guess!

 

Does this mean that all of your troublesome drives (parity, #13, and #14) were indeed in the same drive cage?

 

 

Link to comment

so far the only drives that showed as either unformatted or missing (Parity, 13 and 14) are in drive cage 3, yes...so i think you may have hit the nail on the head...i may just order the same drive cage again and do a swap, as i am pretty confident that your intuition was right on with this...further experiments will tell...maybe tomorrow it would be a good first experiment to put those 5 disks directly on the cables and see what happens?

 

and this hypothesis is further strengthened by the fact that i put the server in its upright position again before i left the studio, closed the case, and then remotely rebooted the machine half an hour ago...i attached a screen-cap of the WebGUI, and all is green again...very encouraging!

unRAID-Main_07.jpg.38ea51131503369d0b64b4ccd507d037.jpg

Link to comment

... Note that if you do the "New Config" with "Trust Parity" and there have been ANY writes to the system during your various previous attempts to resolve this, there will be a LOT of sync errors when you do a parity check.

 

But that's okay.    If the array otherwise looks okay (all drives green and nothing showing as unformatted), then I'd just do a normal (correcting) parity check and let it "correct" all of the sync errors -- this is really the same as doing a new parity sync.    When that's done, just do another parity check to confirm you now have no sync errors -- and your array will be fully recovered !!

 

But if the array does NOT look okay (any drive not green or a drive shows as unformatted) then do NOT do a parity check -- instead follow the procedure we noted earlier for rebuilding a drive after a New Config ... and before even doing that try what I noted about reading the data from another system and/or recovering it with Reiserfsck.

 

By the way ... the damage to cage #3 was almost certainly due to shipping the system with the drives in the cage.

 

Link to comment

yes, i am well aware by now that i caused this damage by not packaging the drives separately...another expensive lesson learned!

 

yesterday i biked all the way across Berlin to a store that actually had a molex to sata-power adapter, in hope that i could then power the drives after taking them out of the cage, only to find that i actually should have purchased *two* of said adapter...back on the bike, across town, then back to the office where my unRAID is located, only to find that the SATA ends that plug into the back of the drive cage will *not* fit directly into the back of the drives, so that after 2 frantic hours of running and biking i had to accept the fact that i would not be able to bring this system back online until i replace the actual drive cage(s)...not a single store in Berlin appears to have any in stock, and even on Amazon.de none of the cages are immediately available to ship out.

 

i do have another question: i noticed that the sata cables were manually numbered, but that those numbers did not necessarily correspond to how they were plugged into the back of the cages...if i remember correctly, the top cage had - maybe 1, 2, 4, 3, 2, then the next cage 1, 3, 4, 2, 4, and faulty cage #3 had 1, 2, 3, 4, 3, or something in that vein...point being, they were *not* numbered in order...does this matter? i mean, obviously everything was working just as intended for at least a few years since i did the last hardware update, so i know it must just be something that is more a violation of my OCD tendencies than a real problem...i am pretty sure that 8 cables come from a 8-channel SATA card in one of the PCIe slots, and i can't say for sure until i get back there whether there is another PCIe card that controls the other 7 drives or whether the mobo sata ports are used, but i maybe be able to check later, if i go in today...kinda getting a bit burned out on this unRAID problem and may work from home today to not have to look at it for a day.

Link to comment

With v5 the order of the SATA data cables doesn't matter -- the drives are tracked by their serial number, so as long as they're all connected you'll be fine.

 

SATA power connectors are VERY standard -- I'm surprised they fit the cages but not the drives.  Are you sure you had them oriented correctly?  [They're keyed, so will only plug in in the proper direction]

 

In any event, since you've isolated the problem to a single defective drive cage, simply replacing that cage is the best way to resolve this => although it can be frustrating to wait for the new cage.    I'm surprised there's not a supplier in Berlin that has these.

 

Link to comment

yeah, trust me when i say that i stared at the connectors from all angles for at least 15 mins, trying to get them plugged in...they are keyed (L-shaped), but it almost seems like on the cable ends the copper contacts are on the opposite side of the horizontal middle "lip" than they are on the drives' female counter-part, when the L is in the same direction...i thought that to be very curious as well, as i know how standardized sata connectors are.

 

good to know about the order of cables not mattering! i began to worry about having to pull the whole system apart to rectify this.

i'll do another search now to see whether maybe one of the stores has stock, even though their website says that they don't...it happens.

Link to comment

yeah, trust me when i say that i stared at the connectors from all angles for at least 15 mins, trying to get them plugged in...they are keyed (L-shaped), but it almost seems like on the cable ends the copper contacts are on the opposite side of the horizontal middle "lip" than they are on the drives' female counter-part, when the L is in the same direction...i thought that to be very curious as well, as i know how standardized sata connectors are.

Sounds like you got the opposite cable - to go from a sata power coming off the supply to a molex.  (But, I would think that the molex end would have also been the reversed gender too.

Link to comment

no, the power adapters i got were fine...it was the SATA cable ends that i already had and used with the backplane cages that were not directly attachable to the drives.

 

i ordered a cage that was available via Amazon Prime for delivery tomorrow, just to get me going again, so let's see where i am by the weekend.

Link to comment

i ordered a cage that was available via Amazon Prime for delivery tomorrow, just to get me going again, so let's see where i am by the weekend.

 

That should do the trick -- the only issue will be that UnRAID will "think" you have some failed drives, since the status has been updated on the flash drive during your various earlier attempts.    But simply doing a "New Config" will fix that -- you can do it with the "Trust Parity" option if you're fairly sure there haven't been any writes at all.

 

I'd be inclined to do the following (VERY carefully) ...

 

(1)  Do a New Config and assign all of the data drives but NOT the parity drive (be CERTAIN you know which drive is parity so you don't make a mistake in the assignments).    Then Start the array and see if everything looks okay -- i.e. there are no "unformatted" drives; and all drives are green.    IF that's the case, then browse your array a bit to confirm your data looks good on all drives (but do NOT do any writes).

 

If #1 looks fine, then you can simply Stop the array; assign parity; and then Start the array and let it do a new parity sync.    When that completes, do a parity check to confirm it went well ... and you're done.

 

(2)  If #1 had any issues -- i.e. an "unformatted" drive -- then redo the New Config, but this time use the "Trust Parity" option -- which will give you the ability to attempt a drive rebuild for the bad (unformatted) drive.    Note that if by any chance you have more than one bad drive, you can't do this ... so you'll need to do #3 below.

 

(3)  If you had more than one drive "bad" when trying #1, you'll need to revert to my earlier notes r.e. trying to read the drives externally and/or doing a Reiserfsck and attempting to repair them.

 

Based on the discussions we've had in this thread, I think there's an excellent chance that #1 will work just fine and you'll be completely recovered shortly after installing the new cage  :)

Link to comment

jeeze, now i am really confused (as if i wasn't before)...while removing drive cage #3 and unplugging the SATA cables from the back, i found that the front-most SATA cable had the actual horizontal prong *from* the case stuck in the horizontal slit of the male end of the cable...it had broken out from the female plug of the case, which would explain the flakiness of the connection(s).

 

*but* after i unplugged everything from the old cage and inserted the new cage, then connected all the cabling again, transferred all 5 drives and booted up the server again, it came back with a very unexpected condition, which really baffles me...it appears as if the cage i removed held the *first* five disks (1-5), not the last 5 (10-14 + parity)...see the attached screen-cap.

 

i must have gotten something really mixed up here, or maybe i inadvertently disconnected cage #1 in the process?

 

ok, back to the drawing board...i'm gonna have to take another careful look at what's going on, but just wanted to give a quick update of where i am, frazzling as it is.

 

unRAID-Main_08.jpg.bd21ef5bcc1154ffebb19a9e73c68b68.jpg

Link to comment

Take a deep breath!!!!  Stop for moment.  If you look at the screen shots you have posted, most of them will give you all the information that you need-- the serial number of every drive in the system.  Look at them.  You should be able to quickly identify which drives are in which cages.  (I have put little paste-on stickers--- copy paper and transparent tape--- on each opening identifying the disk number of the drive inside.  This way when I have a problem, I know exactly what slot to open to get to that disk.)

Link to comment

you are completely correct, Frank1940...that's what i did and i was indeed mistaken all along...what i thought to be cage 3 was actually cage 1 with disks 1-5...i was also able to find the reason why they all showed as missing...the new backplane is capable of housing both SAS and SATA storage, and when i plugged in the 5 SATA cable ends in the rather dark cavity that is my case, i instinctually plugged them all into the yellow females of the cage, not even seeing that there were an equal number of black ones in-between...just re-plugged them all and booted up the server again, and i am now seeing all drives in green, parity in blue.

 

*but* since this obviously doesn't solve my problem with the earlier disappearance and unformatted status of drives 14 and 13 respectively, i have to assume that cage 3 also got damaged during transport, and when i replace cage 3, i might as well replace cage 2, just to be safe...i messed up by not asking my friend who packaged this system to package the drives separately, so i must now pay the price of buying 3 backplanes...a €315 mistake...bummer, but in light of the fact that the data is likely to be ok, it's the *much* smaller bummer.

unRAID-Main_09.jpg.4f5d0ac470a43b8cebb7b21d5e6fd3d7.jpg

Link to comment

At the moment, you seem to have all of the drives looking fine (except parity -- and that's likely just due to the initial "fiddling" you did with the system).

 

Since drives #13, #14, and parity are in the same cage, THAT is the cage you want to replace -- I wouldn't replace the others unless you have issues with drives in them as well.

 

At this point, I'd be sure your new cages replaces the cage that held those drives;  boot again to confirm all the drives look green; and then do the New Config I noted earlier -- i.e. all the data drives, but no parity.    My expectation is that you will then see ALL of your data with no problem if you scan through the disks.    As long as that's true, just Stop the array; assign parity; and Start it back up and let it do a parity sync.

 

Replacing the other cages certainly won't hurt anything -- but since all of your problems were with drives in the same cage, it's likely that's the only one that actually got damaged in shipping.

 

Link to comment

well, like i said, when i removed the front-most SATA cable from cage 1 (which i thought to be cage 3) the entire horizontal lip that was part of the front-most female SATA receptor of the drive cage came out with the male end of the cable, so cage 1 was bad too (just hasn't given me any problems just yet), and we know that cage 3 has issues from the previous problems it exhibited...so since i already had to replace cage 1 and know that i'll also have to replace cage 3, i decided to also replace cage 2 while i'm at it...chances aren't exactly slim that if cage 1 and 3 were bad, that 2 might have problems now or coming up as well...just trying to be vigilant about this, now that i am pulling this thing apart.

 

as a matter of fact, considering how many drives i've had to replace over the past 2-3 years due to the array marking them as faulty bc of too many write errors (fairly new drives even), i wonder whether damage to one or more of the cages hasn't already happened during one of the previous 2 cross-continental journeys, both of which were also done with all drives in the cages...never again, but now it's time to fix this thing and do it *right*!

Link to comment

I missed your note r.e. the lip coming out -- so yes, I absolutely agree that replacing all 3 cages is a good idea.  Clearly 2 are bad, so you may as well be pro-active and replaced the 3rd (especially given it's previous shipping history).

 

I suspect you won't ship this system with the drives in the cages again  :)

 

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.


×
×
  • Create New...