Jump to content

I've had lots of drive issues the last few weeks, unlucky or underlying problem?


JustinChase

Recommended Posts

I am going to go purchase another new drive, and replace disk4 which is just spitting up errors no matter which cable or port it's plugged into.  The parity sync will take over a month at the rate it's able to go with this disk problem.  Hopefully, this will be the last disk I have to purchase for a good long while, and will allow the parity sync to finish in a reasonable time.

 

if that happens, and if the server continues to run without any more issues for a few days (fingers crossed), I'll chalk it up to the PS and the video card.  if I continue to have issues, I'll swap power supplies and see how that goes.

 

I didn't have any issues with any of this until I moved back from Mexico with the server in the car.  Then I got my first red-ball, and it's been a shitstorm of problems ever since.  Probably a month at least.

Link to comment

Okay, I really don't know how to proceed now.

 

I just went and purchased a new 3TB disk (Best Buy didn't have any 4TB in stock).

 

I removed disk4 which was constantly resetting the connection, and was causing the parity rebuild to take from 19 to 49 days, depending on how slow the process was going, from 500KB/sec to 1MB/sec.

 

But, now unRAID thinks there are too many bad/failed/missing disks to start the array, presumably because it thinks parity is bad, and the drive I just replaced.

 

I see 2 terrible options.

 

1. put the old/failing disk back and wait a month for parity to rebuild

2. do new config, and lose the 1.7TB of data that was on the failing disk

 

Ideas?

Link to comment

I am going to go purchase another new drive, and replace disk4 which is just spitting up errors no matter which cable or port it's plugged into.  The parity sync will take over a month at the rate it's able to go with this disk problem.  Hopefully, this will be the last disk I have to purchase for a good long while, and will allow the parity sync to finish in a reasonable time.

 

if that happens, and if the server continues to run without any more issues for a few days (fingers crossed), I'll chalk it up to the PS and the video card.  if I continue to have issues, I'll swap power supplies and see how that goes.

 

I didn't have any issues with any of this until I moved back from Mexico with the server in the car.  Then I got my first red-ball, and it's been a shitstorm of problems ever since.  Probably a month at least.

That might have been problem.  Everything probably got bounced around inside the case.
Link to comment

I didn't have any issues with any of this until I moved back from Mexico with the server in the car.  Then I got my first red-ball, and it's been a shitstorm of problems ever since.  Probably a month at least.

That might have been problem.  Everything probably got bounced around inside the case.

 

Not as bad as you might think, it was sitting on 4 pillows, but still, not great.

 

Regardless, I can't undo that.  All i can do is try to fix it from here, which is going VERY poorly!

Link to comment

Okay, I really don't know how to proceed now.

 

I just went and purchased a new 3TB disk (Best Buy didn't have any 4TB in stock).

 

I removed disk4 which was constantly resetting the connection, and was causing the parity rebuild to take from 19 to 49 days, depending on how slow the process was going, from 500KB/sec to 1MB/sec.

 

But, now unRAID thinks there are too many bad/failed/missing disks to start the array, presumably because it thinks parity is bad, and the drive I just replaced.

 

I see 2 terrible options.

 

1. put the old/failing disk back and wait a month for parity to rebuild

2. do new config, and lose the 1.7TB of data that was on the failing disk

 

Ideas?

 

Are you able to mount the 4TB drive under Windows using the Linux driver (I forget the exact name of the tool off the top of my head)? If so, you could start up with the new 3TB drive, and pull the data off the old outside of UnRAID and then copy back to the array.

 

Link to comment

good idea, I had a similar thought.

 

I decided to try one more thing.  I hooked the old 2TB drive which was acting up, that I had removed.  I used the brand new SATA cable from the new 3TB drive, and a SATA power to old-style 4 prong power connector to hook it up to one of those power cables I have extra in the case (I don't have any more SATA power cables available).

 

I figured I'd just copy anything important off of that drive, even slowly, to make sure I didn't lose anything too important.

 

That worked, and I'm currently copying at about 60MB/sec to one of the array drives with sufficient space on it.

 

I assume at some point that will come to a crawl again, but I've copied all the stuff that would be hard to replace, so if/when it slows way down, I'm okay with losing what's left on it, and re-downloading what I lose.  Hopefully it works okay for the next 12 hours, and I can get it all off of there, but we'll see.  syslog already shows some 'unable to read from sector xxxxx' errors, so the drive is clearly on it's last leg.

 

Once I have to abandon that drive, I'll have to just start with a new config, and let it rebuild parity from what exists at the time.  if/when that happens, I'll do as you suggest, and put it in a windows machine and see if I can get any more data off of it, even slowly, it's probably still faster than downloading it all again.

 

Fingers crossed.

Link to comment

good idea, I had a similar thought.

 

I decided to try one more thing.  I hooked the old 2TB drive which was acting up, that I had removed.  I used the brand new SATA cable from the new 3TB drive, and a SATA power to old-style 4 prong power connector to hook it up to one of those power cables I have extra in the case (I don't have any more SATA power cables available).

 

I figured I'd just copy anything important off of that drive, even slowly, to make sure I didn't lose anything too important.

 

That worked, and I'm currently copying at about 60MB/sec to one of the array drives with sufficient space on it.

 

I assume at some point that will come to a crawl again, but I've copied all the stuff that would be hard to replace, so if/when it slows way down, I'm okay with losing what's left on it, and re-downloading what I lose.  Hopefully it works okay for the next 12 hours, and I can get it all off of there, but we'll see.  syslog already shows some 'unable to read from sector xxxxx' errors, so the drive is clearly on it's last leg.

 

Once I have to abandon that drive, I'll have to just start with a new config, and let it rebuild parity from what exists at the time.  if/when that happens, I'll do as you suggest, and put it in a windows machine and see if I can get any more data off of it, even slowly, it's probably still faster than downloading it all again.

 

Fingers crossed.

 

Good luck. :)

 

Link to comment

Corsair RM Series 1000 Watt ATX/EPS 80PLUS Gold-Certified Power Supply - CP-9020062-NA RM1000

 

I am running 20 hard drives and haven't had any issues, and it is only $149 and it is modular...  1000W should easily handle the 12 drives..

 

I just ordered the 750 Watt version of that PS for $110.  I don't think I'll ever go past 12 drives, since I don't have much/any more room, and honestly, 12 drives has been enough of a PITA for me so far.  I'll probably just put bigger ones in as necessary, if I decide to even keep that damn thing.

 

It looks like it only comes with 2 cables with 3 SATA power connectors each, meaning I can only power 6 drives with the cables it comes with.  Did you order more of these cables, or does it come with more, or can I use the cables on my current modular PS (assuming it's not these cables that are the problem in the first place)?

 

It should be here on Saturday.  We'll see how things go then.

Link to comment
can I use the cables on my current modular PS
Unless you are comfortable with the terms volt-ohm meter, continuity tester, and repinning, NO!!!

 

Modular power supplies do NOT follow any sort of standard, because there isn't one for the power supply end of the cable. The connectors may look and fit the same, but I know for a fact they aren't all pinned the same. You could get lucky, but with your current track record with Murphy, I wouldn't count on it.

Link to comment

Okay, the parity check finished, and both the parity drive and disk 4 reported errors.  I've rebooted a few times today, trying to fix a different issue with Tom, so there are currently no errors reported.

 

I've attached the smart report for disk 4 and the parity disk.  it looks to me like they both have pending sectors, which I assume I need to fix somehow.  I believe a pre-clear will do that, but I don't have a 4TB drive to swap for the parity, so I'll have to shutdown the server for a couple days to pre-clear the parity drive, if that's my only course of action.

 

is there any other way to check/fix these drives?

Link to comment

I do have one, but not sure how good it is.. :)

 

I think when you write to the location of the pending sector it corrects it.  So in theory if you did a full parity build, it should correct the parity drive.  From what I read, it means the block has unstable contents and a write would correct it.  I have only had one and the preclear fixes it, but I would assume the build would write to every block as well and correct it, but you would be without the parity protection during the rebuild.

Link to comment

Archived

This topic is now archived and is closed to further replies.

×
×
  • Create New...