Jump to content

[Solved] Can't transfer a big file to one of my shares.


Recommended Posts

dgaschk-

 

You said a single drive can be rebuilt if all other disks are readable.

 

Is a disk readable if it has pending sectors?

A pending sector can not be read.

 

Then you go onto say, "the rebuild of disk4 might work." Unfortunately, that advice was after I already started preclearing disk2. Or, are you saying that it still might be possible to rebuild disk4 even though I started preclearing disk2?

Yes.

I'm still on step 2 of 10 in the preclearing process (copying zeros to remainder of disk to clear it). So I'll let it finish up.

 

Can someone please tell me how to recover from this sort of disaster?

 

Is there a way to list files on my server by disk? I know whats located in any given share, but how do I find out what folders and files are stored on each disk?

 

Does anyone else know anything about dd? And could tell me how to use it?

 

Try to rebuild disk4 first. It may work. If not then we'll worry about dd.

Link to comment
  • Replies 141
  • Created
  • Last Reply

Top Posters In This Topic

Sorry to keep bugging you, but I'm still struggling to understand exactly what steps to follow.

 

Are you saying to go ahead and finish the preclear of disk2, then rebuild the array and hope that disk4 is rebuilt?

 

Or, should I finish disk2 preclear, rebuild array, then preclear disk4 and rebuild the array again?

 

 

Link to comment

Yes, before I took it offline and started preclearing disk2, parity was valid.

 

Several weeks ago I had performance issues. Based on that, I received the advice in this thread to run SMART tests on disk4. The results showed that disk4 had pending sectors. I was told to preclear this drive and then rebuild my array. The actual steps are detailed earlier in this thread.

 

This solved my performance problems, copying files to the server was no longer really slow.

 

The same performance issues reappeared several days ago. So based on the information in this thread I ran SMART reports on all my drives again. I discovered that disk2 and disk4 both reported pending sectors.

 

At this point I should have just waited for complete instructions on how to handle this problem. Instead, I went ahead and started the preclear process on disk2. Now I'm trying to figure out how to proceed and if I can actually recover the data from these drives.

 

Any help would be greatly appreciated.

 

Link to comment

you have no real choice here.

Re-assign disk4 and disk2, let unRAID attempt to rebuild disk2.  (re-assign disk4 only first, start the array with disk2 missing... Then stop once more and assign disk2.  Let unRAID reconstruct disk2.)

 

It may have an issue (and stop the reconstruction) when it gets to the un-readable sector on disk4, I'm really not sure.  If you are lucky, it will run to completion.

 

You are correct.  The use of the pre-clear on both disks at the same time prevents re-construction of either.  (and, if it makes you feel any better, it very securely wipes the disks clean, and unable to be read even by the most talented of data recovery professionals.)

 

Joe L.

 

 

Link to comment

Follow Joe's advice. You have no other choice except to forget about the contents of disk2.

 

I would suspect your power supply. There were 2 clear cases posted here of power supplies causing WD drives to do similar pending sector issues like you're experiencing. The posters could replicate it with the old supply and replacing the power supply with a quality unit made everything happy again.

 

You also have a supply with dual 12V outputs. You want a quality single 12V output supply. A Corsair is always a good choice. A good 400W or 500W would be fine for 6 drives.

 

Link to comment

At this point, only disk2 is unassigned. And, it is currently getting precleared. I haven't done anything to disk4.

 

I was planning to reassign disk2 and then rebuild my array. Are you suggesting that I should unassign disk4 first, before rebuilding?

 

Sorry for the confusion... I just want to make sure I don't screw this up.

Link to comment

At this point, only disk2 is unassigned. And, it is currently getting precleared. I haven't done anything to disk4.

 

I was planning to reassign disk2 and then rebuild my array. Are you suggesting that I should unassign disk4 first, before rebuilding?

 

Sorry for the confusion... I just want to make sure I don't screw this up.

If disk4 is still assigned, you should be able to access the disk2 contents by starting the array.  It will be emulated by parity and the other data disks even though unassigned.

 

Once you can start the array with just disk2 missing, and once its preclear has completed, you can re-assign disk2 and let unRAID reconstruct the contents onto it.

 

Joe L.

Link to comment
If disk4 is still assigned, you should be able to access the disk2 contents by starting the array.  It will be emulated by parity and the other data disks even though unassigned.
And if you can see your data, I would immediately copy any high value data on the array to another physical disk, preferably to another machine until you can stabilize the array.
Link to comment

Is there a way to identify the specific files stored on disk2 and/or disk4? I know where the files are in regards to the shares, but not the drives.

 

When I first built my server, I only had 1 TB drives. Eventually, I started swapping them out for 2TB drives. Disk2 was primarily for my ripped movies, but after running out of space, added movies were probably stored on other drives.

 

How do I identify the movie files that were stored on disk2 and/or disk4?

Link to comment

I'm currently backing everything important on disk2 and disk4 to drives on other computers. This could take awhile. Most of the time the copy fails. After waiting several minutes for a file to copy, I get an error message saying "There is a problem accessing \\tower\disk2". Then I'll pick the 'Try Again' button and it comes back with "Invalid file handle". Hitting 'Try Again' will restart the transfer. It usually works 1 out of 6 tries.

 

Meanwhile, I'm still waiting for the disk2 preclear to finish. It's only 58% through step 2 (copying zeros to remaining disk to clear it). Last night before going to bed, it was at 52 or 54%. Total elapsed time is 46 hours 37 minutes. I don't think it took nearly this long the last time I precleared a 2TB drive. However, since this is a WD WD20EARX drive, I used the -A option. Will that account for the longer time?

Link to comment

Meanwhile, I'm still waiting for the disk2 preclear to finish. It's only 58% through step 2 (copying zeros to remaining disk to clear it). Last night before going to bed, it was at 52 or 54%. Total elapsed time is 46 hours 37 minutes. I don't think it took nearly this long the last time I precleared a 2TB drive. However, since this is a WD WD20EARX drive, I used the -A option. Will that account for the longer time?

No, look in the syslog.  The disk is probably experiencing errors and constantly being reset by the OS.  The "-A" option would not affect the speed in any way.
Link to comment

I'm currently backing everything important on disk2 and disk4 to drives on other computers. This could take awhile. Most of the time the copy fails. After waiting several minutes for a file to copy, I get an error message saying "There is a problem accessing \\tower\disk2". Then I'll pick the 'Try Again' button and it comes back with "Invalid file handle". Hitting 'Try Again' will restart the transfer. It usually works 1 out of 6 tries.

 

Meanwhile, I'm still waiting for the disk2 preclear to finish. It's only 58% through step 2 (copying zeros to remaining disk to clear it). Last night before going to bed, it was at 52 or 54%. Total elapsed time is 46 hours 37 minutes. I don't think it took nearly this long the last time I precleared a 2TB drive. However, since this is a WD WD20EARX drive, I used the -A option. Will that account for the longer time?

 

No. What does the SMART report for disk2 show?

Link to comment

I'm now at 64% for step 2 of 10.

 

The syslog is attached to this post. It looks like there are a lot of errors, but I don't know how to interpret them.

 

The SMART status report now shows a reduction in current pending sectors. It's gone down from 74 to 21. The raw read error rate has gone up a lot. What does that mean? The SMART report still shows offline uncorrectable at 1 and multi zone error rate at 1.

 

It looks like my syslog was to big to attach to this post. So I've provided this link

 

http://we.tl/2aaDdRomjc

smart_disk2-2013-02-08.txt

Link to comment

Still waiting on the pre-clear to finish for Disk2. Last I checked last night, it was around 75% complete on task 10 of 10 (post-read). I monitor it with a DOS window telnet session (Putty) and had run it with 'screen'.  When I checked it this morning, my DOS window telnet session wasn't running. Luckily, I have unMenu running in another browser tab and could see that I am now 91% complete on the post-read. Otherwise, I wouldn't know where it was at because even when I reran the Putty telnet session and tried 'Ctrl-A n' to toggle through my different screens, the pre-clear progress would not appear. IScreen wasn't even running, so I typed "screen" and the info message displayed and then 'Ctrl-A n' just returned the message "No Other Window".

 

So, I'm curious... how do you display the progress of a pre-clear after losing your telnet session? I thought that was the whole purpose or advantage to screen. I must have done something wrong (obviously).

Link to comment

Still waiting on the pre-clear to finish for Disk2. Last I checked last night, it was around 75% complete on task 10 of 10 (post-read). I monitor it with a DOS window telnet session (Putty) and had run it with 'screen'.  When I checked it this morning, my DOS window telnet session wasn't running. Luckily, I have unMenu running in another browser tab and could see that I am now 91% complete on the post-read. Otherwise, I wouldn't know where it was at because even when I reran the Putty telnet session and tried 'Ctrl-A n' to toggle through my different screens, the pre-clear progress would not appear. IScreen wasn't even running, so I typed "screen" and the info message displayed and then 'Ctrl-A n' just returned the message "No Other Window".

 

So, I'm curious... how do you display the progress of a pre-clear after losing your telnet session? I thought that was the whole purpose or advantage to screen. I must have done something wrong (obviously).

Not sure what you did here, but the preclear if still running must be associated with a session somehow. When you rerun PuTTY you can't just immediately try to toggle through your different screens. You have to run screen again with the -r switch to re-attach to an existing screen session. Here is a good screen tutorial

Link to comment

Ah yes, that's what I forgot... screen -r. Thanks for reminding me. Next time. Anyhow, the pre-clear of disk2 finally finished. Total time 101:29:29. A pre-clear marathon.

 

The post pre-clear SMART status report shows:

 

current pending sectors = 10

offline uncorrectable = 1

multizone error rate = 11

ata error count = 18

 

overall-health self-assessment = passed.

 

The full SMART status report for disk2 is attached to this post.

 

Should I try running preclear on this disk again?

 

Meanwhile disk4 also has 1 pending sector. Eventually, I'll need to deal with that.

 

At the beginning of this episode, I ordered another 2 TB drive. A Western Digital WD20EURS drive. It arrived a few days ago. I haven't done anything with it yet. I could replace one of the drives that are having these pending sector problems, but based on the advice here. These drives might be ok. I could just save this drive as a backup. Or, I could add it to my array since I have room for 2 more drives in this case.

 

I also ordered a replacement power supply. A Corsair 500W PSU that seems to be one of the popular PSUs recommended in another thread on the Lime Tech forum. I'm still not sure if my existing PSU is causing these problems and I'm not sure if there is a fool proof way to assess the status of my PSU. Anyhow, I'm planning to swap it out and see if the problems disappear. I'm expecting it to arrive early this coming week.

 

I could also run memtest to check the status of my RAM.

 

I'm still copying critical files off the server. So cannot move forward until I have these files backed up.

 

What would the experts here recommend, and in what order should I proceed?

smart-disk2_sde_2013-02-10.txt

Link to comment

How, do I get the pending sectors down to zero?

 

I'm dealing with 2 drives with pending sectors, so I'm trying to find a strategy for moving forward.

 

Should I run another preclear on disk2?

Should I run preclear on disk4, the other drive with pending sectors?

 

Should I wait until I install my replacement PSU?

 

Should I try replacing the old disk2 with my newly purchased drive? If so, what can I do with this disk2 drive that still passes as a healthy drive. I'd like to get it replaced, but not sure I can prove it is defective.

 

Sorry for all the questions, but I'm still not sure what steps will get me up and running again.

 

My server is still running, but I'm careful not to write anything to it.

Link to comment

Current pending sectors RAW-VALUE must be zero in order for the parity system to work. Do not use a disk with any pending sectors. Pending sectors will prevent the reconstruction of a failed drive.

 

How do I do this with 2 drives with pending sectors?

 

Do I just keep running preclear until they are down to zero???

Link to comment

Current pending sectors RAW-VALUE must be zero in order for the parity system to work. Do not use a disk with any pending sectors. Pending sectors will prevent the reconstruction of a failed drive.

 

How do I do this with 2 drives with pending sectors?

 

Do I just keep running preclear until they are down to zero???

The bigger issue is WHY the sectors are un-readable AFTER being re-written (and re-allocated)

 

If there are a constant stream of un-readable sectors AFTER being precleared, then the sectors must have been re-written in the "write" phase and either re-written in place, or re-allocated, and then identified as un-readable (their checkum at the end of the sector no matching the contents of the sector) during the post-read phase.

 

The most likely cause is a gradual surface degradation of one of the platters.  (the dust particles in the drive keep the disk head from reading  properly, and the surface gets worse as dust abrades it constantly) Or, it could easily be a noisy power supply voltage feeding the drive (either 5V or 12V) and that causes the electronics to fail a read of a sector randomly.

 

If you put a drive through a preclear cycle and it has sectors pending re-allocation when done odds are it should be RMA'd.    If you have more than one drive with this issue, I'd look at the power supply (including splitters, drive racks, trays, etc) to make sure there are not too many poor quality (high resistance) connections.

 

If you wish to exercise the disk thoroughly, and it is NOT assigned to your array, try

badblocks -c 1024 -b 65536  -o /boot/badblocks_out.txt -svn /dev/sdX

It is the non-destructive read/write test.  It MUST NOT be used on a disk assigned to the unRAID array.  It will take a LONG time but will print its progress as it works.    Run it on the system console or under "screen" so it continues if your PC terminates its telnet session.

 

Joe L.

Link to comment

Will badblocks help me identify if it's a drive problem or PSU/electronics problem?

 

My new PSU should arrive any day. After installing, should I rerun the preclear on disk2? If it ends up with zero pending sectors, then maybe we've determined it was a bad PSU. If it still has pending sectors, then I get an RMA and swap it out with my spare.

 

Then, before I try rebuilding, I need to deal with the pending sectors of disk4.

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.


×
×
  • Create New...