Failed disk advice needed please


Recommended Posts

So I had a disk red-ball.  I had enough space on existing drives in the array I was able to copy off all the data from the protected disk.  During the copy process 2 files failed to transfer with errors turns out one of the other disks in the array started having unrecoverable read errors and it's health now shows some sectors that need to be relocated.  So I wanted to ask the following.  I am not going to rebuild the failed disk, as far as unraid is concerned there is no data on it anyways.  I know my parity is now bad given 1 disk has read errors, so what should I do to get back top good.  I need a new disk as I don't have enough free space to copy off the data from the now failing disk.  Do I get a disk pre-cleared do a new config with all the data drives including the bad one on the array and not assign my parity yet copy off all the data from the now failing disk, once done remove the bad disk and then add parity for rebuild??

Link to comment
Do I get a disk pre-cleared do a new config with all the data drives including the bad one on the array and not assign my parity yet copy off all the data from the now failing disk, once done remove the bad disk and then add parity for rebuild??


Seems like your best option in this case.
Link to comment
  • 2 weeks later...

I thought I would do the change a slightly different way but it didn't like it.  I did a two pass preclear on a disk and then formatted it and mounted it with unassigned devices xfs.  I copied the data off the bad drive only lost two files with read errors.  thought I would be able to lob the new disk into the array turns out did not like the new disk came up as not mountable and needed to format.  so did I don't something wrong or cannot I not put a disk into the array that unraid hasn't formatted itself???  when I put the ud formatted disk in the array I did a new config and had no parity at the time. I have the bad disk mounted in ud and will copy the files again back into the array.

Link to comment
51 minutes ago, mejutty said:

thought I would be able to lob the new disk into the array turns out did not like the new disk came up as not mountable and needed to format.  so did I don't something wrong or cannot I not put a disk into the array that unraid hasn't formatted itself???

 

UD partitions a disk with a different starting sector then unRAID, it won't mount on the array.

 

You can manually partition the disk, use 64 as starting sector, then you can format it with UD and it will be recognized as an array disk later.

Link to comment

I like the way you approached the recovery. When I have a disk fail, the first thing I try to do is copy all critical data from the simulated disk to a safe place. Cache disk is one good target. Workstation disks. And empty space on my array is another option, but I prefer to access array least amount possible while unprotected. This sort of guarantees that if things go bad, I know I will have rescued those specific files. (When a disk fails, you are dependent on every other disk in the array. So if you have 12 disks, you are 11x more likely to lose data than on a regular disk. Still long odds, but if you have a disk kick from an aging array, your chances of another failure go up hugely. The second drive kick makes accessing data from the first disk AND second disk impossible, so copying the data off in a priority order is important). When my critical files are safe, I feel more confident doing a rebuild.

 

But disks that drop out of an array have seldom failed, and often can be mounted and data extracted from them. This may be a technique to be used to restore the couple files that could not be recovered.

 

I have been searching for years for a foolproof way to partition and format a disk just like unRAID. Johnnie, if you have a set of commands for that, I would appreciate and will even test and verify. But I have typically booted up a fresh configuration unRAID USB stick, added the disks I want to format, and let unRAID format them for me. Once done, I can boot back to my full configuration and my disks are formatted "the right way" for later inclusion in an array.

 

  • Upvote 1
Link to comment
29 minutes ago, bjp999 said:

I have been searching for years for a foolproof way to partition and format a disk just like unRAID. Johnnie, if you have a set of commands for that, I would appreciate and will even test and verify

 

You can use these, disk should be cleared, i.e., without any partitions.

 

For GPT disks (>2TB) use:

sgdisk -o -a 64 -n 1:64:0 /dev/sdX

You can also use GPT for 2TB and smaller disks and uuRAID will accept those partitions, but to use MBR like unRAID, you can use sfdisk:

 

EDIT: while unRAID accepts GPT partitions for smaller disks it it's best to use MBR for all 2TB and smaller disks in just in case.

 

sfdisk /dev/sdX

 

Then type this in order always followed by pressing enter at the end of each line:

 

64
write

 

 

Edited by johnnie.black
  • Upvote 2
Link to comment
1 hour ago, bjp999 said:

I like the way you approached the recovery. When I have a disk fail, the first thing I try to do is copy all critical data from the simulated disk to a safe place. Cache disk is one good target. Workstation disks. And empty space on my array is another option, but I prefer to access array least amount possible while unprotected. This sort of guarantees that if things go bad, I know I will have rescued those specific files. (When a disk fails, you are dependent on every other disk in the array. So if you have 12 disks, you are 11x more likely to lose data than on a regular disk. Still long odds, but if you have a disk kick from an aging array, your chances of another failure go up hugely. The second drive kick makes accessing data from the first disk AND second disk impossible, so copying the data off in a priority order is important). When my critical files are safe, I feel more confident doing a rebuild.

 

But disks that drop out of an array have seldom failed, and often can be mounted and data extracted from them. This may be a technique to be used to restore the couple files that could not be recovered.

 

I have been searching for years for a foolproof way to partition and format a disk just like unRAID. Johnnie, if you have a set of commands for that, I would appreciate and will even test and verify. But I have typically booted up a fresh configuration unRAID USB stick, added the disks I want to format, and let unRAID format them for me. Once done, I can boot back to my full configuration and my disks are formatted "the right way" for later inclusion in an array.

 

Best approach is to have backups of anything you can't afford to lose. If you still want to copy files from the emulated disk (which will actually use all disks as mentioned) don't write it back to the array but somewhere else such as cache or to an Unassigned Disk or another computer. That way you avoid changing parity and you can then decide how best to proceed if a different disk has a problem.

Link to comment
3 hours ago, trurl said:

Best approach is to have backups of anything you can't afford to lose. If you still want to copy files from the emulated disk (which will actually use all disks as mentioned) don't write it back to the array but somewhere else such as cache or to an Unassigned Disk or another computer. That way you avoid changing parity and you can then decide how best to proceed if a different disk has a problem.

 

I think we're preaching to the choir and saying similar things.

 

Regarding backups, you are 100% right about the need, but many (me included) backup mainly unique works (photos, home movies, documents, etc.). Non-unique works can be recovered or reconstructed (e.g., rescaning BluRays), but would require quite a lot of time and energy. So the choice of what a person backs up remains a somewhat complex formula of risk, cost, time, and criticality. In general, with a few exceptions, people don't have mirror backups of everything. So that makes recovery of the primary storage a priority for everyone. With speed and convenience taking a back seat to avoiding data loss IMO.

 

I tend to assume that a person does not have good backups, having learned that is the most likely. People who do usually say so when the post, and their posts lack that desperate tone.

 

Talking about risk, below are 7 things users should think about to lower risk of loss. Whether that impacts backup strategy is up to each user, but if you do the following, the risk of loss is reduced by 90%, probably more like 99%:

1 - Use of locking cables, and drive cages. These help avoid the absolutely most common problems that users have - which we see nearly daily.

2 - Learning enough about how unRAID works to recover from common issues

3 - Being smart shoppers and buying drives with good reliability ratings

4 - When bumps in the road occur, posting in the forums to get confirmation of recovery steps BEFORE putting two bullets in each foot

5 - Understanding the SMART attributes and monitoring them frequently (at least turning on the notifications) . The thing people don't quite get is that the attributes are like a thermometer. If you are running a high fever, it doesn't usually mean you need Tylenol. Too often people think smart issues are incidents to be overcome, and not symptoms of bigger problems. My rule of thumb - newer drives have incidents, older drives have symptoms.

6 - Staggering disk purchases. I tend to buy 2 identical drives at the same time so my drives have diversity in age and capacity.

7 - Incrementally rotating out old disks and replacing with fresh. At 40,000+ power on hours [for 24x7 users] you really should be thinking about replacement. If a drive gets there or beyond with SMART attributes in good shape, you made a great purchase and should be thinking about transitioning it out and using it for backups. It's probably pretty small compared to current "sweet spot" sizes. My personal approach is that, together with the buying pattern of 2 at a time, I pull out the two oldest (and smallest) drives and replace them with two new new and bigger drives. If do that every year or two I'm keeping my disks fresh and adding new space. I personally think a person's array should be larger (more slots) to allow this type of steady state replenishment plan. Those with no expansion capability and trying to maintain monolithic drive sizes in search of fastest parity checks are costing themselves money and working against a steady state array maintenance process.

  • Upvote 1
Link to comment

And if you aren't going to do this:

11 minutes ago, bjp999 said:

Learning enough about how unRAID works to recover from common issues

 

Then you absolutely must do this:

13 minutes ago, bjp999 said:

When bumps in the road occur, posting in the forums to get confirmation of recovery steps BEFORE putting two bullets in each foot

 

I have seen way too many people with a problem that unRAID is well designed to recover from, but they turn it into something much worse because they tried to fix it without knowing or asking what to do.

Link to comment

i'll reply to some of the questions.  turl if you read my original post I had a failed disk as well as a disk that started throwing read errors and unrecoverable relocated sectors.  Disk 8 was this disk in my array and I copied the simulated disk8 data off onto other disks.  I knew that a parity rebuild was not going to work given the errors.  The next part was me using a precleared disk formatted in ud and then I copied the data off disk 2 which has the sector read errors.  All 3tb of data apart from 2 files copied across.  At this stage I had not done anything with the array in terms of me ending up going to need to do a parity rebuild.  So once disk 2 data was copied off I did a new config and auto assigned the cache and parity.  Then all the other data disk minus disk 8 which had not data and disk 2 was replaced with the ud formatted disk with all of disk 2's data.  As pointed out it did not like the fact the starting sector of the format was not 64 might be something that ud can build in as I see what I did as a valid recovery.  Anyways bad disk 2 was mounted in ud and I am now copying the data off it back into the now parity protected array after that rebuilt.  So I had 1 disk fail and 1 disk start throwing bad sectors and only 2 files lost as I know they will not copy across which I can obtain again.

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.