Make it impossible to format an emulated disk and/or do not allow replacing a disk by itself


Recommended Posts

What happened:

This is the second time I notice that a user killed his data (another topic). Their disk lost it's connection, so it was thrown out of the array, but parallel the disk partition was corrupt.

 

First problem:

This means the user had as a result an emulated disk without partition (with defective partition table) and the GUI offered them to format the disk.

 

Second problem:

The user formatted the disk as they thought this could be a necessary step to rebuild the data. Instead it resetted the file system and updated the parity accordingly.

 

Third problem:

Their caching rules applied. The mover moved new data to the fresh formatted and emulated disk. Now important sectors with data were overwritten (the chances to recover the data through a third-party software shrunk).

 

Fourth problem:

The user started a data rebuild with the same disk that was disconnected before (hardware problem was solved, like a defective cable or sata port). After the rebuild process was finished all its original data was replaced against a fresh formatted disk + the moved data from the cache.

 

What should happen instead:

a) do not offer to format an emulated disk. Instead it should advise the user to repair the partition table

b) do not allow to replace a "defective" disk by itself. By that its impossible to kill the original data

  • Upvote 2
Link to comment
59 minutes ago, mgutt said:

a) do not offer to format an emulated disk. Instead it should advise the user to repair the partition table

This sounds like it might be a good idea.   This would not be trivial to implement though, but I might see if I can work out a Pull Request that implements this in a way that it could be done as an alternative popup to the one that is currently given when you elect to format a drive?  I wonder, however,  if anyone can come up with a Use Case for when it WOULD make sense to allow formatting of an emulated drive?

 

1 hour ago, mgutt said:

b) do not allow to replace a "defective" disk by itself. By that its impossible to kill the original data

 

There is frequently a need to do this as it is not the disk that caused the drive to be disabled but the something else so not such a good idea.  I also think this would be much harder to detect in a sensible manner.

Link to comment
2 hours ago, mgutt said:

a) do not offer to format an emulated disk. Instead it should advise the user to repair the partition table

 

Don't you have to click the checkbox for formatting and then there is also a further popup stating that "Formatting is never part of a recovery"?

Link to comment
28 minutes ago, Squid said:

Don't you have to click the checkbox for formatting and then there is also a further popup stating that "Formatting is never part of a recovery"?

Yes - but if we can make it even more proof against users taking the wrong action through not properly reading (or understanding) the pop up it would be a desirable change.   I must be honest and I do not see a really good reason as to why an emulated  drive ever needs to be formatted, but the difficulty is in how to stop it (or at least make it harder than it is).  

Link to comment
2 hours ago, Squid said:

Formatting is never part of a recovery

 

I saw several times that users stumple over that. Recovery vs. Rebuild vs. Formatting. To understand the problem one needs to think like an average user - like me.

 

A disk fails -> Panic. 

 

Fumbling with SATA cables, rebooting, swapping disks, rebooting, etc. At one point the old disk is shown again and offered to be formatted. The text is blue. Blue is good.


Many users, especially new users, do see parity as a backup disks. They think formatting the data disk is no problem. They think parity still holds the data. They do not understand that formatting a data disk updates the parity as well. I saw that several times.

 

Whatever can be done to harden that process and make it more fool proof is highly appreciated. Formatting an emulated disk, even if this disk has been recognized again or reported good after a reboot, should never show the formatting option. I second the request from @mgutt .

 

Link to comment

Slightly off-topic here, but I feel the Main tab UI towards the bottom could do with some work.

There's a lot of very important info that feels a little like it's been bolted together over time, without much thought to cohesivness.

 

Maybe something like a dedicated information box - maybe it's empty most of the time, but gets populated with important info when something changes:

"Encryption key is missing"
"Disk X has failed, and is being emulated"

"Formatting will update parity information. Previously emulated data will be lost"

etc...

Edited by -Daedalus
Link to comment
7 hours ago, hawihoney said:

 

I saw several times that users stumple over that. Recovery vs. Rebuild vs. Formatting. To understand the problem one needs to think like an average user - like me.

 

A disk fails -> Panic. 

 

Fumbling with SATA cables, rebooting, swapping disks, rebooting, etc. At one point the old disk is shown again and offered to be formatted. The text is blue. Blue is good.


Many users, especially new users, do see parity as a backup disks. They think formatting the data disk is no problem. They think parity still holds the data. They do not understand that formatting a data disk updates the parity as well. I saw that several times.

 

Whatever can be done to harden that process and make it more fool proof is highly appreciated. Formatting an emulated disk, even if this disk has been recognized again or reported good after a reboot, should never show the formatting option. I second the request from @mgutt .

 

 

Hei there, I dont know if I am allowed to write something here, but this was exactly my case. I paniced and thought oh god damn good i have a parity disc, where all my recovery data is stored.

 

I thought formating the disc, and then just adding it as a new one freshly formated and starting the data rebuild is the way to go. I never heard about that thing that formating actually updates the parity??

 

I thought, and I think most of the people do think the same, that the parity disc is only updated when a parity sync is started and completed, so basically I thought the parity disc is only touched when parity sync is active. I scheduled this weekly on every sunday, so I thought yeah maybe I loose all the data i wrote on my discs this week, but everything before the last sunday is stored on my parity. 

 

It would be nice to get that information anywhere, so that you know that corrections are always written on the parity disc, and not only when parity sync is stared. But after that, this one question pops up in my head: Why do I even have to schedule parity sync then?

 

 

 

 

Link to comment
11 minutes ago, EricM said:

Why do I even have to schedule parity sync then?

Technically you don't.  But if you look at the wording, nowhere does it say "sync"  It says "Check", and is simply a check (usually run monthly during off hours) to make sure that if need be that everything is good to go and can rebuild a drive if the need arises.

Link to comment
Just now, Squid said:

Technically you don't.  But if you look at the wording, nowhere does it say "sync"  It says "Check", and is simply a check (usually run monthly during off hours) to make sure that if need be that everything is good to go and can rebuild a drive if the need arises.

 

So lets say even though I never made a parity check, my parity still should be up to date?

 

So now I get what I missunderstood now for 2 years, but I still think I am not the only one thinking like this.

Link to comment
3 hours ago, EricM said:

now I get what I missunderstood now for 2 years

I think there may be several things misunderstood.

 

Unraid parity is updated realtime whenever any write occurs on the array. So, parity should always be in sync and ready to rebuild a failed disk. Realtime parity updates is the reason writing to the parity array is slower than writing a single disk.

 

Except for simple file reads, everything is a write. Writing a file, moving a file, copying a file, deleting a file. These are all writes. And formatting a disk is also a write. It writes an empty filesystem to the disk, and parity is updated.

 

Also, parity doesn't contain any of your data. Obviously parity can't possibly have the capacity to be a backup of any and all disks in the array.

 

Parity by itself can recover nothing. Parity PLUS all other disks are required to rebuild a missing disk.

 

And, parity is not a substitute for backups. You must always have another copy of anything important and irreplaceable.

Link to comment
3 hours ago, EricM said:

I never heard about that thing that formating actually updates the parity??

The pop up you would have got when you ticked the format box would have explicitly told you that was going to happen.

 

Having said that I understand that in a panic people have been known to blindly click the OK option without carefully reading the warning message, so making it even harder to ignore makes sense.

Link to comment
3 hours ago, EricM said:

But after that, this one question pops up in my head: Why do I even have to schedule parity sync then?

Parity is always updates in real time which is one of the reasons having parity drives slows down write performance.

 

The scheduled parity check is basically a housekeeping task to confirm that parity DOES agree with the current data drives as if it ever gets out of sync for any reason it compromises any future failure of a drive.   Scheduling it weekly seems excessive, monthly or quarterly are much more frequently used.

Link to comment
3 hours ago, EricM said:

So lets say even though I never made a parity check, my parity still should be up to date?

Yes, although that requires some assumptions about the stability of your equipment. If you have an unclean shutdown, you probably have at least a few bytes that didn't get updated properly. That's why an unclean shutdown is always followed by a mandatory parity check on the next boot.

 

Another issue is a drive that is hardly ever read going bad silently in the background. Because Unraid spins down drives until they are needed, you can conceivably go months between spinning up a drive, and you wouldn't know it was bad until it was too late. A regularly scheduled parity check ensures that the full capacity of all drives is able to be read error free, ensuring that an unexpected drive failure should be able to be rebuilt. Even completely "empty" drives are fully used end to end in the parity computation, so having one completely empty drive fail while rebuilding a different disk will cause data loss.

Link to comment
  • 3 months later...

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.