• [6.8.2] Array Start "Bug" or What?


    Joseph
    • Closed

    Hello unRAIDers,

     

    I think there might be a bug with how unRAID starts the array after a disk has been compromised. My apologizes if this has already been addressed. Full disclosure, what I did was reckless by any rational person who values their data, so don't try this on a box with data you care about.

     

    Steps to reproduce:

     

    1. Clean shut down of unRAID v6.8.2

    2. Pull a drive out and reformat/repartition it on another box to temporarily use on another computer.

    3. When you're ready, put the same drive back into the array and power on. On the main page, unRAID doesn't note that something has drastically changed with the drive and gives the "green light" that the array is ok to start -- No warnings or anything to indicate there's an issue.

    4. Not willing to take any chances, I pulled the drive out and let the array start with the missing drive.

    5. When I stopped the array and added the drive back in, unRAID then understood that the drive contents needed to be rebuilt from parity.

     

    I doubt anyone would do anything as boneheaded as this, and I'm not sure what would happen had I started the array at Step 3;  but maybe there should be some sort of check or a message to the user that the data on the disk has been compromised and needs to be rebuilt?

     

    Anyways, hope this helps.

     




    User Feedback

    Recommended Comments

    Problem here is that you did this with the system stopped / shutdown.  unRaid can't tell that you've messed around with the drive unless it does a parity check.   As far as it's concerned there is no problem.  (And because of how unRaid works with no striping, then effectively there is no problem).

     

    Had you started the array, everything would go on its merry way.  When you did a correcting parity check, the system would have fixed itself reflecting the current contents of that drive.

     

    If one of the data disks however died before a correcting check was done, then any rebuild would work, but the calculated information would be completely wrong.  IE: You would trash the rebuilt drive.

    Link to comment

    Thanks Squid for the info. So, to make sure I understand... in my case, even though the drive was returned to unRAID but partitioned as OS X Journaled, had I started the array in step 3, unRAID would have continued to 'work' and once a parity check was done, it would have restored the partition of that drive back to XFS and rebuild the contents of that drive from parity?

    Link to comment

    It probably would have said unmountable and offered to format it.  Either way, you would still have needed to do a correcting parity check to get everything in sync.

     

     

    Link to comment

    Ok, thank you. I was concerned based on the 'new contents' of the physical disk, it would have destroyed the virtual contents held by parity and the data that used to be on the disk would then be forever lost... which is why I thought it might need to be reported as a 'bug', annoyance, or whatever. It would still be nice if there was some way that unRAID could somehow recognize when a disk has drastically changed prior to startup to warn the user. But thank you again for looking into it.

    Link to comment
    22 minutes ago, Joseph said:

    I was concerned based on the 'new contents' of the physical disk, it would have destroyed the virtual contents held by parity and the data that used to be on the disk would then be forever lost...

    That's correct. A correcting parity check would have updated parity to reflect what was now on the disk instead of what was there before, so that parity would once again be useable to recover from a disk failure. All original content would be gone, just like you intended by erasing the disk.

     

    If you didn't want the data erased, why would you format the disk, inside or outside of unraid?

     

    Your scenario of pulling a data drive to temporarily use it for something else doesn't make sense.

    Link to comment
    Quote

    That's correct. A correcting parity check would have updated parity to reflect what was now on the disk instead of what was there before

     

    So my first assumption was correct: starting unRAID at Step 3 would have killed the contents forever.

     

    My concern for raising this as an issue is, unRAID would have known something was wrong if it was a different disk and alerted the user. But because it was the same drive, it thought all was well. Perhaps that's by design, but some kind of warning would have been nice. IDK.

     

    Quote

    Your scenario of pulling a data drive to temporarily use it for something else doesn't make sense.

    LOL @jonathanm...TRUE! It doesn't make sense and I hope no one would ever try it -- especially on a production box. But since this was on a test box, had I have lost the contents, it would not have really mattered. 

     

    FWIW, I needed an extra 4TB of space on a mac and didn't have any drives laying around large enough to complete the project. (whatever data I had on that drive, has been restored since I ensured unRAID understood that drive was 'lost' and it rebuilt the contents from parity.)

     

    My apologies to everyone if I've wasted time on a non-issue, but it seemed like a way to improve the product to "save the user from themselves" for those of us who suffer from 1D10T errors.

    Link to comment

    As noted, if you had started the array, it would have said the disk was unmountable. That would have told you something was wrong. But of course, you already knew something was wrong.

     

    How about this scenario instead. You put the disk into another Linux computer, and you actually write something to it. In that case, it will still mount, but it is out of sync with parity, and if you didn't know you had to rebuild parity, then you could have just continued on, and if you never corrected parity, then you wouldn't be able to rebuild another disk.

     

    Short of a parity check, how can Unraid know you haven't done anything to any of your array disks when it was shutdown? And even then, it won't know why parity is out-of-sync. How could it?

    Link to comment
    22 minutes ago, Joseph said:

    it seemed like a way to improve the product to "save the user from themselves" for those of us who suffer from 1D10T errors.

    Well, to be blunt, if you try to 1D10T proof everything, you will lose functionality, performance, and waste developer time that could be better spent elsewhere.

     

    I suppose the best way to handle your specific issue is a warning message when you start the array, similar to the question the ticket counter agent asks when you present your baggage, has anyone tampered with your bags without your knowledge?

    Link to comment
    22 minutes ago, trurl said:

    Short of a parity check, how can Unraid know you haven't done anything to any of your array disks when it was shutdown? And even then, it won't know why parity is out-of-sync. How could it?

    Perhaps it would know the file system is no longer the same as it once was and/or the power on hours from SMART have changed since it was last shut down? IDK

    Edited by Joseph
    Link to comment
    9 minutes ago, jonathanm said:

    I suppose the best way to handle your specific issue is a warning message when you start the array, similar to the question the ticket counter agent asks when you present your baggage, has anyone tampered with your bags without your knowledge?

    And of course, everyone would ignore it. Or complain about it.

    Link to comment
    7 minutes ago, jonathanm said:

    I suppose the best way to handle your specific issue is a warning message when you start the array...

    That's kinda what I was thinking. But like you said, developer time might be better spent on other more important issues. 

    Link to comment
    4 minutes ago, trurl said:

    And of course, everyone would ignore it. Or complain about it.

    True, if it's generic and wasn't specific to an actual change in the array.

    Link to comment


    Join the conversation

    You can post now and register later. If you have an account, sign in now to post with your account.
    Note: Your post will require moderator approval before it will be visible.

    Guest
    Add a comment...

    ×   Pasted as rich text.   Restore formatting

      Only 75 emoji are allowed.

    ×   Your link has been automatically embedded.   Display as a link instead

    ×   Your previous content has been restored.   Clear editor

    ×   You cannot paste images directly. Upload or insert images from URL.


  • Status Definitions

     

    Open = Under consideration.

     

    Solved = The issue has been resolved.

     

    Solved version = The issue has been resolved in the indicated release version.

     

    Closed = Feedback or opinion better posted on our forum for discussion. Also for reports we cannot reproduce or need more information. In this case just add a comment and we will review it again.

     

    Retest = Please retest in latest release.


    Priority Definitions

     

    Minor = Something not working correctly.

     

    Urgent = Server crash, data loss, or other showstopper.

     

    Annoyance = Doesn't affect functionality but should be fixed.

     

    Other = Announcement or other non-issue.