• Parity disk change while invalid can cause erros if different size device is used


    JorgeB
    • Minor

    This bug likely exists for some time, guess it's a corner case, but an user ran into it today.

     

    How to reproduce:

     

    Say you have all 2TB data disks, upgrade parity to a larger disk, e.g. 3TB, start the array and cancel the parity sync, stop the array and replace the 3TB parity with a 2TB disk, start array and parity sync will start again but will still show the old 3TB size for total parity size (not the disk itself), then it will error out during the sync when it runs past the actual parity size with an error similar to this one:

     

    May 28 19:04:44 Tower9 kernel: attempt to access beyond end of device
    May 28 19:04:44 Tower9 kernel: sdc: rw=1, want=976773176, limit=976773168
    May 28 19:04:44 Tower9 kernel: md: disk0 write error, sector=976773104
    May 28 19:04:44 Tower9 kernel: attempt to access beyond end of device
    May 28 19:04:44 Tower9 kernel: sdc: rw=1, want=976773184, limit=976773168
    May 28 19:04:45 Tower9 kernel: md: disk0 write error, sector=976773112
    May 28 19:04:45 Tower9 kernel: md: recovery thread: exit status: -4

     

    This will result in parity disk being disabled, and the user will need to sync it again.

     

    I guess there will also be a problem if a small disk is used first and then replaced with a larger one, likely parity will say valid but it won't be synced past the end of the smaller device.

     

     

     

     

     

    • Upvote 1



    User Feedback

    Recommended Comments

    Quote

    I guess there will also be a problem if a small disk is used first and then replaced with a larger one, likely parity will say valid but it won't be synced past the end of the smaller device.

    I think I remember something like that happening, where the first parity check after a "successful" parity build ends up with thousands of errors. That could be nasty if a user happens to add a larger data disk in that state before parity is truly correct.

    Link to comment

    Did a quick test, mostly out of curiosity, and if a small device is used first it's as I suspected, parity sync finishes successfully and is reported as valid but it's only synced up to the size of the original device, so after that it will be out of sync (unless the disk was cleared).

     

    10 hours ago, jonathanm said:

    I think I remember something like that happening

     

    I also remember some cases that could have resulted from this bug, also cases where multiple users have reported similar issue (parity completely out of sync after a certain point) after doing a parity-swap, but can't see how it relates directly to this, so likely a different corner case/bug.

     

     

    Link to comment


    Join the conversation

    You can post now and register later. If you have an account, sign in now to post with your account.
    Note: Your post will require moderator approval before it will be visible.

    Guest
    Add a comment...

    ×   Pasted as rich text.   Restore formatting

      Only 75 emoji are allowed.

    ×   Your link has been automatically embedded.   Display as a link instead

    ×   Your previous content has been restored.   Clear editor

    ×   You cannot paste images directly. Upload or insert images from URL.


  • Status Definitions

     

    Open = Under consideration.

     

    Solved = The issue has been resolved.

     

    Solved version = The issue has been resolved in the indicated release version.

     

    Closed = Feedback or opinion better posted on our forum for discussion. Also for reports we cannot reproduce or need more information. In this case just add a comment and we will review it again.

     

    Retest = Please retest in latest release.


    Priority Definitions

     

    Minor = Something not working correctly.

     

    Urgent = Server crash, data loss, or other showstopper.

     

    Annoyance = Doesn't affect functionality but should be fixed.

     

    Other = Announcement or other non-issue.