• SQLite Data Corruption testing


    limetech

    tldr: Starting with 6.8.0-rc2 please visit Settings/Disk Settings and change the 'Tunable (scheduler)' to 'none'.  Then run with SQLite DB files located on array disk shares and report whether your databases still become corrupted.

     

    When we first started looking into this issue one of the first things I ran across was this monster topic:
    https://bugzilla.kernel.org/show_bug.cgi?id=201685

    and related patch discussion:
    https://patchwork.kernel.org/patch/10712695/


    This bug is very very similar to what we're seeing.  In addition Unraid 6.6.7 is on the last of the 4.18 kernels (4.18.20).  Unraid 6.7 is on 4.19 kernel and of course 6.8 is on 5.3 currently.  The SQLite DB Corruption bug also only started happening with 4.19 and so I don't think this is coincidence.

    In looking at the 5.3 code the patch above is not in the code; however, I ran across a later commit that reverted that patch and solved the bug a different way:
    https://www.spinics.net/lists/linux-block/msg34445.html

    That set of changes is in 5.3 code.

    I'm thinking perhaps their "fix" is not properly handling some I/O pattern that SQLite via md/unraid is generating.

     

    Before I go off and revert the kernel to 4.18.20, please test if setting the scheduler to 'none' makes any difference in whether databases become corrupted.

    • Like 1
    • Thanks 3



    User Feedback

    Recommended Comments



    Makes sense. Going to take awhile to check the backups and make sure they good but I think I could find a good one.

     

    Thank you so much for the help!!!!!!!!!!!!!!!!!!!!!

    Link to comment

    Makes sense. Its going to take awhile to check the backups but I know that I'll find one that is good.

     

     

    Thank you soooo much for the help!!!!!!!!!!!!!!

    Link to comment

    Okay, I'm reading here that it is safe to upgrade to 6.8.2? 

     

    Thank you guys for the work - seriously, sorry I had to stop contributing to the thread but it was very frustrating!

    Link to comment

    Soooo I know it's been quite a while and seems to have been fixed, but I'm running into this exact same problem with SQLite DB corruption. In my case I'm running the frigate (NVR) docker container.

     

    Is there a possibility that this issue is back? I just now changed all my volume mappings to address the disk directly instead of /mnt/user/appdata based on a hunch before reading this. I also disabled secondary storage now (so appdata should only sit on one NVME SSD, no secondary storage on the array).

     

    Also, I'm running ZFS on the NVME, which I first suspected as the culprit.

    Was there any working configuration that I could try? Also, did this really only happen to SQLite DBs, or was other data affected as well? I'm just migrating everything back to unRaid after running Synology for a few months (used Synology because of their integrated NVR). I was about to migrate my InfluxDB as well, but after seeing these corruption issues I was afraid to do so.

    I'm currently on unRaid 6.12.2

    Link to comment



    Join the conversation

    You can post now and register later. If you have an account, sign in now to post with your account.
    Note: Your post will require moderator approval before it will be visible.

    Guest
    Add a comment...

    ×   Pasted as rich text.   Restore formatting

      Only 75 emoji are allowed.

    ×   Your link has been automatically embedded.   Display as a link instead

    ×   Your previous content has been restored.   Clear editor

    ×   You cannot paste images directly. Upload or insert images from URL.


  • Status Definitions

     

    Open = Under consideration.

     

    Solved = The issue has been resolved.

     

    Solved version = The issue has been resolved in the indicated release version.

     

    Closed = Feedback or opinion better posted on our forum for discussion. Also for reports we cannot reproduce or need more information. In this case just add a comment and we will review it again.

     

    Retest = Please retest in latest release.


    Priority Definitions

     

    Minor = Something not working correctly.

     

    Urgent = Server crash, data loss, or other showstopper.

     

    Annoyance = Doesn't affect functionality but should be fixed.

     

    Other = Announcement or other non-issue.