• SQLite DB Corruption testers needed


    limetech
    • Closed

    9/17/2019 Update: may have got to the bottom of this.  Please try 6.7.3-rc3 available on the next branch.

    9/18/2019 Update: 6.7.3-rc4 is available to address Very Slow Array Concurrent Performance.

     

    re:

     

    Trying to get to the bottom of this...  First we have not been able to reproduce, which is odd because it implies there may be some kind of hardware/driver dependency with this issue.  Nevertheless I want to start a series of tests, which I know will be painful for some since every time DB corruption occurs, you have to go through lengthy rebuild process.  That said, we would really appreciate anyone's input during this time.

     

    The idea is that we are only going to change one thing at a time.  We can either start with 6.6.7 and start updating stuff until it breaks, or we can start with 6.7.2 and revert stuff until it's fixed.  Since my best guess at this point is that the issue is either with Linux kernel, docker, or something we have misconfigured (not one of a hundred other packages we updated), we are going to start with 6.7.2 code base and see if we can make it work.

     

    But actually, the first stab at this is not reverting anything, but rather first updating the Linux kernel to the latest 4.19 patch release which is 4.19.60 (6.7.2 uses kernel 4.19.55).  In skimming the kernel change logs, nothing jumps out as a possible fix, however I want to first try the easiest and least impactful change: update to latest 4.19 kernel.

     

    If this does not solve the problem (which I expect it won't), then we have two choices:

     

    1) update to latest Linux stable kernel (5.2.2) - we are using 5.2 kernel in Unraid 6.8-beta and so far no one has reported any sqlite DB corruption, though the sample set is pretty small.  The downside with this is, not all out-of-tree drivers yet build with 5.2 kernel and so some functionality would be lost.

     

    2) downgrade docker from 18.09.06 (version in 6.7.2) to 18.06.03-ce (version in 6.6.7).

    [BTW the latest Docker release 19.03.00 was just published today - people gripe about our release numbers, try making sense of Docker release numbers haha]

     

    If neither of those steps succeed then ... well let's hope one of them does succeed.

     

    To get started, first make a backup of your flash via Main/Flash/Flash Backup, and then switch to the 'next' branch via Tools/Upgrade OS page.  There you should see version 6.7.3-rc1

     

    As soon as a couple people report corruption I'll publish an -rc2, probably with reverted Docker.

    Edited by limetech

    • Upvote 5



    User Feedback

    Recommended Comments



    I wonder if this isn't caused by some bug in GCC or GLIBC or other library that changed in the newer unRaid versions., considering that reverting Docker versions has done nothing.

    Link to comment

    I've been debugging this solo, but came across this thread and wanted to share my experiences in case they might offer some pointers. Unraid has been bulletproof for me until now – I'd love to help you guys with this if I can!

     

    I've been having regular corruption over the past week or two. It first occurred on 6.7.0, and has definitely become worse on 6.7.2.

     

    A couple of days back I created two new shares – Plex AppData and Sonarr AppData, configured both to prefer the Cache (but exist on disk1 only), pointed both Plex and Sonarr there via /mnt/user/[app-specific share]/ and that seems to have potentially fixed the issue – the corruption was nightly before. I'll keep watching.

     

    The sqlite3 .dump and re-ingest trick has never worked for me when the databases have corrupted.

     

    I also noticed that when Sonarr's db hosed itself, Plex's performance for the TV library became abysmally slow – often timing out. Music and Movies were unaffected, and still loaded. As soon as I stopped the Sonarr container, Plex performance recovered.

     

    I've attached my diagnostics in case it's useful.

     

     

    tower-diagnostics-20190801-1307.zip

    Edited by JamieK
    • Upvote 1
    Link to comment

    So far so good... There's also a noticeable speed benefit (though that's to be expected I guess).

     

    Note that i'm effectively running my databases on the cache drive only, with fallback to disk1 if they get too full. This does mean you lose parity on them, though.

    Link to comment
    Quote

    update to latest Linux stable kernel (5.2.2) - we are using 5.2 kernel in Unraid 6.8-beta and so far no one has reported any sqlite DB corruption, though the sample set is pretty small.  The downside with this is, not all out-of-tree drivers yet build with 5.2 kernel and so some functionality would be lost.

    I am wondering when we can expect a RC version to test a kernel 5.x.x. I am curious if such a update would fix my problem with parity checks and network connectivity (tg3 driver) and the slowness of accessing shares (fused) with direct i/o on (turning direct i/o off restored speeds, but would be interesting how a new kernel behaves).

    Link to comment

    I have upgraded from 6.7.2 to 6.7.3-rc2 this morning.  I have continued to have corruption up until now...about every 2 days.  Both with plex, and with Sonarr.  

     

    I am currently running my app data from one disk.  

     

    I will let you know what happens. 

    Link to comment

    One week on, I'm seeing no issues using the disk1/cache-preferred setup I described above. Running an integrity check on both Sonarr and Plex's databases returns no errors; and everything seems stable.

    • Upvote 1
    Link to comment

    So I have used unraid in the past and stopped. Decided to go back to it, so booted up the machine and upgraded to the latest stable and got the Plex db corruption. At the time I was transferring some files from my Windows PC to the unraid media folder, there was one stream going on at the time of transfer. My Plex library is set to auto update with partial scan selected as well. There wasn't any crashing of any sort, just the error in the Plex dashboard, oh and I don't use a cache drive.

     

    So I deleted the db files and rebuilt from scratch since it was a small library anyway, and I've unchecked auto update & partial scan and so far there's no corruption issues. I'll try again tonight to confirm it.

     

    Here's my diagnostics file.

     

    Update: I upgraded to 6.7.3-rc2 and re-enabled plex auto update & partial scan. While streaming something from plex and also transfer some media files over, this time I did not encounter any problems. 

    unraid-diagnostics-20190807-2303.zip

    Edited by mraudi
    Added diagnostics file
    Link to comment

    Anyone still having this issue, please try the following:

     

    -Disable Plex auto scan.

    -Copy a few GBs of media to the array, to a share that will be scanned by Plex, but make sure it doesn't go to the same disk where the Plex appdata is located.

    -Start another large copy to the array, media or any other files, as long as it's writing directly to the array, if possible make it go to a different disk from the first media copy and Plex appdata, or try different combinations, while the copy in going on start a Plex library scan, see if doing it while there's array activity more easily triggers the database corruption.

     

     

    Edited by johnnie.black
    Link to comment

    If i move appdata to the cache drive, how do i stop the mover moving it off?

     

    Edit:

    The usual way in shares, sorry.

    Edited by TheBuz
    Being a moron
    Link to comment

    Not sure if this helps, but my Kodi databases and Sonarr which i have running on a physical windows machine also started corrupting around the same everybody was having problems here. Can sqllite push updates on its own?

    Edited by TheBuz
    • Upvote 1
    Link to comment
    6 hours ago, TheBuz said:

    Not sure if this helps, but my Kodi databases and Sonarr which i have running on a physical windows machine also started corrupting around the same everybody was having problems here. Can sqllite push updates on its own?

    Hi

     

    Are your Kodi and Sonarr databases located in you unRaid server shares and you are accessing them through the network?

    If so are you accessing them through a user share or a regular disk share?

     

     

    Link to comment
    17 minutes ago, simalex said:

    Hi

     

    Are your Kodi and Sonarr databases located in you unRaid server shares and you are accessing them through the network?

    If so are you accessing them through a user share or a regular disk share?

     

     

     

    No these are completely independent from Unraid or any network share. It's been happening on a few of my setups and test machines.

     

    Which makes me think something more widespread is going on with SQlite.

     

    But it could just be a coincidence...

     

    • Upvote 1
    Link to comment

    I'm running the 6.7.3-rc2 build, and found the Plex database corrupt this morning.  The last two times I was able to do the procedure to dump to a .sql file and repair.  I was not able to this morning.  I had to roll back plex to a database that was three days old.  

     

    My Appdata is on disk1.  Here are the diagnostics.  

     

     

    swissarmy-diagnostics-20190810-1707.zip

    Link to comment

    I had corruption in Sonarr and SABnzbd also.  Had to repair the databases just a few minutes ago.  Since I've already uploaded diags twice today, I didn't do it this time.   

    Link to comment

    First time post so please go easy.

     

    I've also got this issue after years of trouble-free usage of Unraid. Every time I think it's fixed, it lasts a couple days and then corrupts again. I tried to downgrade from 6.7.2 to 6.6.7 but the only downgrade option I have is to 6.7.0.

    Link to comment
    3 hours ago, liamuk said:

    downgrade from 6.7.2 to 6.6.7 but the only downgrade option I have is to 6.7.0.

    6.6.7 is available for download. Just backup your config folder, install 6.6.7, and restore config.

    Link to comment

    throwing my hat in the ring, I just got my unraid box BACK running (had to switch out motherboard). same shares / same unraid / same harddrives as before.

     

    • Booted up unraid (after it being offline for about 2 months) all my sqlite dockers had corruption.
    • Went ahead and updated to latest RC today (6.7.3-rc2)
    • Tried to recover SQL Lite databases... NO DICE
    • rebuilt  PLEX / SONARR / RADARR FROM SCRATCH..... (ugh...)
    • created a dedicated appdata under /mnt/disk1/specialappdata/

     

    Its been running for 8 hours no corruption.  If i see anything i will update here..  is there anymore detailed info you need ?

    Link to comment
    6 hours ago, trurl said:

    6.6.7 is available for download. Just backup your config folder, install 6.6.7, and restore config.

    Thanks @trurl for this. Downgrade worked and now seems stable with no more corrupted databases so far.

    Link to comment

    I'm wanting to throw my hand (and server) into helping with this but am a little confused on what I should start with. lol

    I've been having the corruption issues for some time now, but seems to be only affecting Radarr (it was affecting Sonarr also, but that stopped when I went to binhex's image..) Radarr it happens after a matter of hours on both linuxserver and binhex's.
    I've tried a whole new setup of Radarr and the same thing happens.

    I want to help resolve this as it's quite painful; and I have some free time on my hands.

    Link to comment

    Hi all following Badams statement in a previous post

    9 hours ago, Badams said:

    (it was affecting Sonarr also, but that stopped when I went to binhex's image..)

    I started wondering if the actual docker image used could also be related with the corruption issue as well (e.g. old version of any number of libraries)

     

    I myself am using the following docker images

     

    Sonarr : linuxserver

    Plex : limetech

     

    and have had the corruption issue in both Sonarr & Plex

    From what is available in the community apps we have the following options for docker images

    for Sonarr linuxserver or binhex

    for Plex : binhex (or binhex plexpass for those that have a plex pass account) or linuxserver or plexinc (can't seem to find limetech anymore)

     

    Can anyone else that does not have the SQLite corruption issue verify which docker image they are using?

     

    I have downgraded and am now working on 6.6.7 and am now thinking of upgrading to 6.7.2 just to run a test with the other available docker images.

     

     

    Link to comment
    12 hours ago, Badams said:

    I'm wanting to throw my hand (and server) into helping with this but am a little confused on what I should start with. lol

    I've been having the corruption issues for some time now, but seems to be only affecting Radarr (it was affecting Sonarr also, but that stopped when I went to binhex's image..) Radarr it happens after a matter of hours on both linuxserver and binhex's.
    I've tried a whole new setup of Radarr and the same thing happens.

    I want to help resolve this as it's quite painful; and I have some free time on my hands.

    Interesting you say that, because i just rebuilt earlier, and the only docker image that now has sqlite corruption is my RADARR image...

     

     

    2019-08-12 11_26_15-Tower_Docker.png

    2019-08-12 11_26_34-Tower_Docker.png

    tower-diagnostics-20190812-1630.zip

    Link to comment

    i couldnt test plex since im remote, but is this typically what the corruption looks like from the log file (retry db?)

    2019-08-12 11_44_37-Log for_ binhex-plex.png

    Link to comment



    Guest
    This is now closed for further comments

  • Status Definitions

     

    Open = Under consideration.

     

    Solved = The issue has been resolved.

     

    Solved version = The issue has been resolved in the indicated release version.

     

    Closed = Feedback or opinion better posted on our forum for discussion. Also for reports we cannot reproduce or need more information. In this case just add a comment and we will review it again.

     

    Retest = Please retest in latest release.


    Priority Definitions

     

    Minor = Something not working correctly.

     

    Urgent = Server crash, data loss, or other showstopper.

     

    Annoyance = Doesn't affect functionality but should be fixed.

     

    Other = Announcement or other non-issue.