Downgraded back to 6.6.7 due to Sqlite corruption


Recommended Posts

I think with complex issues like this, we need to be scientific and methodological instead of having anyone and everyone reporting problem and telling each other to try this or that.

 

How about this - for anyone who reports the problem, also report:

  1. What CPU? How much RAM? Array config?
  2. Roughly how large is your collection? I think file count, even a rough estimate, is more important here.
  3. Have you set your appdata to /mnt/cache (or for those without cache, /mnt/disk1)? If you haven't, we'll ignore you.
  4. Do you have a link between Sonarr and Plex? If yes, have you disable it? If you haven't, we'll ignore you.
  5. Do you have automatic library update on change / partial change? If yes, have you set it to hourly? If you haven't we'll ignore you.
  6. This is more controversial. Can you rebuild your db from scratch?
  7. <add more points as things progress>

The key idea is to get all affected users within sufficiently small boundary that a clear pattern can emerge from all the noise. Perhaps we should have limetech have a separate topic with the first post updating the details for each user reporting the issue.

 

I know the "we'll ignore you" seems harsh but adding noisy info can be worse than not having the info. And to brutally honest, if you can't be bothered to help yourself, we can't be bothered to help you.

 

 

Now comes the hypothesizing:

Reading through this topic again, here is how I would summarize it

  1. The issue affects a minority of users and not others
  2. The db corruption looks have no clear pattern (and not reproducible to those not affected e.g. limetech)
  3. Having a cache disk and setting Plex appdata to /mnt/cache seems to help with some users but not others
  4. Cutting the link between Plex and Sonarr seems to help
  5. Reducing Plex library scan frequency seems to help

 

Based on the above 5 points, could it be that the affected users already have existing corruption with their db?

  • That kinda would explain (1) and (2) i.e. why Limetech can't reproduce the issue because, hardware idiosyncrasies aside, the key difference is that they would either start from an good db or rebuild a brand new db which naturally makes it good.
  • That would also explain why (4) and (5) help because fewer interactions reduce the probability of accessing the bad portion of the db.
  • It's harder to explain (3), unless (3) was the cause of the original db corruption. Setting the db on /mnt/cache skips shfs, which otherwise can be a bit resource hungry. Perhaps the slower performance causes some writes to be done out of order which can corrupt db.
  • This fits even more with why 6.6.7 is good but 6.7.0 isn't. Perhaps all the security fixes slow things down just enough to pass the threshold that would lead to corruption.
  • Upvote 2
Link to comment

There is another topic on the subject but I'm giving up on that one:

 

We got to the point where OP was convinced that appdata/plex residing on a single-disk btrfs cache device and mapped using path:

/mnt/cache/appdata/plex

was stable.

What would be helpful is a) others to confirm, and then b) after confirming this exact config is stable, simply change path to:

/mnt/user/appdata/plex

 

This will tell me if simply passing I/O through 'shfs' layer is introducing this issue.  If it starts to fail, then there are other tests to try.

  • Like 1
  • Upvote 2
Link to comment
  1. ASUSTeK Computer INC. P6T DELUXE V2, Version Rev 1.xx -

    Intel® Core™ i7 CPU 920 @ 2.67GHz - 12GB of RAM, 5 8TB HDs (2par, 3 default array)

  2. New install... I moved 3 movies, 2 DVRed shows... and 1k pictues
  3. Set to/mnt/disk1/appdata/binhex-plex/  Previously it was set to Default that was causing SQL issues few times a day. Currently at 24hrs and no corruption.
  4. Only Plex on the system, this is NEW install. (also, Community Applications, Fix Common Problems and Unassigned Devices)
  5. Set to do hourly.. I played with this setting but saw no change to corruption frequency.
  6. Did a number of times.

 

Well.., update.

DB corrupted again.

Edited by dimitriz
Link to comment
36 minutes ago, runraid said:

@jonathanm /mnt/disk3/appdata/plex

First, make sure plex is not set to auto start.

Assuming your appdata folder will fit on your cache drive, then yes, you can just use the mover.

Set the appdata share to Cache: Prefer.

Before you run the mover, you MUST be sure there are no open files in the appdata tree, as mover won't touch open files. The easiest way to do that is to be sure the Docker service is not running, so there is no Docker item visible in the GUI list of pages. Settings, Docker, enable Docker: NO.

 

After the mover is done, enable the docker service. Before starting plex, edit the config path in the plex docker so instead of /mnt/disk3/appdata/plex it's /mnt/cache/appdata/plex

 

Verify stability in that state before moving on, as that is the premise behind the test.

  • Upvote 1
Link to comment

it's been a week now here since I added a cache pool, moved the appdata to cache only and set it direct io to "no"... I keep randomly checking the consistency of the database, it's still coming back "ok".   Cautiously optimistic.  I've added a number of videos and recorded some via dvr, no problems as of yet.

 

I am on a new install, built the machine about 4 weeks ago.

 

 

What CPU? How much RAM? Array config?

-Asrock H370ITX/ac with i3-8300

-32GB DDR4 (2x 16GB sticks)

-6x SATA 4TB Seagate Constellation Drives (double parity array, 16TB useable)

-2x Intel 660p 512GB NVMe (BTRFS pool, raid 1)

 

Roughly how large is your collection? I think file count, even a rough estimate, is more important here.

- approximately 26500 files in plex collection across 5 libraries

 

Have you set your appdata to /mnt/cache (or for those without cache, /mnt/disk1)? If you haven't, we'll ignore you.

-Was set to /mnt/user/appdata originally.  This corrupted in less than 24 hours usually.

-I moved appdata and system to all one drive, mapped appdata in plex docker to /mnt/disk2/appdata.  This was better, would last about 72 hours.

-I then moved the transcode folder to a tmpfs drive of 24GB max since I noticed corruptions mainly around larger DVR recording nights (5 or more shows).  This lasted 5 days before I saw corruption.

-I added the cache array (didn't have it before) and then moved the appdata and system entirely to cache only.  I set the docker to /mnt/cache/appdata.  I also, at the same time, set the direct io from "auto" to "no".  It has been 1 week now with no corruption.  I'm not trusting it at all.

 

Do you have a link between Sonarr and Plex? If yes, have you disable it? If you haven't, we'll ignore you.

I only use plex.  The goal was to switch everything to unraid, but this issue has hindered that effort greatly.  Until this is stable, I have only plex and syncthing dockers.

 

Do you have automatic library update on change / partial change? If yes, have you set it to hourly? If you haven't we'll ignore you.

Yes.  all check boxes are checked in plex except to music in automatic updates.

library scan interval is 24 hours.

 

This is more controversial. Can you rebuild your db from scratch?

This was a from scratch build.  I've run Plex for quite some time, never ever saw corruption before coming to unraid 4 weeks ago.  Still on trial due to this issue alone.  If I rebuild again it will not be on unraid, I'll probably go back to Debian or OMV, this is mainly due to the annoyances and monotony with redoing DVR.

 

<add more points as things progress> 

I am mapping over /dev/dri to the container for intel gpu decoding in plex.  Not sure who else is doing this, not sure if it's related.

Edited by Abzstrak
spelling
Link to comment

Jun 20, 2019 23:17:05.918 [0x148cc63f1700] DEBUG - Skipping over directory '1993/Cin1993', as nothing has changed; removing 16 media items from map.
Jun 20, 2019 23:17:05.980 [0x148cc63f1700] WARN - Scanning the location /media/Pictures did not complete
Jun 20, 2019 23:17:05.980 [0x148cc63f1700] DEBUG - Since it was an incomplete scan, we are not going to whack missing media.
Jun 20, 2019 23:17:06.077 [0x148cc63f1700] DEBUG - Refreshing section 1 of type: 13
Jun 20, 2019 23:17:06.437 [0x148d582c9700] DEBUG - Refreshing 0 IDs.
Jun 20, 2019 23:17:08.104 [0x148cc63f1700] ERROR - SQLITE3:(nil), 11, database corruption at line 64873 of [bf8c1b2b7a]
Jun 20, 2019 23:17:08.104 [0x148cc63f1700] ERROR - SQLITE3:(nil), 11, statement aborts at 19: [select distinct(metadata_items.id)  from metadata_items  where metadata_items.library_section_id=1 and metadata_items.metadata_type in (14) and (length(user_thumb_url)=0 or length(user
Jun 20, 2019 23:17:08.104 [0x148cc63f1700] ERROR - LibraryUpdater: exception updating libraries; pausing updates briefly before retrying: sqlite3_statement_backend::loadRS: database disk image is malformed
Jun 20, 2019 23:17:14.922 [0x148d584ca700] DEBUG - Completed: [192.168.1.121:49230] -2 GET /player/proxy/poll?deviceClass=pc&protocolVersion=3&protocolCapabilities=timeline%2Cplayback%2Cnavigation%2Cmirror%2Cplayqueues&timeout=1 (6 live) TLS GZIP 20001ms 5 bytes (pipelined: 48)

Link to comment
8 hours ago, Abzstrak said:

it's been a week now here since I added a cache pool, moved the appdata to cache only and set it direct io to "no"... I keep randomly checking the consistency of the database, it's still coming back "ok".   Cautiously optimistic.  I've added a number of videos and recorded some via dvr, no problems as of yet. 

Now, the critical question becomes, are you willing to make changes, one at a time, and attempt to force the corruption issue to return? It would be very helpful for troubleshooting to have a system where you can make it unstable at will.

 

If so, the first thing to test is the mapped path for the plex config, change it from /mnt/cache/appdata to /mnt/user/appdata. Make NO other simultaneous changes.

Link to comment
3 hours ago, jonathanm said:

Now, the critical question becomes, are you willing to make changes, one at a time, and attempt to force the corruption issue to return? It would be very helpful for troubleshooting to have a system where you can make it unstable at will.

 

If so, the first thing to test is the mapped path for the plex config, change it from /mnt/cache/appdata to /mnt/user/appdata. Make NO other simultaneous changes.

Perhaps, but I don't think 1 week is sufficient to determine stability.  I don't trust it yet

Link to comment
2 minutes ago, Abzstrak said:

Perhaps, but I don't think 1 week is sufficient to determine stability.  I don't trust it yet

Well, for purposes of troubleshooting, can you establish a metric with which you will be comfortable? I.E, if the longest period of time between corruption events was 5 days before this change, would you be happy it was stable at double or triple that number?

 

I understand you don't want to do unnecessary work, or provide results that aren't accurate, but at some point you have to make the call if you wish to further help with troubleshooting.

Link to comment

Ok, trying to move the plex appdata to /mnt/cache did something weird. I had to start over. I lost a few years of history :/ 

 

I'm on /mnt/cache now but it'll have to regenerate the database. Plex will need to scan the library and download all metadata again. Fresh start 😞 

  • Like 1
Link to comment
11 minutes ago, runraid said:

Ok, trying to move the plex appdata to /mnt/cache did something weird. I had to start over. I lost a few years of history :/ 

 

I'm on /mnt/cache now but it'll have to regenerate the database. Plex will need to scan the library and download all metadata again. Fresh start 😞 

 

You can use trakt.tv to keep track of your watch status.

 

If you moved the appdata folder using root and not preserving attributes, you get problems.

How did you move it?

  • Like 1
Link to comment

@saarg before I knew about "mover" I had did a "cp" to cache awhile back via ssh as root.

 

Lastnight I started the move with "mover" but the existing files screwed it up. At that point I was in a mixed state.

 

I'm just starting fresh. I just want to get past this corruption and will do whatever it takes now.

Link to comment
42 minutes ago, runraid said:

@saarg before I knew about "mover" I had did a "cp" to cache awhile back via ssh as root.

 

Lastnight I started the move with "mover" but the existing files screwed it up. At that point I was in a mixed state.

 

I'm just starting fresh. I just want to get past this corruption and will do whatever it takes now.

That sounds like the reason for the issues. Some of the old files already in the cache folder and with root permissions. For future use, cp -a preserves the attributes.

Probably best to just start over from scratch.

Link to comment
On 6/15/2019 at 11:27 AM, runraid said:

@Brian H. your experience mirrors mine. #2, moving to /mnt/disk only buys me about a week before things corrupt — and it seems all SQLite db in multiple containers corrupt at once. 

 

7 hours ago, runraid said:

I had did a "cp" to cache awhile back via ssh as root.

 

To assist @limetech in solving this, is it fair to say that your previous quote about /mnt/disk/ not working is no longer valid?

Link to comment

   I have never experienced Corruption. Unraid Version 6.7.0

1. M/B: Supermicro X11SSH-CTF Version 1.01 - s/n: ZM17BS015333 BIOS: American Megatrends Inc. Version 2.2. Dated: 05/23/2018

   CPU: Intel® Xeon® CPU E3-1220 v5 @ 3.00GHz HVM: Enabled IOMMU: Enabled Cache: 128 KiB, 128 KiB, 1024 KiB, 8192 KiB

   Memory: 16 GiB DDR4 Single-bit ECC (max. installable capacity 64 GiB)

   Network: eth0: 1000 Mbps, full duplex, mtu 1500
   eth1: 1000 Mbps, full duplex, mtu 1500

   Kernel: Linux 4.19.41-Unraid x86_64

   OpenSSL: 1.1.1b

 

2. 59,031 Files total. Family Videos Pictures and other files.  Using 6.13TBs

3. Been using /mnt/cache/appdata since I first installed a docker been almost a year.

4. Binhex-Plexpass. With subscription.. Just for more info. Tautulli Docker and Unifi docker. Thats all the dockers I currently have installed.

5. Do you have automatic library update on change / partial change? If yes, I had it set to daily, and my time was 3am to 5am.

6. No as my DB is currently fine. I am only trying to assist by providing more information.

    Dual Parity 1# 4TB WD Red and 1# 8TB Red Pro. Still working on the other 8TB Red Pro.

3 Disks in Pool 1# 3TB WD Red and 2# 4TB WD Reds.

   I can only hope this information will help.

Edited by AinsWorth
Link to comment
3 hours ago, runraid said:

F#%* Because I upgraded to 6.7.1-rc2 I can’t downgrade below 6.7.0. I can no longer user plex. I have no idea what to do now. This is a very bad experiences. I’m seeing more people suffering from this today. 

The GUI only lets you go back one step so in that sense you are right.  However you can always do this manually by downloading the ZIP file of the release you want from the Unraid site and then extract all the bz* type files overwriting the ones on the flash drive.

Link to comment
7 hours ago, runraid said:

@limetech @Squid I started fresh. I moved to /mnt/cache/appdata/plex and it’s been scanning media since this morning. The database is already corrupt. I’m rolling back unraid versions, the current version is too unstable to be used for many of us. 

Are you sure that you don't have any bad hardware? From what I have seen, you are the one that have most corruption of sqlite databases.

Did you post your hardware config?

And have you been running memtest?

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.