Downgraded back to 6.6.7 due to Sqlite corruption

testdasi · June 20, 2019

I think with complex issues like this, we need to be scientific and methodological instead of having anyone and everyone reporting problem and telling each other to try this or that.

How about this - for anyone who reports the problem, also report:

What CPU? How much RAM? Array config?
Roughly how large is your collection? I think file count, even a rough estimate, is more important here.
Have you set your appdata to /mnt/cache (or for those without cache, /mnt/disk1)? If you haven't, we'll ignore you.
Do you have a link between Sonarr and Plex? If yes, have you disable it? If you haven't, we'll ignore you.
Do you have automatic library update on change / partial change? If yes, have you set it to hourly? If you haven't we'll ignore you.
This is more controversial. Can you rebuild your db from scratch?
<add more points as things progress>

The key idea is to get all affected users within sufficiently small boundary that a clear pattern can emerge from all the noise. Perhaps we should have limetech have a separate topic with the first post updating the details for each user reporting the issue.

I know the "we'll ignore you" seems harsh but adding noisy info can be worse than not having the info. And to brutally honest, if you can't be bothered to help yourself, we can't be bothered to help you.

Now comes the hypothesizing:

Reading through this topic again, here is how I would summarize it

The issue affects a minority of users and not others
The db corruption looks have no clear pattern (and not reproducible to those not affected e.g. limetech)
Having a cache disk and setting Plex appdata to /mnt/cache seems to help with some users but not others
Cutting the link between Plex and Sonarr seems to help
Reducing Plex library scan frequency seems to help

Based on the above 5 points, could it be that the affected users already have existing corruption with their db?

That kinda would explain (1) and (2) i.e. why Limetech can't reproduce the issue because, hardware idiosyncrasies aside, the key difference is that they would either start from an good db or rebuild a brand new db which naturally makes it good.
That would also explain why (4) and (5) help because fewer interactions reduce the probability of accessing the bad portion of the db.
It's harder to explain (3), unless (3) was the cause of the original db corruption. Setting the db on /mnt/cache skips shfs, which otherwise can be a bit resource hungry. Perhaps the slower performance causes some writes to be done out of order which can corrupt db.
This fits even more with why 6.6.7 is good but 6.7.0 isn't. Perhaps all the security fixes slow things down just enough to pass the threshold that would lead to corruption.

limetech · June 20, 2019

There is another topic on the subject but I'm giving up on that one:

We got to the point where OP was convinced that appdata/plex residing on a single-disk btrfs cache device and mapped using path:

/mnt/cache/appdata/plex

was stable.

What would be helpful is a) others to confirm, and then b) after confirming this exact config is stable, simply change path to:

/mnt/user/appdata/plex

This will tell me if simply passing I/O through 'shfs' layer is introducing this issue. If it starts to fail, then there are other tests to try.

runraid · June 20, 2019

Thank you very much for the update @limetech

whats the the best way to move data to /mnt/cache/appdata/plex? Do we simply do a cp? I thought I read somewhere we should use a “mover”. Just wanted to Confirm then I’ll do this test.

JonathanM · June 20, 2019

19 minutes ago, runraid said:

whats the the best way to move data to /mnt/cache/appdata/plex? Do we simply do a cp?

What is the path currently?

runraid · June 20, 2019

@jonathanm /mnt/disk3/appdata/plex

dimitriz · June 20, 2019

ASUSTeK Computer INC. P6T DELUXE V2, Version Rev 1.xx -
Intel® Core™ i7 CPU 920 @ 2.67GHz - 12GB of RAM, 5 8TB HDs (2par, 3 default array)
New install... I moved 3 movies, 2 DVRed shows... and 1k pictues
Set to: /mnt/disk1/appdata/binhex-plex/ Previously it was set to Default that was causing SQL issues few times a day. Currently at 24hrs and no corruption.
Only Plex on the system, this is NEW install. (also, Community Applications, Fix Common Problems and Unassigned Devices)
Set to do hourly.. I played with this setting but saw no change to corruption frequency.
Did a number of times.

Well.., update.

DB corrupted again.

Edited June 21, 2019 by dimitriz

JonathanM · June 20, 2019

36 minutes ago, runraid said:

@jonathanm /mnt/disk3/appdata/plex

First, make sure plex is not set to auto start.

Assuming your appdata folder will fit on your cache drive, then yes, you can just use the mover.

Set the appdata share to Cache: Prefer.

Before you run the mover, you MUST be sure there are no open files in the appdata tree, as mover won't touch open files. The easiest way to do that is to be sure the Docker service is not running, so there is no Docker item visible in the GUI list of pages. Settings, Docker, enable Docker: NO.

After the mover is done, enable the docker service. Before starting plex, edit the config path in the plex docker so instead of /mnt/disk3/appdata/plex it's /mnt/cache/appdata/plex

Verify stability in that state before moving on, as that is the premise behind the test.

Abzstrak · June 21, 2019

it's been a week now here since I added a cache pool, moved the appdata to cache only and set it direct io to "no"... I keep randomly checking the consistency of the database, it's still coming back "ok". Cautiously optimistic. I've added a number of videos and recorded some via dvr, no problems as of yet.

I am on a new install, built the machine about 4 weeks ago.

What CPU? How much RAM? Array config?

-Asrock H370ITX/ac with i3-8300

-32GB DDR4 (2x 16GB sticks)

-6x SATA 4TB Seagate Constellation Drives (double parity array, 16TB useable)

-2x Intel 660p 512GB NVMe (BTRFS pool, raid 1)

Roughly how large is your collection? I think file count, even a rough estimate, is more important here.

- approximately 26500 files in plex collection across 5 libraries

Have you set your appdata to /mnt/cache (or for those without cache, /mnt/disk1)? If you haven't, we'll ignore you.

-Was set to /mnt/user/appdata originally. This corrupted in less than 24 hours usually.

-I moved appdata and system to all one drive, mapped appdata in plex docker to /mnt/disk2/appdata. This was better, would last about 72 hours.

-I then moved the transcode folder to a tmpfs drive of 24GB max since I noticed corruptions mainly around larger DVR recording nights (5 or more shows). This lasted 5 days before I saw corruption.

-I added the cache array (didn't have it before) and then moved the appdata and system entirely to cache only. I set the docker to /mnt/cache/appdata. I also, at the same time, set the direct io from "auto" to "no". It has been 1 week now with no corruption. I'm not trusting it at all.

Do you have a link between Sonarr and Plex? If yes, have you disable it? If you haven't, we'll ignore you.

I only use plex. The goal was to switch everything to unraid, but this issue has hindered that effort greatly. Until this is stable, I have only plex and syncthing dockers.

Do you have automatic library update on change / partial change? If yes, have you set it to hourly? If you haven't we'll ignore you.

Yes. all check boxes are checked in plex except to music in automatic updates.

library scan interval is 24 hours.

This is more controversial. Can you rebuild your db from scratch?

This was a from scratch build. I've run Plex for quite some time, never ever saw corruption before coming to unraid 4 weeks ago. Still on trial due to this issue alone. If I rebuild again it will not be on unraid, I'll probably go back to Debian or OMV, this is mainly due to the annoyances and monotony with redoing DVR.

<add more points as things progress> 

I am mapping over /dev/dri to the container for intel gpu decoding in plex. Not sure who else is doing this, not sure if it's related.

Edited June 21, 2019 by Abzstrak
spelling

dimitriz · June 21, 2019

Jun 20, 2019 23:17:05.918 [0x148cc63f1700] DEBUG - Skipping over directory '1993/Cin1993', as nothing has changed; removing 16 media items from map.
Jun 20, 2019 23:17:05.980 [0x148cc63f1700] WARN - Scanning the location /media/Pictures did not complete
Jun 20, 2019 23:17:05.980 [0x148cc63f1700] DEBUG - Since it was an incomplete scan, we are not going to whack missing media.
Jun 20, 2019 23:17:06.077 [0x148cc63f1700] DEBUG - Refreshing section 1 of type: 13
Jun 20, 2019 23:17:06.437 [0x148d582c9700] DEBUG - Refreshing 0 IDs.
Jun 20, 2019 23:17:08.104 [0x148cc63f1700] ERROR - SQLITE3:(nil), 11, database corruption at line 64873 of [bf8c1b2b7a]
Jun 20, 2019 23:17:08.104 [0x148cc63f1700] ERROR - SQLITE3:(nil), 11, statement aborts at 19: [select distinct(metadata_items.id) from metadata_items where metadata_items.library_section_id=1 and metadata_items.metadata_type in (14) and (length(user_thumb_url)=0 or length(user
Jun 20, 2019 23:17:08.104 [0x148cc63f1700] ERROR - LibraryUpdater: exception updating libraries; pausing updates briefly before retrying: sqlite3_statement_backend::loadRS: database disk image is malformed
Jun 20, 2019 23:17:14.922 [0x148d584ca700] DEBUG - Completed: [192.168.1.121:49230] -2 GET /player/proxy/poll?deviceClass=pc&protocolVersion=3&protocolCapabilities=timeline%2Cplayback%2Cnavigation%2Cmirror%2Cplayqueues&timeout=1 (6 live) TLS GZIP 20001ms 5 bytes (pipelined: 48)

JonathanM · June 21, 2019

8 hours ago, Abzstrak said:

it's been a week now here since I added a cache pool, moved the appdata to cache only and set it direct io to "no"... I keep randomly checking the consistency of the database, it's still coming back "ok". Cautiously optimistic. I've added a number of videos and recorded some via dvr, no problems as of yet.

Now, the critical question becomes, are you willing to make changes, one at a time, and attempt to force the corruption issue to return? It would be very helpful for troubleshooting to have a system where you can make it unstable at will.

If so, the first thing to test is the mapped path for the plex config, change it from /mnt/cache/appdata to /mnt/user/appdata. Make NO other simultaneous changes.

Abzstrak · June 21, 2019

3 hours ago, jonathanm said:

Now, the critical question becomes, are you willing to make changes, one at a time, and attempt to force the corruption issue to return? It would be very helpful for troubleshooting to have a system where you can make it unstable at will.

If so, the first thing to test is the mapped path for the plex config, change it from /mnt/cache/appdata to /mnt/user/appdata. Make NO other simultaneous changes.

Perhaps, but I don't think 1 week is sufficient to determine stability. I don't trust it yet

JonathanM · June 21, 2019

2 minutes ago, Abzstrak said:

Perhaps, but I don't think 1 week is sufficient to determine stability. I don't trust it yet

Well, for purposes of troubleshooting, can you establish a metric with which you will be comfortable? I.E, if the longest period of time between corruption events was 5 days before this change, would you be happy it was stable at double or triple that number?

I understand you don't want to do unnecessary work, or provide results that aren't accurate, but at some point you have to make the call if you wish to further help with troubleshooting.

runraid · June 22, 2019

Ok, trying to move the plex appdata to /mnt/cache did something weird. I had to start over. I lost a few years of history

I'm on /mnt/cache now but it'll have to regenerate the database. Plex will need to scan the library and download all metadata again. Fresh start 😞

saarg · June 22, 2019

11 minutes ago, runraid said:

Ok, trying to move the plex appdata to /mnt/cache did something weird. I had to start over. I lost a few years of history

I'm on /mnt/cache now but it'll have to regenerate the database. Plex will need to scan the library and download all metadata again. Fresh start 😞

You can use trakt.tv to keep track of your watch status.

If you moved the appdata folder using root and not preserving attributes, you get problems.

How did you move it?

runraid · June 22, 2019

@saarg before I knew about "mover" I had did a "cp" to cache awhile back via ssh as root.

Lastnight I started the move with "mover" but the existing files screwed it up. At that point I was in a mixed state.

I'm just starting fresh. I just want to get past this corruption and will do whatever it takes now.

saarg · June 22, 2019

42 minutes ago, runraid said:

@saarg before I knew about "mover" I had did a "cp" to cache awhile back via ssh as root.

Lastnight I started the move with "mover" but the existing files screwed it up. At that point I was in a mixed state.

I'm just starting fresh. I just want to get past this corruption and will do whatever it takes now.

That sounds like the reason for the issues. Some of the old files already in the cache folder and with root permissions. For future use, cp -a preserves the attributes.

Probably best to just start over from scratch.

runraid · June 22, 2019

@saarg yup, starting fresh. No media was lost. Just meta data. Not so bad. Lesson learned.

Btw, thank you all for the continued effort in finding a fix for this.

Squid · June 22, 2019

On 6/15/2019 at 11:27 AM, runraid said:

@Brian H. your experience mirrors mine. #2, moving to /mnt/disk only buys me about a week before things corrupt — and it seems all SQLite db in multiple containers corrupt at once.

7 hours ago, runraid said:

I had did a "cp" to cache awhile back via ssh as root.

To assist @limetech in solving this, is it fair to say that your previous quote about /mnt/disk/ not working is no longer valid?

runraid · June 22, 2019

No. I ran on /mnt/disk without touching those files via the cp command. It corrupts every few days while on /mnt/disk. I’m on the cache drive as of this morning and I’ll keep you all posted how that goes. @Squid

runraid · June 23, 2019

@limetech @Squid I started fresh. I moved to /mnt/cache/appdata/plex and it’s been scanning media since this morning. The database is already corrupt. I’m rolling back unraid versions, the current version is too unstable to be used for many of us.

runraid · June 23, 2019

F#%* Because I upgraded to 6.7.1-rc2 I can’t downgrade below 6.7.0. I can no longer user plex. I have no idea what to do now. This is a very bad experiences. I’m seeing more people suffering from this today.

runraid · June 23, 2019

Would it be helpful if I let one of the Limetech employees ssh to my server to debug?

AinsWorth · June 23, 2019

I have never experienced Corruption. Unraid Version 6.7.0

1. M/B: Supermicro X11SSH-CTF Version 1.01 - s/n: ZM17BS015333 BIOS: American Megatrends Inc. Version 2.2. Dated: 05/23/2018

CPU: Intel® Xeon® CPU E3-1220 v5 @ 3.00GHz HVM: Enabled IOMMU: Enabled Cache: 128 KiB, 128 KiB, 1024 KiB, 8192 KiB

Memory: 16 GiB DDR4 Single-bit ECC (max. installable capacity 64 GiB)

Network: eth0: 1000 Mbps, full duplex, mtu 1500
eth1: 1000 Mbps, full duplex, mtu 1500

Kernel: Linux 4.19.41-Unraid x86_64

OpenSSL: 1.1.1b

2. 59,031 Files total. Family Videos Pictures and other files. Using 6.13TBs

3. Been using /mnt/cache/appdata since I first installed a docker been almost a year.

4. Binhex-Plexpass. With subscription.. Just for more info. Tautulli Docker and Unifi docker. Thats all the dockers I currently have installed.

5. Do you have automatic library update on change / partial change? If yes, I had it set to daily, and my time was 3am to 5am.

6. No as my DB is currently fine. I am only trying to assist by providing more information.

Dual Parity 1# 4TB WD Red and 1# 8TB Red Pro. Still working on the other 8TB Red Pro.

3 Disks in Pool 1# 3TB WD Red and 2# 4TB WD Reds.

I can only hope this information will help.

Edited June 23, 2019 by AinsWorth

itimpi · June 23, 2019

3 hours ago, runraid said:

F#%* Because I upgraded to 6.7.1-rc2 I can’t downgrade below 6.7.0. I can no longer user plex. I have no idea what to do now. This is a very bad experiences. I’m seeing more people suffering from this today.

The GUI only lets you go back one step so in that sense you are right. However you can always do this manually by downloading the ZIP file of the release you want from the Unraid site and then extract all the bz* type files overwriting the ones on the flash drive.

saarg · June 23, 2019

7 hours ago, runraid said:

@limetech @Squid I started fresh. I moved to /mnt/cache/appdata/plex and it’s been scanning media since this morning. The database is already corrupt. I’m rolling back unraid versions, the current version is too unstable to be used for many of us.

Are you sure that you don't have any bad hardware? From what I have seen, you are the one that have most corruption of sqlite databases.

Did you post your hardware config?

And have you been running memtest?

Downgraded back to 6.6.7 due to Sqlite corruption

Recommended Posts

Link to comment

Top Posters In This Topic

Popular Days

Top Posters In This Topic

Popular Days

Popular Posts

Squid

JonathanM

binhex

Posted Images

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Join the conversation