SQLite DB Corruption testers needed

mdeabreu · August 19, 2019

Unfortunately, the Plex database got corruption while trying to rebuild the Radarr and Sonarr databases. I decided to revert to 6.6.7. I will keep an eye on this thread in case there is any more testing that I might be able to help with. Fingers crossed we can get to the root of this soon.

raerae1616 · August 19, 2019

I've been struggling with Radarr & Sonarr "malformed" database errors for weeks now also -- ever since trying to move them from my HTPC to my Unraid server (seemed like a great idea, now not-so-much).

Unraid Setup:

I'm running a brand new Unraid server build (using my old HTPC hardware) so my first and only version I've used is Unraid v6.7.2.
I'm using binhex-radarr & binhex-sonarr docker containers.
All appdata mappings have been updated to /mnt/disk2.
I do not have a cache drive (nor do I need one for my purposes) so I'm currently using the disk array directly via /mnt/disk2.

Hardware Info:

AMD Athlon II X4 620
ASUS M4A785-M
8GB DDR2 RAM
LSI SAS 9207-8i HBA card -- all disk drives are running on an; no drives are connected to the motherboard (to avoid various known BIOS booting issues with the MB)
Running 5 Data Disks with 1 Parity Disk (brand new Toshiba 4TB NAS 7200 RPM HDD).

I haven't been able to get Sonarr stable on Unraid v6.7.2 . . . I also keep getting Sqllite malformed database errors every couple of days.

Initially Sonarr seemed to be stable for a week before installing Radarr... but I wasn't paying too much attention prior to installing Radarr. Lately however, Radarr has been working without issue, but Sonarr restuls in a corrupted/malformed SQLite database just about every couple of days.

I may try adding a small HDD I have in an external closure and moving appdata mappings to that via Unassigned Devices as a last attempt . . .

Edited August 19, 2019 by raerae1616
Details/formatting

Kosslyn · August 19, 2019

I had downgraded to 6.6.7 after i spent days with corrupted databases in plex, sonarr, and radarr. but I needed to upgrade to 6.7.x for some new hardware I got and am now having the same corruption issues.

Following this thread, I moved my appdata to cache disk, but then lost everything due to a restart. I then rebuilt from backups. I ran appdata from cache for 2 days with no corruptions. I rebuilt plex from the ground up twice. Usually importing all my video and audio would cause a corruption error, but running from cache, it did not corrupt. However due to loosing data on power loss, I decided to put an old hard drive into the server as an unassigned drive and have mounted that to use exclusively for appdata. I'm hoping that since it isnt included whatsoever in the array and the fs doesn't touch this drive, my db's will be safe. I'll update if I find any corruption.

principis · August 19, 2019

4 hours ago, Kosslyn said:

I had downgraded to 6.6.7 after i spent days with corrupted databases in plex, sonarr, and radarr. but I needed to upgrade to 6.7.x for some new hardware I got and am now having the same corruption issues.

Following this thread, I moved my appdata to cache disk, but then lost everything due to a restart. I then rebuilt from backups. I ran appdata from cache for 2 days with no corruptions. I rebuilt plex from the ground up twice. Usually importing all my video and audio would cause a corruption error, but running from cache, it did not corrupt. However due to loosing data on power loss, I decided to put an old hard drive into the server as an unassigned drive and have mounted that to use exclusively for appdata. I'm hoping that since it isnt included whatsoever in the array and the fs doesn't touch this drive, my db's will be safe. I'll update if I find any corruption.

I guess you haven't assigned a cache disk? That means that the cache is just written to ram. You could assign the extra disk as cache disk and that would solve the problem.

Kosslyn · August 19, 2019

For some reason my google-fu was failing me so I decided not to risk it. If I write to /mnt/cache without a cache disk then it writes to ram, correct? If I write there with a cache disk, then it writes to the disk and the data persists even through restarts and power loss?

And so far there still isn't anyone who has had an issue with the db's while running from /mnt/cache, correct? I figured another reason for using unassigned devices was that Unraid didn't touch the disk at all and it'd be safe from this and most future bugs. I don't mind getting the smallest nvme I can find and just putting that as a unassigned device for appdata so that I know everything is safe. Is this a bad idea? Anything I'm missing?

I've also hit 2 other issues with 6.7.x so I'm trying to get those figured out, but I'm going to add back a cache disk at some point.

Edited August 19, 2019 by Kosslyn

Squid · August 19, 2019

1 hour ago, Kosslyn said:

If I write to /mnt/cache without a cache disk then it writes to ram, correct?

Yes

1 hour ago, Kosslyn said:

If I write there with a cache disk, then it writes to the disk and the data persists even through restarts and power loss?

Yes

mdeabreu · August 19, 2019

1 hour ago, Kosslyn said:

If I write to /mnt/cache without a cache disk then it writes to ram, correct?

8 minutes ago, Squid said:

Yes

Does this mean that even without a physical cache disk I could set my docker containers to /mnt/cache/appdata and then periodically run the mover to get the appdata out of ram and into the array? Could this be a potential solution for those of us without physical cache disks? (fully understanding that power loss means data loss)

wgstarks · August 19, 2019

5 minutes ago, mdeabreu said:

Does this mean that even without a physical cache disk I could set my docker containers to /mnt/cache/appdata and then periodically run the mover to get the appdata out of ram and into the array?

If you set an appdata path of /mnt/cache and then move appdata to the array your appdata won’t be at /mnt/cache anymore and the dockers won’t be able to connect to it.

mdeabreu · August 19, 2019

1 minute ago, wgstarks said:

If you set an appdata path of /mnt/cache and then move appdata to the array your appdata won’t be at /mnt/cache anymore and the dockers won’t be able to connect to it.

Sorry I was unclear, the appdata share points to the array (say disk1 only); the containers themselves point directly to /mnt/cache/appdata instead of /mnt/user/appdata or /mnt/disk1/appdata

Then, if I understand correctly, the containers will write directly to /mnt/cache/appdata which will be in RAM; then the mover should grab the RAM only contents and move them back into the array.

wgstarks · August 19, 2019

If you point your containers to /mnt/cache/appdata they will only see what is contained in /mnt/cache/appdata. As soon as mover runs you’ll lose your appdata.

mdeabreu · August 19, 2019

Ah, crystal clear. Thank you for clearing up my misunderstanding!

jamesj2 · August 19, 2019

Thought I'd share my experience. I changed my drive configuration a couple weeks ago. Went from 7 2TB drives with BTRFS cache pool and 2 parity to 2 14TB drives with no cache and 1 parity. I too ran in the to sqlite database corruption with Plex. I thought it was something I did when I moved all the files to the new drives. I did perform the Plex database fix procedures a couple times since the drive change and then came across this thread. Since I have downgraded to Unraid 6.6.7 and it's been running fine for 5 days. Non of my shares are set to use to use a cache drive since moving to the new drives.

Kosslyn · August 19, 2019

2 hours ago, Squid said:

Yes

Yes

Thank you!

1 hour ago, wgstarks said:

If you point your containers to /mnt/cache/appdata they will only see what is contained in /mnt/cache/appdata. As soon as mover runs you’ll lose your appdata.

So far as I understand it, there is no way to have files on cache and on the array (i.e. duplicate), correct? The cache is a write only cache and there is no way to set it up as a tiered file system or have files live on cache, but then have basically a snapshot of cache stored onto the array without removing the files from cache, correct?

Using "yes" for cache would never move the files to the array and using "prefer" would also not move the files from cache onto the array (unless the cache fills up). And if we use "prefer" and cache fills up, then once it is cleared, the files move back to cache but are then removed from the array, correct?

Squid · August 19, 2019

53 minutes ago, Kosslyn said:

Using "yes" for cache would never move the files to the array

Incorrect Yes moves them to the array

JonathanM · August 19, 2019

3 hours ago, mdeabreu said:

Does this mean that even without a physical cache disk I could set my docker containers to /mnt/cache/appdata and then periodically run the mover to get the appdata out of ram and into the array? Could this be a potential solution for those of us without physical cache disks? (fully understanding that power loss means data loss)

If you really want to, you could do something like you said if you have enough RAM. However... if you have a power outage, you will need to manually intervene and have a long enough UPS runtime to stop the docker service, move appdata and system share to an array disk, then shut down after all data is safely back on the array. I'd guestimate you'd need probably around an hour of runtime to get all that accomplished, so not a consumer grade UPS, or have a backup generator that will allow seamless power through the UPS.

Then, when the coast is clear, start the array, manually move the appdata and system to /mnt/cache (the mover won't work if there is no real cache drive), enable the docker service and be back up running.

If at any point the box shuts down before you get your data moved out of RAM, it's all gone.

So, theoretically given enough resources (RAM, UPS runtime) you could make it work.

However, it would seem to me that sourcing a SSD cache drive would be much cheaper and less stress.

trott · August 20, 2019

I'm new to unraid, I have tried it about 15 days, running emby, sonarr without issue. but I do not use array or cache, I put appdata on the SSD mounted with UD

toonamo · August 20, 2019

So I'm a glutton for punishment and still am using 6.7.2 and get the corruption issue, but i mean duh right?

Info about system if it helps towards finding the common denominator.

eVGA X58 121-BL-E756 w/ I7-950x @ 3066 MHz; HVM: Enabled; IOMMU: Disabled (Not Available for this board); Memory: 12 GiB DDR2; Kernel: Linux 4.19.56-Unraid x86_64; 2*10TB WD RED's; no cache; docker appdata pointed to disk1 not user;

i deleted db and slowly added everything back one by one to see if particular move/ tv show was causing issue. Doing it this way i was able to rebuild the database with out corruption. If i just added all my files and let it rebuild all at once it was almost guaranteed to cause corruption.

after doing this and getting plex to work i noticed sonarr was corrupt now.

but since i was able to get it going and have a backup to go to i'm wondering if anyone out there knows of some commands i could put into a script to run every so often to look for corruption, if none found manually backup the database, else if corruption found, delete database and restore from backup.

not sure what command i could use to find the malformed database

can i backup the database while plex is running? i know the wal and shm files are only there while plex is running, i'm wondering what would happen if i backup the main files while those exist, or if i have to stop plex.

also would anyone know how to do this with sonarr as well?

mi5key · August 20, 2019

I backup my Plex database nightly via cron because of this corruption issue. I have my auto update Library turned off and corruptions are rare, but this is a super duper workaround. I'd much rather have it auto scan, but until Limetech can fix this, this is how I roll. Sonarr corrupts more frequently due to the way it runs.

docker stop Plex

cd /mnt/user/appdata/PlexMediaServer/

tar zcf /mnt/user/appdata/PlexMediaServer/backups/Library_`date +%m-%d-%Y_%H%M`.tar.gz ./Library

docker start Plex

Same with Sonarr

docker stop binhex-sonarr

cd /mnt/user/appdata/binhex-sonarr

tar zcf /mnt/user/appdata/binhex-sonarr/Backups/cron-based/sonarrdb_`date +%m-%d-%Y_%H%M`.tar.gz config.xml nzbdrone.db nzbdrone.db-journal

docker start binhex-sonarr

Then I have a cron sweep both locations for any backups more ten 10 days old.

find /mnt/user/appdata/PlexMediaServer/backups -mtime +9 -type f -delete

Edited August 21, 2019 by mi5key

toonamo · August 21, 2019

just created a script that checks the pragma integrity, and if passes, backs up the database. if it fails, then it stops the docker, restores the latest backup, and restarts the docker.

i need to make one for sonarr now and then i'm going to barrow your code on deleting backups older than 10 days.

BBLV · August 21, 2019

Still no hard fix?? I moved appdata and system to cache 2 nights ago and so far so good. I'll report back if I see corruption. If I do, I might toss this server in the can!

simalex · August 21, 2019

Downgraded again to 6.6.x version as, at least for now, my use case involves periods of heavy I/O load on the server.

The problem as far as I can pinpoint it is related concurrent writes by more than one threads/processes of the same file under heavy I/O load. It is obvious that under certain load circumstances the updates are not being applied to the file in the proper sequence, meaning that a disk section that has been in theory updated by process A and then process B, is actually getting written to disk first process B and then process A. leaving the actual file in an inconsistent state.

This good be a bug or not properly handled exception case both of unRaid or SQLite. In the case SQLite on unRaid the chances of this "heavy load" issue manifesting are multiplied because of the way unRaid works having a. the Parity disk as a bottleneck and b. unused disks are spinned down which causes the system to "freezes" i/o operations when one of the disks needs to be spun-up again (at least that is the case on my H310).

What is most concerning for me however, is the following post on this thread by

On 8/19/2019 at 6:26 AM, phbigred said:

Noticed cache corruption with my VMs too becoming unable to backup

which of course might be completely unrelated. This to me indicates that even choosing to go the VM way instead of Docker for my Plex & Sonarr I might still have issues once I upgrade to the latest unRaid version.

In any case I think it would be great if we could get some update from the development team, just to understand what the status is.

naturalcarr · August 21, 2019

I've noticed a new and scary problem that happened specifically to Sonarr, for about 45 minutes, Sonarr would nearly instantly corrupt it's database on 6.7.2. I do have what I'd consider to be a decent docker load running at all times (see the image for proof). On Sunday, Sonarr, Lidarr, Radarr and Plex all corrupted their DBs during a move, Radarr was backed up that Saturday and was easy to restore, for Plex, I ran my script (i'll talk more about it below), and then came Sonarr. I had 3 backups in the Sonarr scheduled backup directory and 2 in the manual, and I have 2 months of backups of those sets mostly from the original CA Appdata Backup/Restore, with my last 3 weeks being on v2. The first 2 databases didn't fix Sonarr's malformed DB (which was weird, because I used the Manual backups that I KNEW were fine when I triggered the backup). Anyway, after trying about 1/2 of my backups none of them fixed Sonarr, and then all of a sudden one worked fine, then I tested some of the backups I'd already tried (because there was no way that ALL of my backups were corrupted for this long), and with the exception of my latest auto backup, and a rogue db from 3 weeks go, they all worked.

Now, Sonarr had corrupted these DB files, I had backups upon backups so many of them were backed up redundantly, but every time Sonarr started up on one of the backups, It would corrupt, and even after Sonarr stopped doing that, the DBs were still corrupt, the overlap in backups saved me.

image.png.e6278436a4195b098951f8a89a2d0cc0.png

I wish I had recorded this better as this just happened on Sunday, but I was tired, and after everything went back to normal, I just deleted all of the corrupted databases

Now, as for the script I mentioned, I made this a few months ago to check and manage my plex database files, it checks them, backs up if needed, attempts a repair if needed, etc. You shouldn't run it headless, as it requires user input, although anyone who's done a basic script before can change that with two "#"s.

I do apologise if this is the wrong place to post this comment, I just thought that it'd be a good place to get this out to other users who need to work on their DBs a bit during this issue.

simalex · August 21, 2019

What I do for checking the Sonarr DB is periodically go through the Logs filtering out everything but the errors.

If there is a corruption in the DB you will see the malformed message there. Once I have gone through a log set, I will just clear the logs as well.

Sonarr initially still seems to be working properly, when in fact the DB has only few corruptions. Once the number of corruptions increases then Sonarr starts initially showing slow responsiveness issues, until it reaches a point where you can't even get to the landing page.

Anyway. When doing manual backups, unless you do them from inside Sonarr, where I assume the DB is paused for this process, I think it is better to stop the Docker altogether.

I started also going through the SQLite site for additional information, and I would suggest before restoring from a manual backup to delete any existing .db-wal files as they contain pending transactions. If you don't back everything up so as to be able overwrite the .db-wal files with the exact same set when the actual DB was backed up, then these might cause a problem when restarting the DB as SQLite will probably try to apply the pending changes. More so if the db-wal files are corrupt or have been already partially applied.

naturalcarr · August 21, 2019

Quote

Anyway. When doing manual backups, unless you do them from inside Sonarr, where I assume the DB is paused for this process, I think it is better to stop the Docker altogether.

That's what I do, I use the in-app backup, I didn't know about the .db-wal files, though, that makes totla sense, thank you. I'll make sure to delete those the next time something pops up.

trott · August 21, 2019

Hi guys, just want to confirm if this unknow bug will also impact the normal file we written to array?

SQLite DB Corruption testers needed

User Feedback

Recommended Comments

mdeabreu 1

Link to comment

raerae1616 3

Link to comment

Kosslyn 0

Link to comment

principis 0

Link to comment

Kosslyn 0

Link to comment

Squid 4988

Link to comment

mdeabreu 1

Link to comment

wgstarks 523

Link to comment

mdeabreu 1

Link to comment

wgstarks 523

Link to comment

mdeabreu 1

Link to comment

jamesj2 0

Link to comment

Kosslyn 0

Link to comment

Squid 4988

Link to comment

JonathanM 2316

Link to comment

trott 14

Link to comment

toonamo 1

Link to comment

mi5key 1

Link to comment

toonamo 1

Link to comment

BBLV 9

Link to comment

simalex 2

Link to comment

naturalcarr 4

Link to comment

simalex 2

Link to comment

naturalcarr 4

Link to comment

trott 14

Link to comment