May 4, 20242 yr I've been having trouble with Sonarr performance. Looking in the log, I see errors related to database locks. Digging deeper, my cache drive performance is very very bad. Since I was getting a few errors on one of the SSDs, I moved everything to the array and replaced that drive a month ago. At first performance was better (but not great), but now a month later it's bad again. Some info. My disk write performance is measured in kB/s: $ dd if=/dev/zero of=test1.img bs=1024 count=20 oflag=dsync 20+0 records in 20+0 records out 20480 bytes (20 kB, 20 KiB) copied, 5.56003 s, 3.7 kB/s $ hdparm -Tt /dev/sdg /dev/sdg: Timing cached reads: 42310 MB in 2.00 seconds = 21188.73 MB/sec Timing buffered disk reads: 1440 MB in 3.00 seconds = 479.45 MB/sec $ hdparm -Tt /dev/sdh /dev/sdh: Timing cached reads: 39242 MB in 2.00 seconds = 19650.09 MB/sec Timing buffered disk reads: 118 MB in 3.02 seconds = 39.08 MB/sec $ btrfs fi df /mnt/cache Data, RAID1: total=529.00GiB, used=466.58GiB System, RAID1: total=64.00MiB, used=96.00KiB Metadata, RAID1: total=3.00GiB, used=1.86GiB GlobalReserve, single: total=512.00MiB, used=0.00B $ btrfs fi show Label: none uuid: 7853a932-8ed9-4c4c-b1b9-48ad56a03e28 Total devices 2 FS bytes used 469.44GiB devid 3 size 931.51GiB used 530.03GiB path /dev/sdh1 devid 4 size 931.51GiB used 530.03GiB path /dev/sdg1 Label: none uuid: ecf4c172-490c-475d-8f98-be66f31b9bd7 Total devices 1 FS bytes used 2.94MiB devid 1 size 1.00GiB used 174.38MiB path /dev/loop3 I've read that sqlite databases can get really fragmented, which does appear to be the case: $ filefrag lidarr.db lidarr.db: 42850 extents found The UNRAID GUI says that a balance is not needed. But do I need to find and defrag highly fragmented files? I also read that the metadata volume can get fragmented? I started a balance operation anyway, but it was going to take 4+ days to complete so I canceled it. To be honest, I expect a file system to not need this much manual maintenance. Is my pool misconfigured? Should I switch to a ZFS mirrored pool? storage-diagnostics-20240504-0706.zip Edited May 4, 20242 yr by coppit Attaching diagnostics
May 4, 20242 yr Community Expert May just be slow devices, BX500 is the low end model from Crucial, they cannot sustain high speed writes, MX500 are much better, but you can try ZFS to see if there's any difference.
May 4, 20242 yr Author I used filefrag on all my cache files to find out which are the most fragmented. 142807 ./system/docker/docker-xfs.img 21087 ./appdata/mariadb-official/data/ib_logfile0 11353 ./appdata/binhex-plex/Plex Media Server/Plug-in Support/Databases/com.plexapp.plugins.library.db 6344 ./appdata/binhex-plex/Plex Media Server/Plug-in Support/Databases/tv.plex.providers.epg.cloud-e6847e85-fd9d-4ae3-b604-b5ff4a50bfde.db 4277 ./appdata/nextcloud/log/nextcloud/nextcloud.log 2334 ./appdata/pihole-kids/pihole/pihole-FTL.db 1501 ./appdata/binhex-radarr/supervisord.log.1 <and a bunch of Deluge downloads>
May 5, 20242 yr Author I switched to zfs with a 2 disk mirror. $ dd if=/dev/zero of=test1.img bs=100M count=20 oflag=dsync 20+0 records in 20+0 records out 2097152000 bytes (2.1 GB, 2.0 GiB) copied, 0.43951 s, 4.8 GB/s Much better. I'll check it again after a month.
June 17, 20242 yr Author Sadly things only were stable for about a couple of weeks. I remember that I recently swapped out my air cooler for a water cooler in order to accommodate a third GPU. My new hypothesis is that the CPU was getting overheated, then slowing to a crawl. Unraid's temperature monitoring didn't show high temps in general, but I suppose that they can spike quickly without my noticing. I've removed the 3rd GPU and re-installed the air cooler, and reverted my CPU voltage tweaks (which likely caused it to run hotter). Hopefully this will stabilize things <fingers crossed / thumbs held>.
August 22, 20241 yr Author Last update: I finally replaced the CPU. Things are stable now. The weird thing is that the old CPU is stable in another machine as well... Not sure what's going on. Maybe the motherboard's memory speed detection re-ran and is better now, or maybe I was missing a bit of thermal paste.
August 22, 20241 yr Community Expert I notice the CPU is an Intel 13900k. Some problems been coming to light with 13 and 14 gen Intel's. Dunno if that's the problem, but barring any other explanation, that could be a good gremlin to blame.
August 22, 20241 yr Author 2 hours ago, Veah said: I notice the CPU is an Intel 13900k. Some problems been coming to light with 13 and 14 gen Intel's. Dunno if that's the problem, but barring any other explanation, that could be a good gremlin to blame. Yeah, I saw that too. Hard to know...
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.