Jump to content
Sign in to follow this  
zapp8rannigan

CPU still spiking at 100% on 6.8...

19 posts in this topic Last Reply

Recommended Posts

Hi,

 

I'm still having issues with my CPU (i5 4690k) spiking on 6.8. I was originally on 6.7 and was told there was a bug on that version which could be why i was having problems and it was fixed on 6.8. 

 

On 6.7 I couldn't run one or all of (not sure which) Nextcloud, letsencrypt or mariadb. and when i upgraded to 6.8-rc6 everything seemed to be finally fixed but when i came to do a library scan on Booksonic, the cpu spiked again, so I shut that down thinking (hoping) because that app isn't being maintained anymore its no longer compatible, its not ideal but I can live without it for now.

 

Then a few days later I tried to do a library scan on plex and that also caused the cpu to spike. This was much more serious, as I use plex practically everyday.

 

I noticed rc7 became available and quickly updated hoping that would fix my problem, and its been fine for over 3 days it been running, up until when i added a torrent in deluge, which caused it to spike AGAIN.

 

Could this be some kind of hardware issue somewhere? I've tried two different quad nics a while ago thinking it was related to my pfsense setup but it didn't change anything. 

 

What about possibly rebuilding my docker containers, kind of don't want to do this if i can help as I'm worried about losing all my settings and metadata in plex, plus, Nextcloud wasn't exactly a walk in the park to setup, but if theres a chance it could fix my problem I guess I need to do it?

 

Could it be the docker image size? I currently have it set to 30gb  with 10 containers installed. On the docker settings its saying I have "Total to scrub: 6.77GiB" Is this anything i should be concerned about? 

 

I've attached my diagnostics which were taken just when it spiked again.

 

Thanks again for your help

babs-diagnostics-20191126-2113.zip

Edited by zapp8rannigan

Share this post


Link to post

Seeing at least 1 process went Zombie, which means it can't be killed. It might be part of some plugin for Plex, looked for the Z status than traced backwards. Hard to do so on my tablet.

 

Also, your SHFS process has crazy amount of CPU time.

 

 

nobody   28640  0.0  0.0      0     0 ?        Z    20:48   0:00  |           |   \_ [watchdog.sh] <defunct>

Share this post


Link to post

Perhaps a better isolation of the process tree from watchdog.sh on up...

 

 View without wordwrap, doesn't look like it's Plex Plugin, just the mobile screen made it really hard to follow, but much easier on PC now.

 

root     28316  0.0  0.0 107692  5472 ?        Sl   20:48   0:00  |   \_ containerd-shim -namespace moby -workdir /var/lib/docker/containerd/daemon/io.containerd.runtime.v1.linux/moby/06d3c57214354061b944a5ac3382a1890e253c6a7bb079c98ebef0649a550835 -address /var/run/docker/containerd/containerd.sock -containerd-binary /usr/bin/containerd -runtime-root /var/run/docker/runtime-runc
root     28333  0.0  0.0   2308    84 ?        Ss   20:48   0:00  |       \_ /usr/bin/tini -- /bin/bash /usr/local/bin/init.sh
nobody   29531  1.3  0.9 615844 80444 ?        Sl   20:48   0:19  |           \_ /usr/bin/python /usr/bin/deluged -c /config -L info -l /config/deluged.log
root     29237  1.1  0.0  17856  1356 ?        Ds   20:48   0:16  |           \_ /usr/bin/openvpn --daemon --reneg-sec 0 --mute-replay-warnings --auth-nocache --setenv VPN_PROV pia --setenv DEBUG false --setenv VPN_DEVICE_TYPE tun0 --setenv VPN_ENABLED yes --setenv VPN_REMOTE sweden.privateinternetaccess.com --setenv APPLICATION deluge --script-security 2 --writepid /root/openvpn.pid --remap-usr1 SIGHUP --log-append /dev/stdout --pull-filter ignore up --pull-filter ignore down --pull-filter ignore route-ipv6 --pull-filter ignore ifconfig-ipv6 --pull-filter ignore tun-ipv6 --pull-filter ignore persist-tun --pull-filter ignore reneg-sec --up /root/openvpnup.sh --up-delay --up-restart --remote 45.12.220.172 1198 udp --remote 45.12.220.236 1198 udp --remote 45.12.220.227 1198 udp --remote 45.12.220.232 1198 udp --remote 45.12.220.211 1198 udp --remote 45.12.220.249 1198 udp --remote 45.12.220.229 1198 udp --remote 45.12.220.246 1198 udp --remote 45.12.220.235 1198 udp --remote 45.12.220.218 1198 udp --remote 45.83.91.19 1198 udp --remote 45.83.91.18 1198 udp --remote 45.12.220.238 1198 udp --remote-random --keepalive 10 60 --setenv STRICT_PORT_FORWARD yes --disable-occ --auth-user-pass credentials.conf --cd /config/openvpn --config /config/openvpn/Sweden.ovpn
nobody   29893  0.1  0.4  71020 39964 ?        D    20:48   0:01  |           \_ deluge-web
nobody    4636  0.0  0.0   4888   180 ?        D    21:08   0:00  |           \_ pgrep -fa deluge-web
root     28372  0.0  0.1  29920 15004 ?        D    20:48   0:01  |           \_ /usr/bin/python /usr/bin/supervisord -c /etc/supervisor.conf -n
root     28639  0.0  0.0   7336   612 ?        S    20:48   0:00  |           |   \_ /bin/bash /root/start.sh
root      5920  0.0  0.0   4888   232 ?        D    21:08   0:00  |           |   |   \_ pgrep -x openvpn
nobody   28640  0.0  0.0      0     0 ?        Z    20:48   0:00  |           |   \_ [watchdog.sh] <defunct>
nobody   29855  0.0  0.0   3712   548 ?        Ss   20:48   0:00  |           \_ /usr/bin/privoxy /config/privoxy/config
Edited by BRiT
Expanded process listing

Share this post


Link to post

Added a few more lines, had some same-level processes running with 'deluge' and 'openvpn'. So a combo docker that runs torrent through openvpn ?

 

But nothing showing up in syslog as far as obvious root causes or triggers.

Edited by BRiT

Share this post


Link to post

ah yeah, its the binhex-delugevpn docker container which has some openvpn stuff built into it.

 

15 minutes ago, BRiT said:

Also, your SHFS process has crazy amount of CPU time.

whats that? and how do i go about fixing it? could that be the issue?

Share this post


Link to post

That's the process that provides /mnt/usr/ and /mnt/usr0/ . It's layered on top of FUSE (Filesystem in User Space).

 

Here's earlier threads about shfs but those are with it spiked at 100% CPU load, which yours was only showing 0.2 %:

 

 

 

Though maybe yours isn't that bad when averaged out...

 

Here's yours from 3.3 days (averaged cpu time per day of 29.09)


4462 root      20   0  145888    720    252 S   0.0   0.0   0:00.02 shfs
 4475 root      20   0 1091060  17460   1080 S   0.0   0.2  96:10.02 shfs

 

 

And From my system of 38.25 days (averaged cpu time per day of 18.27):


root@TOWER:~# uptime
 18:37:27 up 38 days,  6:04,  1 user,  load average: 0.08, 0.07, 0.03


root@TOWER:~# top -bn1 | egrep -i shfs
 6679 root      20   0  442648  33728    692 S   0.0   0.0   0:11.20 shfs
 6692 root      20   0  787656  55648    828 S   0.0   0.0 699:01.70 shfs

 

Share this post


Link to post

Thanks, just having a read of that thread. Seems it affects the entire server where as I can still get on the gui so mine seems localised to docker.

 

Back when I was on 6.7 when docker would become unresponsive, it would knock out the vm too (pfsense) but I don't recall that happening on 6.8, if it did it was only the once.

 

Do you think I might need to just wipe the docker image and start again?

Share this post


Link to post

Just in case anyone is interested, 

 

still having issues, have replaced my PSU in the hope it was a hardware issue, it needed changing anyway as I was probably pushing it with the amount of hdd's I have now. Alas no change, still happening.

 

Still in the hopes its something hardware related (as it means it's some I can actually fix), I've been testing a file on each drive to see what makes it spike.

So far I've done the last drive I added (hdd) and the cache drive (ssd) I played a file in plex from both drives and both played fine, however I noticed when I reset the playback setting to unplayed within Plex, this caused a significant spike on the cpu (70-80%) where as when just playing the file it didn't go above or even reach 10%. This seems odd to me? infact I can practically crash the server now on command by just hitting played and unplayed in plex a few times. 

 

Does that happen for anyone else?

Share this post


Link to post
2 hours ago, zapp8rannigan said:

Still in the hopes its something hardware related (as it means it's some I can actually fix), I've been testing a file on each drive to see what makes it spike.

So far I've done the last drive I added (hdd) and the cache drive (ssd) I played a file in plex from both drives and both played fine, however I noticed when I reset the playback setting to unplayed within Plex, this caused a significant spike on the cpu (70-80%) where as when just playing the file it didn't go above or even reach 10%. This seems odd to me? infact I can practically crash the server now on command by just hitting played and unplayed in plex a few times. 

Did you map Plex appdata to /mnt/cache or /mnt/user?

Share this post


Link to post
52 minutes ago, testdasi said:

Did you map Plex appdata to /mnt/cache or /mnt/user?

/mnt/user/

I was under the impression if it was set to cache I wouldn't be able to access the content on the array on the stuff that hadn't been moved by the mover yet?

 

The issue itself isn't exclusive to plex, though it MAY be exclusive to docker, as when copying files in MC the cpu doesn't spike anywhere near as much as it does in Krusader

 

 

Edited by zapp8rannigan

Share this post


Link to post
12 minutes ago, zapp8rannigan said:

/mnt/user/

I was under the impression if it was set to cache I wouldn't be able to access the content on the array on the stuff that hadn't been moved by the mover yet?

It is not normal for appdata to contain your media, but merely the Plex working files and as such keeping these on the cache is good for performance.   You will have a separate mapping to the plex container for the media and this SHOULD use /mnt/user.

Share this post


Link to post
4 minutes ago, zapp8rannigan said:

ahh I think I'm getting my wires crossed sorry, my appdata is set to /mnt/user/appdata/PlexMediaServer so rather than user it should be cache?

In theory both should work if appdata is set to Use Cache = Prefer or Only.   However going directly to the drive by-passes the Fuse layer used to implement User Shares and is thus likely to perform better.

Share this post


Link to post

I was actually experiencing this issue as well. Happened twice to the point where I had to forcefully reset then computer as it was non-responsive. I updated to Version: 6.8.0-rc9 and so far the issue haven't came back. PS: My data is pointed to /user if anyone is wondering.

Share this post


Link to post
2 hours ago, itimpi said:

In theory both should work if appdata is set to Use Cache = Prefer or Only.   However going directly to the drive by-passes the Fuse layer used to implement User Shares and is thus likely to perform better.

Interesting, ok just checked all my other containers, I'll switch them across to cache instead of user, and see if it makes a difference.
 

2 hours ago, XiuzSu said:

I was actually experiencing this issue as well. Happened twice to the point where I had to forcefully reset then computer as it was non-responsive. I updated to Version: 6.8.0-rc9 and so far the issue haven't came back. PS: My data is pointed to /user if anyone is wondering.

Mines been the same since 6.7.. It did improve on 6.8 (RC4 i think) it was still spiking but wasn't taking my VM (PfSense) with it anymore, though since i upgrade to RC7+ its been taking it out again, overnight, though not whats causing it because there hasn't been anything for the mover to do...

 

UPDATE....I'm not holding my breath but its no longer spiking when hitting watched/unwatched in plex SO maybe that was it after all? weird though no? I'm not sure I've changed the default templates for most of my dockers

Edited by zapp8rannigan
...maybe solved....

Share this post


Link to post

sadly its still happening, it was fine for a while but its getting worse again.

Just transferring files from my downloads folder (located on the cache drive) to my TV/Film folders, hitting 100% spike when transferring to a share using Krusader/MC. So I just tried transferring a file on MC directly from the cache to a folder on a disk (rather than the share) and its peaking at 42% is that a bit more normal or is that still high for file transfers??
 

I think I tried replacing the sata cable before but no change.

Share this post


Link to post

I'm sorry to hear that and I hope someone can help you out. My issue has completely stopped, and I have been lucky to not encounter any issues at all since my update to v6.8.0.

Share this post


Link to post
On 12/24/2019 at 3:05 PM, XiuzSu said:

I'm sorry to hear that and I hope someone can help you out. My issue has completely stopped, and I have been lucky to not encounter any issues at all since my update to v6.8.0.

Thank you, starting to get worried I'm actually causing damage to my cpu, can't get my head round it. Everytime I think I've located the source of the issue something else contradicts it. For example I originally thought it was tied to docker, but it was doing it in MC the other day so its not. Now I'm thinking its my cache drive but I haven't got another ssd to swap it with at the moment

Share this post


Link to post

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

Sign in to follow this