Jump to content

[SOLVED] Very slow mover and netlink errors


mbc0

Recommended Posts

Hi,

 

My mover is running incredibly slowly (10+minutes per gb) even though my cache is a 970 Pro NVME 

 

This is a new issue as I have been running the setup for a year or so.  I have also noticed alot of these errors in the log

 

Apr 18 21:06:01 UNRAIDSERVER kernel: netlink: 4 bytes leftover after parsing attributes in process `gs-server'.
Apr 18 21:10:01 UNRAIDSERVER kernel: netlink: 4 bytes leftover after parsing attributes in process `gs-server'.
Apr 18 21:14:01 UNRAIDSERVER kernel: netlink: 4 bytes leftover after parsing attributes in process `gs-server'.
Apr 18 21:18:01 UNRAIDSERVER kernel: netlink: 4 bytes leftover after parsing attributes in process `gs-server'.
Apr 18 21:22:02 UNRAIDSERVER kernel: netlink: 4 bytes leftover after parsing attributes in process `gs-server'.

 

I have attached my diags if someone could take a look I would really appreciate it!

 

Thank you 🙂

 

 

unraidserver-diagnostics-20200418-2128.zip

Link to comment
15 minutes ago, mbc0 said:

Apr 18 21:06:01 UNRAIDSERVER kernel: netlink: 4 bytes leftover after parsing attributes in process `gs-server'.

Does it stop after stopping that container (gsdock?)  

 

25 minutes ago, mbc0 said:

My mover is running incredibly slowly (10+minutes per gb)

Looking at the files that it's moving (you really should disable mover logging -> serves no real purpose unless there's an issue), but it appears to be comprised 99% of very small files.  When moving files that small, file system overhead becomes noticeable.

 

You might also want to make sure that ssd trim plugin is running on a semi normal schedule (daily?)

 

It also looks like handbrake is running full tilt, along with emby transcoding something (or struggling to keep up with the massive amount of changes happening all at once), and a vm chugging along also.

 

All in all I'd say you're running a perfect storm right now.

Link to comment
34 minutes ago, Squid said:

Does it stop after stopping that container (gsdock?)  

I saw that just after I posted and stopped the docker, just come back and yes those errors have gone, do you think they are causing an issue? 

34 minutes ago, Squid said:

 

Looking at the files that it's moving (you really should disable mover logging -> serves no real purpose unless there's an issue), but it appears to be comprised 99% of very small files.  When moving files that small, file system overhead becomes noticeable.

I was looking at the files being moved and saw they were large mkv's (4gb+) taking an age to move.  I know parity slows things down but NVME to 6GB SAS drives it seems a lot slower than it used to be?

34 minutes ago, Squid said:

 

You might also want to make sure that ssd trim plugin is running on a semi normal schedule (daily?)

I have trim set to 2am daily

34 minutes ago, Squid said:

 

It also looks like handbrake is running full tilt, along with emby transcoding something (or struggling to keep up with the massive amount of changes happening all at once), and a vm chugging along also.

Handbrake is running 24/7 courtesy of Tdarr but this has been the case for years and not had an issue, it is only using half the resources. an emby transode is actually a very rare occurance! it was fluke that there was one happening at the time of taking the diags 😄

 

34 minutes ago, Squid said:

 

All in all I'd say you're running a perfect storm right now.

I thought considering my specs running mover, a handbrake encode and an almost idle VM (blue iris CCTV Raw Capture) I would be running well within capacity? do you think I am pushing too much?  

 

I have been running this same setup for a long time and only this last few days I have had the slow mover issue so I presumed something had crept in?

Annotation 2020-04-18 225833.jpg

Link to comment
13 minutes ago, johnnie.black said:

At the time the diags were saved there were writes going on to 10 different array disks, this can only be extremely slow since parity needs to be updated simultaneously for all 10, avoid writing to more than one array disk at a time.

Thank you @johnnie.black I really appreciate your input! if you get time could you look through my responses in post #3 please?  I am currently installing some more drives and having a re-shuffle of exisiting to tidy things up.  I think I was in cuckoo land thinking my server could handle anything I threw at it!

Link to comment

Hi @johnnie.black

 

I wonder if you wouldn't mind taking a look at my diags again please?

 

I have reseated all connections and drives

have now got VM's running on an unassigned SSD

barely any dockers running

no emby transcodes

tdarr is disabled

the only thing running is the mover which I have to keep stopping as the performance is so bad.

I also am trying to move data from disk to disk (not when diags were taken) but getting 3-4 MB/s 

the diags were just taken with the mover running.

 

I have another server which is an old HP Gen8 Microserver with 3 10 year old 2tb drives and a 120GB sata SSD and the mover performance is blindingly fast and I can copy disk to disk at 50ish mb/s

 

considering my main server is current hardware, Threadripper, SAS controller, NVME cache I cannot understand why it lags so slowly

 

as always I really appreciate your time!

unraidserver-diagnostics-20200421-1552.zip

Link to comment
4 minutes ago, johnnie.black said:

Well, in the latest diags it's not writing to 10 array disks, but it's still writing to 8, so very slow performance is expected, simultaneous writes to more than 1 array disk will always be slow.

Can I please ask how you see this? and what I can do to get some performance back please?

 

Thank you

Link to comment
2 minutes ago, johnnie.black said:

On the diags, loads.txt

I see the file but cannot understand how you know there are 8 similtaneous writes?

 

 15:52:12 up  1:30,  0 users,  load average: 32.97, 31.98, 25.45  Cores: 32
[cpu]
host=31
guest=2
[cpu0]
host=57
guest=0
[cpu1]
host=7
guest=0
[cpu2]
host=5
guest=0
[cpu3]
host=20
guest=0
[cpu4]
host=92
guest=0
[cpu5]
host=75
guest=0
[cpu6]
host=47
guest=0
[cpu7]
host=64
guest=0
[cpu8]
host=30
guest=19
[cpu9]
host=13
guest=11
[cpu10]
host=10
guest=10
[cpu11]
host=5
guest=4
[cpu12]
host=2
guest=0
[cpu13]
host=1
guest=0
[cpu14]
host=4
guest=0
[cpu15]
host=47
guest=0
[cpu16]
host=11
guest=0
[cpu17]
host=21
guest=0
[cpu18]
host=11
guest=0
[cpu19]
host=10
guest=0
[cpu20]
host=88
guest=0
[cpu21]
host=86
guest=0
[cpu22]
host=29
guest=0
[cpu23]
host=43
guest=0
[cpu24]
host=21
guest=0
[cpu25]
host=3
guest=0
[cpu26]
host=37
guest=0
[cpu27]
host=33
guest=0
[cpu28]
host=57
guest=8
[cpu29]
host=2
guest=0
[cpu30]
host=32
guest=31
[cpu31]
host=5
guest=2

sda (flash)=0 0 1749 1395
nvme0n1 (cache)=1226069 492202 664040 474684
sdb (disk7)=187050 216405 158994 472662
sdg (parity)=4483754 4801194 319516 1582149
sdh (disk12)=212309 245760 196476 128018
sdj (disk14)=0 0 155011 49261
sdk (disk1)=539306 598016 160894 188120
sdi (disk13)=0 0 154924 68
sdo (disk10)=148138 60074 169691 95509
sde (disk17)=0 0 121863 70
sdd (disk3)=0 0 123776 66
sdf (disk4)=0 0 122993 54
sdp (disk5)=436906 0 146467 54
sdc (disk9)=262144 278528 156149 311445
sdl (disk2)=0 0 121364 68
sdm (disk18)=884053 987136 155168 243151
sdn (disk6)=0 0 126870 65
sdq (disk8)=18432 6144 173974 155053
sdt (disk19)=2404352 2454186 99803 24951
sdv (disk20)=0 0 149236 1144
sdw (disk21)=0 0 115131 54
sdx (disk22)=0 0 85298 28
sdz (disk11)=0 0 148447 53

2 minutes ago, johnnie.black said:

 

Start by stopping all those writes, you can see real-time disk activity on the GUI

Do you mean the main tab? I can see all my drives but very hard to see what is active?

Link to comment

Hi @johnnie.black

 

I have really put everything into this but cannot work out what is writing to my disks?  

 

I have stopped VM's 

all my shares are set to use the cache

but still something(s) are writing to my array?

 

 

Annotation 2020-04-22 231435.jpg

Annotation 2020-04-22 231455.jpg

Annotation 2020-04-22 231520.jpg

 

This is with the mover running.  it is moving a large mkv file from an Samsung Pro NVME to disk 1 on the array at 1.9mb/s surely there is something wrong with my setup? The other disk activity which I have no idea what it is but it is just a few kb/s is that enough to slow down that much? 

 

image.thumb.png.33d94d07f8d2c509923d12f21e26eb7f.png

 

 

unraidserver-diagnostics-20200422-2326.zip

Edited by mbc0
Link to comment

@johnnie.black Sorry to add to this, but I have just seen a few other posts of people complaining about the same thing! and the more I think about it this has only been a problem since updating.  I have now updated to 6.9.0-beta 1 and everything is flying again! 

 

Have you any idea what the issue might be on 6.8?

 

image.thumb.png.299600da1240b01823fe64e2006d16a6.png

 

 

 

Edited by mbc0
Link to comment
1 hour ago, johnnie.black said:

You still need to find what was writing to the disks, did you stop all dockers?

Here is another diags with docker service & VM Manager stopped

 

As you can see there is still disk activity but how would I find out what it is?

 

since upgrading to the beta, not only has my mover speed increase (what seems 100X) also my entire array is back to it's "snappy" self especially folders with large amount of images, the thumbnails load instantly like they used to instead of watching them generate 1 by 1, the same goes for the GUI, everything I click is now instant again.

 

I am however nervous running the beta, even though it has given me my system back! Can I ask what you would recommend please? is it safe enough to stay on or should I downgrade to a previous stable (6.7.2 seems to be what others talk about downgrading to)

 

Thanks again for all your time!

 

 

This is with docker & VM Services stopped, all different drives are accessed randomly not just those shown

image.thumb.png.0bb80c213f258e4612bffe5aaf6318a0.png

 

 

unraidserver-diagnostics-20200423-0935.zip

Link to comment
2 hours ago, johnnie.black said:

Something is going on there, that array activity is not normal, what if you boot in safe mode? (still leaving all VMs/dockers disable)

ok, I will try that tonight, I have too many activities on at the moment to do it now. 

 

Can I ask please how I can downgrade to 6.7.2?

 

Thank you

Link to comment
5 minutes ago, johnnie.black said:

Just recently you could get form LT's cloud, not anymore, I guess they recently deleted older releases, why did you want it, v6.7.x has known performance issues.

It was a version that I have found people talking about as a reliable version maybe that is not all true then?

 

I have not had this dog slow performance problem until the last month or so, I cannot remember what version I was on before it happened.  I have proven that changing to 6.9.0 beta fixes my issue but I am nervous about being on a beta so would like to go back to a version that is stable and also before these performance issues I am experiencing in 6.8.

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...