[SOLVED] Very slow mover and netlink errors

mbc0 · April 18, 2020

Hi,

My mover is running incredibly slowly (10+minutes per gb) even though my cache is a 970 Pro NVME

This is a new issue as I have been running the setup for a year or so. I have also noticed alot of these errors in the log

Apr 18 21:06:01 UNRAIDSERVER kernel: netlink: 4 bytes leftover after parsing attributes in process `gs-server'.
Apr 18 21:10:01 UNRAIDSERVER kernel: netlink: 4 bytes leftover after parsing attributes in process `gs-server'.
Apr 18 21:14:01 UNRAIDSERVER kernel: netlink: 4 bytes leftover after parsing attributes in process `gs-server'.
Apr 18 21:18:01 UNRAIDSERVER kernel: netlink: 4 bytes leftover after parsing attributes in process `gs-server'.
Apr 18 21:22:02 UNRAIDSERVER kernel: netlink: 4 bytes leftover after parsing attributes in process `gs-server'.

I have attached my diags if someone could take a look I would really appreciate it!

Thank you 🙂

unraidserver-diagnostics-20200418-2128.zip

Squid · April 18, 2020

15 minutes ago, mbc0 said:

Apr 18 21:06:01 UNRAIDSERVER kernel: netlink: 4 bytes leftover after parsing attributes in process `gs-server'.

Does it stop after stopping that container (gsdock?)

25 minutes ago, mbc0 said:

My mover is running incredibly slowly (10+minutes per gb)

Looking at the files that it's moving (you really should disable mover logging -> serves no real purpose unless there's an issue), but it appears to be comprised 99% of very small files. When moving files that small, file system overhead becomes noticeable.

You might also want to make sure that ssd trim plugin is running on a semi normal schedule (daily?)

It also looks like handbrake is running full tilt, along with emby transcoding something (or struggling to keep up with the massive amount of changes happening all at once), and a vm chugging along also.

All in all I'd say you're running a perfect storm right now.

mbc0 · April 18, 2020

34 minutes ago, Squid said:

Does it stop after stopping that container (gsdock?)

I saw that just after I posted and stopped the docker, just come back and yes those errors have gone, do you think they are causing an issue?

34 minutes ago, Squid said:

Looking at the files that it's moving (you really should disable mover logging -> serves no real purpose unless there's an issue), but it appears to be comprised 99% of very small files. When moving files that small, file system overhead becomes noticeable.

I was looking at the files being moved and saw they were large mkv's (4gb+) taking an age to move. I know parity slows things down but NVME to 6GB SAS drives it seems a lot slower than it used to be?

34 minutes ago, Squid said:

You might also want to make sure that ssd trim plugin is running on a semi normal schedule (daily?)

I have trim set to 2am daily

34 minutes ago, Squid said:

It also looks like handbrake is running full tilt, along with emby transcoding something (or struggling to keep up with the massive amount of changes happening all at once), and a vm chugging along also.

Handbrake is running 24/7 courtesy of Tdarr but this has been the case for years and not had an issue, it is only using half the resources. an emby transode is actually a very rare occurance! it was fluke that there was one happening at the time of taking the diags 😄

34 minutes ago, Squid said:

All in all I'd say you're running a perfect storm right now.

I thought considering my specs running mover, a handbrake encode and an almost idle VM (blue iris CCTV Raw Capture) I would be running well within capacity? do you think I am pushing too much?

I have been running this same setup for a long time and only this last few days I have had the slow mover issue so I presumed something had crept in?

mbc0 · April 18, 2020

I am used to the mover completing all it's work in the small hours when nothing else is happening but now the mover seems to be running 24/7

mbc0 · April 18, 2020

Hi Again 🙂

I have disabled the docker service and started the mover again, would you mind seeing why it is so slow now please?

Thank you!

unraidserver-diagnostics-20200419-0001.zip

JorgeB · April 19, 2020

At the time the diags were saved there were writes going on to 10 different array disks, this can only be extremely slow since parity needs to be updated simultaneously for all 10, avoid writing to more than one array disk at a time.

mbc0 · April 19, 2020

13 minutes ago, johnnie.black said:

At the time the diags were saved there were writes going on to 10 different array disks, this can only be extremely slow since parity needs to be updated simultaneously for all 10, avoid writing to more than one array disk at a time.

Thank you @johnnie.black I really appreciate your input! if you get time could you look through my responses in post #3 please? I am currently installing some more drives and having a re-shuffle of exisiting to tidy things up. I think I was in cuckoo land thinking my server could handle anything I threw at it!

mbc0 · April 21, 2020

Hi @johnnie.black

I wonder if you wouldn't mind taking a look at my diags again please?

I have reseated all connections and drives

have now got VM's running on an unassigned SSD

barely any dockers running

no emby transcodes

tdarr is disabled

the only thing running is the mover which I have to keep stopping as the performance is so bad.

I also am trying to move data from disk to disk (not when diags were taken) but getting 3-4 MB/s

the diags were just taken with the mover running.

I have another server which is an old HP Gen8 Microserver with 3 10 year old 2tb drives and a 120GB sata SSD and the mover performance is blindingly fast and I can copy disk to disk at 50ish mb/s

considering my main server is current hardware, Threadripper, SAS controller, NVME cache I cannot understand why it lags so slowly

as always I really appreciate your time!

unraidserver-diagnostics-20200421-1552.zip

JorgeB · April 21, 2020

Well, in the latest diags it's not writing to 10 array disks, but it's still writing to 8, so very slow performance is expected, simultaneous writes to more than 1 array disk will always be slow.

mbc0 · April 21, 2020

4 minutes ago, johnnie.black said:

Well, in the latest diags it's not writing to 10 array disks, but it's still writing to 8, so very slow performance is expected, simultaneous writes to more than 1 array disk will always be slow.

Can I please ask how you see this? and what I can do to get some performance back please?

Thank you

JorgeB · April 21, 2020

Just now, mbc0 said:

Can I please ask how you see this?

On the diags, loads.txt

1 minute ago, mbc0 said:

and what I can do to get some performance back please?

Start by stopping all those writes, you can see real-time disk activity on the GUI

mbc0 · April 21, 2020

2 minutes ago, johnnie.black said:

On the diags, loads.txt

I see the file but cannot understand how you know there are 8 similtaneous writes?

15:52:12 up 1:30, 0 users, load average: 32.97, 31.98, 25.45 Cores: 32
[cpu]
host=31
guest=2
[cpu0]
host=57
guest=0
[cpu1]
host=7
guest=0
[cpu2]
host=5
guest=0
[cpu3]
host=20
guest=0
[cpu4]
host=92
guest=0
[cpu5]
host=75
guest=0
[cpu6]
host=47
guest=0
[cpu7]
host=64
guest=0
[cpu8]
host=30
guest=19
[cpu9]
host=13
guest=11
[cpu10]
host=10
guest=10
[cpu11]
host=5
guest=4
[cpu12]
host=2
guest=0
[cpu13]
host=1
guest=0
[cpu14]
host=4
guest=0
[cpu15]
host=47
guest=0
[cpu16]
host=11
guest=0
[cpu17]
host=21
guest=0
[cpu18]
host=11
guest=0
[cpu19]
host=10
guest=0
[cpu20]
host=88
guest=0
[cpu21]
host=86
guest=0
[cpu22]
host=29
guest=0
[cpu23]
host=43
guest=0
[cpu24]
host=21
guest=0
[cpu25]
host=3
guest=0
[cpu26]
host=37
guest=0
[cpu27]
host=33
guest=0
[cpu28]
host=57
guest=8
[cpu29]
host=2
guest=0
[cpu30]
host=32
guest=31
[cpu31]
host=5
guest=2

sda (flash)=0 0 1749 1395
nvme0n1 (cache)=1226069 492202 664040 474684
sdb (disk7)=187050 216405 158994 472662
sdg (parity)=4483754 4801194 319516 1582149
sdh (disk12)=212309 245760 196476 128018
sdj (disk14)=0 0 155011 49261
sdk (disk1)=539306 598016 160894 188120
sdi (disk13)=0 0 154924 68
sdo (disk10)=148138 60074 169691 95509
sde (disk17)=0 0 121863 70
sdd (disk3)=0 0 123776 66
sdf (disk4)=0 0 122993 54
sdp (disk5)=436906 0 146467 54
sdc (disk9)=262144 278528 156149 311445
sdl (disk2)=0 0 121364 68
sdm (disk18)=884053 987136 155168 243151
sdn (disk6)=0 0 126870 65
sdq (disk8)=18432 6144 173974 155053
sdt (disk19)=2404352 2454186 99803 24951
sdv (disk20)=0 0 149236 1144
sdw (disk21)=0 0 115131 54
sdx (disk22)=0 0 85298 28
sdz (disk11)=0 0 148447 53

2 minutes ago, johnnie.black said:

Start by stopping all those writes, you can see real-time disk activity on the GUI

Do you mean the main tab? I can see all my drives but very hard to see what is active?

JorgeB · April 21, 2020

20 minutes ago, mbc0 said:

sdb (disk7)=187050 216405 158994 472662

first value is current read speed, second current write speed

20 minutes ago, mbc0 said:

I can see all my drives but very hard to see what is active?

Toggle is on the upper right corner.

mbc0 · April 21, 2020

1 minute ago, johnnie.black said:

first value is current read speed, second current write speed

Toggle is on the upper right corner.

Thank you so much! I never noticed that toggle!

I will do some more investigation!

mbc0 · April 22, 2020

Hi @johnnie.black

I have really put everything into this but cannot work out what is writing to my disks?

I have stopped VM's

all my shares are set to use the cache

but still something(s) are writing to my array?

This is with the mover running. it is moving a large mkv file from an Samsung Pro NVME to disk 1 on the array at 1.9mb/s surely there is something wrong with my setup? The other disk activity which I have no idea what it is but it is just a few kb/s is that enough to slow down that much?

unraidserver-diagnostics-20200422-2326.zip

Edited April 23, 2020 by mbc0

mbc0 · April 23, 2020

@johnnie.black Sorry to add to this, but I have just seen a few other posts of people complaining about the same thing! and the more I think about it this has only been a problem since updating. I have now updated to 6.9.0-beta 1 and everything is flying again!

Have you any idea what the issue might be on 6.8?

Edited April 23, 2020 by mbc0

JorgeB · April 23, 2020

You still need to find what was writing to the disks, did you stop all dockers?

mbc0 · April 23, 2020

1 hour ago, johnnie.black said:

You still need to find what was writing to the disks, did you stop all dockers?

Here is another diags with docker service & VM Manager stopped

As you can see there is still disk activity but how would I find out what it is?

since upgrading to the beta, not only has my mover speed increase (what seems 100X) also my entire array is back to it's "snappy" self especially folders with large amount of images, the thumbnails load instantly like they used to instead of watching them generate 1 by 1, the same goes for the GUI, everything I click is now instant again.

I am however nervous running the beta, even though it has given me my system back! Can I ask what you would recommend please? is it safe enough to stay on or should I downgrade to a previous stable (6.7.2 seems to be what others talk about downgrading to)

Thanks again for all your time!

This is with docker & VM Services stopped, all different drives are accessed randomly not just those shown

unraidserver-diagnostics-20200423-0935.zip

JorgeB · April 23, 2020

Something is going on there, that array activity is not normal, what if you boot in safe mode? (still leaving all VMs/dockers disable)

mbc0 · April 23, 2020

2 hours ago, johnnie.black said:

Something is going on there, that array activity is not normal, what if you boot in safe mode? (still leaving all VMs/dockers disable)

ok, I will try that tonight, I have too many activities on at the moment to do it now.

Can I ask please how I can downgrade to 6.7.2?

Thank you

JorgeB · April 23, 2020

Replace the bz* files on the flashdrive with the ones from the zip and reboot.

mbc0 · April 23, 2020

2 hours ago, johnnie.black said:

Replace the bz* files on the flashdrive with the ones from the zip and reboot.

Thank you, but I cannot find 6.7.2 anywhere? only 6.8.2 & 6.8.3 on website and google not throwing anything up?

JorgeB · April 23, 2020

Just recently you could get form LT's cloud, not anymore, I guess they recently deleted older releases, why did you want it, v6.7.x has known performance issues.

mbc0 · April 23, 2020

5 minutes ago, johnnie.black said:

Just recently you could get form LT's cloud, not anymore, I guess they recently deleted older releases, why did you want it, v6.7.x has known performance issues.

It was a version that I have found people talking about as a reliable version maybe that is not all true then?

I have not had this dog slow performance problem until the last month or so, I cannot remember what version I was on before it happened. I have proven that changing to 6.9.0 beta fixes my issue but I am nervous about being on a beta so would like to go back to a version that is stable and also before these performance issues I am experiencing in 6.8.

JorgeB · April 23, 2020

Beta should be perfectly safe, it's the same as v6.8.3 just with a newer kernel.

[SOLVED] Very slow mover and netlink errors

Recommended Posts

Link to comment

Top Posters In This Topic

Popular Days

Top Posters In This Topic

Popular Days

Popular Posts

JorgeB

Posted Images

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Join the conversation