Containers slowing to a crawl


hermy65

Recommended Posts

Over the last few days ive been running into an issue where i would notice the various containers i was running would be ridiculously slow and in sometime 100% unresponsive. Downloads would be down around 300kb/s and thats where they would stay. If i rebooted the server everything would return to normal and downloads would be in that 40-60Mb/s then after an hour or two everything would go to hell again.

 

Today i thought maybe it was an issue with my docker image so i removed it and started to rebuild my containers but that doesnt seem to be helping me either.

 

My machine isnt underpowered so that shouldnt be the issue but im running out of ideas.

 

Edit: Ive been adding containers for ~1.5 hours and ive only been able to add maybe 10. Something is definitely not right here

 

Diagnostics are attached

storage-diagnostics-20181229-2350.zip

Edited by hermy65
Link to comment

There are read/write errors on two of your cache devices, mostly cache2:

Dec 29 22:22:04 Storage kernel: BTRFS info (device sdd1): bdev /dev/sdd1 errs: wr 0, rd 20, flush 0, corrupt 0, gen 0
Dec 29 22:22:04 Storage kernel: BTRFS info (device sdd1): bdev /dev/sdc1 errs: wr 567551, rd 153668, flush 7072, corrupt 0, gen 0

This will cause corruption on NOCOW shares, like the system share is by default, see here for more info:

https://forums.unraid.net/topic/46802-faq-for-unraid-v6/?do=findComment&comment=700582

 

Link to comment

@johnnie.black I replaced sdc in the above message you posted but now when i run the btrfs dev stats /mnt/cache command it shows 0 for all drives. Is that normal that the errors on sdd would go away after replacing sdc?

 

Also, im still seeing a lot of sluggishness with most things running on my unraid box even after the replacement. Any other suggestions? Running the diagnostics took ~20 minutes.

 

For reference, my machine is running dual Xeon E5-2630 v4's with 64gb of ram

storage-diagnostics-20190105-2034.zip

Edited by hermy65
Added diagnostics
Link to comment
8 hours ago, hermy65 said:

Is that normal that the errors on sdd would go away after replacing sdc? 

Likely they are reset when a device is replaced.

 

Nothing jumps out in the syslog, though I might have missed something since it's spammed with lines similar to these:

Jan  5 20:32:47 Storage root: #012/dev/sdaa:#012 drive state is:  active/idle
Jan  5 20:32:47 Storage root: #012/dev/sdl:#012 drive state is:  unknown
Jan  5 20:32:47 Storage root: #012/dev/sdg:#012 drive state is:  unknown
Jan  5 20:32:47 Storage root: #012/dev/sdac:#012 drive state is:  unknown
Jan  5 20:32:47 Storage root: #012/dev/sdab:#012 drive state is:  active/idle
Jan  5 20:32:47 Storage root: #012/dev/sdr:#012 drive state is:  standby
Jan  5 20:32:47 Storage root: #012/dev/sdv:#012 drive state is:  standby
Jan  5 20:32:47 Storage root: #012/dev/sdo:#012 drive state is:  active/idle

 

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.