Jump to content

Issues with Docker timeouts


marvar

Recommended Posts

Hi Guys can someone have a look at this diagnosis and tell me why all of a sudden im having issues with dockers and massive slow downs. Basically it appears my CPU now has a massive constant load and I cannot up date dockers as well as whole system slowing down to a snail pace compared to what it has been .

thanks in advance I appreciate any help and chance to learn.

 

cerebro-diagnostics-20220204-1210.zip

Link to comment
1 hour ago, marvar said:

my CPU now has a massive constant load

Not sure about that, but your log show plenty of other issues.

 

1 hour ago, marvar said:

I cannot up date dockers

That is probably because you have filesystem corruption on your cache drive

Feb  4 06:25:42 Cerebro kernel: BTRFS warning (device sdi1): direct IO failed ino 53903 rw 1,34817 sector 0x83d3908 len 0 err no 10
Feb  4 06:25:42 Cerebro kernel: sd 5:0:0:0: [sdi] tag#15 UNKNOWN(0x2003) Result: hostbyte=0x00 driverbyte=0x06 cmd_age=55s
Feb  4 06:25:42 Cerebro kernel: sd 5:0:0:0: [sdi] tag#15 CDB: opcode=0x2a 2a 00 05 f6 26 30 00 00 20 00
Feb  4 06:25:42 Cerebro kernel: blk_update_request: I/O error, dev sdi, sector 100017712 op 0x1:(WRITE) flags 0x8800 phys_seg 3 prio class 0
Feb  4 06:25:42 Cerebro kernel: BTRFS error (device sdi1): bdev /dev/sdi1 errs: wr 10, rd 0, flush 0, corrupt 0, gen 0
Feb  4 06:25:42 Cerebro kernel: BTRFS warning (device sdi1): direct IO failed ino 53903 rw 1,34817 sector 0x5f62650 len 0 err no 10

 

and your docker image is also corrupted (probably as a consequence of the cache corruption ?).

Feb  4 06:25:50 Cerebro kernel: BTRFS error (device loop2): bdev /dev/loop2 errs: wr 10, rd 0, flush 0, corrupt 0, gen 0
Feb  4 06:25:50 Cerebro kernel: BTRFS error (device loop2): bdev /dev/loop2 errs: wr 11, rd 0, flush 0, corrupt 0, gen 0
Feb  4 06:25:53 Cerebro kernel: blk_update_request: I/O error, dev loop2, sector 683808 op 0x1:(WRITE) flags 0x1800 phys_seg 22 prio class 0
Feb  4 06:25:53 Cerebro kernel: BTRFS error (device loop2): bdev /dev/loop2 errs: wr 12, rd 0, flush 0, corrupt 0, gen 0
Feb  4 06:25:53 Cerebro kernel: BTRFS error (device loop2): bdev /dev/loop2 errs: wr 13, rd 0, flush 0, corrupt 0, gen 0
Feb  4 06:25:53 Cerebro kernel: BTRFS: error (device loop2) in btrfs_commit_transaction:2377: errno=-5 IO failure (Error while writing out transaction)
Feb  4 06:25:53 Cerebro kernel: BTRFS info (device loop2): forced readonly

 

Did you fill your cache at some point ?

Possibly just because of the many issues detected on cache ? 

 

Feb  4 06:25:42 Cerebro kernel: ata3.00: exception Emask 0x0 SAct 0x81e0027f SErr 0x0 action 0x6 frozen
Feb  4 06:25:42 Cerebro kernel: ata3.00: failed command: WRITE FPDMA QUEUED
Feb  4 06:25:42 Cerebro kernel: ata3.00: cmd 61/40:00:48:22:45/05:00:20:00:00/40 tag 0 ncq dma 688128 out
Feb  4 06:25:42 Cerebro kernel:         res 40/00:01:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
Feb  4 06:25:42 Cerebro kernel: ata3.00: status: { DRDY }
Feb  4 06:25:42 Cerebro kernel: ata3.00: failed command: WRITE FPDMA QUEUED
Feb  4 06:25:42 Cerebro kernel: ata3.00: cmd 61/c0:08:88:27:45/02:00:20:00:00/40 tag 1 ncq dma 360448 out
Feb  4 06:25:42 Cerebro kernel:         res 40/00:01:01:4f:c2/00:00:00:00:00/00 Emask 0x4 (timeout)
Feb  4 06:25:42 Cerebro kernel: ata3.00: status: { DRDY }
Feb  4 06:25:42 Cerebro kernel: ata3.00: failed command: WRITE FPDMA QUEUED
Feb  4 06:25:42 Cerebro kernel: ata3.00: cmd 61/40:10:48:2a:45/05:00:20:00:00/40 tag 2 ncq dma 688128 out
Feb  4 06:25:42 Cerebro kernel:         res 40/00:01:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
Feb  4 06:25:42 Cerebro kernel: ata3.00: status: { DRDY }

 

There are also issues with your disk 1

Feb  2 12:12:06 Cerebro kernel: ata5: EH complete
Feb  2 12:12:13 Cerebro kernel: ata5.00: exception Emask 0x0 SAct 0x700f8000 SErr 0x0 action 0x0
Feb  2 12:12:13 Cerebro kernel: ata5.00: irq_stat 0x40000008
Feb  2 12:12:13 Cerebro kernel: ata5.00: failed command: READ FPDMA QUEUED
Feb  2 12:12:13 Cerebro kernel: ata5.00: cmd 60/40:98:f0:66:f7/05:00:05:00:00/40 tag 19 ncq dma 688128 in
Feb  2 12:12:13 Cerebro kernel:         res 41/40:00:67:6b:f7/00:00:05:00:00/40 Emask 0x409 (media error) <F>
Feb  2 12:12:13 Cerebro kernel: ata5.00: status: { DRDY ERR }
Feb  2 12:12:13 Cerebro kernel: ata5.00: error: { UNC }
Feb  2 12:12:13 Cerebro kernel: ata5.00: configured for UDMA/133
Feb  2 12:12:13 Cerebro kernel: sd 7:0:0:0: [sdk] tag#19 UNKNOWN(0x2003) Result: hostbyte=0x00 driverbyte=0x08 cmd_age=14s
Feb  2 12:12:13 Cerebro kernel: sd 7:0:0:0: [sdk] tag#19 Sense Key : 0x3 [current] 
Feb  2 12:12:13 Cerebro kernel: sd 7:0:0:0: [sdk] tag#19 ASC=0x11 ASCQ=0x4 
Feb  2 12:12:13 Cerebro kernel: sd 7:0:0:0: [sdk] tag#19 CDB: opcode=0x88 88 00 00 00 00 00 05 f7 66 f0 00 00 05 40 00 00
Feb  2 12:12:13 Cerebro kernel: blk_update_request: I/O error, dev sdk, sector 100100967 op 0x0:(READ) flags 0x4000 phys_seg 26 prio class 0
Feb  2 12:12:13 Cerebro kernel: md: disk1 read error, sector=100100896
Feb  2 12:12:13 Cerebro kernel: md: disk1 read error, sector=100100904
Feb  2 12:12:13 Cerebro kernel: md: disk1 read error, sector=100100912
Feb  2 12:12:13 Cerebro kernel: md: disk1 read error, sector=100100920
Feb  2 12:12:13 Cerebro kernel: md: disk1 read error, sector=100100928
Feb  2 12:12:13 Cerebro kernel: md: disk1 read error, sector=100100936

 

And the SMART attributes do not look good on that one.

You should configure Unraid to watch SMART attributes 1 and 200 on WD drives.

 

I would start by checking connection on all drives, data and power, on both sides.

Then doing an extended SMART test on cache and disk1 (you might have to disable disk spin down).

 

JorgeB will be more qualified to give advices for BTRFS corruption.

 

 

Probably unrelated but do you really have 74 shares ?

Link to comment

Thanks very much for original reply Mate. appreciate you looking at my Logs.

Yes have issues with Disk 1, think its just age, system is pretty much a Frankenstein machine as im only new to unraid and Nas's. Just ordered a replacement8 Tb disk which i will utilize for parity and will swap the old 4TB parity into the disk 1 position and retire the current disk.

So the cache did fill at one point. I misunderstood how mover worked in relation to array. I believe I have now set up the cache correctly and also mover scheduler.

Hope fully this is why there is a massive load on CPU as it is redoing the cache file system . I will await its finish . Do you know is there a way to some how look at what the cpu is actually processing? .

 

Link to comment
5 minutes ago, marvar said:

How is the best way to do this from terminal or from krusader. Sorry total noob here I cant find config/shares in Krusader?

Setting up krusader to allow you to work with the flash drive is a bit more complicated.

 

Simplest solution would be to put flash in pc and

2 hours ago, itimpi said:

Delete any .cfg files in config/shares on the flash drive that do not correspond to shares you actually have.  

 

 

5 hours ago, marvar said:

I did stupidly leave the router open and someone attacked the server and ransom ware a heap of files

 

  • Like 1
Link to comment

Ok I have followed squids advice on a few posts and scrub the cache - reset the cache and ordered a new parity drive to swap out with drive 1 . maybe a little ahead of myself but so far,,, smooth sailing :) pretty sure once I submit this reply it will all fall apart haha. as my mate says the journey is half the fun. thanks 

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...