Jump to content

Btrfs errors in cache and high I/O Wait


igoehr

Recommended Posts

Hi,

 

I am currently experiencing btrfs errors on my cache drives and I am not sure what could be the reason for it. Additionally I have some strange behavior of my system and especially the dockers. Sometimes it seems that I have unusually high IO wait times. The cpu usage on the Dashboard is at 50-60%. Top shows nearly zero cpu usage but 60% wait. I seems that these events are correlating with the btrfs errors. Does anyone have some tips for me how to investigate these behaviors and errors?

 

I did a full scrub of the cache this morning but it didn't find any errors...

 

This is the result of Btrfs dev stats

 

root@eon:~# btrfs dev stats /mnt/cache

[/dev/sdd1].write_io_errs    2655

[/dev/sdd1].read_io_errs     2679

[/dev/sdd1].flush_io_errs    0

[/dev/sdd1].corruption_errs  0

[/dev/sdd1].generation_errs  0

[/dev/sde1].write_io_errs    1115

[/dev/sde1].read_io_errs     1861

[/dev/sde1].flush_io_errs    0

[/dev/sde1].corruption_errs  0

[/dev/sde1].generation_errs  0

 

 

eon-diagnostics-20200123-0851.zip

Edited by igoehr
Link to comment

There are ATA errors on both cache devices:

Jan 22 22:27:32 eon kernel: ata4.00: exception Emask 0x0 SAct 0x3e00000 SErr 0x0 action 0x6 frozen
Jan 22 22:27:32 eon kernel: ata4.00: failed command: READ FPDMA QUEUED
Jan 22 22:27:32 eon kernel: ata4.00: cmd 60/08:a8:38:aa:21/00:00:1d:00:00/40 tag 21 ncq dma 4096 in
Jan 22 22:27:32 eon kernel:         res 40/00:01:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
Jan 22 22:27:32 eon kernel: ata4.00: status: { DRDY }
Jan 22 22:27:32 eon kernel: ata4.00: failed command: READ FPDMA QUEUED
Jan 22 22:27:32 eon kernel: ata4.00: cmd 60/70:b0:d0:ae:21/00:00:1d:00:00/40 tag 22 ncq dma 57344 in
Jan 22 22:27:32 eon kernel:         res 40/00:01:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
Jan 22 22:27:32 eon kernel: ata4.00: status: { DRDY }


Jan 22 22:29:49 eon kernel: ata3.00: exception Emask 0x0 SAct 0x6 SErr 0x0 action 0x6 frozen
Jan 22 22:29:49 eon kernel: ata3.00: failed command: READ FPDMA QUEUED
Jan 22 22:29:49 eon kernel: ata3.00: cmd 60/38:08:88:55:22/00:00:1d:00:00/40 tag 1 ncq dma 28672 in
Jan 22 22:29:49 eon kernel:         res 40/00:ff:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
Jan 22 22:29:49 eon kernel: ata3.00: status: { DRDY }
Jan 22 22:29:49 eon kernel: ata3.00: failed command: READ FPDMA QUEUED
Jan 22 22:29:49 eon kernel: ata3.00: cmd 60/28:10:38:0e:da/00:00:00:00:00/40 tag 2 ncq dma 20480 in
Jan 22 22:29:49 eon kernel:         res 40/00:01:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
Jan 22 22:29:49 eon kernel: ata3.00: status: { DRDY }
Jan 22 22:29:49 eon kernel: ata3: hard resetting link
Jan 22 22:29:59 eon kernel: ata3: SATA link up 6.0 Gbps (SStatus 133 SControl 300)
Jan 22 22:29:59 eon kernel: ata3.00: configured for UDMA/133
Jan 22 22:29:59 eon kernel: sd 3:0:0:0: [sdd] tag#31 UNKNOWN(0x2003) Result: hostbyte=0x00 driverbyte=0x06
Jan 22 22:29:59 eon kernel: sd 3:0:0:0: [sdd] tag#31 CDB: opcode=0x28 28 00 1d 22 4d 68 00 00 28 00
Jan 22 22:29:59 eon kernel: print_req_error: I/O error, dev sdd, sector 488787304

This is a hardware problem, it could be a connection issue but strange happening on both devices at the same time, so it could also be a compatibility issue with your board and those model SSDs, try replacing/swapping all cables first, including power cables.

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...