Jump to content

Log filling up with BTRFS errors/IO errors


Go to solution Solved by JorgeB,

Recommended Posts

Hi all,

 

I'm seeing this in my logs until they fill up. It's happened twice now since I got new cache disks. I've tried adjusting the cable and am about to try replacing it. Could it be something other than a hardware issue? I have attached diagnostics below, here's an excerpt:
 

May  6 08:49:20 Tower kernel: BTRFS error (device sdg1): bdev /dev/sdg1 errs: wr 8077984, rd 49379, flush 459477, corrupt 0, gen 0
May  6 08:49:20 Tower kernel: BTRFS error (device sdg1): bdev /dev/sdg1 errs: wr 8077985, rd 49379, flush 459477, corrupt 0, gen 0
May  6 08:49:20 Tower kernel: sd 6:0:0:0: [sdg] tag#11 UNKNOWN(0x2003) Result: hostbyte=0x04 driverbyte=DRIVER_OK cmd_age=0s
May  6 08:49:20 Tower kernel: sd 6:0:0:0: [sdg] tag#11 CDB: opcode=0x2a 2a 00 00 62 7c 40 00 00 60 00
May  6 08:49:20 Tower kernel: I/O error, dev sdg, sector 6454336 op 0x1:(WRITE) flags 0x1800 phys_seg 12 prio class 2
May  6 08:49:20 Tower kernel: sd 6:0:0:0: [sdg] tag#12 UNKNOWN(0x2003) Result: hostbyte=0x04 driverbyte=DRIVER_OK cmd_age=0s
May  6 08:49:20 Tower kernel: sd 6:0:0:0: [sdg] tag#12 CDB: opcode=0x2a 2a 00 00 62 7c e0 00 00 40 00
May  6 08:49:20 Tower kernel: I/O error, dev sdg, sector 6454496 op 0x1:(WRITE) flags 0x1800 phys_seg 8 prio class 2
May  6 08:49:20 Tower kernel: BTRFS error (device sdg1): bdev /dev/sdg1 errs: wr 8077986, rd 49379, flush 459477, corrupt 0, gen 0
May  6 08:49:20 Tower kernel: BTRFS error (device sdg1): bdev /dev/sdg1 errs: wr 8077987, rd 49379, flush 459477, corrupt 0, gen 0
May  6 08:49:20 Tower kernel: BTRFS error (device sdg1): bdev /dev/sdg1 errs: wr 8077988, rd 49379, flush 459477, corrupt 0, gen 0
May  6 08:49:20 Tower kernel: sd 6:0:0:0: [sdg] tag#22 UNKNOWN(0x2003) Result: hostbyte=0x04 driverbyte=DRIVER_OK cmd_age=0s
May  6 08:49:20 Tower kernel: sd 6:0:0:0: [sdg] tag#22 CDB: opcode=0x35 35 00 00 00 00 00 00 00 00 00
May  6 08:49:20 Tower kernel: I/O error, dev sdg, sector 0 op 0x1:(WRITE) flags 0x800 phys_seg 0 prio class 2
May  6 08:49:20 Tower kernel: BTRFS error (device sdg1): bdev /dev/sdg1 errs: wr 8077988, rd 49379, flush 459478, corrupt 0, gen 0
May  6 08:49:20 Tower kernel: sd 6:0:0:0: [sdg] tag#6 UNKNOWN(0x2003) Result: hostbyte=0x04 driverbyte=DRIVER_OK cmd_age=0s
May  6 08:49:20 Tower kernel: sd 6:0:0:0: [sdg] tag#6 CDB: opcode=0x2a 2a 00 00 00 08 80 00 00 08 00
May  6 08:49:20 Tower kernel: I/O error, dev sdg, sector 2176 op 0x1:(WRITE) flags 0x3800 phys_seg 1 prio class 2
May  6 08:49:20 Tower kernel: I/O error, dev sdg, sector 2176 op 0x1:(WRITE) flags 0x3800 phys_seg 1 prio class 2
May  6 08:49:20 Tower kernel: btrfs_end_super_write: 17 callbacks suppressed
May  6 08:49:20 Tower kernel: BTRFS warning (device sdg1): lost page write due to IO error on /dev/sdg1 (-5)
May  6 08:49:20 Tower kernel: BTRFS error (device sdg1): bdev /dev/sdg1 errs: wr 8077989, rd 49379, flush 459478, corrupt 0, gen 0
May  6 08:49:20 Tower kernel: BTRFS error (device sdg1): error writing primary super block to device 1
May  6 08:49:20 Tower kernel: sd 6:0:0:0: [sdg] tag#7 UNKNOWN(0x2003) Result: hostbyte=0x04 driverbyte=DRIVER_OK cmd_age=0s
May  6 08:49:20 Tower kernel: sd 6:0:0:0: [sdg] tag#7 CDB: opcode=0x2a 2a 00 97 cd a6 a8 00 00 08 00
May  6 08:49:20 Tower kernel: BTRFS warning (device sdg1): lost page write due to IO error on /dev/sdg1 (-5)
May  6 08:49:20 Tower kernel: BTRFS error (device sdg1): error writing primary super block to device 1



I think the issues begin with this error which may suggest that my PCI SATA controller is the underlying culprit?

 

May  5 16:18:04 Tower kernel: ata6.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6 frozen
May  5 16:18:04 Tower kernel: ata6.00: failed command: DATA SET MANAGEMENT
May  5 16:18:04 Tower kernel: ata6.00: cmd 06/01:01:00:00:00/00:00:00:00:00/a0 tag 21 dma 512 out
May  5 16:18:04 Tower kernel:         res 40/00:01:01:4f:c2/00:00:00:00:00/00 Emask 0x4 (timeout)
May  5 16:18:04 Tower kernel: ata6.00: status: { DRDY }
May  5 16:18:04 Tower kernel: ata6: hard resetting link
May  5 16:18:10 Tower kernel: ata6: link is slow to respond, please be patient (ready=0)
May  5 16:18:14 Tower kernel: ata6: COMRESET failed (errno=-16)
May  5 16:18:14 Tower kernel: ata6: hard resetting link
May  5 16:18:20 Tower kernel: ata6: link is slow to respond, please be patient (ready=0)
May  5 16:18:24 Tower kernel: ata6: COMRESET failed (errno=-16)
May  5 16:18:24 Tower kernel: ata6: hard resetting link
May  5 16:18:30 Tower kernel: ata6: link is slow to respond, please be patient (ready=0)
May  5 16:18:59 Tower kernel: ata6: COMRESET failed (errno=-16)
May  5 16:18:59 Tower kernel: ata6: limiting SATA link speed to 3.0 Gbps
May  5 16:18:59 Tower kernel: ata6: hard resetting link
May  5 16:19:04 Tower kernel: ata6: COMRESET failed (errno=-16)
May  5 16:19:04 Tower kernel: ata6: reset failed, giving up
May  5 16:19:04 Tower kernel: ata6.00: disable device
May  5 16:19:04 Tower kernel: ata6: EH complete
May  5 16:19:04 Tower kernel: sd 6:0:0:0: [sdg] tag#5 UNKNOWN(0x2003) Result: hostbyte=0x04 driverbyte=DRIVER_OK cmd_age=89s
May  5 16:19:04 Tower kernel: sd 6:0:0:0: [sdg] tag#5 CDB: opcode=0x28 28 00 2d a9 29 40 00 00 08 00
May  5 16:19:04 Tower kernel: I/O error, dev sdg, sector 766060864 op 0x0:(READ) flags 0x1000 phys_seg 1 prio class 2
May  5 16:19:04 Tower kernel: BTRFS error (device sdg1): bdev /dev/sdg1 errs: wr 0, rd 1, flush 0, corrupt 0, gen 0
May  5 16:19:04 Tower kernel: sd 6:0:0:0: [sdg] tag#6 UNKNOWN(0x2003) Result: hostbyte=0x04 driverbyte=DRIVER_OK cmd_age=0s


 

Update: I added iommu=pt in case it is my Marvell controller causing issues. Prior to doing that, I switched cables (sata and sata power adaptor) and rebooted. The log for each cache drive are littered in errors so im going to keep the array offline till I have a clue whats happening as im concerned about data corruption/loss.

 

 

 

Thanks!

tower-diagnostics-20230506-0836.zip

Edited by TurkeyPerson
Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...