cache btrfs errors


Recommended Posts

I have had this continue to occur even after doing a complete format of the cache, rebuilding docker images, and restoring app data. These errors seems to occur after some Deluge activity.

 

2 SSDs btrfs-encrypted raid 1.

 

Do I have a bad SSD?

 

Is there a way to do a SSD test?

 

Quote

Nov 15 14:51:22 Tower kernel: loop: Write error at byte offset 710000640, length 4096.
Nov 15 14:51:22 Tower kernel: print_req_error: I/O error, dev loop2, sector 1386720
Nov 15 14:51:22 Tower kernel: BTRFS error (device loop2): bdev /dev/loop2 errs: wr 1, rd 0, flush 0, corrupt 0, gen 0
Nov 15 14:51:22 Tower kernel: loop: Write error at byte offset 710017024, length 4096.
Nov 15 14:51:22 Tower kernel: print_req_error: I/O error, dev loop2, sector 1386752
Nov 15 14:51:22 Tower kernel: BTRFS error (device loop2): bdev /dev/loop2 errs: wr 2, rd 0, flush 0, corrupt 0, gen 0
Nov 15 14:51:22 Tower kernel: loop: Write error at byte offset 978436096, length 4096.
Nov 15 14:51:22 Tower kernel: print_req_error: I/O error, dev loop2, sector 1911008
Nov 15 14:51:22 Tower kernel: BTRFS error (device loop2): bdev /dev/loop2 errs: wr 3, rd 0, flush 0, corrupt 0, gen 0
Nov 15 14:51:22 Tower kernel: loop: Write error at byte offset 978452480, length 4096.
Nov 15 14:51:22 Tower kernel: print_req_error: I/O error, dev loop2, sector 1911040
Nov 15 14:51:22 Tower kernel: BTRFS error (device loop2): bdev /dev/loop2 errs: wr 4, rd 0, flush 0, corrupt 0, gen 0
Nov 15 14:51:24 Tower kernel: loop: Write error at byte offset 709902336, length 4096.
Nov 15 14:51:24 Tower kernel: print_req_error: I/O error, dev loop2, sector 1386528
Nov 15 14:51:24 Tower kernel: BTRFS error (device loop2): bdev /dev/loop2 errs: wr 5, rd 0, flush 0, corrupt 0, gen 0
Nov 15 14:51:24 Tower kernel: loop: Write error at byte offset 709951488, length 4096.
Nov 15 14:51:24 Tower kernel: print_req_error: I/O error, dev loop2, sector 1386624
Nov 15 14:51:24 Tower kernel: BTRFS error (device loop2): bdev /dev/loop2 errs: wr 6, rd 0, flush 0, corrupt 0, gen 0
Nov 15 14:51:24 Tower kernel: loop: Write error at byte offset 710033408, length 4096.
Nov 15 14:51:24 Tower kernel: print_req_error: I/O error, dev loop2, sector 1386784
Nov 15 14:51:24 Tower kernel: BTRFS error (device loop2): bdev /dev/loop2 errs: wr 7, rd 0, flush 0, corrupt 0, gen 0
Nov 15 14:51:24 Tower kernel: loop: Write error at byte offset 710082560, length 4096.
Nov 15 14:51:24 Tower kernel: print_req_error: I/O error, dev loop2, sector 1386880
Nov 15 14:51:24 Tower kernel: BTRFS error (device loop2): bdev /dev/loop2 errs: wr 8, rd 0, flush 0, corrupt 0, gen 0
Nov 15 14:51:24 Tower kernel: loop: Write error at byte offset 710148096, length 4096.
Nov 15 14:51:24 Tower kernel: print_req_error: I/O error, dev loop2, sector 1387008
Nov 15 14:51:24 Tower kernel: BTRFS error (device loop2): bdev /dev/loop2 errs: wr 9, rd 0, flush 0, corrupt 0, gen 0
Nov 15 14:51:24 Tower kernel: loop: Write error at byte offset 710213632, length 4096.
Nov 15 14:51:24 Tower kernel: print_req_error: I/O error, dev loop2, sector 1387136
Nov 15 14:51:24 Tower kernel: BTRFS error (device loop2): bdev /dev/loop2 errs: wr 10, rd 0, flush 0, corrupt 0, gen 0
Nov 15 14:51:24 Tower kernel: BTRFS: error (device loop2) in btrfs_commit_transaction:2267: errno=-5 IO failure (Error while writing out transaction)
Nov 15 14:51:24 Tower kernel: BTRFS info (device loop2): forced readonly
Nov 15 14:51:24 Tower kernel: BTRFS warning (device loop2): Skipping commit of aborted transaction.
Nov 15 14:51:24 Tower kernel: BTRFS: error (device loop2) in cleanup_transaction:1860: errno=-5 IO failure
Nov 15 14:51:24 Tower kernel: BTRFS info (device loop2): delayed_refs has NO entry
Nov 15 14:54:15 Tower webGUI: Successful login user root from 10.10.10.24

 

 

tower-diagnostics-20201115-1632.zip

Link to comment

Tonight a got a smart error "199 CRC error count" on one of the SSD drives and many errors as shown below. It seems this could be related to a bad SATA cable. I will swap the cable and power tomorrow.

 

Any other ideas?

 

Thank you,

 

Quote

Nov 15 19:07:46 Tower kernel: TCP: request_sock_TCP: Possible SYN flooding on port 25317. Sending cookies.  Check SNMP counters.
Nov 15 20:00:07 Tower crond[1855]: exit status 1 from user root /usr/local/sbin/mover &> /dev/null
Nov 15 20:09:29 Tower kernel: ata3.00: exception Emask 0x10 SAct 0xf00 SErr 0x400100 action 0x6 frozen
Nov 15 20:09:29 Tower kernel: ata3.00: irq_stat 0x08000000, interface fatal error
Nov 15 20:09:29 Tower kernel: ata3: SError: { UnrecovData Handshk }
Nov 15 20:09:29 Tower kernel: ata3.00: failed command: WRITE FPDMA QUEUED
Nov 15 20:09:29 Tower kernel: ata3.00: cmd 61/80:40:c0:0d:e6/00:00:0b:00:00/40 tag 8 ncq dma 65536 out
Nov 15 20:09:29 Tower kernel:         res 40/00:58:40:0f:e6/00:00:0b:00:00/40 Emask 0x10 (ATA bus error)
Nov 15 20:09:29 Tower kernel: ata3.00: status: { DRDY }
Nov 15 20:09:29 Tower kernel: ata3.00: failed command: WRITE FPDMA QUEUED
Nov 15 20:09:29 Tower kernel: ata3.00: cmd 61/80:48:40:0e:e6/00:00:0b:00:00/40 tag 9 ncq dma 65536 out
Nov 15 20:09:29 Tower kernel:         res 40/00:58:40:0f:e6/00:00:0b:00:00/40 Emask 0x10 (ATA bus error)
Nov 15 20:09:29 Tower kernel: ata3.00: status: { DRDY }
Nov 15 20:09:29 Tower kernel: ata3.00: failed command: WRITE FPDMA QUEUED
Nov 15 20:09:29 Tower kernel: ata3.00: cmd 61/80:50:c0:0e:e6/00:00:0b:00:00/40 tag 10 ncq dma 65536 out
Nov 15 20:09:29 Tower kernel:         res 40/00:58:40:0f:e6/00:00:0b:00:00/40 Emask 0x10 (ATA bus error)
Nov 15 20:09:29 Tower kernel: ata3.00: status: { DRDY }
Nov 15 20:09:29 Tower kernel: ata3.00: failed command: WRITE FPDMA QUEUED
Nov 15 20:09:29 Tower kernel: ata3.00: cmd 61/60:58:40:0f:e6/00:00:0b:00:00/40 tag 11 ncq dma 49152 out
Nov 15 20:09:29 Tower kernel:         res 40/00:58:40:0f:e6/00:00:0b:00:00/40 Emask 0x10 (ATA bus error)
Nov 15 20:09:29 Tower kernel: ata3.00: status: { DRDY }
Nov 15 20:09:29 Tower kernel: ata3: hard resetting link
Nov 15 20:09:29 Tower kernel: ata3: SATA link up 6.0 Gbps (SStatus 133 SControl 300)
Nov 15 20:09:29 Tower kernel: ata3.00: supports DRM functions and may not be fully accessible
Nov 15 20:09:29 Tower kernel: ata3.00: NCQ Send/Recv Log not supported
Nov 15 20:09:29 Tower kernel: ata3.00: supports DRM functions and may not be fully accessible
Nov 15 20:09:29 Tower kernel: ata3.00: NCQ Send/Recv Log not supported
Nov 15 20:09:29 Tower kernel: ata3.00: configured for UDMA/133
Nov 15 20:09:29 Tower kernel: ata3: EH complete
Nov 15 22:29:57 Tower kernel: ata3.00: exception Emask 0x10 SAct 0x1f800000 SErr 0x400100 action 0x6 frozen
Nov 15 22:29:57 Tower kernel: ata3.00: irq_stat 0x08000000, interface fatal error
Nov 15 22:29:57 Tower kernel: ata3: SError: { UnrecovData Handshk }
Nov 15 22:29:57 Tower kernel: ata3.00: failed command: WRITE FPDMA QUEUED
Nov 15 22:29:57 Tower kernel: ata3.00: cmd 61/80:b8:c0:26:f2/00:00:0b:00:00/40 tag 23 ncq dma 65536 out
Nov 15 22:29:57 Tower kernel:         res 40/00:d0:40:26:f2/00:00:0b:00:00/40 Emask 0x10 (ATA bus error)
Nov 15 22:29:57 Tower kernel: ata3.00: status: { DRDY }
Nov 15 22:29:57 Tower kernel: ata3.00: failed command: WRITE FPDMA QUEUED
Nov 15 22:29:57 Tower kernel: ata3.00: cmd 61/80:c0:40:27:f2/00:00:0b:00:00/40 tag 24 ncq dma 65536 out
Nov 15 22:29:57 Tower kernel:         res 40/00:d0:40:26:f2/00:00:0b:00:00/40 Emask 0x10 (ATA bus error)
Nov 15 22:29:57 Tower kernel: ata3.00: status: { DRDY }
Nov 15 22:29:57 Tower kernel: ata3.00: failed command: WRITE FPDMA QUEUED
Nov 15 22:29:57 Tower kernel: ata3.00: cmd 61/80:c8:c0:25:f2/00:00:0b:00:00/40 tag 25 ncq dma 65536 out
Nov 15 22:29:57 Tower kernel:         res 40/00:d0:40:26:f2/00:00:0b:00:00/40 Emask 0x10 (ATA bus error)
Nov 15 22:29:57 Tower kernel: ata3.00: status: { DRDY }
Nov 15 22:29:57 Tower kernel: ata3.00: failed command: WRITE FPDMA QUEUED
Nov 15 22:29:57 Tower kernel: ata3.00: cmd 61/80:d0:40:26:f2/00:00:0b:00:00/40 tag 26 ncq dma 65536 out
Nov 15 22:29:57 Tower kernel:         res 40/00:d0:40:26:f2/00:00:0b:00:00/40 Emask 0x10 (ATA bus error)
Nov 15 22:29:57 Tower kernel: ata3.00: status: { DRDY }
Nov 15 22:29:57 Tower kernel: ata3.00: failed command: WRITE FPDMA QUEUED
Nov 15 22:29:57 Tower kernel: ata3.00: cmd 61/80:d8:c0:27:f2/00:00:0b:00:00/40 tag 27 ncq dma 65536 out
Nov 15 22:29:57 Tower kernel:         res 40/00:d0:40:26:f2/00:00:0b:00:00/40 Emask 0x10 (ATA bus error)
Nov 15 22:29:57 Tower kernel: ata3.00: status: { DRDY }
Nov 15 22:29:57 Tower kernel: ata3.00: failed command: WRITE FPDMA QUEUED
Nov 15 22:29:57 Tower kernel: ata3.00: cmd 61/c0:e0:40:28:f2/00:00:0b:00:00/40 tag 28 ncq dma 98304 out
Nov 15 22:29:57 Tower kernel:         res 40/00:d0:40:26:f2/00:00:0b:00:00/40 Emask 0x10 (ATA bus error)
Nov 15 22:29:57 Tower kernel: ata3.00: status: { DRDY }
Nov 15 22:29:57 Tower kernel: ata3: hard resetting link
Nov 15 22:29:57 Tower kernel: ata3: SATA link up 6.0 Gbps (SStatus 133 SControl 300)

 

Link to comment

An update on this to close it out.

 

After replacing the SATA cable, and switching the SATA port the btrf errors continued after rebuild of the filesystem and docker image multiple times. I removed the suspect SATA SSD drive and went to a single cache drive and errors stopped. I have concluded that that SSD has failed and replaced it. All is restored and back to normal now.

Edited by Michael Woodson
  • Like 1
Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.