TurkeyPerson Posted May 6, 2023 Share Posted May 6, 2023 (edited) Hi all, I'm seeing this in my logs until they fill up. It's happened twice now since I got new cache disks. I've tried adjusting the cable and am about to try replacing it. Could it be something other than a hardware issue? I have attached diagnostics below, here's an excerpt: May 6 08:49:20 Tower kernel: BTRFS error (device sdg1): bdev /dev/sdg1 errs: wr 8077984, rd 49379, flush 459477, corrupt 0, gen 0 May 6 08:49:20 Tower kernel: BTRFS error (device sdg1): bdev /dev/sdg1 errs: wr 8077985, rd 49379, flush 459477, corrupt 0, gen 0 May 6 08:49:20 Tower kernel: sd 6:0:0:0: [sdg] tag#11 UNKNOWN(0x2003) Result: hostbyte=0x04 driverbyte=DRIVER_OK cmd_age=0s May 6 08:49:20 Tower kernel: sd 6:0:0:0: [sdg] tag#11 CDB: opcode=0x2a 2a 00 00 62 7c 40 00 00 60 00 May 6 08:49:20 Tower kernel: I/O error, dev sdg, sector 6454336 op 0x1:(WRITE) flags 0x1800 phys_seg 12 prio class 2 May 6 08:49:20 Tower kernel: sd 6:0:0:0: [sdg] tag#12 UNKNOWN(0x2003) Result: hostbyte=0x04 driverbyte=DRIVER_OK cmd_age=0s May 6 08:49:20 Tower kernel: sd 6:0:0:0: [sdg] tag#12 CDB: opcode=0x2a 2a 00 00 62 7c e0 00 00 40 00 May 6 08:49:20 Tower kernel: I/O error, dev sdg, sector 6454496 op 0x1:(WRITE) flags 0x1800 phys_seg 8 prio class 2 May 6 08:49:20 Tower kernel: BTRFS error (device sdg1): bdev /dev/sdg1 errs: wr 8077986, rd 49379, flush 459477, corrupt 0, gen 0 May 6 08:49:20 Tower kernel: BTRFS error (device sdg1): bdev /dev/sdg1 errs: wr 8077987, rd 49379, flush 459477, corrupt 0, gen 0 May 6 08:49:20 Tower kernel: BTRFS error (device sdg1): bdev /dev/sdg1 errs: wr 8077988, rd 49379, flush 459477, corrupt 0, gen 0 May 6 08:49:20 Tower kernel: sd 6:0:0:0: [sdg] tag#22 UNKNOWN(0x2003) Result: hostbyte=0x04 driverbyte=DRIVER_OK cmd_age=0s May 6 08:49:20 Tower kernel: sd 6:0:0:0: [sdg] tag#22 CDB: opcode=0x35 35 00 00 00 00 00 00 00 00 00 May 6 08:49:20 Tower kernel: I/O error, dev sdg, sector 0 op 0x1:(WRITE) flags 0x800 phys_seg 0 prio class 2 May 6 08:49:20 Tower kernel: BTRFS error (device sdg1): bdev /dev/sdg1 errs: wr 8077988, rd 49379, flush 459478, corrupt 0, gen 0 May 6 08:49:20 Tower kernel: sd 6:0:0:0: [sdg] tag#6 UNKNOWN(0x2003) Result: hostbyte=0x04 driverbyte=DRIVER_OK cmd_age=0s May 6 08:49:20 Tower kernel: sd 6:0:0:0: [sdg] tag#6 CDB: opcode=0x2a 2a 00 00 00 08 80 00 00 08 00 May 6 08:49:20 Tower kernel: I/O error, dev sdg, sector 2176 op 0x1:(WRITE) flags 0x3800 phys_seg 1 prio class 2 May 6 08:49:20 Tower kernel: I/O error, dev sdg, sector 2176 op 0x1:(WRITE) flags 0x3800 phys_seg 1 prio class 2 May 6 08:49:20 Tower kernel: btrfs_end_super_write: 17 callbacks suppressed May 6 08:49:20 Tower kernel: BTRFS warning (device sdg1): lost page write due to IO error on /dev/sdg1 (-5) May 6 08:49:20 Tower kernel: BTRFS error (device sdg1): bdev /dev/sdg1 errs: wr 8077989, rd 49379, flush 459478, corrupt 0, gen 0 May 6 08:49:20 Tower kernel: BTRFS error (device sdg1): error writing primary super block to device 1 May 6 08:49:20 Tower kernel: sd 6:0:0:0: [sdg] tag#7 UNKNOWN(0x2003) Result: hostbyte=0x04 driverbyte=DRIVER_OK cmd_age=0s May 6 08:49:20 Tower kernel: sd 6:0:0:0: [sdg] tag#7 CDB: opcode=0x2a 2a 00 97 cd a6 a8 00 00 08 00 May 6 08:49:20 Tower kernel: BTRFS warning (device sdg1): lost page write due to IO error on /dev/sdg1 (-5) May 6 08:49:20 Tower kernel: BTRFS error (device sdg1): error writing primary super block to device 1 I think the issues begin with this error which may suggest that my PCI SATA controller is the underlying culprit? May 5 16:18:04 Tower kernel: ata6.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6 frozen May 5 16:18:04 Tower kernel: ata6.00: failed command: DATA SET MANAGEMENT May 5 16:18:04 Tower kernel: ata6.00: cmd 06/01:01:00:00:00/00:00:00:00:00/a0 tag 21 dma 512 out May 5 16:18:04 Tower kernel: res 40/00:01:01:4f:c2/00:00:00:00:00/00 Emask 0x4 (timeout) May 5 16:18:04 Tower kernel: ata6.00: status: { DRDY } May 5 16:18:04 Tower kernel: ata6: hard resetting link May 5 16:18:10 Tower kernel: ata6: link is slow to respond, please be patient (ready=0) May 5 16:18:14 Tower kernel: ata6: COMRESET failed (errno=-16) May 5 16:18:14 Tower kernel: ata6: hard resetting link May 5 16:18:20 Tower kernel: ata6: link is slow to respond, please be patient (ready=0) May 5 16:18:24 Tower kernel: ata6: COMRESET failed (errno=-16) May 5 16:18:24 Tower kernel: ata6: hard resetting link May 5 16:18:30 Tower kernel: ata6: link is slow to respond, please be patient (ready=0) May 5 16:18:59 Tower kernel: ata6: COMRESET failed (errno=-16) May 5 16:18:59 Tower kernel: ata6: limiting SATA link speed to 3.0 Gbps May 5 16:18:59 Tower kernel: ata6: hard resetting link May 5 16:19:04 Tower kernel: ata6: COMRESET failed (errno=-16) May 5 16:19:04 Tower kernel: ata6: reset failed, giving up May 5 16:19:04 Tower kernel: ata6.00: disable device May 5 16:19:04 Tower kernel: ata6: EH complete May 5 16:19:04 Tower kernel: sd 6:0:0:0: [sdg] tag#5 UNKNOWN(0x2003) Result: hostbyte=0x04 driverbyte=DRIVER_OK cmd_age=89s May 5 16:19:04 Tower kernel: sd 6:0:0:0: [sdg] tag#5 CDB: opcode=0x28 28 00 2d a9 29 40 00 00 08 00 May 5 16:19:04 Tower kernel: I/O error, dev sdg, sector 766060864 op 0x0:(READ) flags 0x1000 phys_seg 1 prio class 2 May 5 16:19:04 Tower kernel: BTRFS error (device sdg1): bdev /dev/sdg1 errs: wr 0, rd 1, flush 0, corrupt 0, gen 0 May 5 16:19:04 Tower kernel: sd 6:0:0:0: [sdg] tag#6 UNKNOWN(0x2003) Result: hostbyte=0x04 driverbyte=DRIVER_OK cmd_age=0s Update: I added iommu=pt in case it is my Marvell controller causing issues. Prior to doing that, I switched cables (sata and sata power adaptor) and rebooted. The log for each cache drive are littered in errors so im going to keep the array offline till I have a clue whats happening as im concerned about data corruption/loss. Thanks! tower-diagnostics-20230506-0836.zip Edited May 6, 2023 by TurkeyPerson Quote Link to comment
TurkeyPerson Posted May 6, 2023 Author Share Posted May 6, 2023 New diagnostic after changing cables and reboooting tower-diagnostics-20230506-1519.zip Quote Link to comment
Solution JorgeB Posted May 7, 2023 Solution Share Posted May 7, 2023 Try this: https://forums.unraid.net/topic/134954-warning-crucial-mx500-ssds-world-of-pain-stay-away-from-these/?do=findComment&comment=1255816 Quote Link to comment
TurkeyPerson Posted May 8, 2023 Author Share Posted May 8, 2023 This solution appears to be working so far. Thank you! 1 Quote Link to comment
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.