dodgeman Posted October 20, 2020 Share Posted October 20, 2020 I've been having a few kernel panics. Kernal panic - not syncing: Fatal exception in interrupt Kernel offset disabled. Before that in the syslog I see this Oct 18 13:18:11 MHS kernel: ata4.00: exception Emask 0x10 SAct 0x20000 SErr 0x90302 action 0xe frozen Oct 18 13:18:11 MHS kernel: ata4.00: irq_stat 0x00400000, PHY RDY changed Oct 18 13:18:11 MHS kernel: ata4: SError: { RecovComm UnrecovData Persist PHYRdyChg 10B8B } Oct 18 13:18:11 MHS kernel: ata4.00: failed command: READ FPDMA QUEUED Oct 18 13:18:11 MHS kernel: ata4.00: cmd 60/70:88:b8:58:45/02:00:00:00:00/40 tag 17 ncq dma 319488 in Oct 18 13:18:11 MHS kernel: res 40/00:88:b8:58:45/00:00:00:00:00/40 Emask 0x10 (ATA bus error) Oct 18 13:18:11 MHS kernel: ata4.00: status: { DRDY } Oct 18 13:18:11 MHS kernel: ata4: hard resetting link Oct 18 13:18:17 MHS kernel: ata4: SATA link up 6.0 Gbps (SStatus 133 SControl 300) Oct 18 13:18:17 MHS kernel: ata4.00: configured for UDMA/133 Oct 18 13:18:17 MHS kernel: ata4: EH complete Oct 18 13:18:37 MHS kernel: ata4: limiting SATA link speed to 3.0 Gbps Oct 18 13:18:37 MHS kernel: ata4.00: exception Emask 0x10 SAct 0x40000 SErr 0x90302 action 0xe frozen Oct 18 13:18:37 MHS kernel: ata4.00: irq_stat 0x00400000, PHY RDY changed Oct 18 13:18:37 MHS kernel: ata4: SError: { RecovComm UnrecovData Persist PHYRdyChg 10B8B } Oct 18 13:18:37 MHS kernel: ata4.00: failed command: READ FPDMA QUEUED Oct 18 13:18:37 MHS kernel: ata4.00: cmd 60/60:90:28:6e:6a/03:00:00:00:00/40 tag 18 ncq dma 442368 in Oct 18 13:18:37 MHS kernel: res 40/00:90:28:6e:6a/00:00:00:00:00/40 Emask 0x10 (ATA bus error) Oct 18 13:18:37 MHS kernel: ata4.00: status: { DRDY } Oct 18 13:18:37 MHS kernel: ata4: hard resetting link Oct 18 13:18:42 MHS kernel: ata4: SATA link up 3.0 Gbps (SStatus 123 SControl 320) Oct 18 13:18:42 MHS kernel: ata4.00: configured for UDMA/133 Oct 18 13:18:42 MHS kernel: ata4: EH complete Oct 18 13:19:06 MHS kernel: ata4: SATA link up 3.0 Gbps (SStatus 123 SControl 320) Oct 18 13:19:06 MHS kernel: ata4.00: configured for UDMA/133 Oct 18 13:19:32 MHS kernel: ata4.00: exception Emask 0x10 SAct 0x3 SErr 0x90300 action 0xe frozen Oct 18 13:19:32 MHS kernel: ata4.00: irq_stat 0x00400000, PHY RDY changed Oct 18 13:19:32 MHS kernel: ata4: SError: { UnrecovData Persist PHYRdyChg 10B8B } Oct 18 13:19:32 MHS kernel: ata4.00: failed command: READ FPDMA QUEUED Oct 18 13:19:32 MHS kernel: ata4.00: cmd 60/00:00:40:bc:ba/04:00:00:00:00/40 tag 0 ncq dma 524288 in Oct 18 13:19:32 MHS kernel: res 40/00:08:40:c0:ba/00:00:00:00:00/40 Emask 0x10 (ATA bus error) Oct 18 13:19:32 MHS kernel: ata4.00: status: { DRDY } Oct 18 13:19:32 MHS kernel: ata4.00: failed command: READ FPDMA QUEUED Oct 18 13:19:32 MHS kernel: ata4.00: cmd 60/b8:08:40:c0:ba/00:00:00:00:00/40 tag 1 ncq dma 94208 in Oct 18 13:19:32 MHS kernel: res 40/00:08:40:c0:ba/00:00:00:00:00/40 Emask 0x10 (ATA bus error) Oct 18 13:19:32 MHS kernel: ata4.00: status: { DRDY } Oct 18 13:19:32 MHS kernel: ata4: hard resetting link Oct 18 13:19:33 MHS kernel: br0: received packet on bond0 with own address as source address (addr:18:c0:4d:2f:2a:76, vlan:0) ### [PREVIOUS LINE REPEATED 1 TIMES] ### Oct 18 13:19:37 MHS kernel: ata4: SATA link up 3.0 Gbps (SStatus 123 SControl 320) Oct 18 13:19:37 MHS kernel: ata4.00: configured for UDMA/133 Oct 18 13:19:37 MHS kernel: ata4: EH complete Oct 18 13:20:06 MHS kernel: br0: received packet on bond0 with own address as source address (addr:18:c0:4d:2f:2a:76, vlan:0) ### [PREVIOUS LINE REPEATED 1 TIMES] ### Oct 18 13:20:39 MHS kernel: resource sanity check: requesting [mem 0x000c0000-0x000fffff], which spans more than PCI Bus 0000:00 [mem 0x000c0000-0x000dffff window] Oct 18 13:20:39 MHS kernel: caller _nv000908rm+0x1bf/0x1f0 [nvidia] mapping multiple BARs Oct 18 13:20:41 MHS kernel: resource sanity check: req Is this the controller or the drive on the controller causing this issue ? Quote Link to comment
JorgeB Posted October 20, 2020 Share Posted October 20, 2020 Please post the diagnostics: Tools -> Diagnostics Quote Link to comment
dodgeman Posted October 20, 2020 Author Share Posted October 20, 2020 mhs-diagnostics-20201020-0853.zip Quote Link to comment
JorgeB Posted October 20, 2020 Share Posted October 20, 2020 Looks like a connection problem, replace cables on disk1. Quote Link to comment
JorgeB Posted October 20, 2020 Share Posted October 20, 2020 Oh, and you need to run a filesystem check on the disk being rebuilt (disk19). Quote Link to comment
dodgeman Posted October 20, 2020 Author Share Posted October 20, 2020 Thanks odd for a cable to just go out, but its replaced and running a sync. Is there a good guide on reading the diag ? Quote Link to comment
JorgeB Posted October 20, 2020 Share Posted October 20, 2020 8 minutes ago, dodgeman said: Is there a good guide on reading the diag ? Not that I know of. Quote Link to comment
Squid Posted October 20, 2020 Share Posted October 20, 2020 13 minutes ago, dodgeman said: Is there a good guide on reading the diag ? I learned by using this 1 Quote Link to comment
dodgeman Posted October 20, 2020 Author Share Posted October 20, 2020 Nice so the real guide is experience. I will go back through it to see the items you identified so I can pick up on that as I am sure this is not the last time something like this will happen. Quote Link to comment
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.