[6.5.1] - Call Trace Error

sutherlandm · May 13, 2018

Hi All,

I was warned through Fix Common Problems about a Call Trace error this morning, just wondering if someone could help me out to track down what is wrong. From the syslog it looks like the Call Trace came right after the following lines about CPU 5:

May 11 21:41:06 Picard kernel: WARNING: CPU: 5 PID: 0 at net/sched/sch_generic.c:320 dev_watchdog+0x157/0x1b8
May 11 21:41:06 Picard kernel: CPU: 5 PID: 0 Comm: swapper/5 Not tainted 4.14.35-unRAID #1
May 11 21:41:06 Picard kernel: Call Trace:

After that it goes on about "ata14" and "hard resetting link":

May 13 13:45:00 Picard kernel: ata14.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6
May 13 13:45:00 Picard kernel: ata14: hard resetting link
May 13 13:45:00 Picard kernel: ata14.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6
May 13 13:45:00 Picard kernel: ata14: hard resetting link

I did notice some odd behavior with downloading docker updates and some of the communication between them lately, not sure if it's related.

Any help would be greatly appreciated. Thank you!

picard-diagnostics-20180513-1450.zip

JorgeB · May 14, 2018

Call trace is the same as this one, ATA14 errors can be ignored as that is not a drive, it's the Marvell console device, BTW, never use those 4 Marvell ports for any devices.

sutherlandm · May 15, 2018

Ah alright, thank you!

I have moved a few of the drives around so I'm not using any of the 4 Marvell ports. Out of curiosity, why is it best not to use them?

itimpi · May 15, 2018

4 hours ago, sutherlandm said:

I have moved a few of the drives around so I'm not using any of the 4 Marvell ports. Out of curiosity, why is it best not to use them?

Marvel controllers seem to have a habit of letting disks drop offline randomly. Also they do not always play well if virtualisation features are enabled in the machines BIOS. It is a bit strange as some people use them successfully for a while but then start getting errors affecting disks attached to the Marvel controller.

My guess is that this related to something deep in the 64-bit drivers used on current Linux kernels as such problems did not seem to happen when were using 32-bit Linux kernels. As such it is always possible that at some point the advice to avoid Marvel controllers becomes unnecessary but as the cause is unknown the advice is still good as a generic answer.

JorgeB · May 15, 2018

Like itimpi mentioned Marvell controllers in general can have issues with dropped disks, but the one on those Asrock server boards is notoriously problematic.

John_M · May 16, 2018

On 5/15/2018 at 6:04 AM, itimpi said:

My guess is that this related to something deep in the 64-bit drivers

I thought that too and I believe it is indeed the case with the Marvell SASLP controllers. However I note that this particular Marvell SATA controller uses the standard ahci module as its driver, not something that's specific to Marvell hardware.

0a:00.0 SATA controller [0106]: Marvell Technology Group Ltd. 88SE9230 PCIe SATA 6Gb/s Controller [1b4b:9230] (rev 11)
	Subsystem: ASRock Incorporation 88SE9230 PCIe SATA 6Gb/s Controller [1849:9230]
	Kernel driver in use: ahci
	Kernel modules: ahci

[6.5.1] - Call Trace Error

Recommended Posts

sutherlandm

Link to comment

JorgeB

Link to comment

sutherlandm

Link to comment

itimpi

Link to comment

JorgeB

Link to comment

John_M

Link to comment

Join the conversation