Marvell disk controller chipsets and virtualization


Recommended Posts

  • 4 weeks later...

I seem to have run into this after upgrading my Microserver Gen 8 from a Celeron G1610T to a Xeon E3-1265L. Everything else seems to be working fine but drives aren't showing on a Marvel 88SE9230 card connected via eSATA to an external port multiplier box. Only thing that changed was the CPU. The G1610T was missing VT-d so finding this thread seemed like a fairly easy fix.

 

I've turned off VT-d, VT-x, even hyperthreading and turbo boost in the BIOS with no effect. iommu=pt and/or turning off VT-d seems to remove the qc timeout errors etc, but the drives are still not recognised.

 

lspci reports

# lspci -vvnn -d 1b4b:
07:00.0 SATA controller [0106]: Marvell Technology Group Ltd. 88SE9230 PCIe SATA 6Gb/s Controller [1b4b:9230] (rev 11) (prog-if 01 [AHCI 1.0])
        Subsystem: Marvell Technology Group Ltd. 88SE9230 PCIe SATA 6Gb/s Controller [1b4b:9230]
        Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr+ Stepping- SERR- FastB2B- DisINTx+
        Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
        Latency: 0, Cache Line Size: 64 bytes
        Interrupt: pin A routed to IRQ 32
        NUMA node: 0
        Region 0: I/O ports at 4000 [size=8]
        Region 1: I/O ports at 4008 [size=4]
        Region 2: I/O ports at 4010 [size=8]
        Region 3: I/O ports at 4018 [size=4]
        Region 4: I/O ports at 4020 [size=32]
        Region 5: Memory at fbff0000 (32-bit, non-prefetchable) [size=2K]
        [virtual] Expansion ROM at fbf00000 [disabled] [size=64K]
        Capabilities: [40] Power Management version 3
                Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA PME(D0-,D1-,D2-,D3hot+,D3cold-)
                Status: D0 NoSoftRst- PME-Enable- DSel=0 DScale=0 PME-
        Capabilities: [50] MSI: Enable+ Count=1/1 Maskable- 64bit-
                Address: fee00000  Data: 4092
        Capabilities: [70] Express (v2) Legacy Endpoint, MSI 00
                DevCap: MaxPayload 512 bytes, PhantFunc 0, Latency L0s <1us, L1 <8us
                        ExtTag- AttnBtn- AttnInd- PwrInd- RBE+ FLReset-
                DevCtl: Report errors: Correctable- Non-Fatal+ Fatal+ Unsupported-
                        RlxdOrd+ ExtTag- PhantFunc- AuxPwr- NoSnoop-
                        MaxPayload 128 bytes, MaxReadReq 4096 bytes
                DevSta: CorrErr- UncorrErr- FatalErr- UnsuppReq- AuxPwr- TransPend-
                LnkCap: Port #0, Speed 5GT/s, Width x2, ASPM L0s L1, Exit Latency L0s <512ns, L1 <64us
                        ClockPM- Surprise- LLActRep- BwNot- ASPMOptComp-
                LnkCtl: ASPM Disabled; RCB 64 bytes Disabled- CommClk+
                        ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
                LnkSta: Speed 5GT/s, Width x1, TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt-
                DevCap2: Completion Timeout: Not Supported, TimeoutDis+, LTR-, OBFF Not Supported
                DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis-, LTR-, OBFF Disabled
                LnkCtl2: Target Link Speed: 5GT/s, EnterCompliance- SpeedDis-
                         Transmit Margin: Normal Operating Range, EnterModifiedCompliance- ComplianceSOS-
                         Compliance De-emphasis: -6dB
                LnkSta2: Current De-emphasis Level: -6dB, EqualizationComplete-, EqualizationPhase1-
                         EqualizationPhase2-, EqualizationPhase3-, LinkEqualizationRequest-
        Capabilities: [e0] SATA HBA v0.0 BAR4 Offset=00000004
        Capabilities: [100 v1] Advanced Error Reporting
                UESta:  DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
                UEMsk:  DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq+ ACSViol-
                UESvrt: DLP- SDES+ TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
                CESta:  RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr-
                CEMsk:  RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr-
                AERCap: First Error Pointer: 00, GenCap- CGenEn- ChkCap- ChkEn-
        Kernel driver in use: ahci
        Kernel modules: ahci

Everything had been working fine for some months before the CPU upgrade. I can go back to the old CPU but have limited heatsink compound to chop and change too much right now.

 

I haven't attempted a Marvell BIOS upgrade - it's a generic no name kind of card so not sure of what card BIOS might work.

 

Any ideas?

 

Link to comment

Hmm so after much mucking around (HP server BIOSes take so long to boot), I flashed my controller BIOS to 1065 (was 1041) per this post:

https://homeservershow.com/forums/topic/9179-marvell-9230-firmware-updates-and-such/?do=findComment&comment=142560

 

old version was 

AUTOLOAD VERSION[0x00000000]: 200015
LOADER VERSION[0x0000C000]: 21001004
BIOS VERSION[0x00020000]: 1.0.0.1012
FIRMWARE VERSION[0x00030000]: 2.3.0.1041

 

programmed version is:

PACKAGE VERSION[0xFFFFFFFF]: 2.3.0.1063
AUTOLOAD VERSION[0x00000000]: 200018
LOADER VERSION[0x0000C000]: 21001008
BIOS VERSION[0x00020000]: 1.0.0.1024
FIRMWARE VERSION[0x00030000]: 2.3.0.1065

 

There may be issues with it running at only 2.5Gbps instead of 5Gbps on the PCIe bus that I need to look at. My card is an A1 and I just let it automatically pick the firmware to upgrade to.

 

That in conjunction with disabling VT-d in the BIOS now gives a workable system (and at least means I don't have to swap back to the G1610T which didn't have VT-d anyway).

 

iommu=pt did not work (nor did adding "intel_iommu=on" in conjunction with it that I found somewhere).

 

I would like VT-d to work though as that was part of the reason for the Xeon upgrade... So will keep trying.

 

BTW: This is with the stock unRAID 6.3.5 kernel.

 

 

Link to comment

Updating my Marvell firmware fixed the issue for me.

http://www.station-drivers.com/index.php?option=com_remository&Itemid=353&func=download&id=1572&chk=2c7b3aedeb1f577056b779ce8835c1d8&no_html=1&lang=en

 

This file above worked for my 9123 controller card. I had to make a DOS boot disk then run the Go.bat file with -y so:

C:\go.bat -y

It totally cleared the issue up for me and I can now see my drives.

Link to comment
  • 5 months later...
On 1/6/2018 at 2:35 AM, TechFireSide said:

Updating my Marvell firmware fixed the issue for me.

http://www.station-drivers.com/index.php?option=com_remository&Itemid=353&func=download&id=1572&chk=2c7b3aedeb1f577056b779ce8835c1d8&no_html=1&lang=en

 

This file above worked for my 9123 controller card. I had to make a DOS boot disk then run the Go.bat file with -y so:

C:\go.bat -y

It totally cleared the issue up for me and I can now see my drives.

 

Do you have the link to the station driver article? The direct download link is blocked.

 

 

Thanks.

Link to comment
  • 8 months later...

Hi All,

 

I am new to UnRaid and just finished my build. I am trying really hard and not at all code friendly. But learning slowly. I have been trying to boot my system and got it up and running but cannot see any disks. I heard about this Marvell chipset issue. So I am hoping you could help me. I really want to use UnRaid. I have installed FreeNas on the machine and it worked great. But don't like FreeNas. I really would like to use Unraid. So I have attached my tower log for some guidance. 

 

Thank you in advance for your time on this matter. 

tower-diagnostics-20190322-1707.rar

Link to comment

The .rar is fine with something like 7 zip. 

 

I couldn't see anything obvious apart from the lack of notification in syslog. Yet it appears in lspci. My particular card shows this as an example:

 

Mar 23 12:06:16 Mars kernel: ata14.00: ATAPI: MARVELL VIRTUAL, , 1.09, max UDMA/66
Mar 23 12:06:18 Mars kernel: scsi 14:0:0:0: Processor         Marvell  Console          1.01 PQ: 0 ANSI: 5

 

 

 

Link to comment
31 minutes ago, Shonky said:

The .rar is fine with something like 7 zip. 

Yes, strange it didn't open at home with 7zip, it does at work, regardless no point in raring the diags.

 

There's a problem identifying the drive connect on this port:

Mar 22 17:01:13 Tower kernel: ata13: SATA link up 6.0 Gbps (SStatus 133 SControl 300)
Mar 22 17:01:13 Tower kernel: ata13.00: qc timeout (cmd 0xec)

Since all 8 Intel SATA ports are free, OP should start by using those.

Link to comment
  • 1 year later...
On 6/15/2015 at 10:50 PM, RobJ said:

Update:  A potential workaround - some users are reporting success with the following workaround, add iommu=pt to the append line of your syslinux.conf file on your boot flash.

  Example - change this

      append  initrd=/bzroot

  To this

      append  iommu=pt  initrd=/bzroot

I have tried to add this, as I don't see the disk attached to the card in unraid.

Inside the card's bios I do see the disk.

 

In syslog I got this

Nov 14 15:36:41 Selina kernel: ahci 0000:09:00.0: Event logged [IO_PAGE_FAULT domain=0x0000 address=0x000000000009d200 flags=0x0000]

followed by this a bit further down

Nov 14 15:36:41 Selina kernel: sd 6:0:0:0: [sde] Attached SCSI disk
Nov 14 15:36:41 Selina kernel: ata9: link is slow to respond, please be patient (ready=0)
Nov 14 15:36:41 Selina kernel: ata16.00: qc timeout (cmd 0xa1)
Nov 14 15:36:41 Selina kernel: ata16.00: failed to IDENTIFY (I/O error, err_mask=0x4)
Nov 14 15:36:41 Selina kernel: ata16: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
Nov 14 15:36:41 Selina kernel: ata9: COMRESET failed (errno=-16)
Nov 14 15:36:41 Selina kernel: ata9: link is slow to respond, please be patient (ready=0)
Nov 14 15:36:41 Selina kernel: ata16.00: qc timeout (cmd 0xa1)
Nov 14 15:36:41 Selina kernel: ata16.00: failed to IDENTIFY (I/O error, err_mask=0x4)
Nov 14 15:36:41 Selina kernel: ata16: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
Nov 14 15:36:41 Selina kernel: ata9: COMRESET failed (errno=-16)
Nov 14 15:36:41 Selina kernel: ata9: link is slow to respond, please be patient (ready=0)
Nov 14 15:36:41 Selina kernel: ata16.00: qc timeout (cmd 0xa1)
Nov 14 15:36:41 Selina kernel: ata16.00: failed to IDENTIFY (I/O error, err_mask=0x4)
Nov 14 15:36:41 Selina kernel: ata16: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
Nov 14 15:36:41 Selina kernel: ata9: COMRESET failed (errno=-16)
Nov 14 15:36:41 Selina kernel: ata9: limiting SATA link speed to 3.0 Gbps
Nov 14 15:36:41 Selina kernel: ata9: COMRESET failed (errno=-16)
Nov 14 15:36:41 Selina kernel: ata9: reset failed, giving up

So if I read this correct there is 3 timeouts on a 4 port card, which seems logical, as I only got one disk connected to it.

 

In System Devices I got this under "IOMMU group 18"

	[1b4b:9230] 09:00.0 SATA controller: Marvell Technology Group Ltd. 88SE9230 PCIe SATA 6Gb/s Controller (rev 11)

Do I need to do something else to get unraid to see the drive, or should it just be there as the drives attached to the mainboard?

Link to comment
3 hours ago, JorgeB said:

This workaround doesn't always work, but also note that if you're using AMD hardware it's different.

Ah, I didn't see there were two different ones. I am on AMD, sadly it didn't work changing it. I got it to work by setting it to off, and right now that isn't a problem, but I have some plans on the future where I need it enabled. Is the solution then to get another card? 

Link to comment
  • 1 month later...

My marvell controller is acting odd.

Dropped out during a rebuild the other day.

I have since taken the array off of it and put it on the mobo because I never want that again but I still need ports so I used it for vm drives.

Now my vm reboots.

 

 Maybe it's dieing or maybe it's connected to this issue some how.

 

I'm using the 6.9rc2 so obviously I'm not saying it's the same issue but I don't see anything online about marvell controllers dieing.

 

I need the vm to have a terminal so I can't even tell what model chipset it is (should be newer than this issue though)

 

For now I'll have to just consolidate my drives and run without that card.

 

I have no way of knowing what the issue is.

 

Link to comment
  • 2 months later...

Hello and sorry to revive an old thread, but i guess this is still relevant in 6.9.1. Bought a startech sata card that uses a marvel controller and experienced the issues the op was talking about. using "append  iommu=pt  initrd=/bzroot" in syslinux.cfg, solved my issue. Thank you OP!

 

 

Link to comment
On 4/2/2021 at 4:38 PM, Enorym said:

Hello and sorry to revive an old thread, but i guess this is still relevant in 6.9.1. Bought a startech sata card that uses a marvel controller and experienced the issues the op was talking about. using "append  iommu=pt  initrd=/bzroot" in syslinux.cfg, solved my issue. Thank you OP!

 

 

I found out my unit over heats. (It was very hot then it crashed)

i bought a new one and that's the same.

I GLUED A 60MM FAN ON THEM AND RUN THE FANS FROM 5V AND THEY A BOTH WORK FINE NOW.

(((((Though i'm not using it in a VM any more))))

Edited by mdrodge
Link to comment
  • 2 months later...

This is more wide spread that it appears. Most people just won’t bother with checking why their Marvell controller connected disks don’t show up.

Also, this is not kernel, driver or firmware related. At least if it is the case, not the main cause. This is clearly UnRAID issue. I have done extensive tests, during which confirmed that the same 88SE9230 cards (yes, multiple) wold work just fine with OMV throughout 3 last versions, but would fail in UnRAID throughout 3 last versions. 
The fix didn’t work in my case.

It seems that the Linux distor under UnRAID or UnRAID itself initialises SATA controllers in some special way. Most often error output is “soft reset failed”.

Considering the number of Marvell controllers on the market, it would be reasonable to address this issue rather sooner than later.

Link to comment

 

12 minutes ago, p13 said:

this is not kernel,

Quick googling shows that currently OMV ships with Debian 10 which is Kernel v4.9.  Unraid hasn't used Kernel 4.9 since ~6.3.3 which is far far more than 3 versions ago.

 

Not a fair comparison.

16 minutes ago, p13 said:

Considering the number of Marvell controllers on the market

Yes, there are a ton of marvel based controllers out there.  Simply because the chipset is cheap and Chinese manufacturers (Syba, IO Crest etc) can bang them out by the millions and not have to worry about much in terms of problems because the largest market is Windows, and Windows will try, try, try again, reset the controller, repeat the above ad-naseum until things happen to work.

Link to comment
53 minutes ago, Squid said:

 

Quick googling shows that currently OMV ships with Debian 10 which is Kernel v4.9.  Unraid hasn't used Kernel 4.9 since ~6.3.3 which is far far more than 3 versions ago.

 

Not a fair comparison.

Yes, there are a ton of marvel based controllers out there.  Simply because the chipset is cheap and Chinese manufacturers (Syba, IO Crest etc) can bang them out by the millions and not have to worry about much in terms of problems because the largest market is Windows, and Windows will try, try, try again, reset the controller, repeat the above ad-naseum until things happen to work.


I meant 3 major version, e.g 4,5,6 (yes, that backwards). 
 

Well, one way or another it works in OMV (Windows) and doesn’t in Unraid. A typical consumer business would “correct” the situation. Guess, UnRAID is still enthusiasts oriented.

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.