Marvell disk controller chipsets and virtualization


Recommended Posts

I'm experiencing similar issues, and wonder if this is due to me having a marvel controller. I purchased a Startech PEXSAT34SFF PCI-E to Mini-SAS controller (I've got a HP N54L microserver and I wanted to run mini-SAS rather than several SATA cables due to space constraints etc..) I've switched off SVM in the BIOS but I'm still having issues with drives. Bear in mind smart reports for both SSD's and 1x 4TB connected to this controller report no issues; as these drives are less than a month old and have hardly seen any usage due to the above mentioned issue. Can I disable virtualization within unRAID altogether? I only use Docker; dont plan on using KVM etc..

 

Drives all show up fine and register fine on the array; its only when activity occurs (writes etc..) when things start playing up.

Right now, it's easy to blame Marvell for any drive issues, but your summary doesn't quite fit the typical symptoms.  May I suggest starting a support thread in the General Support board, explain it as you have above, and include your Diagnostics.  I'd prefer to handle this first as general troubleshooting, until we know better.

Link to comment

I think I'm getting this same issue with my Marvell controller (Model 88SE9485) with VT-d enabled.

 

I initially thought it was related to my GPU being installed, but have now discovered I only get the errors when VT-d is enabled.

 

What I don't understand is that all my drives show up fine, and I only get DMA errors randomly every couple of hours, or more often when doing a parity check.

ness-diagnostics-20160329-1611-nogpu.zip

Link to comment

There was considerable discussion of this controller (9480 re-ID'd as 9485) earlier in this thread.  Those who had parity errors were not common, but obviously that was a killer, and I suspect they got rid of it (don't remember, too lazy to check!).

 

You've had better luck with it, up to now.  The problem always was failing DMA reads, but on some drives they failed even the initial IDENTITY request, so the drives never even appeared.  On others, the drives would seem to work, but might be associated with strange or defective behavior (like inconsistent parity checks).  There have been a few positives found, a fix or 2 and a workaround that works for some (but not all), so might be worth reading through the thread.  Personally, I'd replace it, if you can't find a fix that works for you.

Link to comment

I have read through this thread, and will try the fixes at the weekend.

Unfortunately I do not have the cash to upgrade the card, the IBM and Dell cards seem quite hard to get a hold of in the uk.

 

Has the Linux kernal patch been included in unRAID 6.2? If so which exact version has the patch been added.

Link to comment

Has the Linux kernal patch been included in unRAID 6.2? If so which exact version has the patch been added.

It's my understanding the patch was included long ago, possibly even before I started this thread!  No idea what version.  Didn't apparently change anything.

Link to comment

It appears that I have fixed it somehow, but unfortunately I have no idea how, as I was getting so frustrated at it I ended up changing 2/3 things at once.

 

The last I remember changing is;

I updated the bios version even although it was already on latest, which also reset all the bios settings to default. There might have been a funny setting deep somewhere, one of which was I had my board setup so that the two on board USB controllers would show seperately so I could pass one through to VM's.

I also performed a check disk on my unRAID USB, although it didn't find any errors, so doubt it done anything.

I had my GPU back in my machine at this point, and noticed when booting I had a PCI-e to PCI bridge which wasn't there while I was having issues.

 

The third one might well be linked to the first change I made. I have had the whole server back together for over a week now ran two parity syncs and have had no ATA errors now

Link to comment
  • 1 month later...
  • 2 weeks later...
  • 1 month later...
  • 4 weeks later...
  • 2 weeks later...

I was not having any luck with a Marvel based expander (SAS2LP-MV8) with IOMMU enabled in the BIOS but since I swapped to an IBM ServeRAID M1015 I have been having a much better time all round as I also run it with a HP SAS Expander for an extra 24 ports (The SAS2LP-MV8 didnt like the dual link connection to the HP card, but the M1015 worked fine).

The only thing with the IBM card is that you will want to flash it to IT mode which is fairly straightforward but I did have to muck around with making a UEFI bootable USB to do the flash as it kept throwing an error when doing it via a DOS based USB, but there was a post on the forum or externally somewhere for doing either method.

 

I used the following page as a guide for choosing my card at the time, but I dont know when it was last updated. The M1015 seemed to be a popular choice with other users though:

http://lime-technology.com/wiki/index.php/Hardware_Compatibility#PCI_SATA_Controllers

 

Since then I have been able to run VM's, the only thing you may want to check beforehand is if you want to pass through PCI devices to the VM as that caught me out. All my PCIe slots were in the same IOMMU group so you have to pass them all through to the VM which was obviously no good to me as I want my SATA controllers to stay in unRAID. You could use ACS override but it would not be a good idea to do it on the same group as your SATA controllers due to the possibility of the other device writing to the memory space of your controller, which would not be good for your data integrity.

 

Hope that helps.

Link to comment
  • 3 months later...

any update on this on a fix

 

I assume you have tried the workaround mentioned in the first post, highlighted in blue?

 

Apart from that, I'm not aware of any changes.  I think it's up to Marvell.  I don't know of anything that Lime Technology could do.

Link to comment
  • 2 weeks later...
  • 1 month later...

If a fix should come into play, I would personally like it to be an optional setting.  Many users (myself included) use some of the affected controllers with IOMMU enabled and running virtual machines with absolutely zero issues.  I would hate to have any "fix" cause issues for those not affected....  My personal opinion (for right or wrong) is that this is a BIOS issue not a driver issue per se.

Link to comment
  • 1 month later...
3 minutes ago, DingHo said:

Sorry to drag up an old post, but would this bug effect docker containers, or only VMs.  Thanks.

 

Just VMs, well it will affect everything if you're trying to use virtualisation.  So as long as you don't want to you should be good.

Edited by CHBMB
Link to comment

Hopefully this saves the next person some heartache in searching for a solution...

 

Running unRAID 6.3.2 with  MSI - Z77A-G45 (MS-7752) motherboard and  Intel® Core™ i5-3570 CPU @ 3.40GHz.

 

I can confirm that with IOMMU (VT-d) enabled, the StarTech PEXSAT32 elicits the same errors as the OP's. With VT-d disabled in BIOS, the errors go away.

 

StarTech PEXSAT32 uses Marvell 9128 chipset which is on the OP list.  

 

I can also confirm that adding iommu=pt does not fix this problem with this card on and Intel board.

 

This card worked without problem on an AMD processor/motherboard combination with IOMMU enabled.

 

 

Edited by ksignorini
Link to comment
  • 3 weeks later...
  • 3 weeks later...

First I wanted to thank the creator of this thread and all those who have added their information. I'm running unRaid 6.3.2 with a Startech 4 port SATA RAID Controller Card (PEXSAT34RH). I wasn't able to preclear a new drive attached to this controller.

 

Upon the suggestion of this thread, I flash the firmware on the controller to the latest, v2.3.1065 and that fixed my issues. I was then able to Preclear my new hard drive.

Link to comment
  • 4 months later...

Thank you very much for this thread, I have bene having problems for ages with disk errors on parity check and rebuild and never knew about this until the "Fix Common Problems" plugin sent me here. I got with of my SAS LAPS and bought two Dell Percs and its perfect, no errors, nothing. Just perfect working server. 

 

Thank you everyone that has put time into researching the cards and posting their experiences.

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.