UnRAID stopped booting with PCIe to SATA adapter inserted


KR1SeS

Recommended Posts

Hey guys, first post on the forums requesting support, please go easy on me. I've been using UnRAID for almost 3 years and I basically use it for photo storage and a Plex media server.

 

I have an FX-6300 on an ASUS board. I have 1 cache disk and 9 array disks (1 partiy). My board only has 6 SATA ports so I purchased this over a year ago:

IOCrest SI-PEX40064 4 Port SATA III PCIe 2.0 X 1 Controller Card Components ---- (Chipset: Marvell 88SE9215)

https://www.amazon.ca/gp/product/B00AZ9T3OU/ref=oh_aui_detailpage_o07_s00?ie=UTF8&psc=1

 

I connected 4 disks to the PCIe to SATA controller and everything has been running smooth for over a year. A few weeks ago I received alerts from UnRaid that 4 of my disks had read errors. I thought it strange and determined that each of those disks was connected to the PCIe to SATA controller. After starting the server back up UnRAID detected that those 4 disks were missing. I disconnected the SATA controller and reconnected, booted up and all drives were present. I could start the array and everything was back to normal...until a few weeks later.

 

Now, a few days ago, my server hung. I couldn't reboot via webGUI or SSH with Putty. I had to hard power off. When I tried to power back up I would wait but wouldn't get the UnRaid started confirmation beeps I was used to hearing, and I couldn't access the webGUI. Here is a list of troubleshooting steps I took as I suspected a bad PCIe to SATA controller:

  • Removed SATA controller from PCIe slot and UnRAID booted normal, of course 4 of my disks were missing
  • Tried controller in all available PCIe slots and when inserted into any UnRAID would not boot
  • Purchased a brand new IO Crest controller
  • Inserted into any PCIe slot but without a single SATA disk attached and UnRAID would boot
  • Attaching any number of SATA disks to that new controller and UnRAID would fail to boot and in any slot
  • Purchased another brand new IO Crest controller
  • Inserted into and PCIe slot, with or without SATA disks attached, and UnRAID would fail to boot
  • Removed controller from PCIe slot and UnRAID boots every time with my 4 disks missing
  • Connected the 4 disks that were used on the controller to the motherboard instead, removing 4 others that occupied those ports, UnRAID boots and the former 4 disks appear and are healthy and the latter 4 disks are missing, as expected

 

Is it possible that I have an issue with the motherboard? I could see one PCIe slot going bad, but all of them? I assumed they don't all share the same PCIe lanes and that all of them failing wasn't very likely. Strange also that two of the controller tested wouldn't allow UnRAID to boot at all, and one controller let UnRAID boot as long as no SATA disks were attached.

 

I'm not sure what else to do. I would just go get a new motherboard if I was sure that was the issue and the CPU and RAM were not the cause. Any insight would be greatly appreciated.

 

TL;DR: PCIe to SATA controller worked for over a year then failed and UnRAID won't boot if it's inserted into the PCIe slot. Two new PCIe to SATA controllers of the same model also do not let UnRAID boot when inserted. This holds true when inserted into any of the available PCIe slots on the motherboard.

 

Edit 1: 

Additional Troubleshooting

Inserted PCIe GPU and connected monitor

 

Original SATA Controller

  • When inserted into any PCIe slot but no SATA disks attached PC will POST and UnRAID will boot
  • When connecting any one or more SATA disks PC will not POST

Both Replacement SATA Controllers

  • When inserted into any PCI slot PC will POST and UnRAID will boot
  • When connecting any one or more SATA disks PC will post and UnRAID will boot and all drives show, array can be started
  • Tested multiple arrangements of both PCIe slots and SATA disks attached and PC will POST and UnRAID boots every time, array can always be started
  • Removed GPU and UnRAID doesn't boot, can't tell if PC passes POST, can't reach webGUI and cannot ping static IP address

So I'm left thinking that for some reason my original SATA controller may have failed. I'm baffled that the two new replacement Marvell SATA controllers work every time provided I have a PCIe GPU inserted.

 

Could this be a power supply issue? At first I thought it couldn't as, if anything, adding the GPU would draw more power. But the I remembered that the GPU is separately powered directly form the PSU with the 6-pin power cable.

Edited by KR1SeS
Added Edit 1: Additional Troubleshooting
Link to comment
12 minutes ago, mrbilky said:

I'm no expert thats for sure but search marvell controllers/chipsets they seem to be problematic according to most folks here with unRAID if you upgraded to the latest version it may have broke what once worked

I did upgrade to UnRAID around the time of the initial failure. The controller didn't however fail promptly after the OS upgrade, it would have been a few days/weeks after.

 

I was finding it hard to believe that I could go through 3 of these controllers all not working that I started suspecting my motherboard.

 

I was thinking of purchasing an LSI SAS2008-8I SATA 9211-8i and trying it out before exploring other avenues. 

Link to comment

@KR1SeS Since you have an AMD motherboard have you by chance enabled AMD-Vi recently to do any VM work in unRAID?

 

This is more or less the description of the problem with Marvell-chipset-based SATA controllers and virtualization:

 

"There is a bad bug in the Marvell code for certain disk controller chipsets that causes connected drives to be unable to communicate when IOMMU is enabled.  If VT-d or AMD-Vi are turned on, then DMA reads fail, and the drives are unavailable."

 

Basically, if AMD-Vi is turned on, the drives connected to Marvell-based controllers can disappear in unRAID due to read errors.

 

If AMD-Vi is not enabled, this is likely not the issue and you need to look at something else as the possible cause.

 

The recommendation is to go with an LSI controller even if you are not using virtualization to avoid any potential problems.  Clones such as the Dell H310 and IBM M1015 are also very popular and are confirmed to work.  Just  make sure they are cross-flashed to IT mode if you decide to try one of those.

Edited by Hoopster
  • Upvote 1
Link to comment
37 minutes ago, Hoopster said:

@KR1SeS Since you have an AMD motherboard have you by chance enabled AMD-Vi recently to do any VM work in unRAID?

 

This is more or less the description of the problem with Marvell-chipset-based SATA controllers and virtualization:

 

"There is a bad bug in the Marvell code for certain disk controller chipsets that causes connected drives to be unable to communicate when IOMMU is enabled.  If VT-d or AMD-Vi are turned on, then DMA reads fail, and the drives are unavailable."

 

Basically, if AMD-Vi is turned on, the drives connected to Marvell-based controllers can disappear in unRAID due to read errors.

 

If AMD-Vi is not enabled, this is likely not the issue and you need to look at something else as the possible cause.

 

The recommendation is to go with an LSI controller even if you are not using virtualization to avoid any potential problems.  Clones such as the Dell 310 and IBM 1015 are also very popular and are confirmed to work.  Just  make sure they are cross-flashed to IT mode if you decide to try one of those.

I have never visualized with my UnRAID server and don't plan to. I'll have to check my BIOS/UEFI but I haven't touched the settings since my initial setup 3 years ago, and when I added my first Marvell controller it just worked (for over a year until now) and no settings were changed.

 

I'll look into the Dell H310 and the IBM M1015 as well.

Edited by KR1SeS
Fixed part numbers.
Link to comment

@Hoopster @mrbilky I have some more information from additional troubleshooting last night. I've also edited my OP with this.

 

Additional Troubleshooting

Inserted PCIe GPU and connected monitor

 

Original SATA Controller

  • When inserted into any PCIe slot but no SATA disks attached PC will POST and UnRAID will boot
  • When connecting any one or more SATA disks PC will not POST

Both Replacement SATA Controllers

  • When inserted into any PCIe slot PC will POST and UnRAID will boot
  • When connecting any one or more SATA disks PC will post and UnRAID will boot and all drives show, array can be started
  • Tested multiple arrangements of both PCIe slots and SATA disks attached and PC will POST and UnRAID boots every time, array can always be started
  • Removed GPU and UnRAID doesn't boot, can't tell if PC passes POST, can't reach webGUI and cannot ping static IP address

So I'm left thinking that for some reason my original SATA controller may have failed. I'm baffled that the two new replacement Marvell SATA controllers work every time provided I have a PCIe GPU inserted.

 

Could this be a power supply issue? At first I thought it couldn't as, if anything, adding the GPU would draw more power. But the I remembered that the GPU is separately powered directly form the PSU with the 6-pin power cable.

 

Edited by KR1SeS
Changed PCI to PCIe
Link to comment
3 hours ago, KR1SeS said:
  • When inserted into any PCI slot PC will POST and UnRAID will boot
  • When connecting any one or more SATA disks PC will post and UnRAID will boot and all drives show, array can be started
  • Tested multiple arrangements of both PCIe slots and SATA disks attached and PC will POST and UnRAID boots every time, array can always be started
  • Removed GPU and UnRAID doesn't boot, can't tell if PC passes POST, can't reach webGUI and cannot ping static IP address

So, before you had this problem, did your server boot fine without the GPU installed and with the original PCIE/SATA card?

 

I am not familiar with your MB/CPU combo.  Does the MB have built-in graphics so no GPU is required or have you always just booted headless (no GPU required by MB) and accessed the server through GUI, PuTTY, SSH, Terminal, etc.?

 

Have you checked all graphics-related settings in your BIOS?

Link to comment
15 minutes ago, Hoopster said:

So, before you had this problem, did your server boot fine without the GPU installed and with the original PCIE/SATA card?

 

I am not familiar with your MB/CPU combo.  Does the MB have built-in graphics so no GPU is required or have you always just booted headless (no GPU required by MB) and accessed the server through GUI, PuTTY, SSH, Terminal, etc.?

 

Have you checked all graphics-related settings in your BIOS?

CPU: AMD FX-6300
M/B: Asus - M5A97 R2.0 ATX AM3+ Motherboard

 

My CPU doesn't have an iGPU and the motherboard does not have built-in graphics. In the very beginning I didn't have enough disks to warrant the SATA controller. I used an old 8800GT GPU for my initial setup of UnRAID. After that initial setup I physically removed the GPU from the PCIe slot to save energy and reduce heat and noise. The system would boot headless and I'd access one of two ways, via web browser to the webGUI, or with PuTTY. 

 

Once I needed more than the 6 SATA ports my MB has, I bought this Marvell SATA controller. All I did was shutdown the server, install in a PCIe x1 slot, plug in my new SATA disks and booted headless, no GPU. Everything booted fine and I was able to pre-clear and bring the new drives into the array. For the next year plus I've rebooted the server multiple times and it's always booted headless without issues. Then we come to about a month ago when UnRAID hung and didn't even respond to SSH powerdown commands. Then we get to the start of my OP.

 

I checked BIOS settings but I didn't really scrutinize. I didn't see virtualization settings. From day one I disabled everything but my USB Flash Drive for boot devices, and it's still that way. 

 

The server is still up since last night and running fine now, only with the GPU installed which I would still really like to avoid. It's strange how out of the blue this requirement of having a GPU presented itself. I didn't make any changes to the BIOS or UnRAID settings.

Link to comment
13 minutes ago, KR1SeS said:

I checked BIOS settings but I didn't really scrutinize. I didn't see virtualization settings. From day one I disabled everything but my USB Flash Drive for boot devices, and it's still that way. 

 

The server is still up since last night and running fine now, only with the GPU installed which I would still really like to avoid. It's strange how out of the blue this requirement of having a GPU presented itself. I didn't make any changes to the BIOS or UnRAID settings. 

 

According to the manual for your motherboard virtualization is supported by the motherboard and AMD specs say your CPU supports it as well.  On your MB, virtualization is enabled if IOMMU is enabled.  Make sure it is disabled (which is the default setting) if you don't want it.

 

Check the Initiate Graphics Adapter setting.   There are only two options; PEG (Pci Express Graphics) and PCI (regular PCI) so you can choose which one to initiate on boot should you have a GPU installed in both slot types.  Your current setting is probably PEG since it is initializing the PCIe graphics adapter.  Try changing it to PCI, remove the GPU and reboot.  I am really not expecting anything to change, but, you never know.  This is really puzzling since it used to boot fine headless and now will not. 

 

Below is a snippet from your MB manual regarding North Bridge configuration.  Both of the above are found there.

 

3.5.2 North Bridge
IOMMU [Disabled]
Allows you to disable or enable the IOMMU. IOMMU is supported on Linux based systems to
convert 32bit I/O to 64bit MMIO. Configuration options: [Disabled] [Enabled]
Memory Configuration
Allows you to set the related memory configurations.
Bank Interleaving [Auto]
Allows you to enable the bank memory interleaving. Configuration options: [Auto]
[Disabled]
Channel Interleaving [Auto]
Allows you to enable the channel memory interleaving. Configuration options:
[Disabled] [Auto]
ECC Mode [Enabled]
Allows you to enable or disable the ECC mode. Configuration options: [Disabled]
[Enabled]
Power Down Enable [Disabled]
Allows you to enable or disable the DDR power down mode. Configuration options:
[Disabled] [Enabled]
Memory Hole Remapping [Enabled]
Allows you to enable or disable memory remapping around memory hole.
Configuration options: [Disabled] [Enabled]
DCT Unganged Mode [Enabled]
Allows you to select unganged mode or ganged mode. Configuration options:
[Disabled] [Enabled]
Initiate Graphic Adapter [PEG/PCI]
Allows you to select the primary boot graphic controller. Configuration options: [PEG/PCI]
[PCI/PEG]
 
  • Upvote 1
Link to comment
On 9/7/2018 at 2:49 PM, Hoopster said:

 

According to the manual for your motherboard virtualization is supported by the motherboard and AMD specs say your CPU supports it as well.  On your MB, virtualization is enabled if IOMMU is enabled.  Make sure it is disabled (which is the default setting) if you don't want it.

 

Check the Initiate Graphics Adapter setting.   There are only two options; PEG (Pci Express Graphics) and PCI (regular PCI) so you can choose which one to initiate on boot should you have a GPU installed in both slot types.  Your current setting is probably PEG since it is initializing the PCIe graphics adapter.  Try changing it to PCI, remove the GPU and reboot.  I am really not expecting anything to change, but, you never know.  This is really puzzling since it used to boot fine headless and now will not. 

 

I wasn't able to troubleshoot any more until last night.

Settings as I found them last night and as were left when everything used to work properly:

  • IOMMU - Disabled
  • Initiate Graphics - PEG-PCI
  • ECC Memory - Enabled

So I tried a few changes in the UEFI, one at a time, and I'd then ensure UnRAID boots with the GPU in, and then again without. I swapped to PCI-PEG and that didn't work. I disabled ECC Memory (figured why not try at this point) and that didn't work.

 

I also found a boot setting where if an error is encountered the system waits for F1 to be pressed before continuing. This setting was enabled by default. I tried disabling it, but to no avail.

 

As of now this is all I can do; boot UnRAID without GPU only if my Marvel SATA controller is removed, or; boot UnRAID with a GPU if my Marvel SATA controller is connected. So I'm leaving the system up and running with the GPU, which draws more power and adds more heat unfortunately.

 

I ordered this and I hope I can crossflash it to the latest IT-mode and I'll see if UnRAID will boot with it and no GPU.

https://www.ebay.ca/itm/LSI-SAS2008-8I-SATA-9211-8i-6Gbps-8-Ports-HBA-PCI-E-RAID-Controller-Card/252048579357?ssPageName=STRK%3AMEBIDX%3AIT&_trksid=p2057872.m2749.l2649

 

Link to comment
16 minutes ago, Benson said:

Does any boot sound hear or have set continue boot even error ?

When my keyboard and GPU are connected there is no beep and I have a successful POST. There is also no beep(s) when UnRAID loads successfully.

When my GPU is removed there is one short beep, which I believe indicates a successful POST with a GPU and/or keyboard not connected. When UnRAID loads headless successfully (with my SATA controller removed) there are a few successive longer beeps. When UnRAID fails load headless (with my SATA controller connected) there are no further beeps after the one short POST beep.

 

Before this issue surfaced, when everything was working, I had no GPU connected, and my Marvell SATA controller was installed, when I powered on the PC I would get the on short beep at POST and after a few seconds I'd get a few successive longer beeps when UnRAID loaded successfully.

Link to comment
1 minute ago, Benson said:

This should be expect.

 

Headless + Controller still have that short beep, but fail boot ........ anyway all my unRAID(s) never here longer beep once success boot up.

The thing is, Headless + SATA Controller used boot for me. Before this issue I've been running that way for over a year with multiple restarts/boots. UnRAID booted headless every time without a GPU and with the Controller and 4 SATA disks attached.

 

I will say that after testing three of the same Marvell Controllers, my very fist one is 100% not functioning, and the two new ones both perform as you said, when installed headless my system won't boot. Long shot here, but if it's true that Headless + SATA Controller shouldn't boot, then maybe, maybe, maybe, the first controller I ever used had a flaw from the beginning, which allowed the system to boot.

Link to comment

No mater GPU (+-) / Keyboard (+-), does controller's AHCI BIOS post ?

 

If YES, I want you try

1) OptROM xxx ( not exactly remember ) set "keep current setting" instead "force BIOS"

2) "PCIe storage boot" set disable ( in CSM section if CSM was enable )

 

After confirm no controller's AHCI BIOS post, then try GPU (-) and Controller (+) succes or not.

Edited by Benson
  • Upvote 1
Link to comment
6 minutes ago, Benson said:

No mater GPU (+-) / Keyboard (+-), does controller's AHCI BIOS post ?

 

If YES, I want you try

1) OptROM xxx ( not exactly remember ) set "current setting" instead "force BIOS"

2) "PCIe storage boot" set disable ( in CSM section if CSM was enable )

+GPU +/-Keyboard Controller AHCI is good.

-GPU +/-Keyboard I don't know how to tell if Controller AHCI is good as I can only go off system beep sounds since I can't connect a monitor.

 

I will try both of your suggestions when I get home from work. Thank you.

Link to comment
  • 3 weeks later...

@Benson @Hoopster Sorry for the delay in following up. I appreciate the help and wanted to report back now that I have replaced my Marvell SATA controller with an LSI 9211-8i.

 

None of the suggestions recommended allowed me to boot without a GPU like I used to be able to.

 

I had been waiting for an LSI 9211-8i that I ordered from China and I just got it and installed it last night. I flashed the LSI controller to the latest firmware, as well as IT mode. I chose not to flash it with a BIOS. I can now successfully boot without my GPU inserted again! My previous issue must have been with the SATA controller I was using, but I still have no idea why I was able to boot without a GPU and using the Marvell for over a year. Oh well, I now have space for four extra SATA drives now thanks to the LSI.

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.