[SOLVED] Adding 2nd NVMe. Data Drive Dissappears


Recommended Posts

Hi,

I was hoping someone could give a newbie some advice regarding a problem I've just encountered.

I’m only a few months into my unraid experience, but the other day I thought I should now add a cache drive to the system. I therefore bought a second WD 1TB NVMe to use for this purpose (The first one I pass through as a windows VM).

 

When I added this to the system and restarted unraid, my main data drive (8TB Iron wolf) was marked as missing.

Unfortunately I was panicking a bit at this point, so took the obvious step of removing the NVMe, which reverted everything to normal (I hadn’t taken a diagnostic dump during the problem due to panicking I'd lose data).

 

When I looked at the BIOS all the drives were present including the new NVMe.

 

Before attempting this again I was wondering if there was anything I was obviously doing wrong or should do to add this drive?

I’ve attached my normal diagnostics file, and my hardware below, but thought I’d raise the question before building up my nerve to try again, even if just to grab a diagnostic dump during the problem (I don't want to risk corruption or data loss if I can help it).

 

Any assistance and advice greatly appreciated.

Best Regards

 

 

 

 

Hardware

·         Version 6.8.3 2020-03-05

·         AMD 3900X

·         Gigabyte Aorus X570 Pro Motherboard

·         2X 8TB Seagate Ironwolf HD. One as data and one as parity

·         1X 320GB old data drive

·         Gigabyte GeForce RTX 2070 Super WINDFORCE OC 3x 8G (Pass through to windows VM)

·         CORSAIR VENGEANCE LPX 64GB (2 x 32GB) DDR4 3200 (PC4-25600)

·         1X WD Blue SN550 1TB High-Performance M.2 Pcie NVMe SSD (Pass through to VM)

·         1X 4 port Intel NIC

·         1X Old GPU for system

 

Plugins

·         Community Apps

·         Dynamix System Temp

·         Fix Common Problems

·         Nerd Tools

·         ProfFTDd

·         Statistics

·         Tips and Tweaks

·         Unaasigned Devices

·         UPnP Monitor

·         User Scripts

·         VFIO-PCI CFG

 

PCIe override = Both

unraid-diagnostics-20201024-1152.zip

Link to comment

check the specs of your MB in detail.

For many, when adding a second NVMe, one SATA port will be sacrificed.

If not all SATA ports are in use at that time, moving the HDD to a different SATA port is the best option.

 

Edit: never mind, missed the info, where you stated that in BIOS all disks were present

Edited by Ford Prefect
Link to comment

Many thanks both for the super speedy responses.

 

In terms of adding the NVMe back in to get a diagmostic dump, can I do that 'relatively' safely without data corrpution or loss? As I say I was panicking a bit at first when the drive dissappeared.

 

Can I also shift the SATA port around i.e. does it tie the drive to the port in some way or does it recognise a drive by an ID regardless of port it is connected too?

 

Thanks

 

Steve

Edited by sgpowelluk
Link to comment

unraid will keep track of the data drive position in the array based on serial number/GUID, not port, so moving drives to a different port or even controller should be possible.

 

If your diagnostic routine is non-invasive and as long as you do not format the drive, your data should be save.

But you could always try and create a backup/dump to another location, before running diagnostics

Link to comment

So curiosity got the better of me and I managed to pluck up courage to try  a few things. 

 

Firstly I booted with the second NVMe to get the diags. I've attached the diags from when the data drive dissappears. Hope this sheds some light.

 

From the BIOS perspective the drives appear as:-

SATA 0

SATA 1

SATA 2

SATA 3 - 8TB Data

SATA 4 - 8TB Parity

SATA 5 - 320 GB

 

There does seem to be an odd anomolie though in that on some of the BIOS screens the SATA 5 and the 320GB doesn't appear to show, where as on other info screens it does. Haven't gone through all permitations to bottom this out.

 

I've also moved the SATA ports around and found that when the second NVMe is present only the last two SATA ports are able to see drives in UNRAID.

 

Oh, I should say this diags was with the 3070 GPU removed to ease access to NVMe.

 

Many thanks for continued assistance.

Regards

 

Steve

 

unraid-diagnostics-20201024-1524.zip

Edited by sgpowelluk
Link to comment

Well, I am not an expert with AMD based systems.

The unraid diagnostics do not reveal anything going wrong, as far as I am concerned.

But you are right, only one 8TB drive is present and slot-2 in the array is missing.

Also, your MB should support 2 NVMe drives without any problems.

So I doubt that this is unraid related but rather caused by the BIOS.

What I found is this:

Gigabyte Technology Co., Ltd. - X570 AORUS PRO
BIOS Information
	Vendor: American Megatrends Inc.
	Version: F20a
	Release Date: 06/16/2020

...but according to this: https://www.gigabyte.com/Motherboard/X570-AORUS-PRO-rev-10/support#support-dl-bios the Version F20a and its release-date does not even appear in the list.

So your installed version is at least behind F20 and from the list, version F21 would improve PCIe compatibility (which could have an impact for PCIe based NVM;e drives.

 

So the best thing I have to offer at this point is to recommend, that you should upgrade your BIOS to the latest release first.

Link to comment

I know this doesn't address the problem, but I recommend not even using

4 hours ago, sgpowelluk said:

1X 320GB old data drive

It doesn't provide significant capacity, it is going to be slower than newer larger disks, and it's old. Each additional disk is an additional point of failure.

 

In order to reliably rebuild every bit of a missing disk, Unraid must be able to reliably read every bit of all other disks.

Link to comment

My suspicions were correct:

 

Device 09:00.0 was the GPU:

09:00.0 VGA compatible controller [0300]: NVIDIA Corporation TU104 [GeForce RTX 2070 SUPER] [10de:1e84] (rev a1)
    Subsystem: Gigabyte Technology Co., Ltd Device [1458:4001]
    Kernel driver in use: vfio-pci

Now is a SATA controller

 

09:00.0 SATA controller [0106]: Advanced Micro Devices, Inc. [AMD] FCH SATA Controller [AHCI mode] [1022:7901] (rev 51)
    Subsystem: Advanced Micro Devices, Inc. [AMD] FCH SATA Controller [AHCI mode] [1022:7901]
    Kernel driver in use: vfio-pci
    Kernel modules: ahci

And is being passed-through/bound to a VM.

  • Like 1
Link to comment

Thanks Ford Prefect/ Trurl,

I was aware that there has been a new BIOS for the motherboard for a while and thought that might be a sensible next step. That said I held off as I was worried that it might mess up my MMU groupings and stop some of the VMs working. Similarly if the new BIOS improved MMU groupings I thought I may be able to remove 'PCIe Overide', which I guess is generally considered an undesirable workaround.

Any comments on whether this is likely to be a big issue with my concerns listed?

 

 

In terms of the 320GB drive; I used that to build the server while I awaited for the 8TB ones on back order. In my naivity I hadn't realised that removing a drive had knock on effects, such as recalulating parity (I believe), so some people had suggested leaving drives in. To be honest I am due to swap it out shortly with either an existing 3TB from my desktop or maybe even another 8TB. That still leaves me needing three/ four working SATA ports though (Four to do advance preclear).

I asbolutely agree though it shouldn't really be in there any longer.

 

I suspect the BIOS upgrade may be the logical next step, so any thoughts on whether that is likely to negatively effect my VMs or whether once upgraded at what point I should disable PCIe override; greatly appreciated.

 

Thanks once again to both of you.

Regards

 

Steve

Link to comment

...you previously did configure to passthrough your external GPU card.

When you configured the PCI-Slot address for it, the NVMe wasn't there.

The address scheme/numbering with the Slot-IDs rebuilds every time the system boots...

If you change your hardware config, like adding new PCIe devices (like NVMe-cards or reseating existing cards into a different slot, the numbering will/can change.

Unfortunately passthrough is is a hard-configured list and will not auto-adjust itself, when the configuration/numbering changes.

After you added the NVMe, the same number, that used to be your GPU catrd now is your SATA-Controller (at least some ports of it).

So when booting unraid, the get to disapear from the system and are being passed on to the VM.

Same can happen, when you pull a card.

Again the ID just got assigned to another device.

 

Add everything you want into your server, boot, disable VMs in the VM Manager Settings, then re-configure the passthrough assignments with the VFIO-PCI plugin, save and reboot, then all disks should be there and you can re-enable VMs, too.

  • Like 1
Link to comment

Apologies for slow feedback, I haven't had any free time until now.

 

Ford Prefect,

the above explanation made perfect sense, so I followed your steps and managed to get everything visible, incluing the NVMe and the initial VMs I really care about.

 

So all being well I think you and JorgeB have managed to resolve my issues, which I'm extremely grateful for your help. I've also managed to learn a bit more about Unraid in the process which is also of benefit.

 

I'll give it a few days and do a bit more testing, then look at the BIOS upgrade and whether that improves options around PCIe override as well.

 

I believe I need to mark this as solved, but do I also need to mark yourself and JorgeB in any way from a reputation perspective.

 

Once again really appreciate both your assistance.

Regards

 

Steve

  • Like 2
Link to comment
  • JorgeB changed the title to [SOLVED] Adding 2nd NVMe. Data Drive Dissappears

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.