Jump to content
Alabaster

unRAID not detecting both M.2 drives

12 posts in this topic Last Reply

Recommended Posts

Posted (edited)

I just purchased 2 M.2 drives hoping to use them as a cache pool. The UEFI menu will show both drives, but the unRAID web GUI will only list one of the drives. Below is a little about my setup.

 

Motherboard: ASRock X370 Taichi

CPU: Ryzen 5 2600

M.2 drives: 2x XPG SX6000 Lite M.2 2280 512GB PCI-Express 3.0 x4 3D NAND

HBAs: 2x LSI 9207-8i

GPU: ATI FireMV 2250

 

I tested each drive separately and each will be detected in the web GUI in either M.2 slot when installed one at a time. However, unRAID isn't showing both of them when they're both installed, although the UEFI sees both of them.

 

After swapping drives and slots, I removed the GPU thinking it may be some odd PCIe lane allocation issue, but the issue persisted. I also updated from 6.7.0 to 6.7.2. Attached are the diagnostics.

 

I looked through syslog.txt and it looks to be an issue with both drives using the same NVMe Qualified Name (NQN). I'm researching this path at the moment, but hope someone with more experience can weigh in with a possible solution. Any ideas?

 

 

elysium-diagnostics-20190713-0247.zip

Edited by Alabaster

Share this post


Link to post
Posted (edited)

2 NVMe detect, problem may fix by Kernel or NVMe firmware update.

 

01:00.0 Non-Volatile memory controller [0108]: Realtek Semiconductor Co., Ltd. Device [10ec:5762] (rev 01)
    Subsystem: Realtek Semiconductor Co., Ltd. Device [10ec:5762]
    Kernel driver in use: nvme
    Kernel modules: nvme

 

21:00.0 Non-Volatile memory controller [0108]: Realtek Semiconductor Co., Ltd. Device [10ec:5762] (rev 01)
    Subsystem: Realtek Semiconductor Co., Ltd. Device [10ec:5762]
    Kernel modules: nvme

 

Jul 12 21:26:04 Elysium kernel: nvme nvme1: ignoring ctrl due to duplicate subnqn (nqn.2018-05.com.example:nvme:nvm-subsystem-OUI00E04C).
Jul 12 21:26:04 Elysium kernel: nvme nvme1: Removing after probe failure status: -22

 

https://forums.lenovo.com/t5/ThinkPad-X-Series-Laptops/X1-Extreme-Intel-NVMe-Firmware-Upgrade-NQN-Duplicate-Issue/m-p/4415819#M99048

 

commit b9453f9bb66e864f8b7d7e112aea475bdd7a4e2b
Author: James Dingwall <james@dingwall.me.uk>
Date:   Tue Jan 8 10:20:51 2019 -0700

    nvme: introduce NVME_QUIRK_IGNORE_DEV_SUBNQN
    
    [ Upstream commit 6299358d198a0635da2dd3c4b3ec37789e811e44 ]
    
    If a device provides an NQN it is expected to be globally unique.
    Unfortunately some firmware revisions for Intel 760p/Pro 7600p devices did
    not satisfy this requirement.  In these circumstances if a system has >1
    affected device then only one device is enabled.  If this quirk is enabled
    then the device supplied subnqn is ignored and we fallback to generating
    one as if the field was empty.  In this case we also suppress the version
    check so we don't print a warning when the quirk is enabled.
    
    Reviewed-by: Keith Busch <keith.busch@intel.com>
    Signed-off-by: James Dingwall <james@dingwall.me.uk>
    Signed-off-by: Christoph Hellwig <hch@lst.de>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 

Edited by Benson
  • Like 1
  • Upvote 1

Share this post


Link to post

I am also having this same issue with 2 Adata XPG GAMMIX S5 256GB drives.  Both are seen in the BIOS but I see the following during system start up.

 

Jul 14 21:56:11 TheWatchtower kernel: nvme nvme1: ignoring ctrl due to duplicate subnqn (nqn.2018-05.com.example:nvme:nvm-subsystem-OUI00E04C).

Jul 14 21:56:11 TheWatchtower kernel: nvme nvme1: Removing after probe failure status: -22

 

Is there a config tweak that can be made or will this have to be added by the main dev team?

 

Alabaster where you able to get the issue sorted out?

 

Cheers,

 

Chris

 

Share this post


Link to post
On 7/15/2019 at 12:09 AM, drkCrix said:

I am also having this same issue with 2 Adata XPG GAMMIX S5 256GB drives.  Both are seen in the BIOS but I see the following during system start up.

 

Jul 14 21:56:11 TheWatchtower kernel: nvme nvme1: ignoring ctrl due to duplicate subnqn (nqn.2018-05.com.example:nvme:nvm-subsystem-OUI00E04C).

Jul 14 21:56:11 TheWatchtower kernel: nvme nvme1: Removing after probe failure status: -22

 

Is there a config tweak that can be made or will this have to be added by the main dev team?

 

Alabaster where you able to get the issue sorted out?

 

Cheers,

 

Chris

 

Unrelated to your issue?

 

did you buy you're ADATA from MASS DROP?

Share this post


Link to post
Posted (edited)

That was me. I ran into this issue with my new Thinkpad X1 Extreme laptop. Feel free to adapt the patch if you have different variants of this SSD. `lspci -nn` will give you the PCI vendor and device IDs. If it's Realtek, it should be 0x10ec for the vendor ID and the device ID will be different for different models of the SSD.

Also, we really should get ADATA/Realtek to patch their lame firmware :( I'm sure the Linux kernel guys aren't happy with an ever-growing list of quirks.

Also, here's a resource for building custom kernels for unRAID: 
https://wiki.unraid.net/Building_a_custom_kernel

(Note: I'm not an unRAID user, just circling back here as I hate the phenomenon of finding some post about some problem, but no solutions.)

9 hours ago, drkCrix said:

@Alabaster

 

I found this today

 

https://lkml.org/lkml/2019/7/15/57

 

Looks like it is for drives with the realtek controller like we have.

 

Edited by mishan

Share this post


Link to post
Posted (edited)

@drkCrix

No, I haven't sorted this out.

 

According to the following link, it appears there will be a fix in the Linux 5.3 kernel. I'm not Linux savvy, so I'm not sure how that will play out for unRAID.

https://forum.proxmox.com/threads/only-one-of-two-nvme-detected-in-linux-duplicate-subnqn.54480/

 

I thought the NQN was something configurable by the manufacturer. So, I would blame on ADATA and not Realtek. I could be completely wrong about that though.

 

I contacted ADATA "customer service" (in quotes since it is a joke). Their response email looks completely automated as it starts with "Dear Customer" and doesn't even have the name I entered when filling out the online form to contact them. The email starts out acknowledging I have an issue and then goes right into stating they are here to assist with the return of the product. The email did provide a few generic troubleshooting steps, but had absolutely no mention of the issue I explicitly detailed for them (again, since this is an automated, impersonal email.) I'll try replying to the email to see if that actually gets anywhere.

 

I think I'll probably just return the drives and spent a bit more for another brand.

Edited by Alabaster

Share this post


Link to post

I have started my return with Amazon on the Adata drives.  I have ordered 2 Corsair MP510 to replace them.  Hopefully Corsair will be easier to deal with then Adata if issues arise

Share this post


Link to post

Ok, I have cancelled the MP510 order.

 

@Benson are there nvme drives that just work?  

 

Samsung ?

HP EX920 ?

 

Noticed that there are a few firmware updates to the Phison E12 family of nvme drives (currently on 12.3) I wonder if any of the updates fixed the trim issues

 

Thanks

Edited by drkCrix

Share this post


Link to post

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.