Alabaster Posted July 13, 2019 Share Posted July 13, 2019 (edited) I just purchased 2 M.2 drives hoping to use them as a cache pool. The UEFI menu will show both drives, but the unRAID web GUI will only list one of the drives. Below is a little about my setup. Motherboard: ASRock X370 Taichi CPU: Ryzen 5 2600 M.2 drives: 2x XPG SX6000 Lite M.2 2280 512GB PCI-Express 3.0 x4 3D NAND HBAs: 2x LSI 9207-8i GPU: ATI FireMV 2250 I tested each drive separately and each will be detected in the web GUI in either M.2 slot when installed one at a time. However, unRAID isn't showing both of them when they're both installed, although the UEFI sees both of them. After swapping drives and slots, I removed the GPU thinking it may be some odd PCIe lane allocation issue, but the issue persisted. I also updated from 6.7.0 to 6.7.2. Attached are the diagnostics. I looked through syslog.txt and it looks to be an issue with both drives using the same NVMe Qualified Name (NQN). I'm researching this path at the moment, but hope someone with more experience can weigh in with a possible solution. Any ideas? elysium-diagnostics-20190713-0247.zip Edited July 13, 2019 by Alabaster 1 Quote Link to comment
Vr2Io Posted July 13, 2019 Share Posted July 13, 2019 (edited) 2 NVMe detect, problem may fix by Kernel or NVMe firmware update. 01:00.0 Non-Volatile memory controller [0108]: Realtek Semiconductor Co., Ltd. Device [10ec:5762] (rev 01) Subsystem: Realtek Semiconductor Co., Ltd. Device [10ec:5762] Kernel driver in use: nvme Kernel modules: nvme 21:00.0 Non-Volatile memory controller [0108]: Realtek Semiconductor Co., Ltd. Device [10ec:5762] (rev 01) Subsystem: Realtek Semiconductor Co., Ltd. Device [10ec:5762] Kernel modules: nvme Jul 12 21:26:04 Elysium kernel: nvme nvme1: ignoring ctrl due to duplicate subnqn (nqn.2018-05.com.example:nvme:nvm-subsystem-OUI00E04C). Jul 12 21:26:04 Elysium kernel: nvme nvme1: Removing after probe failure status: -22 https://forums.lenovo.com/t5/ThinkPad-X-Series-Laptops/X1-Extreme-Intel-NVMe-Firmware-Upgrade-NQN-Duplicate-Issue/m-p/4415819#M99048 commit b9453f9bb66e864f8b7d7e112aea475bdd7a4e2b Author: James Dingwall <james@dingwall.me.uk> Date: Tue Jan 8 10:20:51 2019 -0700 nvme: introduce NVME_QUIRK_IGNORE_DEV_SUBNQN [ Upstream commit 6299358d198a0635da2dd3c4b3ec37789e811e44 ] If a device provides an NQN it is expected to be globally unique. Unfortunately some firmware revisions for Intel 760p/Pro 7600p devices did not satisfy this requirement. In these circumstances if a system has >1 affected device then only one device is enabled. If this quirk is enabled then the device supplied subnqn is ignored and we fallback to generating one as if the field was empty. In this case we also suppress the version check so we don't print a warning when the quirk is enabled. Reviewed-by: Keith Busch <keith.busch@intel.com> Signed-off-by: James Dingwall <james@dingwall.me.uk> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Sasha Levin <sashal@kernel.org> Edited July 13, 2019 by Benson 1 1 Quote Link to comment
drkCrix Posted July 15, 2019 Share Posted July 15, 2019 I am also having this same issue with 2 Adata XPG GAMMIX S5 256GB drives. Both are seen in the BIOS but I see the following during system start up. Jul 14 21:56:11 TheWatchtower kernel: nvme nvme1: ignoring ctrl due to duplicate subnqn (nqn.2018-05.com.example:nvme:nvm-subsystem-OUI00E04C). Jul 14 21:56:11 TheWatchtower kernel: nvme nvme1: Removing after probe failure status: -22 Is there a config tweak that can be made or will this have to be added by the main dev team? Alabaster where you able to get the issue sorted out? Cheers, Chris Quote Link to comment
drkCrix Posted July 16, 2019 Share Posted July 16, 2019 @Alabaster I found this today https://lkml.org/lkml/2019/7/15/57 Looks like it is for drives with the realtek controller like we have. 1 Quote Link to comment
ijuarez Posted July 16, 2019 Share Posted July 16, 2019 On 7/15/2019 at 12:09 AM, drkCrix said: I am also having this same issue with 2 Adata XPG GAMMIX S5 256GB drives. Both are seen in the BIOS but I see the following during system start up. Jul 14 21:56:11 TheWatchtower kernel: nvme nvme1: ignoring ctrl due to duplicate subnqn (nqn.2018-05.com.example:nvme:nvm-subsystem-OUI00E04C). Jul 14 21:56:11 TheWatchtower kernel: nvme nvme1: Removing after probe failure status: -22 Is there a config tweak that can be made or will this have to be added by the main dev team? Alabaster where you able to get the issue sorted out? Cheers, Chris Unrelated to your issue? did you buy you're ADATA from MASS DROP? Quote Link to comment
mishan Posted July 17, 2019 Share Posted July 17, 2019 (edited) That was me. I ran into this issue with my new Thinkpad X1 Extreme laptop. Feel free to adapt the patch if you have different variants of this SSD. `lspci -nn` will give you the PCI vendor and device IDs. If it's Realtek, it should be 0x10ec for the vendor ID and the device ID will be different for different models of the SSD. Also, we really should get ADATA/Realtek to patch their lame firmware I'm sure the Linux kernel guys aren't happy with an ever-growing list of quirks. Also, here's a resource for building custom kernels for unRAID: https://wiki.unraid.net/Building_a_custom_kernel (Note: I'm not an unRAID user, just circling back here as I hate the phenomenon of finding some post about some problem, but no solutions.) 9 hours ago, drkCrix said: @Alabaster I found this today https://lkml.org/lkml/2019/7/15/57 Looks like it is for drives with the realtek controller like we have. Edited July 17, 2019 by mishan Quote Link to comment
Alabaster Posted July 17, 2019 Author Share Posted July 17, 2019 (edited) @drkCrix No, I haven't sorted this out. According to the following link, it appears there will be a fix in the Linux 5.3 kernel. I'm not Linux savvy, so I'm not sure how that will play out for unRAID. https://forum.proxmox.com/threads/only-one-of-two-nvme-detected-in-linux-duplicate-subnqn.54480/ I thought the NQN was something configurable by the manufacturer. So, I would blame on ADATA and not Realtek. I could be completely wrong about that though. I contacted ADATA "customer service" (in quotes since it is a joke). Their response email looks completely automated as it starts with "Dear Customer" and doesn't even have the name I entered when filling out the online form to contact them. The email starts out acknowledging I have an issue and then goes right into stating they are here to assist with the return of the product. The email did provide a few generic troubleshooting steps, but had absolutely no mention of the issue I explicitly detailed for them (again, since this is an automated, impersonal email.) I'll try replying to the email to see if that actually gets anywhere. I think I'll probably just return the drives and spent a bit more for another brand. Edited July 17, 2019 by Alabaster 1 Quote Link to comment
drkCrix Posted July 20, 2019 Share Posted July 20, 2019 I have started my return with Amazon on the Adata drives. I have ordered 2 Corsair MP510 to replace them. Hopefully Corsair will be easier to deal with then Adata if issues arise Quote Link to comment
Vr2Io Posted July 20, 2019 Share Posted July 20, 2019 But MP510 seems have TRIM issue. Quote Link to comment
drkCrix Posted July 20, 2019 Share Posted July 20, 2019 (edited) Ok, I have cancelled the MP510 order. @Benson are there nvme drives that just work? Samsung ? HP EX920 ? Noticed that there are a few firmware updates to the Phison E12 family of nvme drives (currently on 12.3) I wonder if any of the updates fixed the trim issues Thanks Edited July 20, 2019 by drkCrix Quote Link to comment
mishan Posted August 2, 2019 Share Posted August 2, 2019 On 7/20/2019 at 4:36 AM, johnnie.black said: Samsung work fine. Yeah, seriously hard to go wrong with Samsung SSDs Quote Link to comment
Alabaster Posted August 2, 2019 Author Share Posted August 2, 2019 UPDATE: I returned one of the ADATA drives and purchased a HP EX920. Both drives will now show up now. I was doing testing on my UPS setup to make sure the server will come back on after a power outage. I have a CyberPower CP1500AVRLCD UPS and am using the NUT plugin for communication. I noticed the UPS was turning off before the server would shut down. This was causing a hard shutdown and the ADATA drive would not show up after powering the server back on, not even in BIOS/UEFI. It also wouldn't show up on subsequent reboots either. I would have to swap the drives between the M.2 slots and boot it back up for both drives to show up again (not convenient!!). I was able to tweak the settings and config files in the NUT plugin to get my desired shutdown sequence in order to avoid this problem. In case anyone was wondering, I effectively ran the Autodetect function in NUT, switched the 'Enable Manual Config Only' option to Yes, then added the two highlighted lines in the screenshot to the ups.conf file. Adding the 'offdelay' instructs my UPS to delay its shutdown for roughly X seconds. This allowed more time for the server to shutdown. The default is 20secs which is simply not enough. I tried adding the same to the ups.conf file when the 'Enable Manual Config Only' option was set to No, but the value would default when the service was started, thus I had to set it to Yes. Below are the settings I'm using for visual reference(I didn't have the service started in the screenshot as I was still testing, I also only had the shutdown time set at 1min for testing purposes as well). Of course, your mileage may vary. Hope that helps someone! Quote Link to comment
sekrit Posted September 3, 2019 Share Posted September 3, 2019 On 7/20/2019 at 7:36 AM, johnnie.black said: Samsung work fine. I actually came here because my 2 Samsung NVMEs are not both being recognized. 😕 Quote Link to comment
JorgeB Posted September 3, 2019 Share Posted September 3, 2019 3 minutes ago, sekrit said: I actually came here because my 2 Samsung NVMEs are not both being recognized. 😕 Then you have a different problem, they are known to work with Unraid. Quote Link to comment
sekrit Posted September 3, 2019 Share Posted September 3, 2019 8 minutes ago, johnnie.black said: Then you have a different problem, they are known to work with Unraid. The ones previously seen perhaps. I doubt that it is an entirely different issue regarding same model NVMEs. Giving such an absolute response when not even knowing which models are effected is "specious" at best. No? Quote Link to comment
JorgeB Posted September 3, 2019 Share Posted September 3, 2019 See if the devices are detetc on the bios or other OS, or post the diags. Quote Link to comment
tjb_altf4 Posted September 3, 2019 Share Posted September 3, 2019 3 hours ago, sekrit said: I actually came here because my 2 Samsung NVMEs are not both being recognized. 😕 Good chance it's a pcie lane sharing issue... that is a common culprit. Check your mobo manual to see if your m.2 slots share lanes with other devices you may be using. 1 Quote Link to comment
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.