• [6.10.0-rc4] NVMe device name changed after upgrading


    JorgeB
    • Solved Minor

    Just upgraded one of my main servers from v6.9.2 to rc4 and one of the NVMe devices changed name and is "wrong", there's an extra underscore:

     

    image.png.a454b7b6a543cbd423c06350dc9b9571.png

     

    No big deal I though, just reset the pool and it will use the new name, but it doesn't take, after resetting the pool and starting the array the device mounts but it continues to show the device as "new", note the blue icon:

     

    image.thumb.png.5dbf64cefbd48d0acd8c4b443871768c.png

     

    And after stopping the array:

     

    image.thumb.png.da336ebcb7162f8ff65f0fa47e3ddf98.png

     

    So I need to reset the pool every time I stop the array, this doesn't happen with the other NVMe devices and if I go back top v6.9.2 this NVMe device name goes back to how it was.

     

     

     

    tower1-diagnostics-20220423-1417.zip

     




    User Feedback

    Recommended Comments



    What is the output of

    ls -l /dev/disk/by-id

     

    Upcoming version rc5 has a newer version of eudev, worth to check.

     

    Link to comment

    I assume you just need this device:

     

    lrwxrwxrwx 1 root root 13 Apr 23 14:06 nvme-TOSHIBA-RD400__664S107XTPGV -> ../../nvme1n1
    lrwxrwxrwx 1 root root 15 Apr 23 14:06 nvme-TOSHIBA-RD400__664S107XTPGV-part1 -> ../../nvme1n1p1

     

    Link to comment
    4 minutes ago, JorgeB said:

    I assume you just need this device:

     

    Yes, this is the name created by eudev.

    I did a quick check on the rules but don't see any double underscore inserted, I am on "internal" rc5 though.

     

    Link to comment

    Would making the match fuzzier help or hurt? I get that it would be VERY BAD to get a false positive match, but is there anything you could do to make the false negatives less prevalent without introducing any false positives? Maybe parse the old device string for underscores, dashes, spaces, and only compare the alphanumeric strings between the separators? That way aaa - bbb and aaa__--__bbb would be detected as the same device?

    Link to comment
    9 hours ago, limetech said:

    Please retest with -rc5

    Same deal, I guess this is device related? Strange that this is my oldest NVMe device and never had had issues since like v6.4.

    Link to comment
    9 hours ago, JorgeB said:

    Same deal, I guess this is device related? Strange that this is my oldest NVMe device and never had had issues since like v6.4.

     

    Please post output of this command:

     

    /sbin/udevadm info -q property -n /dev/nvme1n1

    Link to comment
    3 minutes ago, limetech said:

    Please post output of this command:

    DEVLINKS=/dev/disk/by-id/nvme-TOSHIBA-RD400__664S107XTPGV /dev/disk/by-id/nvme-eui.e83a9702000018f5
    DEVNAME=/dev/nvme1n1
    DEVPATH=/devices/pci0000:64/0000:64:02.0/0000:66:00.0/nvme/nvme1/nvme1n1
    DEVTYPE=disk
    DISKSEQ=18
    ID_MODEL=TOSHIBA-RD400
    ID_PART_TABLE_TYPE=dos
    ID_SERIAL=TOSHIBA-RD400_        664S107XTPGV
    ID_SERIAL_SHORT=        664S107XTPGV
    ID_WWN=eui.e83a9702000018f5
    MAJOR=259
    MINOR=0
    SUBSYSTEM=block
    USEC_INITIALIZED=36861596

     

    Hmm, guess all those extra spaces before the serial are the problem?

    Link to comment
    37 minutes ago, JorgeB said:
    DEVLINKS=/dev/disk/by-id/nvme-TOSHIBA-RD400__664S107XTPGV /dev/disk/by-id/nvme-eui.e83a9702000018f5
    DEVNAME=/dev/nvme1n1
    DEVPATH=/devices/pci0000:64/0000:64:02.0/0000:66:00.0/nvme/nvme1/nvme1n1
    DEVTYPE=disk
    DISKSEQ=18
    ID_MODEL=TOSHIBA-RD400
    ID_PART_TABLE_TYPE=dos
    ID_SERIAL=TOSHIBA-RD400_        664S107XTPGV
    ID_SERIAL_SHORT=        664S107XTPGV
    ID_WWN=eui.e83a9702000018f5
    MAJOR=259
    MINOR=0
    SUBSYSTEM=block
    USEC_INITIALIZED=36861596

     

    Hmm, guess all those extra spaces before the serial are the problem?

    Does cat  /sys/block/nvme1n1/device/serial provide the same result with spaces?

    Link to comment
    10 minutes ago, SimonF said:

    Does cat  /sys/block/nvme1n1/device/serial provide the same result with spaces?

     

    Yep:

    root@Tower1:~# cat  /sys/block/nvme1n1/device/serial
            664S107XTPGV

     

    Link to comment

    Kind of a pain, but if you want to revert to 6.9.2 and type same command we can see what's the difference.

     

    I've looked through quite a bit of eudev code and there is a function that does exactly this, ie:

    • it strips all spaces from end of string
    • it strips all spaces from beginning of string
    • collapses sequence of one or more internal spaces to a single underscore

    So if above was applied to

    ID_SERIAL=TOSHIBA-RD400_        664S107XTPGV

    you would get this exact result. But it looks like the udev rules do not use that string directly.  Anyway this is very odd.

     

    Link to comment

    eudev rules concatenate {ID_MODEL}_{ID_SERIAL_SHORT} to create {ID_SERIAL} and the subsequent symlink.

     

    It looks like the leading spaces in the ID_SERIAL_SHORT field are not removed, and replaced by an underscore (eudev by default replaces spaces by an underscore in the symlink).
     

    # /sbin/udevadm info -q property -n /dev/nvme3n1
    DEVLINKS=/dev/disk/by-id/nvme-Samsung_SSD_970_EVO_1TB_S5H9NS0NB15476D /dev/disk/by-id/nvme-eui.0025385b01410415
    DEVNAME=/dev/nvme3n1
    DEVPATH=/devices/pci0000:00/0000:00:03.3/0000:07:00.0/nvme/nvme3/nvme3n1
    DEVTYPE=disk
    DISKSEQ=20
    ID_MODEL=Samsung SSD 970 EVO 1TB
    ID_PART_TABLE_TYPE=dos
    ID_SERIAL=Samsung SSD 970 EVO 1TB_S5H9NS0NB15476D
    ID_SERIAL_SHORT=S5H9NS0NB15476D
    ID_WWN=eui.0025385b01410415
    MAJOR=259
    MINOR=4
    SUBSYSTEM=block
    USEC_INITIALIZED=32096967
    

     

    Note the spaces in the ID_SERIAL fields, which are replaced by underscore in the symlink.

     

    I remember my old Micron SSD disks had the same issue with leading spaces in the name, but these were all removed.

    Looks like a later commit to eudev introduces this issue.

     

     

    Link to comment
    6 minutes ago, limetech said:

    Kind of a pain, but if you want to revert to 6.9.2 and type same command we can see what's the difference.

    I can do that tomorrow, to be honest it wouldn't be a big loss since I expect to retire this SSD soon anyway because it's way past it's predicted life, currently at 180% with >1PB written:
     

    -Percentage used         180%
    -Data units read         311,791,494 [159 TB]
    -Data units written      2,076,218,700 [1.06 PB]

     

    On the other hand I'm curious to see how much longer it will last and though it's not a very common model there might be other users with the same device, so if it can be fixed it's always better.

    Link to comment
    4 minutes ago, bonienl said:

    eudev by default replaces spaces by an underscore in the symlink

    Where does it do that?

     

    The rule in rules/60-persistent-storage.rules:

     

    ENV{ID_SERIAL}="$env{ID_MODEL}_$env{ID_SERIAL_SHORT}", SYMLINK+="disk/by-id/nvme-$env{ID_SERIAL}"

     

    Maybe the code that actually generates the symlink using SYMLINK does this?

    Link to comment

    The file udev-rules.c has the function "get_key" which retrieves the key. Inside this function leading spaces are/should be stripped.

    Quote

     

    /* skip whitespace */

    while (isspace(linepos[0]) || linepos[0] == ',')

        linepos++;

     

     

    I believe "isspace" comes from a standard library, but no clue which library. Perhaps a problem here?

     

    Quote

    Where does it do that?

    See this PR (it was made 5 years ago)

    https://github.com/eudev-project/eudev/commit/5c39ec9686eb737fc012e4553b820c3ff656d446

    Link to comment
    Quote

     

    after resetting the pool and starting the array the device mounts but it continues to show the device as "new"

     

     

    This happens because the internal code of Unraid creates the device ID based on the fields given by udevadm, but this doesn't correspond anymore with the symlink (which has a now double underscore in it), hence it "thinks" it doesn't exist and is new...

     

    Link to comment

    We can apply the same "trick" of removing leading and trailing underscores and collapsing all sequences of 2 or more internal underscores to a single underscore.  However - will this now cause other people problems?  That's the risk/question.

     

    Maybe another approach is to run both the stored device id and the udev id through a filter first that removes all spaces and underscores unconditionally and then compare those strings - "should work" right?

    Link to comment

    @limetech in the past you made an adjustment to combine ID and WWN, and we introduced a new setting to include WWN or not in the displayed name. I am not sure how much that will be impacted *if* we start sanitizing the ID name.

     

    Perhaps safer to look first where or what is the difference between 6.9.2 and rc5?

     

    Link to comment
    4 hours ago, limetech said:

    but if you want to revert to 6.9.2 and type same command we can see what's the difference.

     

    DEVLINKS=/dev/disk/by-id/nvme-TOSHIBA-RD400_664S107XTPGV
    DEVNAME=/dev/nvme1n1
    DEVPATH=/devices/pci0000:64/0000:64:02.0/0000:66:00.0/nvme/nvme1/nvme1n1
    DEVTYPE=disk
    ID_MODEL=TOSHIBA-RD400
    ID_MODEL_ENC=TOSHIBA-RD400\x20\x20\x20\x20\x20\x20\x20\x20\x20\x20\x20\x20\x20\x20\x20\x20\x20\x20\x20\x20\x20\x20\x20\x20\x20\x20\x20
    ID_PART_TABLE_TYPE=dos
    ID_REVISION=57CZ4102
    ID_SERIAL=TOSHIBA-RD400_664S107XTPGV
    ID_SERIAL_SHORT=664S107XTPGV
    ID_TYPE=nvme
    MAJOR=259
    MINOR=0
    SUBSYSTEM=block
    USEC_INITIALIZED=27426685

     

    Re-posting output from v6.10 below so it's easier to compare without scrolling up:

     

    DEVLINKS=/dev/disk/by-id/nvme-TOSHIBA-RD400__664S107XTPGV /dev/disk/by-id/nvme-eui.e83a9702000018f5
    DEVNAME=/dev/nvme1n1
    DEVPATH=/devices/pci0000:64/0000:64:02.0/0000:66:00.0/nvme/nvme1/nvme1n1
    DEVTYPE=disk
    DISKSEQ=18
    ID_MODEL=TOSHIBA-RD400
    ID_PART_TABLE_TYPE=dos
    ID_SERIAL=TOSHIBA-RD400_        664S107XTPGV
    ID_SERIAL_SHORT=        664S107XTPGV
    ID_WWN=eui.e83a9702000018f5
    MAJOR=259
    MINOR=0
    SUBSYSTEM=block
    USEC_INITIALIZED=36829040

     

     

     

     

    • Like 1
    Link to comment
    7 minutes ago, bonienl said:

    What is the eudev version in 6.9.2?

     

    -rw-r--r-- 1 root root 0 Apr  7  2021 /var/log/packages/eudev-3.2.5-x86_64-2_LT

     

    Link to comment

    Going to close this since the name change is apparently a udev issue and the main problem, i.e., the fact that the pool config wasn't saved is a different bug, I now saved the new config and don't really care that the device name includes an extra underscore.

    Link to comment

    I've looked at about as much eudev/kernel code as I have appetite for ...

     

    The source of the problem is this:

    On 4/27/2022 at 10:30 AM, JorgeB said:
    root@Tower1:~# cat  /sys/block/nvme1n1/device/serial
            664S107XTPGV

     

    In constructing the /dev/disk/by-id symlink, eudev forms:

    "model" + "_" + "serial"

     

    Then there is additional code in eudev that looks at the overall symlink string, and then it collapses any internal white space to a singe "_" character (this was to fix a different bug).  This is why you see two underscores.

     

    I can see in the nvme device driver where trailing white space is removed, but nothing about leading white space.  Sometime between Linux kernel 5.10.x and 5.15.x there is a nvme driver change that quit trimming leading white space in a serial number.  Probably this is also device-specific, meaning most nvme devices don't have leading white space, and no one noticed the bug which was introduced except for resident storage guru @JorgeB :)

     

    Here's what I'm going to do about this:  In the next release, -rc8, I'm going to look at the './by-id' string and collapse multiple underscores into a single underscore.  This way anyone currently running older version of Unraid OS with an nvme device that has leading spaces in it's serial string, will continue to be identified properly.  However for you, @JorgeB it means you'll have to edit your cfg files to remove that extra underscore...

    Link to comment



    Join the conversation

    You can post now and register later. If you have an account, sign in now to post with your account.
    Note: Your post will require moderator approval before it will be visible.

    Guest
    Add a comment...

    ×   Pasted as rich text.   Restore formatting

      Only 75 emoji are allowed.

    ×   Your link has been automatically embedded.   Display as a link instead

    ×   Your previous content has been restored.   Clear editor

    ×   You cannot paste images directly. Upload or insert images from URL.


  • Status Definitions

     

    Open = Under consideration.

     

    Solved = The issue has been resolved.

     

    Solved version = The issue has been resolved in the indicated release version.

     

    Closed = Feedback or opinion better posted on our forum for discussion. Also for reports we cannot reproduce or need more information. In this case just add a comment and we will review it again.

     

    Retest = Please retest in latest release.


    Priority Definitions

     

    Minor = Something not working correctly.

     

    Urgent = Server crash, data loss, or other showstopper.

     

    Annoyance = Doesn't affect functionality but should be fixed.

     

    Other = Announcement or other non-issue.