• Zpool problems after upgrading to the 6.12.9


    blue8lucian
    • Minor



    User Feedback

    Recommended Comments

    This looks like a controller/kernel issue, some of these controllers, despite having only 6 SATA ports, list much more in the kernel, example from your Asmedia1166 controller and v6.12.8, it's listing 30 good ports, just 2 dummy:

     

    Mar 27 18:18:42 MedHP kernel: ata1: SATA max UDMA/133 abar m8192@0xfc780000 port 0xfc780100 irq 59
    Mar 27 18:18:42 MedHP kernel: ata2: SATA max UDMA/133 abar m8192@0xfc780000 port 0xfc780180 irq 59
    Mar 27 18:18:42 MedHP kernel: ata3: SATA max UDMA/133 abar m8192@0xfc780000 port 0xfc780200 irq 59
    Mar 27 18:18:42 MedHP kernel: ata4: SATA max UDMA/133 abar m8192@0xfc780000 port 0xfc780280 irq 59
    Mar 27 18:18:42 MedHP kernel: ata5: SATA max UDMA/133 abar m8192@0xfc780000 port 0xfc780300 irq 59
    Mar 27 18:18:42 MedHP kernel: ata6: SATA max UDMA/133 abar m8192@0xfc780000 port 0xfc780380 irq 59
    Mar 27 18:18:42 MedHP kernel: ata7: DUMMY
    Mar 27 18:18:42 MedHP kernel: ata8: DUMMY
    Mar 27 18:18:42 MedHP kernel: ata9: SATA max UDMA/133 abar m8192@0xfc780000 port 0xfc780500 irq 59
    Mar 27 18:18:42 MedHP kernel: ata10: SATA max UDMA/133 abar m8192@0xfc780000 port 0xfc780580 irq 59
    Mar 27 18:18:42 MedHP kernel: ata11: SATA max UDMA/133 abar m8192@0xfc780000 port 0xfc780600 irq 59
    Mar 27 18:18:42 MedHP kernel: ata12: SATA max UDMA/133 abar m8192@0xfc780000 port 0xfc780680 irq 59
    Mar 27 18:18:42 MedHP kernel: ata13: SATA max UDMA/133 abar m8192@0xfc780000 port 0xfc780700 irq 59
    Mar 27 18:18:42 MedHP kernel: ata14: SATA max UDMA/133 abar m8192@0xfc780000 port 0xfc780780 irq 59
    Mar 27 18:18:42 MedHP kernel: ata15: SATA max UDMA/133 abar m8192@0xfc780000 port 0xfc780800 irq 59
    Mar 27 18:18:42 MedHP kernel: ata16: SATA max UDMA/133 abar m8192@0xfc780000 port 0xfc780880 irq 59
    Mar 27 18:18:42 MedHP kernel: ata17: SATA max UDMA/133 abar m8192@0xfc780000 port 0xfc780900 irq 59
    Mar 27 18:18:42 MedHP kernel: ata18: SATA max UDMA/133 abar m8192@0xfc780000 port 0xfc780980 irq 59
    Mar 27 18:18:42 MedHP kernel: ata19: SATA max UDMA/133 abar m8192@0xfc780000 port 0xfc780a00 irq 59
    Mar 27 18:18:42 MedHP kernel: ata20: SATA max UDMA/133 abar m8192@0xfc780000 port 0xfc780a80 irq 59
    Mar 27 18:18:42 MedHP kernel: ata21: SATA max UDMA/133 abar m8192@0xfc780000 port 0xfc780b00 irq 59
    Mar 27 18:18:42 MedHP kernel: ata22: SATA max UDMA/133 abar m8192@0xfc780000 port 0xfc780b80 irq 59
    Mar 27 18:18:42 MedHP kernel: ata23: SATA max UDMA/133 abar m8192@0xfc780000 port 0xfc780c00 irq 59
    Mar 27 18:18:42 MedHP kernel: ata24: SATA max UDMA/133 abar m8192@0xfc780000 port 0xfc780c80 irq 59
    Mar 27 18:18:42 MedHP kernel: ata25: SATA max UDMA/133 abar m8192@0xfc780000 port 0xfc780d00 irq 59
    Mar 27 18:18:42 MedHP kernel: ata26: SATA max UDMA/133 abar m8192@0xfc780000 port 0xfc780d80 irq 59
    Mar 27 18:18:42 MedHP kernel: ata27: SATA max UDMA/133 abar m8192@0xfc780000 port 0xfc780e00 irq 59
    Mar 27 18:18:42 MedHP kernel: ata28: SATA max UDMA/133 abar m8192@0xfc780000 port 0xfc780e80 irq 59
    Mar 27 18:18:42 MedHP kernel: ata29: SATA max UDMA/133 abar m8192@0xfc780000 port 0xfc780f00 irq 59
    Mar 27 18:18:42 MedHP kernel: ata30: SATA max UDMA/133 abar m8192@0xfc780000 port 0xfc780f80 irq 59
    Mar 27 18:18:42 MedHP kernel: ata31: SATA max UDMA/133 abar m8192@0xfc780000 port 0xfc781000 irq 59
    Mar 27 18:18:42 MedHP kernel: ata32: SATA max UDMA/133 abar m8192@0xfc780000 port 0xfc781080 irq 59

     

    With 6.12.9 looks like the kernel is enforcing just 6 ports, so all ports after ATA6 are considered dummy ports:


     

    Mar 27 18:13:36 MedHP kernel: ahci 0000:04:00.0: ASM1166 has only six ports
    
    Mar 27 18:13:36 MedHP kernel: ata1: SATA max UDMA/133 abar m8192@0xfc780000 port 0xfc780100 irq 62
    Mar 27 18:13:36 MedHP kernel: ata2: SATA max UDMA/133 abar m8192@0xfc780000 port 0xfc780180 irq 62
    Mar 27 18:13:36 MedHP kernel: ata3: SATA max UDMA/133 abar m8192@0xfc780000 port 0xfc780200 irq 62
    Mar 27 18:13:36 MedHP kernel: ata4: SATA max UDMA/133 abar m8192@0xfc780000 port 0xfc780280 irq 62
    Mar 27 18:13:36 MedHP kernel: ata5: SATA max UDMA/133 abar m8192@0xfc780000 port 0xfc780300 irq 62
    Mar 27 18:13:36 MedHP kernel: ata6: SATA max UDMA/133 abar m8192@0xfc780000 port 0xfc780380 irq 62
    Mar 27 18:13:36 MedHP kernel: ata7: DUMMY
    Mar 27 18:13:36 MedHP kernel: ata8: DUMMY
    Mar 27 18:13:36 MedHP kernel: ata9: DUMMY
    Mar 27 18:13:36 MedHP kernel: ata10: DUMMY
    Mar 27 18:13:36 MedHP kernel: ata11: DUMMY
    Mar 27 18:13:36 MedHP kernel: ata12: DUMMY
    Mar 27 18:13:36 MedHP kernel: ata13: DUMMY
    Mar 27 18:13:36 MedHP kernel: ata14: DUMMY
    Mar 27 18:13:36 MedHP kernel: ata15: DUMMY
    Mar 27 18:13:36 MedHP kernel: ata16: DUMMY
    Mar 27 18:13:36 MedHP kernel: ata17: DUMMY
    Mar 27 18:13:36 MedHP kernel: ata18: DUMMY
    Mar 27 18:13:36 MedHP kernel: ata19: DUMMY
    Mar 27 18:13:36 MedHP kernel: ata20: DUMMY
    Mar 27 18:13:36 MedHP kernel: ata21: DUMMY
    Mar 27 18:13:36 MedHP kernel: ata22: DUMMY
    Mar 27 18:13:36 MedHP kernel: ata23: DUMMY
    Mar 27 18:13:36 MedHP kernel: ata24: DUMMY
    Mar 27 18:13:36 MedHP kernel: ata25: DUMMY
    Mar 27 18:13:36 MedHP kernel: ata26: DUMMY
    Mar 27 18:13:36 MedHP kernel: ata27: DUMMY
    Mar 27 18:13:36 MedHP kernel: ata28: DUMMY
    Mar 27 18:13:36 MedHP kernel: ata29: DUMMY
    Mar 27 18:13:36 MedHP kernel: ata30: DUMMY
    Mar 27 18:13:36 MedHP kernel: ata31: DUMMY
    Mar 27 18:13:36 MedHP kernel: ata32: DUMMY

     

    But your controller is using 4 of those dummy ports, ATA29, 30, 31 and 32.

     

    So technically I think this is more a controller firmware issue, it should only indicate actual 6 ports, not 30, still the kernel should be able to use them, either by using a quirk or reverting this recent change, since I suspect a lot of users will be affected.

    Link to comment
    9 hours ago, JorgeB said:

    @blue8lucianis this a 6 port controller or does it have more ports, i.e., it has a SATA port multiplier?

    Yes i Have a SATA port multiplier but in the process of change it.

    Link to comment

    I experienced this/similar behaviour also after upgrading to 6.12.9.
    Upgraded: 3 drives in my pool "unavailable" (pretty sure it is the three that are logical drives above 32) so pool wont import and calamity ensues... 

    downgraded to 6.12.8: instantly works properly. 

    Fwiw, I'm running a hybrid zfs pool (leftover from my old baremetal ubuntu setup, works fine with no issues other than UI and a forced "zpool import" on boot).

    And this is the 10port sata controller details (with 9 drives connected) output from "lshw -class storage -class disk"

    
      *-sata                    
           description: SATA controller
           product: ASM1166 Serial ATA Controller
           vendor: ASMedia Technology Inc.
           physical id: 0
           bus info: pci@0000:01:00.0
           logical name: scsi5
           logical name: scsi6
           logical name: scsi7
           logical name: scsi8
           logical name: scsi9
           logical name: scsi10
           logical name: scsi34
           logical name: scsi35
           logical name: scsi36
           version: 02
           width: 32 bits
           clock: 33MHz
           capabilities: sata pm msi pciexpress ahci_1.0 bus_master cap_list rom emulated
           configuration: driver=ahci latency=0
           resources: irq:129 memory:d1182000-d1183fff memory:d1180000-d1181fff memory:d1100000-d117ffff

     

    Link to comment
    4 hours ago, rogueGeer said:

    I experienced this/similar behaviour also after upgrading to 6.12.9.

    This issue will happen with any Asmedia 1064 with more than 4 ports, or any Asmedia 1166 controller with more then 6 ports, i.e., controllers that use SATA port multipliers, any devices on ports that come from a PM will not be detected, this is a kernel issue, they have already reverted this change but it didn't make to kernel 6.1.x., so possibly LT will need to apply the patches themselves.

    • Thanks 1
    Link to comment

    Think I have the same or similar issue. My larger zpool with 18 hdd and 2 nvme drive does not mount automatically after a restart and I have to mount the zpool manually. Once added, the pool operates normally. The pool is on a netapp ds2448, 24 disk shelf via a LSI SAS 9201-e16

    But the Main page indicate the pool is "Unmountable: Unsupported or no file system found" for all 18 drives.  I have added manually a special vdev, mirrored 1TB nvme, to this pool for metadata and small files to speed up some operations, which may be a complicating factor. 

     

    Just in case, upload the diagnostic file

    unraid-diagnostics-20240401-1712.zip

    Link to comment
    35 minutes ago, Eric Hollebone said:

    Think I have the same or similar issue.

    It can't be the same issue if all the devices are detected, for some reason your syslog is incomplete, but I suspect I know what the problem is, post a screenshot of main showing the pool assignments.

    Link to comment
    1 hour ago, ambipro said:

    I have a ASM1166 and haven't had any issues with this update.

    There's a problem only if the controller has more than 6 ports, i.e., if it uses a SATA port multiplier, and only the ports behind the PM won't work.

    Link to comment

    I am experiencing the same issue after upgrade to 6.12.9.

     

    It appears related to this Linux kernel commit:

       https://github.com/torvalds/linux/commit/9815e39617541ef52d0dfac4be274ad378c6dc09

     

    The commit attempts to correct over-enumeration of SATA ports, but it does not account for port multipliers that are common in a large number of SATA cards.

     

    A reverse patch has been merged to the kernel tree, but it looks like Unraid must be using an affected kernel build. See: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=2f3c2b39768d2c0ccf0f6712a4b27453674e5de7

     

    If an Unraid developer sees this, please investigate merging the patch into your kernel build and release an update. This is a critical failure for many people. Thanks!

     

     

     

    Edited by mojojojo
    Link to comment

    @JorgeB Here is the screen shot of main. This is done after I remount the zpool manual. if you need a picture of it before, please let me know.

    601432978_unriad-6_12.10-zpoolnotmounted.thumb.png.35e6979cc72c5edfcc16201bb78e3849.png

     

    and the zpool status if it helps
     

    Quote

    root@renfrew-unraid:~# zpool status
      pool: storage
     state: ONLINE
      scan: scrub repaired 0B in 03:35:24 with 0 errors on Sun Mar 24 02:35:25 2024
    config:

            NAME         STATE     READ WRITE CKSUM
            storage      ONLINE       0     0     0
              raidz1-0   ONLINE       0     0     0
                sdi1     ONLINE       0     0     0
                sdj1     ONLINE       0     0     0
                sdk1     ONLINE       0     0     0
              raidz1-1   ONLINE       0     0     0
                sdl1     ONLINE       0     0     0
                sdm1     ONLINE       0     0     0
                sdn1     ONLINE       0     0     0
              raidz1-2   ONLINE       0     0     0
                sdo1     ONLINE       0     0     0
                sdp1     ONLINE       0     0     0
                sdq1     ONLINE       0     0     0
              raidz1-3   ONLINE       0     0     0
                sdr1     ONLINE       0     0     0
                sds1     ONLINE       0     0     0
                sdt1     ONLINE       0     0     0
              raidz1-4   ONLINE       0     0     0
                sdu1     ONLINE       0     0     0
                sdv1     ONLINE       0     0     0
                sdw1     ONLINE       0     0     0
              raidz1-5   ONLINE       0     0     0
                sdx1     ONLINE       0     0     0
                sdy1     ONLINE       0     0     0
                sdz1     ONLINE       0     0     0
            special
              mirror-6   ONLINE       0     0     0
                nvme1n1  ONLINE       0     0     0
                nvme2n1  ONLINE       0     0     0

    errors: No known data errors

      pool: vm-storage
     state: ONLINE
      scan: scrub repaired 0B in 01:51:45 with 0 errors on Fri Mar 29 01:11:46 2024
    config:

            NAME                                         STATE     READ WRITE CKSUM
            vm-storage                                   ONLINE       0     0     0
              raidz1-0                                   ONLINE       0     0     0
                sdaa1                                    ONLINE       0     0     0
                sdab1                                    ONLINE       0     0     0
                sdac1                                    ONLINE       0     0     0
                sdad1                                    ONLINE       0     0     0
            logs
              nvme-INTEL_MEMPEK1J032GA_PHBT80930079032P  ONLINE       0     0     0

    errors: No known data errors

     

    Edited by Eric Hollebone
    Link to comment

    Update #1  - was watching the zpool mount during array start and captured the following from the system log

     

    Quote

    Apr  4 16:20:03 renfrew-unraid emhttpd: mounting /mnt/storage
    Apr  4 16:20:03 renfrew-unraid emhttpd: shcmd (60391): mkdir -p /mnt/storage
    Apr  4 16:20:03 renfrew-unraid emhttpd: shcmd (60392): /usr/sbin/zpool import -f -N -o autoexpand=on  -d /dev/sdi1 -d /dev/sdj1 -d /dev/sdk1 -d /dev/sdl1 -d /dev/sdm1 -d /dev/sdn1 -d /dev/sdo1 -d /dev/sdp1 -d /dev/sdq1 -d /dev/sdr1 -d /dev/sds1 -d /dev/sdt1 -d /dev/sdu1 -d /dev/sdv1 -d /dev/sdw1 -d /dev/sdx1 -d /dev/sdy1 -d /dev/sdz1 2940375383946697912 storage
    Apr  4 16:20:03 renfrew-unraid root: cannot import 'storage' as 'storage': I/O error
    Apr  4 16:20:03 renfrew-unraid root: #011Destroy and re-create the pool from
    Apr  4 16:20:03 renfrew-unraid root: #011a backup source.
    Apr  4 16:20:03 renfrew-unraid emhttpd: shcmd (60392): exit status: 1
    Apr  4 16:20:03 renfrew-unraid emhttpd: storage: import error
    Apr  4 16:20:03 renfrew-unraid emhttpd: /mnt/storage mount error: Unsupported or no file system
    Apr  4 16:20:03 renfrew-unraid emhttpd: shcmd (60393): rmdir /mnt/storage

     

    but after the array starts, I can mount the pool and have it function correctly by using:

    zpool import storage

    What seems to be missing is the extra special vdev for the metadata mirror from the scripted import cmd.

     

    I'm thinking my post to this topic likely should be moved to a new thread as it does not relate to the original topic, and classified as a enhancement to support special vdevs
     

    Edited by Eric Hollebone
    Link to comment


    Join the conversation

    You can post now and register later. If you have an account, sign in now to post with your account.
    Note: Your post will require moderator approval before it will be visible.

    Guest
    Add a comment...

    ×   Pasted as rich text.   Restore formatting

      Only 75 emoji are allowed.

    ×   Your link has been automatically embedded.   Display as a link instead

    ×   Your previous content has been restored.   Clear editor

    ×   You cannot paste images directly. Upload or insert images from URL.


  • Status Definitions

     

    Open = Under consideration.

     

    Solved = The issue has been resolved.

     

    Solved version = The issue has been resolved in the indicated release version.

     

    Closed = Feedback or opinion better posted on our forum for discussion. Also for reports we cannot reproduce or need more information. In this case just add a comment and we will review it again.

     

    Retest = Please retest in latest release.


    Priority Definitions

     

    Minor = Something not working correctly.

     

    Urgent = Server crash, data loss, or other showstopper.

     

    Annoyance = Doesn't affect functionality but should be fixed.

     

    Other = Announcement or other non-issue.