Zpool problems after upgrading to the 6.12.9

JorgeB · March 27

This looks like a controller/kernel issue, some of these controllers, despite having only 6 SATA ports, list much more in the kernel, example from your Asmedia1166 controller and v6.12.8, it's listing 30 good ports, just 2 dummy:

Mar 27 18:18:42 MedHP kernel: ata1: SATA max UDMA/133 abar m8192@0xfc780000 port 0xfc780100 irq 59
Mar 27 18:18:42 MedHP kernel: ata2: SATA max UDMA/133 abar m8192@0xfc780000 port 0xfc780180 irq 59
Mar 27 18:18:42 MedHP kernel: ata3: SATA max UDMA/133 abar m8192@0xfc780000 port 0xfc780200 irq 59
Mar 27 18:18:42 MedHP kernel: ata4: SATA max UDMA/133 abar m8192@0xfc780000 port 0xfc780280 irq 59
Mar 27 18:18:42 MedHP kernel: ata5: SATA max UDMA/133 abar m8192@0xfc780000 port 0xfc780300 irq 59
Mar 27 18:18:42 MedHP kernel: ata6: SATA max UDMA/133 abar m8192@0xfc780000 port 0xfc780380 irq 59
Mar 27 18:18:42 MedHP kernel: ata7: DUMMY
Mar 27 18:18:42 MedHP kernel: ata8: DUMMY
Mar 27 18:18:42 MedHP kernel: ata9: SATA max UDMA/133 abar m8192@0xfc780000 port 0xfc780500 irq 59
Mar 27 18:18:42 MedHP kernel: ata10: SATA max UDMA/133 abar m8192@0xfc780000 port 0xfc780580 irq 59
Mar 27 18:18:42 MedHP kernel: ata11: SATA max UDMA/133 abar m8192@0xfc780000 port 0xfc780600 irq 59
Mar 27 18:18:42 MedHP kernel: ata12: SATA max UDMA/133 abar m8192@0xfc780000 port 0xfc780680 irq 59
Mar 27 18:18:42 MedHP kernel: ata13: SATA max UDMA/133 abar m8192@0xfc780000 port 0xfc780700 irq 59
Mar 27 18:18:42 MedHP kernel: ata14: SATA max UDMA/133 abar m8192@0xfc780000 port 0xfc780780 irq 59
Mar 27 18:18:42 MedHP kernel: ata15: SATA max UDMA/133 abar m8192@0xfc780000 port 0xfc780800 irq 59
Mar 27 18:18:42 MedHP kernel: ata16: SATA max UDMA/133 abar m8192@0xfc780000 port 0xfc780880 irq 59
Mar 27 18:18:42 MedHP kernel: ata17: SATA max UDMA/133 abar m8192@0xfc780000 port 0xfc780900 irq 59
Mar 27 18:18:42 MedHP kernel: ata18: SATA max UDMA/133 abar m8192@0xfc780000 port 0xfc780980 irq 59
Mar 27 18:18:42 MedHP kernel: ata19: SATA max UDMA/133 abar m8192@0xfc780000 port 0xfc780a00 irq 59
Mar 27 18:18:42 MedHP kernel: ata20: SATA max UDMA/133 abar m8192@0xfc780000 port 0xfc780a80 irq 59
Mar 27 18:18:42 MedHP kernel: ata21: SATA max UDMA/133 abar m8192@0xfc780000 port 0xfc780b00 irq 59
Mar 27 18:18:42 MedHP kernel: ata22: SATA max UDMA/133 abar m8192@0xfc780000 port 0xfc780b80 irq 59
Mar 27 18:18:42 MedHP kernel: ata23: SATA max UDMA/133 abar m8192@0xfc780000 port 0xfc780c00 irq 59
Mar 27 18:18:42 MedHP kernel: ata24: SATA max UDMA/133 abar m8192@0xfc780000 port 0xfc780c80 irq 59
Mar 27 18:18:42 MedHP kernel: ata25: SATA max UDMA/133 abar m8192@0xfc780000 port 0xfc780d00 irq 59
Mar 27 18:18:42 MedHP kernel: ata26: SATA max UDMA/133 abar m8192@0xfc780000 port 0xfc780d80 irq 59
Mar 27 18:18:42 MedHP kernel: ata27: SATA max UDMA/133 abar m8192@0xfc780000 port 0xfc780e00 irq 59
Mar 27 18:18:42 MedHP kernel: ata28: SATA max UDMA/133 abar m8192@0xfc780000 port 0xfc780e80 irq 59
Mar 27 18:18:42 MedHP kernel: ata29: SATA max UDMA/133 abar m8192@0xfc780000 port 0xfc780f00 irq 59
Mar 27 18:18:42 MedHP kernel: ata30: SATA max UDMA/133 abar m8192@0xfc780000 port 0xfc780f80 irq 59
Mar 27 18:18:42 MedHP kernel: ata31: SATA max UDMA/133 abar m8192@0xfc780000 port 0xfc781000 irq 59
Mar 27 18:18:42 MedHP kernel: ata32: SATA max UDMA/133 abar m8192@0xfc780000 port 0xfc781080 irq 59

With 6.12.9 looks like the kernel is enforcing just 6 ports, so all ports after ATA6 are considered dummy ports:

Mar 27 18:13:36 MedHP kernel: ahci 0000:04:00.0: ASM1166 has only six ports

Mar 27 18:13:36 MedHP kernel: ata1: SATA max UDMA/133 abar m8192@0xfc780000 port 0xfc780100 irq 62
Mar 27 18:13:36 MedHP kernel: ata2: SATA max UDMA/133 abar m8192@0xfc780000 port 0xfc780180 irq 62
Mar 27 18:13:36 MedHP kernel: ata3: SATA max UDMA/133 abar m8192@0xfc780000 port 0xfc780200 irq 62
Mar 27 18:13:36 MedHP kernel: ata4: SATA max UDMA/133 abar m8192@0xfc780000 port 0xfc780280 irq 62
Mar 27 18:13:36 MedHP kernel: ata5: SATA max UDMA/133 abar m8192@0xfc780000 port 0xfc780300 irq 62
Mar 27 18:13:36 MedHP kernel: ata6: SATA max UDMA/133 abar m8192@0xfc780000 port 0xfc780380 irq 62
Mar 27 18:13:36 MedHP kernel: ata7: DUMMY
Mar 27 18:13:36 MedHP kernel: ata8: DUMMY
Mar 27 18:13:36 MedHP kernel: ata9: DUMMY
Mar 27 18:13:36 MedHP kernel: ata10: DUMMY
Mar 27 18:13:36 MedHP kernel: ata11: DUMMY
Mar 27 18:13:36 MedHP kernel: ata12: DUMMY
Mar 27 18:13:36 MedHP kernel: ata13: DUMMY
Mar 27 18:13:36 MedHP kernel: ata14: DUMMY
Mar 27 18:13:36 MedHP kernel: ata15: DUMMY
Mar 27 18:13:36 MedHP kernel: ata16: DUMMY
Mar 27 18:13:36 MedHP kernel: ata17: DUMMY
Mar 27 18:13:36 MedHP kernel: ata18: DUMMY
Mar 27 18:13:36 MedHP kernel: ata19: DUMMY
Mar 27 18:13:36 MedHP kernel: ata20: DUMMY
Mar 27 18:13:36 MedHP kernel: ata21: DUMMY
Mar 27 18:13:36 MedHP kernel: ata22: DUMMY
Mar 27 18:13:36 MedHP kernel: ata23: DUMMY
Mar 27 18:13:36 MedHP kernel: ata24: DUMMY
Mar 27 18:13:36 MedHP kernel: ata25: DUMMY
Mar 27 18:13:36 MedHP kernel: ata26: DUMMY
Mar 27 18:13:36 MedHP kernel: ata27: DUMMY
Mar 27 18:13:36 MedHP kernel: ata28: DUMMY
Mar 27 18:13:36 MedHP kernel: ata29: DUMMY
Mar 27 18:13:36 MedHP kernel: ata30: DUMMY
Mar 27 18:13:36 MedHP kernel: ata31: DUMMY
Mar 27 18:13:36 MedHP kernel: ata32: DUMMY

But your controller is using 4 of those dummy ports, ATA29, 30, 31 and 32.

So technically I think this is more a controller firmware issue, it should only indicate actual 6 ports, not 30, still the kernel should be able to use them, either by using a quirk or reverting this recent change, since I suspect a lot of users will be affected.

JorgeB · March 27

@blue8lucianis this a 6 port controller or does it have more ports, i.e., it has a SATA port multiplier?

blue8lucian · March 28

9 hours ago, JorgeB said:

@blue8lucianis this a 6 port controller or does it have more ports, i.e., it has a SATA port multiplier?

Yes i Have a SATA port multiplier but in the process of change it.

rogueGeer · March 29

I experienced this/similar behaviour also after upgrading to 6.12.9.
Upgraded: 3 drives in my pool "unavailable" (pretty sure it is the three that are logical drives above 32) so pool wont import and calamity ensues...

downgraded to 6.12.8: instantly works properly.

Fwiw, I'm running a hybrid zfs pool (leftover from my old baremetal ubuntu setup, works fine with no issues other than UI and a forced "zpool import" on boot).

And this is the 10port sata controller details (with 9 drives connected) output from "lshw -class storage -class disk"


  *-sata                    
       description: SATA controller
       product: ASM1166 Serial ATA Controller
       vendor: ASMedia Technology Inc.
       physical id: 0
       bus info: pci@0000:01:00.0
       logical name: scsi5
       logical name: scsi6
       logical name: scsi7
       logical name: scsi8
       logical name: scsi9
       logical name: scsi10
       logical name: scsi34
       logical name: scsi35
       logical name: scsi36
       version: 02
       width: 32 bits
       clock: 33MHz
       capabilities: sata pm msi pciexpress ahci_1.0 bus_master cap_list rom emulated
       configuration: driver=ahci latency=0
       resources: irq:129 memory:d1182000-d1183fff memory:d1180000-d1181fff memory:d1100000-d117ffff

JorgeB · March 29

4 hours ago, rogueGeer said:

I experienced this/similar behaviour also after upgrading to 6.12.9.

This issue will happen with any Asmedia 1064 with more than 4 ports, or any Asmedia 1166 controller with more then 6 ports, i.e., controllers that use SATA port multipliers, any devices on ports that come from a PM will not be detected, this is a kernel issue, they have already reverted this change but it didn't make to kernel 6.1.x., so possibly LT will need to apply the patches themselves.

Eric Hollebone · April 1

Think I have the same or similar issue. My larger zpool with 18 hdd and 2 nvme drive does not mount automatically after a restart and I have to mount the zpool manually. Once added, the pool operates normally. The pool is on a netapp ds2448, 24 disk shelf via a LSI SAS 9201-e16

But the Main page indicate the pool is "Unmountable: Unsupported or no file system found" for all 18 drives. I have added manually a special vdev, mirrored 1TB nvme, to this pool for metadata and small files to speed up some operations, which may be a complicating factor.

Just in case, upload the diagnostic file

unraid-diagnostics-20240401-1712.zip

JorgeB · April 1

35 minutes ago, Eric Hollebone said:

Think I have the same or similar issue.

It can't be the same issue if all the devices are detected, for some reason your syslog is incomplete, but I suspect I know what the problem is, post a screenshot of main showing the pool assignments.

ambipro · April 2

I have a ASM1166 and haven't had any issues with this update.

I'm only using one drive on the device though.

Followed and flashed this firmware prior to using it, not sure if this may help.

https://docs.phil-barker.com/posts/upgrading-ASM1166-firmware-for-unraid/

JorgeB · April 2

1 hour ago, ambipro said:

I have a ASM1166 and haven't had any issues with this update.

There's a problem only if the controller has more than 6 ports, i.e., if it uses a SATA port multiplier, and only the ports behind the PM won't work.

mojojojo · April 2

I am experiencing the same issue after upgrade to 6.12.9.

It appears related to this Linux kernel commit:

https://github.com/torvalds/linux/commit/9815e39617541ef52d0dfac4be274ad378c6dc09

The commit attempts to correct over-enumeration of SATA ports, but it does not account for port multipliers that are common in a large number of SATA cards.

A reverse patch has been merged to the kernel tree, but it looks like Unraid must be using an affected kernel build. See: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=2f3c2b39768d2c0ccf0f6712a4b27453674e5de7

If an Unraid developer sees this, please investigate merging the patch into your kernel build and release an update. This is a critical failure for many people. Thanks!

Edited April 2 by mojojojo

ljm42 · April 3

Hi folks, if you are affected by this please install 6.12.10-rc.1 and confirm it resolves the issue:
https://forums.unraid.net/bug-reports/prereleases/unraid-os-version-61210-rc1-available-r2941/

The problematic commits were added to kernel 6.1.80, so for this release we reverted from kernel 6.1.82 to 6.1.79.

Eric Hollebone · April 4

@JorgeB Here is the screen shot of main. This is done after I remount the zpool manual. if you need a picture of it before, please let me know.

and the zpool status if it helps

Quote

root@renfrew-unraid:~# zpool status
pool: storage
state: ONLINE
scan: scrub repaired 0B in 03:35:24 with 0 errors on Sun Mar 24 02:35:25 2024
config:

NAME STATE READ WRITE CKSUM
storage ONLINE 0 0 0
raidz1-0 ONLINE 0 0 0
sdi1 ONLINE 0 0 0
sdj1 ONLINE 0 0 0
sdk1 ONLINE 0 0 0
raidz1-1 ONLINE 0 0 0
sdl1 ONLINE 0 0 0
sdm1 ONLINE 0 0 0
sdn1 ONLINE 0 0 0
raidz1-2 ONLINE 0 0 0
sdo1 ONLINE 0 0 0
sdp1 ONLINE 0 0 0
sdq1 ONLINE 0 0 0
raidz1-3 ONLINE 0 0 0
sdr1 ONLINE 0 0 0
sds1 ONLINE 0 0 0
sdt1 ONLINE 0 0 0
raidz1-4 ONLINE 0 0 0
sdu1 ONLINE 0 0 0
sdv1 ONLINE 0 0 0
sdw1 ONLINE 0 0 0
raidz1-5 ONLINE 0 0 0
sdx1 ONLINE 0 0 0
sdy1 ONLINE 0 0 0
sdz1 ONLINE 0 0 0
special
mirror-6 ONLINE 0 0 0
nvme1n1 ONLINE 0 0 0
nvme2n1 ONLINE 0 0 0

errors: No known data errors

pool: vm-storage
state: ONLINE
scan: scrub repaired 0B in 01:51:45 with 0 errors on Fri Mar 29 01:11:46 2024
config:

NAME STATE READ WRITE CKSUM
vm-storage ONLINE 0 0 0
raidz1-0 ONLINE 0 0 0
sdaa1 ONLINE 0 0 0
sdab1 ONLINE 0 0 0
sdac1 ONLINE 0 0 0
sdad1 ONLINE 0 0 0
logs
nvme-INTEL_MEMPEK1J032GA_PHBT80930079032P ONLINE 0 0 0

errors: No known data errors

Edited April 4 by Eric Hollebone

Eric Hollebone · April 4

Update #1 - was watching the zpool mount during array start and captured the following from the system log

Quote

Apr 4 16:20:03 renfrew-unraid emhttpd: mounting /mnt/storage
Apr 4 16:20:03 renfrew-unraid emhttpd: shcmd (60391): mkdir -p /mnt/storage
Apr 4 16:20:03 renfrew-unraid emhttpd: shcmd (60392): /usr/sbin/zpool import -f -N -o autoexpand=on -d /dev/sdi1 -d /dev/sdj1 -d /dev/sdk1 -d /dev/sdl1 -d /dev/sdm1 -d /dev/sdn1 -d /dev/sdo1 -d /dev/sdp1 -d /dev/sdq1 -d /dev/sdr1 -d /dev/sds1 -d /dev/sdt1 -d /dev/sdu1 -d /dev/sdv1 -d /dev/sdw1 -d /dev/sdx1 -d /dev/sdy1 -d /dev/sdz1 2940375383946697912 storage
Apr 4 16:20:03 renfrew-unraid root: cannot import 'storage' as 'storage': I/O error
Apr 4 16:20:03 renfrew-unraid root: #011Destroy and re-create the pool from
Apr 4 16:20:03 renfrew-unraid root: #011a backup source.
Apr 4 16:20:03 renfrew-unraid emhttpd: shcmd (60392): exit status: 1
Apr 4 16:20:03 renfrew-unraid emhttpd: storage: import error
Apr 4 16:20:03 renfrew-unraid emhttpd: /mnt/storage mount error: Unsupported or no file system
Apr 4 16:20:03 renfrew-unraid emhttpd: shcmd (60393): rmdir /mnt/storage

but after the array starts, I can mount the pool and have it function correctly by using:

zpool import storage

What seems to be missing is the extra special vdev for the metadata mirror from the scripted import cmd.

I'm thinking my post to this topic likely should be moved to a new thread as it does not relate to the original topic, and classified as a enhancement to support special vdevs

Edited April 4 by Eric Hollebone

JorgeB · April 5

11 hours ago, Eric Hollebone said:

Here is the screen shot of main.

You have to assign all pool members, including the special vdev, and note that they must be assigned in the same order as zpool import/status shows, see here for more detail and how to re-import the pool:

https://forums.unraid.net/topic/46802-faq-for-unraid-v6/?do=findComment&comment=1256329

Zpool problems after upgrading to the 6.12.9

User Feedback

Recommended Comments

JorgeB 7520

Link to comment

JorgeB 7520

Link to comment

blue8lucian 2

Link to comment

rogueGeer 0

Link to comment

JorgeB 7520

Link to comment

Eric Hollebone 0

Link to comment

JorgeB 7520

Link to comment

ambipro 3

Link to comment

JorgeB 7520

Link to comment

mojojojo 2

Link to comment

ljm42 1211

Link to comment

Eric Hollebone 0

Link to comment

Eric Hollebone 0

Link to comment

JorgeB 7520

Link to comment

Join the conversation