July 15, 2025Jul 15 I am at my wits end with my raidz2 array. I have 6x8TB hdds attached to a SUPERMICRO SAS825TQ backplane which is connected to my lsi 9500-8i.I originally had no issues with it, then the side table that the 3d printed enclosure was sitting on took the drives right into the concrete. Hard drives don’t like that so I lost 4 out of my 8 drives, I replaced 2 of the lost ones and recreated a raidz2 array out of the 6 remaining drives.Now a week or so later I have issues with applications going unresponsive from getting seemingly locked up with disk io that takes minutes to happen at times. When this happens I am eventually forced to shut the machine down and bring it back up since processes won’t stop since they are waiting on disk io. When it turns back on I end up stuck in the “mounting” step with hundreds of messages in /proc/spl/kstat/zfs/dbmsgI don't get any errors at all in the WebGui and the drives show read activity, but just a few hundred KiB/s. The entire system just stays stuck like that for hours until the zfs pool eventually mounts, typically between 6-12 hours. I don't know how to begin to figure out what the issue is.I have attached the full output of /proc/spl/kstat/zfs/dbmsg Lines like this seem the most concerning to me, but I see errors for sdm and sdl and I just replaced the sdm drive… not sure if these are actual issues or just caused by some other source of latency since requesting smart data from the drive gives me a very normal response.1752608776 ffff888114bad3c0 zio.c:2302:zio_deadman_impl(): slow zio[10]: zio=ffff8885808da140 timestamp=10191871982502 delta=26961250 queued=0 io=0 path=/dev/sdl1 last=10191888020255 type=2 priority=3 flags=0x700080 stage=0x400000 pipeline=0x4e00000 pipeline-trace=0xa00001 objset=0 object=79 level=0 blkid=2547347 offset=5137725710336 size=4096 error=0What I have tried hardware wise:Swapped out the lsi9500 for a lsi9207, same issueBy doing the above I also swapped all the cables to the backplaneI also swapped out the backplaneSaw those errors about sdm so after all of the above I swapped that out for one of my spare refurbs. dbmsg output.txt
July 16, 2025Jul 16 Community Expert Run the diskspeed container test to see if it can find one or more obvious slow disks.
July 19, 2025Jul 19 Author Solution So... at least one of those has got to be hanging on by a thread. I ended up not being able to get it to mount again in r/w, but by doing the following:Disabled the automatic mounting of disksForcing a restartRunning the following: zpool import -F -f -o readonly=on poolnameThen starting the arrayThe pool mounts almost immediately since it didn't have to deal with the ZIL, but unraid gui shows an issue with the filesystem that you can safely ignore.Then I was able to copy the data over to my array using rsync with the archival flag and used a terminal session to manually create the share folder on the array so I could just switch the destination for the mover.Now I'm going to take a break from this and then hook up the spare HBA to my desktop and test the disks. Edited July 19, 2025Jul 19 by bebis
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.