March 14, 20242 yr recently I move my server to a bigger case everything was working fine, then couple of days later I upgraded my cache pool from 2 (1TB) sata ssd btrs to 3 (2TB) NVME using ZFS. after moving the appdata and recreating the docker image file, everything seem to be fine. after couple of days the server stopped responding and needed to be restarted, couple of days later even thou all other dockers were working, Plex was failing to start. After some trouble shooting I realized the docker image was corrupted and had to deleted and recreated it. The very next day the server crash again but this time after rebooting the server the array will get stuck starting but will never start. I rebooted the server several times with no luckI even tried safe mode. Suspecting the cache pool having something to do with this issues I decided to remove one the drive, I was able to start the array in safe mode, after that I stopped the array and re added the removed drive back to the cache pool and now unraid is recognizing the drive as a new drive and stating that all data will be erase from the drive if I start the array. All my appdata is in the cache, I do not have any VM. Any help will be appreciated. poseidon-diagnostics-20240313-2112.zip
March 14, 20242 yr Community Expert Log is being spammed with rootshare related errors, disable that and post new diags after a reboot.
March 14, 20242 yr Author This is a new log file after rebooting the server, I do not know how to disable rootshare related errors unless I get the log from syslog, and am not sure if those are okay to be openly shared. poseidon-diagnostics-20240314-0852.zip
March 14, 20242 yr Community Expert 29 minutes ago, NGMK said: rootshare The recommended way to handle this now is with Unassigned Devices plugin. Looks like you are doing it in smb-extra.conf instead. Settings - SMB - SMB Extras
March 14, 20242 yr Community Expert Remove the rootshare from SMB extras and post the output of zpool import
March 14, 20242 yr Author 1 hour ago, JorgeB said: Remove the rootshare from SMB extras and post the output of zpool import pool: cache id: 2664919203947636995 state: DEGRADED status: One or more devices contains corrupted data. action: The pool can be imported despite missing or damaged devices. The fault tolerance of the pool may be compromised if imported. see: https://openzfs.github.io/openzfs-docs/msg/ZFS-8000-4J config: cache DEGRADED raidz1-0 DEGRADED nvme0n1 UNAVAIL invalid label nvme1n1p1 ONLINE nvme2n1p1 ONLINE before opening this post I found another post where someone was having an issues starting the array and was recommended by someone else to remove the cache pool drives. I attempted to do this but although I unassigned all three NVME drives from the pool in the UI when I started the array I realized that only the first drive was actually unassigned. This may be the cause of these results.
March 14, 20242 yr Community Expert See if the pool imports with the current status, so then you can try to fix it, unassign all pool devices, start array, stop array, reassign all pool devices in the correct order as zpool import shows, start array, post new diags.
March 15, 20242 yr Author Array now starts after removing every nvme cache drives from the pool, starting the array without any drive in the pool, stoping the array, adding all three drives back to the cache pool in the same order. I left the FS as auto, now Unraid wants me to format the drives as right this moment the drives are not mountable. Whats next. poseidon-diagnostics-20240314-2127.zip
March 15, 20242 yr Community Expert Pool is not importing because the first device doesn't have a valid fs, try this: sfdisk /dev/nvme0n1 then type 2048 and hit enter, finally post the output/screenshot of the results
March 15, 20242 yr Community Expert Type N to keep the signature and enter, then type write and enter, after that re-start the array and post new diags.
March 16, 20242 yr Author when you run the command [ sfdisk /dev/nvme0n1 ] then type [2048] then [N] to not remove the signature I get the following, asking for the other devices in the pool. I type write close the command line and try starting the array but it wont start. below if the screen shot and the new diagnostics P.S. I really appreciated you taking the time to help. poseidon-diagnostics-20240315-2031.zip
March 16, 20242 yr Community Expert That is not for the other devices, it's if you wanted to create a second partition, just type write and enter, not clear if you already did that or not.
March 16, 20242 yr Author Yes I already did type [write] after, and I did this whole process more than once but the Array wont after, and I always end up having to reboot the server. the only data valuable inside the cache pool was my appdata and I have a 2 weeks old backup in the main array, however Im very concern on how this cache pool became so corrupted, this is the very first time me using ZFS and red on another post that zfs1 with 3 drives in a cache pool was only recommended in a experimental setting and not in a mission critical server. What else can we try here. I will prefer to save the pool if possible. New Diagnostics attached. poseidon-diagnostics-20240316-1430.zip
March 17, 20242 yr Community Expert Solution 16 hours ago, NGMK said: post that zfs1 with 3 drives in a cache pool was only recommended in a experimental ZFS raidz1 is far from experimental, it has been considerable stable for a long time. 16 hours ago, NGMK said: Array wont after, and I always end up having to reboot the server. That suggests the pool is crashing the server on mount, before starting the server type: zpool import -o readonly=on cache If successful then start the array, the GUI will show the pool unmountable but the data should be under /mnt/cache, then backup and re-create the pool
March 17, 20242 yr Author Yes I already tried zpool import -o readonly=on cache and was able to start the array with the cache on read only status, the cache pool is available on the gui file explorer, I tried coping the appdata folder to one of the array disks and all was going well until it just stayed on a single file transferring it forever
March 17, 20242 yr Community Expert Check the main page for write speeds, to see if it's still going, also the syslog for any errors.
March 17, 20242 yr Author So I created a new share in a another cache pool I have with only one sata ssd and copied the appdata directory to it and judging by it size i believe all files are there. Should I just give up on the nvme pool (main) reformat and recreated the pool?
March 19, 20242 yr Author consider this one solved, the array is back online alone with the cache pool. I transferred the appdata recovered from the failed pool, I hope plex is able to recover. Thanks JorgeB for your assistance.
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.