Can't get the array to start.


Go to solution Solved by JorgeB,

Recommended Posts

recently I move my server to a bigger case everything was working fine, then couple of days later I upgraded my cache pool from 2 (1TB) sata ssd btrs to 3 (2TB) NVME using ZFS. after moving the appdata and recreating the docker image file, everything seem to be fine. after couple of days the server stopped responding and needed to be restarted, couple of days later even thou all other dockers were working, Plex was failing to start. After some trouble shooting I realized the docker image was corrupted and had to deleted and recreated it.  The very next day the server crash again but this time after rebooting the server the array will get stuck starting but will never start. I rebooted the server several times with no luckI even tried safe mode. Suspecting the cache pool having something to do with this issues I decided to remove one the drive, I was able to start the array in safe mode, after that I stopped the array and re added the removed drive back to the cache pool and now unraid is recognizing the drive as a new drive and stating that all data will be erase from the drive if I start the array. All my appdata is in the cache, I do not have any VM.  Any help will be appreciated. 

poseidon-diagnostics-20240313-2112.zip

Link to comment
1 hour ago, JorgeB said:

Remove the rootshare from SMB extras and post the output of

zpool import

 

pool: cache
     id: 2664919203947636995
  state: DEGRADED
status: One or more devices contains corrupted data.
 action: The pool can be imported despite missing or damaged devices.  The
        fault tolerance of the pool may be compromised if imported.
   see: https://openzfs.github.io/openzfs-docs/msg/ZFS-8000-4J
 config:

        cache          DEGRADED
          raidz1-0     DEGRADED
            nvme0n1    UNAVAIL  invalid label
            nvme1n1p1  ONLINE
            nvme2n1p1  ONLINE

 

before opening this post I found another post where someone was having an issues starting the array and was recommended by someone else to remove the cache pool drives. I attempted to do this but although I unassigned all three NVME drives from the pool in the UI when I started the array I realized that only the first drive was actually unassigned. This may be the cause of these results.

Link to comment

Yes I already did type [write] after, and I did this whole process more than once but the Array wont after, and I always end up having to reboot the server. the only data valuable inside the cache pool was my appdata and I have a 2 weeks old backup in the main array, however Im very concern on how this cache pool became so corrupted, this is the very first time me using ZFS and red on another post that zfs1 with 3 drives in a cache pool was only recommended in a experimental setting and not in a mission critical server.  What else can we try here. I will prefer to save the pool if possible. New Diagnostics attached.

 

 

image.thumb.png.a665d71477cd58a762f791f6b7da15e0.png

 

poseidon-diagnostics-20240316-1430.zip

Link to comment
  • Solution
16 hours ago, NGMK said:

post that zfs1 with 3 drives in a cache pool was only recommended in a experimental

ZFS raidz1 is far from experimental, it has been considerable stable for a long time.

 

16 hours ago, NGMK said:

Array wont after, and I always end up having to reboot the server.

That suggests the pool is crashing the server on mount, before starting the server type:

 

zpool import -o readonly=on cache

 

If successful then start the array, the GUI will show the pool unmountable but the data should be under /mnt/cache, then backup and re-create the pool

 

Link to comment

Yes I already tried 

zpool import -o readonly=on cache 

and was able to start the array with the cache on read only status, the cache pool is available on the gui file explorer, I tried coping the appdata folder to one of the array disks and all was going well until it just stayed on a single file transferring it forever

 

IMG_0067.png

Link to comment

So I created a new share in a another cache pool I have with only one sata ssd and copied the appdata directory to it and judging by it size i believe all files are there. Should I just give up on the nvme pool (main) reformat and recreated the pool?

Link to comment

consider this one solved,  the array is back online alone with the cache pool. I transferred the appdata recovered from the failed pool, I hope plex is able to recover.

Thanks JorgeB for your assistance.

  • Like 1
Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.