Jump to content

Replaced HDD, rebuilt successfully, but then data disappeared and Shares disappeared.


Go to solution Solved by Kudjo,

Recommended Posts

I upgraded disk1 from a 10TB HDD to an 18TB HDD. That went well. The contents were emulated properly (I was able to use plex, etc) and the data rebuild completed without any issues that I could tell. Got the green dot, everything seemed fine. I choose this time to upgrade unraid from 6.12.10 to 6.12.11. I rebooted and I thought everything was fine. A few hours later (I don't know exactly how much later) users were letting me know that several of my services were down and had been for a while. I logged on and found all my shares had disappeared.
image.thumb.png.27afaf6ad6f938bf679a0cbd3ad32b61.png

When I logged onto the unraid machine directly, I couldn't enter the  /mnt/user directory. And then I realized that I couldn't access /mnt/disk1 or it's data either. (no longer being emulated).

image.png.1797f52e7feffbbcb4b78c7bcdbba532.png

image.thumb.png.18e58dc6e35af684eb7befdfc2874534.png


I read online that sometimes when the shares disappear, a reboot will fix it. So, I tried that and when the reboot was complete I COULD see the shares listed in the unraid webGUI. But only for a few minutes before they disappeared again. And at no time could I access any of the shares from terminal or from a file manager. The only way I can access anything is by going directly to the disks (e.g. /mnt/disk2/stuff/thing/file.me) on the unraid server itself.

 

When my system rebooted, it also started a parity-check and immediatly started reporting LOTS of "Sync errors corrected:".  I let the parity-check complete (took over a day) and the result was over 2.4 BILLION Sync errors corrected. (!!!) But when the parity check was finally finished, I still couldn't see anything in disk1 and another reboot didn't help either. (Still no shares in the webGUI after a few minutes, and still no access to /mnt/user or /mnt/disk1). I was getting quite worried at this point. I decided the best thing to do would be to rollback to 6.12.10 and hope I had just had the misfortune of finding a 6.12.11 bug. I finished the rollback (about an hour or two ago) and after the reboot I got the same behavior: No access to /mnt/user or /mnt/disk1, and no access to the shares in the webGUI (even though I can see the shares from my windows file explorer, I still cannot access them)
image.thumb.png.dbc4466bfdfc75b4497d62bbe1e7276c.png

I am terrified to do anything without direction now. I really don't want to lose 10TB of stuff. I've never had trouble like this with unraid before and have used it for MANY years.

I've attached my latest Diagnostics. Please direct me of what to do. I'm happy to do the work, I'm just at a loss for how to proceed safely.

P.S. And before anyone says so: yes, I know the importance of backups. This is 1 of 2 unraid servers. and part of a personal backup improvement project. I am literally in the process of backing up everything important when this hiccup happened. But the backups target the shares and I have no idea what was stored on disk1. 

unkudjo-diagnostics-20240731_0451.zip

Link to comment
Phase 1 - find and verify superblock...
Phase 2 - using internal log
        - zero log...
ERROR: The filesystem has valuable metadata changes in a log which needs to
be replayed.  Mount the filesystem to replay the log, and unmount it before
re-running xfs_repair.  If you are unable to mount the filesystem, then use
the -L option to destroy the log and attempt a repair.
Note that destroying the log may cause corruption -- please attempt a mount
of the filesystem before doing this.

 

Link to comment

I ran the operation without the "-n" and with "-L". I think the operation is complete, because the read/write count has stopped updating
image.thumb.png.e12b3463ff51ae116b1fe3843377bfe3.png

But everytime I try to load the page where I ran the check filesystem from, I get this, so I can't actually see what happened.

image.thumb.png.7e3f5b7941b78e42e7708ab60d2d7589.png

I pulled Diagnostics, but don't know where to look to see instructions for what's next in them. According to what I'm reading online, others that were in a similar situation would reboot and start the array in normal mode (instead of Maintenance Mode, like I had it in for the check filesystem operation). Is that what I do next?

Thank you very much for all of your help with this so far! I am very grateful.

unkudjo-diagnostics-20240801-0917.zip

Link to comment
  • Solution

After reviewing the diagnostics, I followed the instructions on how to check filesystem and first ran without the "-n".

Pulled diags again and then ran the check filesystem without the "-n" again, but included "-L" this time. Once the operation was complete (the WebGUI "Main" display no longer showed the disk read/writes increasing), we pulled diags again and then rebooted the machine and started the array again.

At this point, the shares had been restored and the data on the target disk was accessible again. No data loss as far as I can tell. We pulled diags one more time to verify that all looks good.

Thank you for all your help, @JorgeB!

  • Like 1
Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...