server crash while 2 hdd failed :-/


Nexius2
Go to solution Solved by Nexius2,

Recommended Posts

Hello, bad morning for me today.

I started rebuilding a disk early this week to upgrade. the data rebuild has not finished yet.

during the night, 2 other HDD failed... no luck...

I have 2 parity HDD but with the one that was rebuilding, I guess that takes the count to 3.... so my array doesn't like it.

my shares don't all show up, for example, I can't see the appdata (wich is on a cache so???)

 

can someone help to get this back up with minimum data loss?

thanks

nostromo-diagnostics-20231202-1001.zip

Link to comment

what I was thinking for now.

 

- get 2 new hdd

- preclear them

- copy data from crached HDD to new one (don't know exactly how for now)

- replace failed hdd with the new ones

- replace the hdd that was rebuilding with the old one

- reset the config to loose my 2 parity disks

- let it rebuild the parity

- cross my fingers very hard

 

this methode will take a very long time (couple days / weeks) and doesn't explain the loss of shares (and i think data on disks that are fine)

Edited by Nexius2
Link to comment

doesn't seem like I can cancel it... nothing happens and i don't see it in the logs.

is there a command line to stop rebuild?

 

 

logs are full, it might explain why i can't cancel rebuild or do anything.

df -h /var/log
Filesystem      Size  Used Avail Use% Mounted on
tmpfs           128M  128M     0 100% /var/log

nostromo-diagnostics-20231203-0935.zip

Edited by Nexius2
Link to comment

the server rebooted. but it says rebuild is finished (with lots of errors) I think it was disk 25. I didn't start the array yet but I guess i can't swap back since it is a 18To and the old was an 8To.

 

Disks 6 and 19 where moved but bad power or connexion problem, they are now disabled.

don't know what to do 😕

nostromo-diagnostics-20231203-1736.zip

Edited by Nexius2
Link to comment

This will only work if parity is still valid:

 

-Tools -> New Config -> Retain current configuration: All -> Apply
-Check all assignments and assign any missing disk(s) if needed
-IMPORTANT - Check both "parity is already valid" and "maintenance mode" and start the array (note that the GUI will still show that data on parity disk(s) will be overwritten, this is normal as it doesn't account for the checkbox, but it won't be as long as it's checked)
-Stop array
-Unassign disk25
-Start array (in normal mode now) and post new diags.

Link to comment

as parity can't be valid after the reboot (rebuild disk marked as ok but has not even half the data on it), I decided to try copying data from failing disks to new ones.

after a preclear, I can format and mount new disks. but is there a way to mount disks from a stopped array that are disabled? I have hear of Unassigned device that could do it, but couldn't figure out how.

thanks

Link to comment
2 hours ago, Nexius2 said:

as parity can't be valid after the reboot (rebuild disk marked as ok but has not even half the data on it)

It could still be mostly valid, it was worth a try since it wouldn't cause any harm.

 

2 hours ago, Nexius2 said:

but is there a way to mount disks from a stopped array that are disabled?

Not sure what you mean, disable disks are emulated, if you mean mount the actual disk you can do that with UD, or do a new config to re-enable the disk (this will prevent rebuilding).

 

Link to comment
  • 2 weeks later...

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.