re-enabled disabled disk - need help :)


MWDK

Recommended Posts

So i had a SATA cable fail somehow that ended up in an array disc being disabled.

I followed the guide to re-enable the disc with stopping the array, removing the disc from the array. starting. stopping, enabling the disc for array and making it rebuilt again. Works fine - BUT it now says that my disc have no file system and also want to format the drive (after the rebuilt has finished without errors...)

Is this correct to format it?

Has it automatically moved the files that were ON the disc to the other discs in the array?

I hope im not losing files :(

 

Link to comment
16 minutes ago, MWDK said:

Is this correct to format it?

NO!!!

 

16 minutes ago, MWDK said:

Has it automatically moved the files that were ON the disc to the other discs in the array?

no

 

16 minutes ago, MWDK said:

I hope im not losing files

  Do you have backups of anything important and irreplaceable?

 

 

Go to Tools - Diagnostics and attach the complete diagnostics zip file to your NEXT post.

Link to comment
31 minutes ago, trurl said:

Looks like you have a corrupted filesystem on disk1, and it was that way in both the "before" and "after" diagnostics.

 

Have you rebooted since the rebuild? If not, don't. Post a screenshot of Main - Array Devices.

 

 

 

 

Attached. Haven't rebooted 

Screenshot_20200216_232128.jpg

Screenshot_20200216_232211.jpg

Screenshot_20200216_232154.jpg

Link to comment

Doesn't look like there were any I/O errors during the rebuild, and I didn't see any in the diagnostics, but there was some clutter in the syslog so I wanted to check. Do you have multiple browsers or browser tabs open to your server, including mobile? That would cause the clutter I was seeing. (csrf token)

 

So, you need to repair the filesystem on disk1.

 

Stop the array and start it in Maintenance mode, then click on Disk1 to get to its page and Click the button to check its filesystem.

 

Capture any output and post it.

Link to comment
9 hours ago, trurl said:

Doesn't look like there were any I/O errors during the rebuild, and I didn't see any in the diagnostics, but there was some clutter in the syslog so I wanted to check. Do you have multiple browsers or browser tabs open to your server, including mobile? That would cause the clutter I was seeing. (csrf token)

 

So, you need to repair the filesystem on disk1.

 

Stop the array and start it in Maintenance mode, then click on Disk1 to get to its page and Click the button to check its filesystem.

 

Capture any output and post it.

Yes to several open browsers.

Attached is the output... i would recommend changing how it shows this output as it crashes my browser because of the size. Would be much better to generate a file like the diag :) Had to save the site html and take info from there

gnubbi_Device.txt

Link to comment
5 hours ago, MWDK said:

i would recommend changing how it shows this output as it crashes my browser because of the size

The output comes from the linux xfs repair utility and isn't usually that large. The fact that it is so large is NOT a good sign.

 

The only idea I have at this point is to unassign the disk so it is emulated by parity again and see if that can be repaired any better.

 

Let me drag @johnnie.black into this thread and see if he has any ideas.

Link to comment

There might have been a better way of resolving this before the rebuild, but I think the best option now is to run xfs_repair without -n, filesystem looks very corrupt, but sometimes xfs_repair output makes it look worse than it is.

 

15 minutes ago, trurl said:

The only idea I have at this point is to unassign the disk so it is emulated by parity again and see if that can be repaired any better.

If the disk was recently rebuilt without error emulated disk should be the same, but it won't hurt much to try.

Link to comment
On 2/17/2020 at 3:16 PM, johnnie.black said:

There might have been a better way of resolving this before the rebuild, but I think the best option now is to run xfs_repair without -n, filesystem looks very corrupt, but sometimes xfs_repair output makes it look worse than it is.

 

If the disk was recently rebuilt without error emulated disk should be the same, but it won't hurt much to try.

it has just been rebuilt without error, so i asume it wouldnt do any good.

can someone explain why the parity disc can't "fix" this? i mean if the disc goes bad i had the idea that the parity would have a "backup" and if a new disc was inserted, everything would be fine again? :(

Link to comment
1 minute ago, johnnie.black said:

There's only one way, though you can go ahead and use -L if it asks for it.

Phase 1 - find and verify superblock... Phase 2 - using internal log - zero log... ERROR: The filesystem has valuable metadata changes in a log which needs to be replayed. Mount the filesystem to replay the log, and unmount it before re-running xfs_repair. If you are unable to mount the filesystem, then use the -L option to destroy the log and attempt a repair. Note that destroying the log may cause corruption -- please attempt a mount of the filesystem before doing this.

Link to comment
2 hours ago, johnnie.black said:

If you mean after xfs_repair is done just start the array normally, disk should mount now.

So it started up fine. I can browse the files.. Can i start a check now to see if everything is as it shuold be somehow?

i have a huge lost-found folder now with lots of files in :) why?

Edited by MWDK
Link to comment
58 minutes ago, MWDK said:

have a huge lost-found folder now with lots of files in :) why?

That is where the repair process puts any files it finds for which it cannot find the directory information (and thus its folder/filename).    If the data is critical to you then you then you have to manually examine each such file to try and determine what it is (the Linux ‘files’ command can help here by at least giving you the likely file type). 
 

in practice it is normally easier to restore the files from your backups.    This is also a reason why you might want to have file checksums for all your files so you can quickly identify which files  might be different to the backups.

Link to comment
On 2/18/2020 at 6:41 PM, johnnie.black said:

the best option here would likely have been to re-enable the disk, instead of rebuilding

How does one 're-enable the disk' other than what TS wrote? I also followed the guide to re-enable the disc with stopping the array, removing the disc from the array, starting array, stopping array, enabling the disc for the array and making it parity-sync rebuild. You seem to know of a miracle other kind of 're-enable' option that is not documented anywhere.. For most that will be too late then, since they already rebuilt it.

Link to comment
14 hours ago, fluisterben said:

You seem to know of a miracle other kind of 're-enable' option that is not documented anywhere..

I think he is referring to the option that makes unraid trust that "parity is already valid" this avoids a rebuild. See here... I've not used this procedure yet so if you have questions make sure to ask someone before making changes.

Link to comment
15 hours ago, fluisterben said:

How does one 're-enable the disk' other than what TS wrote?

Consider that Unraid only disables a disk when a write fails, which means that under most conditions, what is physically on the disk is wrong, possibly corrupt. If you choose to discard the writes that have been done to that disk slot and revert back to what is physically on the disk, you must be aware of what you stand to lose. The safest option is to rebuild to a different disk, then if the rebuild from emulation doesn't work properly you have the option of recovering what is on the physical disk. The next safest option is to rebuild to the same disk, the last option is to discard parity and rebuild it from what is currently on all the physical disks.

 

None of these options are instant, all take resources and involve risk to get the array back in to fault tolerance.

  • Thanks 1
Link to comment
49 minutes ago, jonathanm said:

None of these options are instant, all take resources and involve risk to get the array back in to fault tolerance.

And even if you use the trust parity option, you still need a correcting parity check to make sure parity really is in sync with the physical disks.

 

Generally, trusting parity should only be done after careful consideration. It might make sense in some specific circumstances. It often does make sense in a New Config situation, seldom in a disabled disk situation.

  • Thanks 1
Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.