[HELP] Putting disks back to normal again

ricreis394 · October 11, 2023

Hello,

I've been using unraid for a long time, but only in the last few months I've been using more the disks. Currently I have around 15TB of total disks size but some of my disks were having problems (mainly used Dell SAS 3TB).

Last week one of those disks were disabled, so I took him out and added a 10TB disk which I applied as Parity 2 (since 10TB was the biggest size), but the main purpose was to then replace the Parity 1 into Data (replace with the disabled disk).

I ended up finishing it, and currently I have 2 parity disks and 1 missing disk in data.

Now I'm trying to pass the Parity 1 into data, and since I cannot switch Parity 2 into 1, and Parity 1 into data, I ended up adding another 10TB disk into parity 1, and then the first Parity will go into data.

But I'm having some errors, and I can't understand why, since it is quite random.

I hope someone can help me with this. I'm a bit noob in this subject

Diagnostics is in attachment

EDIT: I ended up restarting the server, and now the Parity 1 appears with a green dot I don't know. Just in case I started the array and I'm currently doing the parity-check to make sure everything is alright

towernas-diagnostics-20231011-0814.zip

Edited October 11, 2023 by ricreis394
new update

ricreis394 · October 11, 2023

Parity check stopped at 22.5% with error on another disk...

I really don't know what more to do, errors keeps popping

One more diagnostics attached...

towernas-diagnostics-20231011-1337.zip

JorgeB · October 11, 2023

Log shows issues with parity2 and disk2, both look more like a power/connection problem, check/replace cables or try again with a different PSU if available.

ricreis394 · October 11, 2023

@JorgeB Thanks for the reply 🙏

Can it be the sata connection? or you mean only power?

I don't have another PSU but I do have a UPS, does it help?

JorgeB · October 11, 2023

4 minutes ago, ricreis394 said:

Can it be the sata connection? or you mean only power?

I meant both cables.

4 minutes ago, ricreis394 said:

but I do have a UPS, does it help?

Not really.

ricreis394 · October 12, 2023

@JorgeBThanks a lot! It seems that the PSU was a problem, I switch to another (better) and the parity check is almost done without issues.

Anyways, I still have a disk disabled which says `Unmountable: Wrong or no file system`.

I do think that some data is missing, can I recover with success?

I'm following this article, should I need to take attention to something else? How do I know if there is everything in that disk and somethings aren't corrupted?

JorgeB · October 12, 2023

4 minutes ago, ricreis394 said:

Anyways, I still have a disk disabled which says `Unmountable: Wrong or no file system`.

I do think that some data is missing, can I recover with success?

Post new diagnostics

ricreis394 · October 12, 2023

1 minute ago, JorgeB said:

Post new diagnostics

Diagnostics in attachment

Parity-check is still running (1h30 left to finish)

towernas-diagnostics-20231012-1647.zip

JorgeB · October 12, 2023

After the parity check finishes check filesystem on disk1, run it without -n.

ricreis394 · October 12, 2023

4 minutes ago, JorgeB said:

After the parity check finishes check filesystem on disk1, run it without -n.

Do you mean disk2 (HUS724030ALS640_P8K4XPAW) ?

That's the one which is plugged in and disabled. The disk1 isn't on the system atm

disk2 is the one I want to recover data

JorgeB · October 12, 2023

14 minutes ago, ricreis394 said:

Do you mean disk2

Oops, disk1 was the first I saw and assumed there was only one, it's disks 1 and 2 that need a filesystem check, disk1 is also disabled and unmountable.

ricreis394 · October 12, 2023

4 hours ago, JorgeB said:

Oops, disk1 was the first I saw and assumed there was only one, it's disks 1 and 2 that need a filesystem check, disk1 is also disabled and unmountable.

attempted to do xfs_repair without -n and it returned this:

Phase 1 - find and verify superblock...
bad primary superblock - bad CRC in superblock !!!

attempting to find secondary superblock...
.found candidate secondary superblock...
verified secondary superblock...
writing modified primary superblock
sb realtime bitmap inode value 18446744073709551615 (NULLFSINO) inconsistent with calculated value 129
resetting superblock realtime bitmap inode pointer to 129
sb realtime summary inode value 18446744073709551615 (NULLFSINO) inconsistent with calculated value 130
resetting superblock realtime summary inode pointer to 130
Phase 2 - using internal log
        - zero log...
ERROR: The filesystem has valuable metadata changes in a log which needs to
be replayed.  Mount the filesystem to replay the log, and unmount it before
re-running xfs_repair.  If you are unable to mount the filesystem, then use
the -L option to destroy the log and attempt a repair.
Note that destroying the log may cause corruption -- please attempt a mount
of the filesystem before doing this.

is it safe or is it the better option to run xfs_repair with -L parameter?

ricreis394 · October 12, 2023

Ok, did the command with the -L and this is the output

Phase 1 - find and verify superblock...
sb realtime bitmap inode value 18446744073709551615 (NULLFSINO) inconsistent with calculated value 129
resetting superblock realtime bitmap inode pointer to 129
sb realtime summary inode value 18446744073709551615 (NULLFSINO) inconsistent with calculated value 130
resetting superblock realtime summary inode pointer to 130
Phase 2 - using internal log
- zero log...
ALERT: The filesystem has valuable metadata changes in a log which is being
destroyed because the -L option was used.
- scan filesystem freespace and inode maps...
clearing needsrepair flag and regenerating metadata
sb_icount 0, counted 10624
sb_ifree 0, counted 772
sb_fdblocks 732208911, counted 62708200
- found root inode chunk
Phase 3 - for each AG...
- scan and clear agi unlinked lists...
- process known inodes and perform inode discovery...
- agno = 0
- agno = 1
- agno = 2
- agno = 3
- process newly discovered inodes...
Phase 4 - check for duplicate blocks...
- setting up duplicate extent list...
- check for inodes claiming duplicate blocks...
- agno = 0
- agno = 3
- agno = 2
- agno = 1
Phase 5 - rebuild AG headers and trees...
- reset superblock...
Phase 6 - check inode connectivity...
- resetting contents of realtime bitmap and summary inodes
- traversing filesystem ...
- traversal finished ...
- moving disconnected inodes to lost+found ...
Phase 7 - verify and correct link counts...
Maximum metadata LSN (1:538229) is ahead of log (1:2).
Format log to cycle 4.
done

The disk continues disabled and unmountable but appears that I'm able to access it's files (?)

Diagnostics attached

towernas-diagnostics-20231012-2159.zip

Edited October 12, 2023 by ricreis394
diagnostics

ricreis394 · October 12, 2023

Now I'm trying to rebuild back the drive, following this guide: https://docs.unraid.net/unraid-os/manual/storage-management/#rebuilding-a-drive-onto-itself

It'll take around 4 hours, let's see what is the result

ricreis394 · October 13, 2023

This disk seems recovered.

I'm attempting to recover the other disk (disk1) and I'm affraid that I have lose data.

Right now I have missing data, I tried plugging in again the disk1, and tried to add to the array (without starting it), but the array recognizes it as a new disk (sad...), so I tried to mount it via unassigned devices, but gives me error `Device '/dev/sdd' failed to mount. Check the syslog for details.`

Then I tried to do a `xfs_repair -v /dev/sdd` but it can't find the

Phase 1 - find and verify superblock...
bad primary superblock - bad magic number !!!

attempting to find secondary superblock...
.found candidate secondary superblock...
unable to verify superblock, continuing...
.found candidate secondary superblock...
unable to verify superblock, continuing...
...........................................................................................................................................................................

Does it mean I have lose all of that data?

Diagnostics attached

towernas-diagnostics-20231013-0850.zip

Edited October 13, 2023 by ricreis394

JorgeB · October 13, 2023

34 minutes ago, ricreis394 said:

Then I tried to do a `xfs_repair -v /dev/sdd

Try adding a 1 in the end:

xfs_repair -v /dev/sdd1

ricreis394 · October 13, 2023

39 minutes ago, JorgeB said:
Try adding a 1 in the end:
xfs_repair -v /dev/sdd1

Nice! Gives better result.

Phase 1 - find and verify superblock...
        - block cache size set to 1506168 entries
Phase 2 - using internal log
        - zero log...
zero_log: head block 1778397 tail block 1778324
ERROR: The filesystem has valuable metadata changes in a log which needs to
be replayed.  Mount the filesystem to replay the log, and unmount it before
re-running xfs_repair.  If you are unable to mount the filesystem, then use
the -L option to destroy the log and attempt a repair.
Note that destroying the log may cause corruption -- please attempt a mount
of the filesystem before doing this.

Can I procced with the -L ?

ricreis394 · October 13, 2023

Okay, I did the command with the -L and the result was good.

I can now mount the disk via unassigned devices and I can find a lot of missing files in the array, that's good!

I suspect the next step is to copy the contents into the array, right? Only then I can format the disk and add it back to the array to resolve a missing disk.

Am I right?

JorgeB · October 13, 2023

27 minutes ago, ricreis394 said:

I suspect the next step is to copy the contents into the array, right? Only then I can format the disk and add it back to the array to resolve a missing disk.

Am I right?

Correct.

ricreis394 · October 13, 2023

I'm copying 2.8TB but I only have 700GB left on my array, can I still copy data. After the copy I'll add this disk into the array so the storage will increase.

Right now the array tells `Device is missing (disabled), contents emulated`

JorgeB · October 13, 2023

You can format the emulated disk, so that it can also be used for data, then rebuild.

ricreis394 · October 13, 2023

17 minutes ago, JorgeB said:

You can format the emulated disk, so that it can also be used for data, then rebuild.

Sorry but I can't find the solution in the docs.

Should I start the array with maintenance mode? I'm affraid of losing data... Need some guidance

JorgeB · October 13, 2023

Format button is next to array stop button, there's a checkbox.

ricreis394 · October 13, 2023

10 minutes ago, JorgeB said:

Format button is next to array stop button, there's a checkbox.

You sure that it's okay to start the array with the disk in this state?

I'm affraid Unraid will erase it's content

JorgeB · October 13, 2023

Don't add disk1 to the array before copying all the data, I posted about formatting the emulated disk, not the old disk.

[HELP] Putting disks back to normal again

Recommended Posts

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Join the conversation