2 disks show Unmountable: No file system


boosted

Recommended Posts

I have a 8 disk array with 6 data and 2 parity.  Haven't really paid much attention to the GUI then I noticed some slowness and strange behavior from the array so I went to the GUI.  It shows disk 3 and disk 5 with red X and Unmountable: No file system error.   I also saw disk 6 if I remember correctly as being mounted.  It seemed strange that if 3 out of 6 disks are out of the array that it can still emulate the data on 3 and 5?  The array is not very full, so disk 6 might not have anything.  I clicked "mount" for disk 6 then stop and started the array.  It now shows disk 6 in the array as normal.  

 

What should I do about disk 3 and 5?  This rig is barely 2 years old, seems odd to be losing 2 drives, can any diagnostics be run to see what happened?  it doesn't seem to show smarts error.  I noticed that files I copied into the array are not there any more via the NFS share.  Did I lose any data?  If disk 3 and 5 are emulated, shouldn't all the data be there still?  Attached is the error log.  

unraid-syslog-20201123-1300.zip

Link to comment
7 minutes ago, boosted said:

Haven't really paid much attention to the GUI then I noticed

You must setup Notifications to alert you immediately by email or other agent as soon as a problem is detected. Don't let one problem become multiple problems (as it seems you may have) and data loss.

 

Go to Tools - Diagnostics and attach the complete Diagnostics ZIP file to your NEXT post in this thread.

Link to comment
16 minutes ago, trurl said:

Since you have dual parity, it is able to emulate both of the missing disks, but unfortunately the emulated disks are unmountable. Be sure to check connections on ALL disks since ALL disks are needed to accurately emulate the disabled disks.

I understand that it can emulate 2 disks since I have 2 parities. But when disk 6 was in the unassigned, it also said emulating. I wonder how that happened or how that works.

 

I opened up the system and checked the connections, they look fine.  I reseated the sata and power on both ends.  Here's the diagnostic.

unraid-diagnostics-20201123-1349.zip

Link to comment

Some things we can't tell at all from the diagnostics on that old versions, and other things we can tell if we work harder at it.

 

For example, I have to open up multiple folders and files just to see which disks are disabled and then be able to compare them to the SMART reports for those disks.

 

Disabled and emulated disks 3 and 5 still not mounted but the physical disks are connected now. Disks 3 and 5 SMART attributes look OK but neither have had any self tests run on them yet.

 

The best way to proceed would be to try to repair the emulated filesystems but first answer these 2 questions:

 

Do you have any spare disks of the same size or larger (but no larger than either parity)?

 

Do you have backups of anything important and irreplaceable?

Link to comment
2 minutes ago, trurl said:

Some things we can't tell at all from the diagnostics on that old versions, and other things we can tell if we work harder at it.

 

For example, I have to open up multiple folders and files just to see which disks are disabled and then be able to compare them to the SMART reports for those disks.

 

Disabled and emulated disks 3 and 5 still not mounted but the physical disks are connected now. Disks 3 and 5 SMART attributes look OK but neither have had any self tests run on them yet.

 

The best way to proceed would be to try to repair the emulated filesystems but first answer these 2 questions:

 

Do you have any spare disks of the same size or larger (but no larger than either parity)?

 

Do you have backups of anything important and irreplaceable?

Is it wise to upgrade to latest version of OS right now while in this degraded state for better diagnostics?

 

I do not have a spare drive at the moment.

 

No backups of the entire array, but if I lose what's on disk 3, it might be ok.  From what I can tell, only things from recent 1 month I added were lost it seems.  That tells me that with the high water setting may have just started writing to disk 3 recently after disk 1 and 2 were half full, so whatever I added recently may have been lost in disk 3, but I still have those data I believe.  We're not talking about losing the whole array right?

 

I made a huge copy of files yesterday, with multiple(6) copy streams at the same time.  That's when the issue started.  It doesn't make sense that would kill a drive though.

 

 

Link to comment
5 minutes ago, boosted said:

No backups of the entire array

I don't either. But I have multiple offsite copies of anything important and irreplaceable. And I have a backup Unraid server for some of the less important things just because I had some hardware leftover after upgrading my main server.

 

Even dual parity is not a substitute for a backup plan.

7 minutes ago, boosted said:

We're not talking about losing the whole array right?

All the mounted disks should be OK, and maybe we can fix the others.

 

8 minutes ago, boosted said:

I do not have a spare drive at the moment.

The reason I ask is because it might be useful to keep the original disks unchanged in any way. It is even possible that the original disks are in fact mountable, but for some reason the emulated disks are not.

 

In any case, we are going to start with checking the emulated filesystems of the disabled disks.

 

Study this and ask if you have any questions:

 

https://wiki.unraid.net/Check_Disk_Filesystems#Checking_and_fixing_drives_in_the_webGui

Link to comment
21 minutes ago, trurl said:

I don't either. But I have multiple offsite copies of anything important and irreplaceable. And I have a backup Unraid server for some of the less important things just because I had some hardware leftover after upgrading my main server.

 

Even dual parity is not a substitute for a backup plan.

All the mounted disks should be OK, and maybe we can fix the others.

 

The reason I ask is because it might be useful to keep the original disks unchanged in any way. It is even possible that the original disks are in fact mountable, but for some reason the emulated disks are not.

 

In any case, we are going to start with checking the emulated filesystems of the disabled disks.

 

Study this and ask if you have any questions:

 

https://wiki.unraid.net/Check_Disk_Filesystems#Checking_and_fixing_drives_in_the_webGui

I understand that parity is no substitute for backups.  But I have 2 other identical synology diskstations set up backing each other up already.  Plus a APC rack mount UPS to keep the power stable.  With this 3rd array, the funds are just not there lol.  But the diskstations are the absolute irreplaceables.  The unRaid data are more or less replaceable.  I'd be really sad if some aren't recoverable, but it won't affect my life, so that's the choice I made. Although I have been too lazy on the disk checks on the unRaid.

 

Let me read through the check wiki and get back to you.  Thank you for your continued assistance.  Apologies for the ancient OS version for making it difficult to match up the logs.

Link to comment

Had to finish up some stuff.  Here are the results.  I put it in maintenance mode, added verbose to options to make it -nv and ran the test for both drives.  drive 3 took a while, and I clicked refresh to get the result.  disk 5 took no time at all, almost as if it didn't run?

 

disk3

Phase 1 - find and verify superblock...

bad primary superblock - bad CRC in superblock !!!

 

attempting to find secondary superblock...

.found candidate secondary superblock...

verified secondary superblock...

would write modified primary superblock

Primary superblock would have been modified.

Cannot proceed further in no_modify mode.

Exiting now.

 

disk5

Phase 1 - find and verify superblock...

bad primary superblock - bad CRC in superblock !!!

 

attempting to find secondary superblock...

.found candidate secondary superblock...

verified secondary superblock...

would write modified primary superblock

Primary superblock would have been modified.

Cannot proceed further in no_modify mode.

Exiting now.

Link to comment

here's disk 3 with -v

 

 

Phase 1 - find and verify superblock...

bad primary superblock - bad CRC in superblock !!!

 

attempting to find secondary superblock...

.found candidate secondary superblock...

verified secondary superblock...

writing modified primary superblock

        - block cache size set to 120736 entries

sb root inode value 18446744073709551615 (NULLFSINO) inconsistent with calculated value 96

resetting superblock root inode pointer to 96

sb realtime bitmap inode 18446744073709551615 (NULLFSINO) inconsistent with calculated value 97

resetting superblock realtime bitmap ino pointer to 97

sb realtime summary inode 18446744073709551615 (NULLFSINO) inconsistent with calculated value 98

resetting superblock realtime summary ino pointer to 98

Phase 2 - using internal log

        - zero log...

zero_log: head block 487811 tail block 487807

ERROR: The filesystem has valuable metadata changes in a log which needs to

be replayed.  Mount the filesystem to replay the log, and unmount it before

re-running xfs_repair.  If you are unable to mount the filesystem, then use

the -L option to destroy the log and attempt a repair.

Note that destroying the log may cause corruption -- please attempt a mount

of the filesystem before doing this.

Link to comment

disk 5 is much different

 

Phase 1 - find and verify superblock...

bad primary superblock - bad CRC in superblock !!!

 

attempting to find secondary superblock...

.found candidate secondary superblock...

verified secondary superblock...

writing modified primary superblock

        - block cache size set to 120736 entries

sb root inode value 18446744073709551615 (NULLFSINO) inconsistent with calculated value 96

resetting superblock root inode pointer to 96

sb realtime bitmap inode 18446744073709551615 (NULLFSINO) inconsistent with calculated value 97

resetting superblock realtime bitmap ino pointer to 97

sb realtime summary inode 18446744073709551615 (NULLFSINO) inconsistent with calculated value 98

resetting superblock realtime summary ino pointer to 98

Phase 2 - using internal log

        - zero log...

zero_log: head block 163 tail block 163

        - scan filesystem freespace and inode maps...

sb_icount 0, counted 64

sb_ifree 0, counted 60

sb_fdblocks 1952984865, counted 1952984857

        - found root inode chunk

Phase 3 - for each AG...

        - scan and clear agi unlinked lists...

        - process known inodes and perform inode discovery...

        - agno = 0

        - agno = 1

        - agno = 2

        - agno = 3

        - agno = 4

        - agno = 5

        - agno = 6

        - agno = 7

        - process newly discovered inodes...

Phase 4 - check for duplicate blocks...

        - setting up duplicate extent list...

        - check for inodes claiming duplicate blocks...

        - agno = 0

        - agno = 1

        - agno = 2

        - agno = 3

        - agno = 4

        - agno = 5

        - agno = 6

        - agno = 7

Phase 5 - rebuild AG headers and trees...

        - agno = 0

        - agno = 1

        - agno = 2

        - agno = 3

        - agno = 4

        - agno = 5

        - agno = 6

        - agno = 7

        - reset superblock...

Phase 6 - check inode connectivity...

        - resetting contents of realtime bitmap and summary inodes

        - traversing filesystem ...

        - agno = 0

        - agno = 1

        - agno = 2

        - agno = 3

        - agno = 4

        - agno = 5

        - agno = 6

        - agno = 7

        - traversal finished ...

        - moving disconnected inodes to lost+found ...

Phase 7 - verify and correct link counts...

Note - stripe unit (0) and width (0) were copied from a backup superblock.

Please reset with mount -o sunit=,swidth= if necessary

 

        XFS_REPAIR Summary    Mon Nov 23 18:00:47 2020

 

Phase           Start           End             Duration

Phase 1:        11/23 18:00:47  11/23 18:00:47

Phase 2:        11/23 18:00:47  11/23 18:00:47

Phase 3:        11/23 18:00:47  11/23 18:00:47

Phase 4:        11/23 18:00:47  11/23 18:00:47

Phase 5:        11/23 18:00:47  11/23 18:00:47

Phase 6:        11/23 18:00:47  11/23 18:00:47

Phase 7:        11/23 18:00:47  11/23 18:00:47

 

Total run time:

done

Link to comment

You will have to use the -L on disk3. That is just the way the linux xfs repair tool works. It is giving you a chance to mount the disk and replay the transaction log, but Unraid has already determined the disk is unmountable so nothing to do but make it forget about that transaction log and proceed.

 

Is disk5 mounted now?

Link to comment
2 minutes ago, trurl said:

You will have to use the -L on disk3. That is just the way the linux xfs repair tool works. It is giving you a chance to mount the disk and replay the transaction log, but Unraid has already determined the disk is unmountable so nothing to do but make it forget about that transaction log and proceed.

 

Is disk5 mounted now?

I'm still in maintenance mode.  do I take it out of maintenance mode to see if disk5 is mountable?  Currently it still says both disk 3 and 5 are not mountable in maintenance mode.

Link to comment

ok I ran -vL on disk 3.

 

Phase 1 - find and verify superblock...

        - block cache size set to 120736 entries

sb root inode value 18446744073709551615 (NULLFSINO) inconsistent with calculated value 96

resetting superblock root inode pointer to 96

sb realtime bitmap inode 18446744073709551615 (NULLFSINO) inconsistent with calculated value 97

resetting superblock realtime bitmap ino pointer to 97

sb realtime summary inode 18446744073709551615 (NULLFSINO) inconsistent with calculated value 98

resetting superblock realtime summary ino pointer to 98

Phase 2 - using internal log

        - zero log...

zero_log: head block 487811 tail block 487807

ALERT: The filesystem has valuable metadata changes in a log which is being

destroyed because the -L option was used.

        - scan filesystem freespace and inode maps...

sb_icount 0, counted 18112

sb_ifree 0, counted 334

sb_fdblocks 1952984865, counted 1064388454

        - found root inode chunk

Phase 3 - for each AG...

        - scan and clear agi unlinked lists...

        - process known inodes and perform inode discovery...

        - agno = 0

        - agno = 1

        - agno = 2

        - agno = 3

        - agno = 4

        - agno = 5

        - agno = 6

        - agno = 7

        - process newly discovered inodes...

Phase 4 - check for duplicate blocks...

        - setting up duplicate extent list...

        - check for inodes claiming duplicate blocks...

        - agno = 1

        - agno = 0

        - agno = 2

        - agno = 3

        - agno = 4

        - agno = 5

        - agno = 6

        - agno = 7

Phase 5 - rebuild AG headers and trees...

        - agno = 0

        - agno = 1

        - agno = 2

        - agno = 3

        - agno = 4

        - agno = 5

        - agno = 6

        - agno = 7

        - reset superblock...

Phase 6 - check inode connectivity...

        - resetting contents of realtime bitmap and summary inodes

        - traversing filesystem ...

        - agno = 0

        - agno = 1

        - agno = 2

        - agno = 3

        - agno = 4

        - agno = 5

        - agno = 6

        - agno = 7

        - traversal finished ...

        - moving disconnected inodes to lost+found ...

Phase 7 - verify and correct link counts...

Maximum metadata LSN (1:487789) is ahead of log (1:2).

Format log to cycle 4.

 

        XFS_REPAIR Summary    Mon Nov 23 19:10:33 2020

 

Phase           Start           End             Duration

Phase 1:        11/23 19:07:53  11/23 19:07:53

Phase 2:        11/23 19:07:53  11/23 19:08:42  49 seconds

Phase 3:        11/23 19:08:42  11/23 19:08:44  2 seconds

Phase 4:        11/23 19:08:44  11/23 19:08:44

Phase 5:        11/23 19:08:44  11/23 19:08:44

Phase 6:        11/23 19:08:44  11/23 19:08:45  1 second

Phase 7:        11/23 19:08:45  11/23 19:08:45

 

Total run time: 52 seconds

done

 

still says unmountable. I ran disk3 with -n again just to see if there's more repairs needed.  maybe there are?

 

Phase 1 - find and verify superblock...

Phase 2 - using internal log

        - zero log...

        - scan filesystem freespace and inode maps...

        - found root inode chunk

Phase 3 - for each AG...

        - scan (but don't clear) agi unlinked lists...

        - process known inodes and perform inode discovery...

        - agno = 0

        - agno = 1

        - agno = 2

        - agno = 3

        - agno = 4

        - agno = 5

        - agno = 6

        - agno = 7

        - process newly discovered inodes...

Phase 4 - check for duplicate blocks...

        - setting up duplicate extent list...

        - check for inodes claiming duplicate blocks...

        - agno = 0

        - agno = 1

        - agno = 2

        - agno = 3

        - agno = 4

        - agno = 5

        - agno = 6

        - agno = 7

No modify flag set, skipping phase 5

Phase 6 - check inode connectivity...

        - traversing filesystem ...

        - traversal finished ...

        - moving disconnected inodes to lost+found ...

Phase 7 - verify link counts...

No modify flag set, skipping filesystem flush and exiting.

 

 

Edited by boosted
Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.