boosted

November 25, 2020

Rebuild completed without errors. It does not appear that I have lost any file, at least that I can tell. As long as the loss is the whole file and not chunks of it. That'd just corrupt the whole file without me knowing it for a long while. Spare HD ordered as well.

Thanks again trurl for your help on this.

November 24, 2020

12 more hours til rebuild is done. I'll pick up some drives during this BF sale if possible as spares. Prices have gone down since when I built it.

November 24, 2020

5 hours ago, trurl said:

Let us know how it goes.

Extended SMARTS test finally finished. Both said completed without error. Here are their reports. I think I'm ok to proceed. But I'm not sure if some the attributes numbers are good. Most say "old age" eventhough it's just 2 years old.

unraid-smart-20201124-1215-disk3.zip unraid-smart-20201124-1215-disk5.zip

November 24, 2020

1 minute ago, trurl said:

yes

Thank you!

November 24, 2020

One more question, can I rebuild both drives at the same time? The wiki didn't say whether that's advisable or possible.

November 24, 2020

Ok, I will do that. Extended smart test, then stop array, unassign disks, start array, stop array, assign disk back then start to rebuild. According to the wiki. Thank you again.

November 24, 2020

Thank you for helping me get this far! I honestly don't think I lost anything, but I will see over time, I can most likely recover the last 2 month, i think that's how long they've been down. So does no lost+found mean nothing was lost? Or that nothing that was lost can be recovered?

What is the rebuilding process from this point? The wiki didn't say what to do to rebuild after a repair. I don't want to assume anything.

As for what happened, can you speculate anything? Maybe those 2 drives lost power at some point and corrupted the data? Should I still trust these 2 drives or should I get replacements for them?

November 24, 2020

The data that were gone in maintenance mode are back, mainly the things I copied over the past month or so. But I don't know if I'm missing anything though.

November 24, 2020

disk 3 and 5 show emulated with red X and no unmountable labeling for 3 and 5. Nothing in the unassigned section. here's the diagnostic.

unraid-diagnostics-20201123-1928.zip

November 24, 2020

I did not

November 24, 2020

ok I ran -vL on disk 3.

Phase 1 - find and verify superblock...

- block cache size set to 120736 entries

sb root inode value 18446744073709551615 (NULLFSINO) inconsistent with calculated value 96

resetting superblock root inode pointer to 96

sb realtime bitmap inode 18446744073709551615 (NULLFSINO) inconsistent with calculated value 97

resetting superblock realtime bitmap ino pointer to 97

sb realtime summary inode 18446744073709551615 (NULLFSINO) inconsistent with calculated value 98

resetting superblock realtime summary ino pointer to 98

Phase 2 - using internal log

- zero log...

zero_log: head block 487811 tail block 487807

ALERT: The filesystem has valuable metadata changes in a log which is being

destroyed because the -L option was used.

- scan filesystem freespace and inode maps...

sb_icount 0, counted 18112

sb_ifree 0, counted 334

sb_fdblocks 1952984865, counted 1064388454

- found root inode chunk

Phase 3 - for each AG...

- scan and clear agi unlinked lists...

- process known inodes and perform inode discovery...

- agno = 0

- agno = 1

- agno = 2

- agno = 3

- agno = 4

- agno = 5

- agno = 6

- agno = 7

- process newly discovered inodes...

Phase 4 - check for duplicate blocks...

- setting up duplicate extent list...

- check for inodes claiming duplicate blocks...

- agno = 1

- agno = 0

- agno = 2

- agno = 3

- agno = 4

- agno = 5

- agno = 6

- agno = 7

Phase 5 - rebuild AG headers and trees...

- agno = 0

- agno = 1

- agno = 2

- agno = 3

- agno = 4

- agno = 5

- agno = 6

- agno = 7

- reset superblock...

Phase 6 - check inode connectivity...

- resetting contents of realtime bitmap and summary inodes

- traversing filesystem ...

- agno = 0

- agno = 1

- agno = 2

- agno = 3

- agno = 4

- agno = 5

- agno = 6

- agno = 7

- traversal finished ...

- moving disconnected inodes to lost+found ...

Phase 7 - verify and correct link counts...

Maximum metadata LSN (1:487789) is ahead of log (1:2).

Format log to cycle 4.

XFS_REPAIR Summary Mon Nov 23 19:10:33 2020

Phase Start End Duration

Phase 1: 11/23 19:07:53 11/23 19:07:53

Phase 2: 11/23 19:07:53 11/23 19:08:42 49 seconds

Phase 3: 11/23 19:08:42 11/23 19:08:44 2 seconds

Phase 4: 11/23 19:08:44 11/23 19:08:44

Phase 5: 11/23 19:08:44 11/23 19:08:44

Phase 6: 11/23 19:08:44 11/23 19:08:45 1 second

Phase 7: 11/23 19:08:45 11/23 19:08:45

Total run time: 52 seconds

done

still says unmountable. I ran disk3 with -n again just to see if there's more repairs needed. maybe there are?

Phase 1 - find and verify superblock...

Phase 2 - using internal log

- zero log...

- scan filesystem freespace and inode maps...

- found root inode chunk

Phase 3 - for each AG...

- scan (but don't clear) agi unlinked lists...

- process known inodes and perform inode discovery...

- agno = 0

- agno = 1

- agno = 2

- agno = 3

- agno = 4

- agno = 5

- agno = 6

- agno = 7

- process newly discovered inodes...

Phase 4 - check for duplicate blocks...

- setting up duplicate extent list...

- check for inodes claiming duplicate blocks...

- agno = 0

- agno = 1

- agno = 2

- agno = 3

- agno = 4

- agno = 5

- agno = 6

- agno = 7

No modify flag set, skipping phase 5

Phase 6 - check inode connectivity...

- traversing filesystem ...

- traversal finished ...

- moving disconnected inodes to lost+found ...

Phase 7 - verify link counts...

No modify flag set, skipping filesystem flush and exiting.

November 24, 2020

2 minutes ago, trurl said:

You will have to use the -L on disk3. That is just the way the linux xfs repair tool works. It is giving you a chance to mount the disk and replay the transaction log, but Unraid has already determined the disk is unmountable so nothing to do but make it forget about that transaction log and proceed.

Is disk5 mounted now?

I'm still in maintenance mode. do I take it out of maintenance mode to see if disk5 is mountable? Currently it still says both disk 3 and 5 are not mountable in maintenance mode.

November 24, 2020

disk 5 is much different

Phase 1 - find and verify superblock...

bad primary superblock - bad CRC in superblock !!!

attempting to find secondary superblock...

.found candidate secondary superblock...

verified secondary superblock...

writing modified primary superblock

- block cache size set to 120736 entries

sb root inode value 18446744073709551615 (NULLFSINO) inconsistent with calculated value 96

resetting superblock root inode pointer to 96

sb realtime bitmap inode 18446744073709551615 (NULLFSINO) inconsistent with calculated value 97

resetting superblock realtime bitmap ino pointer to 97

sb realtime summary inode 18446744073709551615 (NULLFSINO) inconsistent with calculated value 98

resetting superblock realtime summary ino pointer to 98

Phase 2 - using internal log

- zero log...

zero_log: head block 163 tail block 163

- scan filesystem freespace and inode maps...

sb_icount 0, counted 64

sb_ifree 0, counted 60

sb_fdblocks 1952984865, counted 1952984857

- found root inode chunk

Phase 3 - for each AG...

- scan and clear agi unlinked lists...

- process known inodes and perform inode discovery...

- agno = 0

- agno = 1

- agno = 2

- agno = 3

- agno = 4

- agno = 5

- agno = 6

- agno = 7

- process newly discovered inodes...

Phase 4 - check for duplicate blocks...

- setting up duplicate extent list...

- check for inodes claiming duplicate blocks...

- agno = 0

- agno = 1

- agno = 2

- agno = 3

- agno = 4

- agno = 5

- agno = 6

- agno = 7

Phase 5 - rebuild AG headers and trees...

- agno = 0

- agno = 1

- agno = 2

- agno = 3

- agno = 4

- agno = 5

- agno = 6

- agno = 7

- reset superblock...

Phase 6 - check inode connectivity...

- resetting contents of realtime bitmap and summary inodes

- traversing filesystem ...

- agno = 0

- agno = 1

- agno = 2

- agno = 3

- agno = 4

- agno = 5

- agno = 6

- agno = 7

- traversal finished ...

- moving disconnected inodes to lost+found ...

Phase 7 - verify and correct link counts...

Note - stripe unit (0) and width (0) were copied from a backup superblock.

Please reset with mount -o sunit=,swidth= if necessary

XFS_REPAIR Summary Mon Nov 23 18:00:47 2020

Phase Start End Duration

Phase 1: 11/23 18:00:47 11/23 18:00:47

Phase 2: 11/23 18:00:47 11/23 18:00:47

Phase 3: 11/23 18:00:47 11/23 18:00:47

Phase 4: 11/23 18:00:47 11/23 18:00:47

Phase 5: 11/23 18:00:47 11/23 18:00:47

Phase 6: 11/23 18:00:47 11/23 18:00:47

Phase 7: 11/23 18:00:47 11/23 18:00:47

Total run time:

done

November 24, 2020

here's disk 3 with -v

Phase 1 - find and verify superblock...

bad primary superblock - bad CRC in superblock !!!

attempting to find secondary superblock...

.found candidate secondary superblock...

verified secondary superblock...

writing modified primary superblock

- block cache size set to 120736 entries

sb root inode value 18446744073709551615 (NULLFSINO) inconsistent with calculated value 96

resetting superblock root inode pointer to 96

sb realtime bitmap inode 18446744073709551615 (NULLFSINO) inconsistent with calculated value 97

resetting superblock realtime bitmap ino pointer to 97

sb realtime summary inode 18446744073709551615 (NULLFSINO) inconsistent with calculated value 98

resetting superblock realtime summary ino pointer to 98

Phase 2 - using internal log

- zero log...

zero_log: head block 487811 tail block 487807

ERROR: The filesystem has valuable metadata changes in a log which needs to

be replayed. Mount the filesystem to replay the log, and unmount it before

re-running xfs_repair. If you are unable to mount the filesystem, then use

the -L option to destroy the log and attempt a repair.

Note that destroying the log may cause corruption -- please attempt a mount

of the filesystem before doing this.

November 24, 2020

Had to finish up some stuff. Here are the results. I put it in maintenance mode, added verbose to options to make it -nv and ran the test for both drives. drive 3 took a while, and I clicked refresh to get the result. disk 5 took no time at all, almost as if it didn't run?

disk3

Phase 1 - find and verify superblock...

bad primary superblock - bad CRC in superblock !!!

attempting to find secondary superblock...

.found candidate secondary superblock...

verified secondary superblock...

would write modified primary superblock

Primary superblock would have been modified.

Cannot proceed further in no_modify mode.

Exiting now.

disk5

Phase 1 - find and verify superblock...

bad primary superblock - bad CRC in superblock !!!

attempting to find secondary superblock...

.found candidate secondary superblock...

verified secondary superblock...

would write modified primary superblock

Primary superblock would have been modified.

Cannot proceed further in no_modify mode.

Exiting now.

November 23, 2020

21 minutes ago, trurl said:

I don't either. But I have multiple offsite copies of anything important and irreplaceable. And I have a backup Unraid server for some of the less important things just because I had some hardware leftover after upgrading my main server.

Even dual parity is not a substitute for a backup plan.

All the mounted disks should be OK, and maybe we can fix the others.

The reason I ask is because it might be useful to keep the original disks unchanged in any way. It is even possible that the original disks are in fact mountable, but for some reason the emulated disks are not.

In any case, we are going to start with checking the emulated filesystems of the disabled disks.

Study this and ask if you have any questions:

https://wiki.unraid.net/Check_Disk_Filesystems#Checking_and_fixing_drives_in_the_webGui

I understand that parity is no substitute for backups. But I have 2 other identical synology diskstations set up backing each other up already. Plus a APC rack mount UPS to keep the power stable. With this 3rd array, the funds are just not there lol. But the diskstations are the absolute irreplaceables. The unRaid data are more or less replaceable. I'd be really sad if some aren't recoverable, but it won't affect my life, so that's the choice I made. Although I have been too lazy on the disk checks on the unRaid.

Let me read through the check wiki and get back to you. Thank you for your continued assistance. Apologies for the ancient OS version for making it difficult to match up the logs.

November 23, 2020

2 minutes ago, trurl said:

Some things we can't tell at all from the diagnostics on that old versions, and other things we can tell if we work harder at it.

For example, I have to open up multiple folders and files just to see which disks are disabled and then be able to compare them to the SMART reports for those disks.

Disabled and emulated disks 3 and 5 still not mounted but the physical disks are connected now. Disks 3 and 5 SMART attributes look OK but neither have had any self tests run on them yet.

The best way to proceed would be to try to repair the emulated filesystems but first answer these 2 questions:

Do you have any spare disks of the same size or larger (but no larger than either parity)?

Do you have backups of anything important and irreplaceable?

Is it wise to upgrade to latest version of OS right now while in this degraded state for better diagnostics?

I do not have a spare drive at the moment.

No backups of the entire array, but if I lose what's on disk 3, it might be ok. From what I can tell, only things from recent 1 month I added were lost it seems. That tells me that with the high water setting may have just started writing to disk 3 recently after disk 1 and 2 were half full, so whatever I added recently may have been lost in disk 3, but I still have those data I believe. We're not talking about losing the whole array right?

I made a huge copy of files yesterday, with multiple(6) copy streams at the same time. That's when the issue started. It doesn't make sense that would kill a drive though.

November 23, 2020

16 minutes ago, trurl said:

Since you have dual parity, it is able to emulate both of the missing disks, but unfortunately the emulated disks are unmountable. Be sure to check connections on ALL disks since ALL disks are needed to accurately emulate the disabled disks.

I understand that it can emulate 2 disks since I have 2 parities. But when disk 6 was in the unassigned, it also said emulating. I wonder how that happened or how that works.

I opened up the system and checked the connections, they look fine. I reseated the sata and power on both ends. Here's the diagnostic.

unraid-diagnostics-20201123-1349.zip

November 23, 2020

1 minute ago, trurl said:

Disks 3 and 5 don't appear to be connected. Shut down, check all connections, power and SATA, including any splitters. Reboot and post new diagnostics.

I will do that. Hopefully just bad cabling that came loose.

November 23, 2020

2 minutes ago, trurl said:

You are on a very old (nearly 3 years) version of Unraid, so unfortunately, the diagnostics you gave us won't tell us as much as newer versions.

hopefully it still tells something. I'm always hesitant to upgrade.

November 23, 2020

1 minute ago, trurl said:

Not clear what you mean here since there is no "mount" to click for a disk in the parity array.

I made a typo. disk 6 showed up in the unassigned devices section. so I clicked "mount" for it. At that time, disk 3 and 5 were showing contents emulate.

November 23, 2020

yea I need to get notification set up. here's the diagnostic log. Thank you for taking a look for me.

unraid-diagnostics-20201123-1322.zip

November 23, 2020

I have a 8 disk array with 6 data and 2 parity. Haven't really paid much attention to the GUI then I noticed some slowness and strange behavior from the array so I went to the GUI. It shows disk 3 and disk 5 with red X and Unmountable: No file system error. I also saw disk 6 if I remember correctly as being mounted. It seemed strange that if 3 out of 6 disks are out of the array that it can still emulate the data on 3 and 5? The array is not very full, so disk 6 might not have anything. I clicked "mount" for disk 6 then stop and started the array. It now shows disk 6 in the array as normal.

What should I do about disk 3 and 5? This rig is barely 2 years old, seems odd to be losing 2 drives, can any diagnostics be run to see what happened? it doesn't seem to show smarts error. I noticed that files I copied into the array are not there any more via the NFS share. Did I lose any data? If disk 3 and 5 are emulated, shouldn't all the data be there still? Attached is the error log.

unraid-syslog-20201123-1300.zip

December 3, 2007

I vote to bump up the jumbo frame on the list.

boosted

Posts

Joined

Last visited

Content Type

Profiles

Forums

Downloads

Store

Gallery

Bug Reports

Documentation

Landing

Posts posted by boosted

2 disks show Unmountable: No file system

2 disks show Unmountable: No file system

2 disks show Unmountable: No file system

2 disks show Unmountable: No file system

2 disks show Unmountable: No file system

2 disks show Unmountable: No file system

2 disks show Unmountable: No file system

2 disks show Unmountable: No file system

2 disks show Unmountable: No file system

2 disks show Unmountable: No file system

2 disks show Unmountable: No file system

2 disks show Unmountable: No file system

2 disks show Unmountable: No file system

2 disks show Unmountable: No file system

2 disks show Unmountable: No file system

2 disks show Unmountable: No file system

2 disks show Unmountable: No file system

2 disks show Unmountable: No file system

2 disks show Unmountable: No file system

2 disks show Unmountable: No file system

2 disks show Unmountable: No file system

2 disks show Unmountable: No file system

2 disks show Unmountable: No file system

Sticky: Ye Olde Laundry List