Jump to content

Help needed with unmountable file system and disk errors (initial setup)


Recommended Posts

Trying to set up an unRaid box to replace my Drobo 5N (which, thankfully, still works after 5+ years). I've got a Gigabyte GA-Z97MX-Gaming 5 motherboard (not sure what SATA controller it has as the specs page doesn't seem to say).

 

I have a WD Red 6Tb drive which was in the Drobo for 2.5 years and recently (supposedly) crashed; the Drobo ejected it from the array for some reason. I took it in for warranty replacement, where they did a thorough test that took a couple of days and reported the access time for each and every sector of the disk. It apparently passed with flying colors, so not sure why it was ejected from the Drobo.

 

In any case, I gave it a complete, full NTFS reformat in my Windows PC and ran `chkdsk /R`, both of which completed with no errors or issues. At this point I'm very confused as to why it was ejected from the Drobo. So I plug it into my otherwise-empty unRaid box to see what happens.

 

As expected, it reported that the NTFS-formatted disk was unmountable, so I clicked to have unRaid format it. Now it says the "FS" column is "xfs", but reports that the disk is "Unmountable: No file system". I have no idea what's going on.

 

Finally, the disk log shows a lot of errors, but I don't know what's going on with those either. I'm attaching both the SMART report and the copied disk log (is there a better way to extract that log than just copy-paste?).

 

Any ideas? Thanks so much for any help you can give me.

 

6tb disk log.txt cube-smart-20210311-1119.zip

Link to comment

I've swapped SATA cables with new ones out of the bag; no immediate change. I then stopped the array to try formatting the drive again. After about 2-3 minutes it just stopped formatting, still reporting "Unmountable: No file system", with a notification on the side:

Quote

 

Unraid array errors: 11-03-2021 18:22

Warning [CUBE] - array has errors
Array has 1 disk with read errors

 

The disk log is attached below, as is the full diagnostics as requested.

 

Thank you so much for your time and assistance with this!

6tb new cables formatting stopped errors.txt cube-diagnostics-20210311-1824.zip

Link to comment

Hmm. Ok, here's the extra wrench in the works: when the Drobo kicked out the 6Tb, I immediately bought an 8Tb "replacement" before realizing the 6Tb was still under warranty. Then, since the Drobo was down to single-disk redundancy, I ran the 8Tb through a week-long `badblocks` test (which it passed), and then installed it into my empty unRaid box. It mounted fine, I made it a share (or whatever the terminology is around creating shares on drives), and began another week-long process, that of using rsync to copy over all the data (just under 6Tb of data) from the Drobo to the 8Tb drive in unRaid.

 

That completed successfully, and for a while (5-10 days?), unRaid was working super-fast (relative to the slowpoke Drobo) on the network with just the 8Tb drive. Then, just in the last few days trying to figure out the issue with the 6Tb drive, suddenly the 8Tb drive stopped being recognized by unRaid as well. Same exact reported issue: "Unmountable: No file system".

 

Does any of that shed any more light on what might be going on here? Is it possible that, instead of the drive being bad, the motherboard/controller is bad? But if so, why would it copy 6Tb of data over without a hitch? But if the drives are bad, why has every test under the sun except attempting to mount them in unRaid saying the drives are fine? 🤔

Link to comment
Phase 1 - find and verify superblock...
        - block cache size set to 703632 entries
Phase 2 - using internal log
        - zero log...
zero_log: head block 2058495 tail block 2058491
ALERT: The filesystem has valuable metadata changes in a log which is being
ignored because the -n option was used.  Expect spurious inconsistencies
which may be resolved by first mounting the filesystem to replay the log.
        - scan filesystem freespace and inode maps...
        - found root inode chunk
Phase 3 - for each AG...
        - scan (but don't clear) agi unlinked lists...
        - process known inodes and perform inode discovery...
        - agno = 0
        - agno = 1
        - agno = 2
        - agno = 3
        - agno = 4
        - agno = 5
        - agno = 6
        - agno = 7
        - process newly discovered inodes...
Phase 4 - check for duplicate blocks...
        - setting up duplicate extent list...
        - check for inodes claiming duplicate blocks...
        - agno = 2
        - agno = 0
        - agno = 3
        - agno = 1
        - agno = 4
        - agno = 5
        - agno = 6
        - agno = 7
No modify flag set, skipping phase 5
Phase 6 - check inode connectivity...
        - traversing filesystem ...
        - agno = 0
        - agno = 1
        - agno = 2
        - agno = 3
        - agno = 4
        - agno = 5
        - agno = 6
        - agno = 7
        - traversal finished ...
        - moving disconnected inodes to lost+found ...
Phase 7 - verify link counts...
No modify flag set, skipping filesystem flush and exiting.

        XFS_REPAIR Summary    Thu Mar 11 20:28:53 2021

Phase		Start		End		Duration
Phase 1:	03/11 20:26:33	03/11 20:26:34	1 second
Phase 2:	03/11 20:26:34	03/11 20:26:34
Phase 3:	03/11 20:26:34	03/11 20:27:46	1 minute, 12 seconds
Phase 4:	03/11 20:27:46	03/11 20:27:46
Phase 5:	Skipped
Phase 6:	03/11 20:27:46	03/11 20:28:53	1 minute, 7 seconds
Phase 7:	03/11 20:28:53	03/11 20:28:53

Total run time: 2 minutes, 20 seconds

 

 

Does the above indicate it found issues that need to be repaired? The manual seemed to indicate there'd be a clearly stated option to use for re-running the check command, but I don't see anything that matches that description above.

Link to comment

Geez, now I know where all that Holywood tech speak in movies comes from. O.O

Phase 1 - find and verify superblock...
        - block cache size set to 703632 entries
Phase 2 - using internal log
        - zero log...
zero_log: head block 2058495 tail block 2058491
ALERT: The filesystem has valuable metadata changes in a log which is being
destroyed because the -L option was used.
        - scan filesystem freespace and inode maps...
        - found root inode chunk
Phase 3 - for each AG...
        - scan and clear agi unlinked lists...
        - process known inodes and perform inode discovery...
        - agno = 0
        - agno = 1
        - agno = 2
        - agno = 3
        - agno = 4
        - agno = 5
        - agno = 6
        - agno = 7
        - process newly discovered inodes...
Phase 4 - check for duplicate blocks...
        - setting up duplicate extent list...
        - check for inodes claiming duplicate blocks...
        - agno = 0
        - agno = 3
        - agno = 2
        - agno = 1
        - agno = 4
        - agno = 5
        - agno = 6
        - agno = 7
Phase 5 - rebuild AG headers and trees...
        - agno = 0
        - agno = 1
        - agno = 2
        - agno = 3
        - agno = 4
        - agno = 5
        - agno = 6
        - agno = 7
        - reset superblock...
Phase 6 - check inode connectivity...
        - resetting contents of realtime bitmap and summary inodes
        - traversing filesystem ...
        - agno = 0
        - agno = 1
        - agno = 2
        - agno = 3
        - agno = 4
        - agno = 5
        - agno = 6
        - agno = 7
        - traversal finished ...
        - moving disconnected inodes to lost+found ...
Phase 7 - verify and correct link counts...
Maximum metadata LSN (1:2058555) is ahead of log (1:2).
Format log to cycle 4.

        XFS_REPAIR Summary    Thu Mar 11 20:48:18 2021

Phase		Start		End		Duration
Phase 1:	03/11 20:45:54	03/11 20:45:54
Phase 2:	03/11 20:45:54	03/11 20:46:07	13 seconds
Phase 3:	03/11 20:46:07	03/11 20:47:03	56 seconds
Phase 4:	03/11 20:47:03	03/11 20:47:03
Phase 5:	03/11 20:47:03	03/11 20:47:03
Phase 6:	03/11 20:47:03	03/11 20:47:52	49 seconds
Phase 7:	03/11 20:47:52	03/11 20:47:52

Total run time: 1 minute, 58 seconds
done

 

Great, so... now what? Do I stop the array from maintenance mode and restart normally? Currently, the Main screen still shows both disks as "Unmountable: No file system", although at least there's progress that the 8Tb's FS is "xfs" now instead of just "auto". ¯\_(ツ)_/¯

 

Also, if at any point in all this there's any indication whether the issue would be due to a failing drive vs failing MB vs random, please do let me know. :)

Link to comment

Ok, so that does seem to have brought the 8Tb back, although I'm still scratching my head as to how it got borked in the first place. Nevertheless, thank you!!

 

As for the original issue, the 6Tb drive... as best as you can tell, that does seem to be either a cable or drive issue, correct? And since I swapped out to new cables....

Link to comment

BTW, when I run a check on the 6Tb drive with `-n`, the result is this (canceled after a few minutes of whole-lotta-nothin):

 

Quote

Phase 1 - find and verify superblock...

bad primary superblock - filesystem mkfs-in-progress bit set !!!

attempting to find secondary superblock...

.......................................................................................................................................................................................

And the line with the dots just grows and grows.

Link to comment
25 minutes ago, Sandwich said:

No, from running the array in maintenance mode, clicking the disk, and running the check.

 

I have never seen that particular error message before so do not know what it means.  

 

I think you are going to need to let it scan the disk to see if it can find a valid superblock (which can take hours on a large disk).

 

Maybe someone else will have a suggestion?

Link to comment

Bit of an update and further puzzlement: I've gotten the 6Tb drive replaced, and the tech at the store assured me that the drive was error-free. Great! Same thing he said about the original 6Tb drive. 🙄

 

So I installed the replacement 6Tb and, unlike the original one, unRAID was able to format it. Yay! Then, since I need the largest drive in the array to be the parity disk, I `rsync`ed everything from the 8Tb (which had everything `rsync`ed from the Drobo5N) to the replacement 6Tb. That process took about half a day, and completed successfully. I then created a new drive config (Tools -> New Config), and assigned the 8Tb as a parity disk, and the 6Tb as data. It started to rebuild the parity on the 8Tb, which it said would take about a day. So I left it like that last night.

 

This morning I come back to see that the process paused partway through, and there's an error notification on the screen:

image.thumb.png.dc92c63646dece054258594f83ac56f9.png

 

If I click on the disk log, the modal window just loads screen after screen of log rows, with errors scattered everywhere.

 

At this point, I have no idea what to think.

 

Full diagnostic attached.

cube-diagnostics-20210318-0833.zip

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...