Jump to content

Drive went from 100% full to 'Unmountable: No file system' after a clean shutdown and reboot


nfriedly

Recommended Posts

Hi folks.

 

First up: I have backups of my data, so I haven't actually lost anything. I just want to know what happened and how to avoid it in the future.

 

I have a drive in my array, disk 3, that I had done a command-line rsync to send data to it. I didn't realize that I was trying to send a little bit more data than would fit on the drive. So, it's 100% full and the rsync failed to complete. No big deal, just wasted time. I figured I'd clean it up later. Not sure if that's relevant, but it might be.

 

I shut down my system and added more RAM a couple of days ago. It was a clean shutdown via the web UI, and I waited for the system to completely power down before swapping out the RAM. I booted it back up, but then had to leave, so I didn't really inspect it that closely. I'm not sure if it was reporting the drive as full or not formatted at that point.

 

This morning I logged in to look at things more closely and found the drive reporting as not formatted. I tried a reboot (via the web UI) but that didn't change anything. I dumped the diagnostics, but I'm fairly new to unraid, so I'm not even sure what I should be looking for. As far as I can tell, it's not acting like any errors happened, it's just acting as if I had never formatted the drive in the first place.

 

The SMART reallocated sector count has a raw value of 8, but it's been that way for a while, so I'm not sure if that's related.

unraid-diagnostics-20181220-1253.zip

Link to comment

Ok, here's the output:

 

Phase 1 - find and verify superblock...
couldn't verify primary superblock - not enough secondary superblocks with matching geometry !!!

attempting to find secondary superblock...
.......................................found candidate secondary superblock...
verified secondary superblock...
would write modified primary superblock
Primary superblock would have been modified.
Cannot proceed further in no_modify mode.
Exiting now.

 

Link to comment

Ok, without the -n, It gave me this:
 

Phase 1 - find and verify superblock...
couldn't verify primary superblock - not enough secondary superblocks with matching geometry !!!

attempting to find secondary superblock...
.......................................found candidate secondary superblock...
verified secondary superblock...
writing modified primary superblock
sb realtime bitmap inode 18446744073709551615 (NULLFSINO) inconsistent with calculated value 129
resetting superblock realtime bitmap ino pointer to 129
sb realtime summary inode 18446744073709551615 (NULLFSINO) inconsistent with calculated value 130
resetting superblock realtime summary ino pointer to 130
Phase 2 - using internal log
        - zero log...
        - scan filesystem freespace and inode maps...
sb_icount 0, counted 3651520
sb_ifree 0, counted 176
sb_fdblocks 732208915, counted 957649
        - found root inode chunk
Phase 3 - for each AG...
        - scan and clear agi unlinked lists...
        - process known inodes and perform inode discovery...
        - agno = 0
        - agno = 1
        - agno = 2
        - agno = 3
        - process newly discovered inodes...
Phase 4 - check for duplicate blocks...
        - setting up duplicate extent list...
        - check for inodes claiming duplicate blocks...
        - agno = 0
        - agno = 2
        - agno = 1
        - agno = 3
Phase 5 - rebuild AG headers and trees...
        - reset superblock...
Phase 6 - check inode connectivity...
        - resetting contents of realtime bitmap and summary inodes
        - traversing filesystem ...
        - traversal finished ...
        - moving disconnected inodes to lost+found ...
Phase 7 - verify and correct link counts...
Note - stripe unit (0) and width (0) were copied from a backup superblock.
Please reset with mount -o sunit=,swidth= if necessary
done

So, I think that means it fixed something. What's the best next move? Just restart the array without maintenance mode? Or do I need to worry about parity being wrong?

 

Also, does anyone have any idea how the drive got into the broken state? That's my bigger concern.

Thank you for the help! (And patience!)

Link to comment
25 minutes ago, nfriedly said:

Just restart the array without maintenance mode?

Start in normal mode, if you followed the wiki's instructions parity is updated during xfs_repair.

 

26 minutes ago, nfriedly said:

Also, does anyone have any idea how the drive got into the broken state? That's my bigger concern.

You should always avoid completely filling up a disk, but for more details on what went wrong you'd need to ask in the xfs mailing list, it's a xfs issue, not an Unraid issue.

 

Link to comment

Archived

This topic is now archived and is closed to further replies.

×
×
  • Create New...