[Solved] Hardware upgrade issues (Disabled Disk)

Tybio · December 21, 2018

All,

I replaced my old E3 with an E-2176G today on a new SM board, and ran into some issues with an LSI card. The ROM would load and I could see the drives in linux...but unraid would get stuck mounting the only array drive on the LSI board. After some effort I got it to shut down, and moved the LSI card to another slot. Now everything /seems/ to boot properly, but the disk is in "Error Disabled" state. Before I "Start" the array, I wanted to run the diags by the group here.

I'd appreicate any input! I've got over 50TB of data on this array and am getting a bit nervious :).

tower-diagnostics-20181221-1333.zip

Tybio · December 21, 2018

The SMART report looks ok. Wonder if I should just try to start the array, it isn't asking to format or anything.

JonathanM · December 21, 2018

Haven't looked at your diagnostics, but your description leads me to believe you will need to rebuild the dropped drive, otherwise you will be vulnerable to another disk failure causing data loss.

A screenshot of the main GUI page might clear up some things.

Tybio · December 21, 2018

Here you go! I'm not sure what will happen when I start, it doesn't give the normal "Start Array and rebuild disk" message...just the normal "Start".

JonathanM · December 21, 2018

Hmm. The message to format an unmountable drive wouldn't show up until you start the array, so you don't yet know if there will be any issues.

Maybe someone else will have a different opinion, but I think if it were me I'd start in maintenance mode and do a file system check on all the drive slots before either rebuilding the drive in place or discarding parity and using the kicked drive as is.

If the file system check comes up clean on all 8 data slots, then rebuilding slot 8 onto the same drive is probably the correct option.

Tybio · December 21, 2018

What's the best way to do an FS check on an xfs filesystem?

itimpi · December 21, 2018

Just now, Tybio said:

What's the best way to do an FS check on an xfs filesystem?

If you start the array in Maintenance mode then you can click on each diskX entry and from the resulting dialog one of the options is to run a file system check.

Tybio · December 21, 2018

I'm not seeing that option, it doesn't even know the FS on the drive for some reason.

Tybio · December 21, 2018

In reading the docs, I'm starting to worry. It should know what file system the disks are, and have a file system check section...but it doesn't. Tell me I didn' tlose 50+TB of data please?

JorgeB · December 22, 2018

Unassign disk8 and start the array, if the emulated disk mounts and data looks correct rebuild on top.

Tybio · December 22, 2018

Ok, ran xfs_repair as the docs said from command line and didn't see anything obviously wrong. This is the output for the "Disabled" disk:

root@Tower:/boot/config# xfs_repair -nv /dev/md8
Phase 1 - find and verify superblock...
        - block cache size set to 1460776 entries
Phase 2 - using internal log
        - zero log...
zero_log: head block 1389819 tail block 1389819
        - scan filesystem freespace and inode maps...
        - found root inode chunk
Phase 3 - for each AG...
        - scan (but don't clear) agi unlinked lists...
        - process known inodes and perform inode discovery...
        - agno = 0
        - agno = 1
        - agno = 2
        - agno = 3
        - agno = 4
        - agno = 5
        - agno = 6
        - agno = 7
        - agno = 8
        - agno = 9
        - process newly discovered inodes...
Phase 4 - check for duplicate blocks...
        - setting up duplicate extent list...
        - check for inodes claiming duplicate blocks...
        - agno = 2
        - agno = 5
        - agno = 0
        - agno = 3
        - agno = 4
        - agno = 7
        - agno = 6
        - agno = 8
        - agno = 1
        - agno = 9
No modify flag set, skipping phase 5
Phase 6 - check inode connectivity...
        - traversing filesystem ...
        - agno = 0
        - agno = 1
        - agno = 2
        - agno = 3
        - agno = 4
        - agno = 5
        - agno = 6
        - agno = 7
        - agno = 8
        - agno = 9
        - traversal finished ...
        - moving disconnected inodes to lost+found ...
Phase 7 - verify link counts...
No modify flag set, skipping filesystem flush and exiting.

        XFS_REPAIR Summary    Fri Dec 21 16:09:54 2018

Phase		Start		End		Duration
Phase 1:	12/21 16:09:49	12/21 16:09:49
Phase 2:	12/21 16:09:49	12/21 16:09:49
Phase 3:	12/21 16:09:49	12/21 16:09:52	3 seconds
Phase 4:	12/21 16:09:52	12/21 16:09:52
Phase 5:	Skipped
Phase 6:	12/21 16:09:52	12/21 16:09:54	2 seconds
Phase 7:	12/21 16:09:54	12/21 16:09:54

Total run time: 5 seconds
root@Tower:/boot/config#

Advice on next steps?

Tybio · December 22, 2018

3 minutes ago, johnnie.black said:

Unassign disk8 and start the array, if the emulated disk mounts and data looks correct rebuild on top.

Er, if it doesn't know the filesystem to mount with, wouldn't it just format the drive?

Tybio · December 22, 2018

12 minutes ago, johnnie.black said:

Unassign disk8 and start the array, if the emulated disk mounts and data looks correct rebuild on top.

Johnnie,

Sorry for being annoying here, but what will happen if I hit start and for some reason it can't mount the disks? Will it just fail, or try to initialize them?

JorgeB · December 22, 2018

Just fail.

Tybio · December 22, 2018

Ok, loaded up fine and is showing the files on the bad disk. Going to re-add it now and start the rebuild

Thanks for the help!

trurl · December 22, 2018

40 minutes ago, Tybio said:

Tell me I didn' tlose 50+TB of data please?

Just for future reference, it's almost impossible to do this without a massive, smoking, hardware failure, or a theft of your server.

Unlike RAID systems, each disk in Unraid (not RAID) is independent. Even if several of them did truly die, you could still get any files that are on the good ones.

Tybio · December 22, 2018

rebuild finished, all seems well! Thanks for the help.

[Solved] Hardware upgrade issues (Disabled Disk)

Recommended Posts

Tybio

Link to comment

Tybio

Link to comment

JonathanM

Link to comment

Tybio

Link to comment

JonathanM

Link to comment

Tybio

Link to comment

itimpi

Link to comment

Tybio

Link to comment

Tybio

Link to comment

JorgeB

Link to comment

Tybio

Link to comment

Tybio

Link to comment

Tybio

Link to comment

JorgeB

Link to comment

Tybio

Link to comment

trurl

Link to comment

Tybio

Link to comment

Archived