Jump to content

A drive went unmountable midnight


Go to solution Solved by itimpi,

Recommended Posts

Hi guys,

 

We have a Unraid Server with Unraid OS Starter,

 

Last night, it restarted all by itself, then there was a unmountable drive (drive 2) and all of its shares were gone, what is weird is that how it is Data-Rebuild 'ing another drive we but in yesterday and it still says Unmountable.

 

Details about the drive:

 

Disk Model: ST12000NM0127

Capacity: 12TB

SDB

 

XFS file system,

Unmountable: Unsupported or no file system

Attached in Picture1 is the  picture of the arrays

 

Thanks guys

Picture1.png

Link to comment
Posted (edited)
Quote

Attach Diagnostics to your NEXT post in this thread. 

My server diagnostics have been added @trurl

 

Quote

Standard handling  of disks going 'unmountable' is covered here in the online documentation accessible via the Manual link at the bottom of the Unraid GUI.

@itimpi

the weird thing is that  the icon next to the drive is green and saying "Normal Operation, device is active, click to spin down device".

 

I can't stop the array for 6 hours as I am rebuilding another drive (disk 3), the weird thing is that disk 2 is getting read from while unmountable and looks like it is getting written to the disk 3 (i know, for data rebuilding) but how is it using disk 2? it says it cant

image.thumb.png.89deea594be6561e54d4309908c190fd.png

fjanahi.server-diagnostics-20240630-2145.zip

Edited by Rashoodi
mentioning/@ing people
Link to comment

A disk going unmountable does.not mean there is anything wrong with it - just that there appears to be something corrupt at the file system level that is preventing it being mounted.

 

Disk2 is being read as during a rebuild sectors from every drive that has not been disabled (i.e. marked with a red ‘x’)are used in conjunction with the parity drive(s) to reconstruct the sector content for the drive being rebuilt.   The rebuild has no concept of file system (and thus does not care that a drive is unmountable) as it works at the raw sector level.  

Link to comment

Good morning to you all 🙃,

 

The server has been restarting every 10 - 12 hours and it is not letting me finish my Data-Rebuild,

 

I am thinking of stopping the rebuild if it restarts 1 more time, I don't know if the restarting is hardware issue or software issue.

 

I know that it is the thing that broke Disk 2 (at least what I think)

 

Quote

Model:Custom

 

M/B:ASRock B450M/ac R2.0 

 

BIOS:American Megatrends Inc. Version P3.10 Dated 10/27/2022

 

CPU:AMD Ryzen 7 5700G with Radeon Graphics @ 3800 MHz

 

HVM:Enabled

 

IOMMU:Disabled

 

Cache:L1 - Cache: 512 KiB, L2 - Cache: 4 MiB, L3 - Cache: 16 MiB

 

Memory:32 GiB DDR4 (max. installable capacity 128 GiB)

 

Network:bond0: fault-tolerance (active-backup), mtu 1500

 

Kernel:Linux 6.1.79-Unraid x86_64

 

OpenSSL:1.1.1v

 

Uptime: 3 hours, 40 minutes

 

Link to comment

2 questions,

 

i hade disk 3 rebuilding right?, now it just became unmountable after a restart :(.

 

and how can i solve this

 

1 hour ago, JorgeB said:

That is almost always a hardware problem, or bad power.

I will check this when I return home

Link to comment

Y

1 minute ago, Rashoodi said:

i hade disk 3 rebuilding right?, now it just became unmountable after a restart :(.

 

and how can i solve this


You would normally use the process in the link given earlier for handling unmountable drives.  
 

you might want to list your diagnostics zip file to see if that shows anything of interest.

Link to comment

attached output of  xfs repair

 

Phase 1 - find and verify superblock...
Phase 2 - using internal log
        - zero log...
ERROR: The filesystem has valuable metadata changes in a log which needs to
be replayed.  Mount the filesystem to replay the log, and unmount it before
re-running xfs_repair.  If you are unable to mount the filesystem, then use
the -L option to destroy the log and attempt a repair.
Note that destroying the log may cause corruption -- please attempt a mount
of the filesystem before doing this.

 

Link to comment

You can start by running memtest, and but since memtest is only definitive if it finds errors, and because you have multiple sticks, you can also try using the server with just one, if the same try with a different one, that will basically rule out bad RAM.

Link to comment
  • Solution
37 minutes ago, Rashoodi said:

attached output of  xfs repair

 

Phase 1 - find and verify superblock...
Phase 2 - using internal log
        - zero log...
ERROR: The filesystem has valuable metadata changes in a log which needs to
be replayed.  Mount the filesystem to replay the log, and unmount it before
re-running xfs_repair.  If you are unable to mount the filesystem, then use
the -L option to destroy the log and attempt a repair.
Note that destroying the log may cause corruption -- please attempt a mount
of the filesystem before doing this.

 

You now need to run with -L and without -n.

Link to comment

Update.

 

I am XFS Repairing -v in the terminal and it will take some time as my drive is 12TB and could take till midnight,

 

I have removed some problematic RAM sticks and see if thats my problem, if not. I need to change the entire kit.

 

I have some error showing in my Unraid display (plugged into the GPU)

 

FS (md2p1): Corruption detected. Unmount and run xfs_repair

FS (md2p1): Corruption of in-memory data

FS (md2p1): Metadata I/O Error (0x1) detected at xfs_inactive_ifree.isra.0+0x123/0x174 (xfs] (fs/xfs/xfs_inode.c:1612). Shutt

FS (md2p1): Please unmount the filesystem and rectify the problem(s)

FS (md2p1): Failed to recover leftover Coll staging extents, err -5.

FS (md3p1): Corruption detected. Unmount and run xfs_repair

FS (md3p1): Metadata I/O Error (0x1) detected at xfs_inactive_ifree.isra.0+0x123/0x174 (xfs] (fs/xfs/xfs_inode.c:1612). Shutt

FS (md3p1): Please unmount the filesystem and rectify the problem(s)

KFS (md3p1): Failed to recover leftover Coll staging extents, err -5.

keep in mind I have 2 drives that are unmontable, Disk 2 sdb and Disk 3 sdd (being emulated)

Link to comment
1 hour ago, Rashoodi said:

problematic RAM sticks

You should never attempt to run any computer unless memory is working perfectly. Everything goes through RAM. The OS and other executable code, your data. Everything. The CPU can't do anything with anything until it is loaded into RAM.

 

If you have been running with bad RAM that is probably the reason you have corrupted your filesystems.

Link to comment

I have Good News 🙌🙌🙌

 

  1. My 2 Drives are back alive thanks to @itimpi
  2. I have solved my RAM issue by removing the sticks 2 + 4
  3. It is now stable

 

I have some bad news 😩😩😩

  1. bbergle-jellyfin is not working anymore,  I don't mind and will reinstall

 

And that's All, thanks to everyone who helped  me and hopefully you have a good day/night ahead

 

 

 

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...