[SOLVED] Corruption of in-memory data detected. Shutting down filesystem


Recommended Posts

Hello,

I have had some issues recently on and off with my server with XFS corruption on my cache drive. I repair the corruption with xfs_repair -v for the drive and then the issue comes back a week later. I just replaced a drive and now I am having some strange issues I don't know how to solve. But are not xfs corruption on the cache drive. It's showing a problem with md19 which is a drive im trying to replace. Please see my attached diagnostic logs.

 

Previous post of issues:

unraid-diagnostics-20200520-1051.zip

 

I decided it would be a good idea to run memtest. However when I choose that from the UNRAID boot menu, it imediately reboots my server. I am unable to load Memtest

Edited by guyonphone
adding more info
Link to comment

I have launched Memtest on the machine with a usb drive with just memtest86+! on it. Ill report back on the memory test status. If anyone is good enough to interpret my logs and see perhaps what is wrong I would truly appreciate it :)

 

Edited by guyonphone
adding more info
Link to comment

Hi Tee-Tee Jorge,

 

Im NOT having corruption over and over on the same drive, it's been different drives every time. First it was my docker image on my cache drive. Then it was my cache drive running XFS. And now it says md19 is having this issue which is emulated and in the process of being rebuilt as we speak.

 

I am receiving the following in my logs

 

May 21 17:35:52 Unraid emhttpd: error: get_fs_sizes, 6412: Input/output error (5): statfs: /mnt/user/lost+found

 

I have noticed I now have missing data from my array.

 

You helped another user with these issues of which I appear to be having the exact same issues.

 

I am going to put the old disk back in and perform a new config. I have attached my current logs for you to look over if you would look them over I would appreciate it as I don't know how to correctly determine what is happening.

unraid-diagnostics-20200521-1739.zip

Edited by guyonphone
Link to comment

Hello Advanced Member 😉

 

The usual advice is to set the Minimum for each user share to larger than the largest file you expect to write to the share.

 

Unraid has no way to know in advance how large a file will become when it chooses a disk for the file. If a disk has less than Minimum, it will choose another disk.

 

Note that doesn't mean the space will not be used. Here are some examples:

 

A share has 10G Minimum, a disk has 11G free. The disk can be chosen. You write a 9G file to the disk. It now has 2G free, which is less than Minimum, so the disk won't be chosen again.

 

A share has 10G Minimum, a disk has 11G free. The disk can be chosen. You write a 12G file. The write fails when the disk runs out of space.

  • Haha 1
Link to comment

Ha Ha! Ive been around on the forums for a while. Not necessarily advanced in the Unraid Technical sense. Right now I have a minimum free setting of 15GB usually 20GB, what can I say, i trim it down when im running out of space to squeeze the most out of things as I can. Drives are expensive. However 20GB even 50GB in relation to a 14TB drive is going to show 100% full, when you've filled the drives up to that level. Currently the disk i have with the least amount of free space is 6GB.
 

 

Edited by guyonphone
Link to comment
10 hours ago, trurl said:

Most of your shares have no Minimum Free

9 hours ago, guyonphone said:

I have a minimum free setting of 15GB

I do see one share with that setting. No way to know from diagnostics if that is the only share you typically write to or not.

 

Link to comment

Ok, my problem has come back so let me try to explain in the best way possible to solve this issue.

 

Originally:

I had some issue which corrupted my filesystem on drive md19. I hadn't fully realized this and I pulled the drive to replace it with a larger drive.  After this my array showed I lost a bunch of files. I stopped the rebuild, pulled the new drive (14tb) put the old drive back in (6tb), did a new config, and started the rebuild back. All my files suddenly returned that were lost on that drive.

 

Now:

My drive is rebuilding using the original drive(6TB) which was in md19. I checked my system and I now see that I have lost a bunch of files again. It looks like this is the culprit:

 

May 22 06:39:46 Unraid kernel: XFS (md19): xfs_do_force_shutdown(0x8) called from line 439 of file fs/xfs/libxfs/xfs_defer.c.  Return address = 00000000e39b5244
May 22 06:39:46 Unraid kernel: XFS (md19): Corruption of in-memory data detected.  Shutting down filesystem
May 22 06:39:46 Unraid kernel: XFS (md19): Please umount the filesystem and rectify the problem(s)

My array is currently rebuilding parity. What is the best method to continue forward without losing my data?

 

1. Should I allow the array the time to finish rebuilding and then try an XFS repair? Currently if I look at MD19 it shows empty no files. But the array sees that it is full of data.

 

or

 

2. Should I stop the array rebuild and try to run an XFS repair on an emulated disk?

 

or

 

3. Should I stop the rebuild pull the drive, see if I can copy data off the drive using some sort of third party tool to pull the data?

 

I know Tee-Tee Jorge said I should backup the data and format the drive, and I want to do what he says, but what's complicating things for me is the array rebuild.

 

 

Other Questions:

 

Why is Corruption of in-memory data happening? Looks like the answer to this is just XFS corruption, not RAM.

 

Will the XFS corruption cause issues with the Array rebuild? My guess is no, since the parity is raw values?

 

unraid-diagnostics-20200522-1305.zip

Edited by guyonphone
Link to comment

Just want to give an update.

 

After putting the 6TB back in, and rebuilding parity, i booted the array to maintenance mode, and ran an 'xfs_repair -n' I received a notice that the drive had valuable journal logs that need to be written back to the array and that I should try to mount the drive first. I tried and the drive would not mount. Therefore I ran an 'xfs_repair -L' and zeroed the log. I then was able to mount the drive, it doesn't look like i suffered any/much corruption.

 

Once my array was healthy I then replaced the 6TB drive with a 14TB drive and it's rebuilding.

 

As for my CPU temps. I took out the ol' air compressor w/ moisture trap and hit the fins with 120PSI this cleaned the heatsync honestly it wasn't that dusty (no build up between the fan and the fins) I think my heatsync just isn't really up to the task of properly cooling this CPU (it's a low profile noctua heatsync/fan) i will look into getting a beefier heatsync.

 

Thanks for your help Johnnie Black, turl!

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.