guyonphone Posted May 20, 2020 Posted May 20, 2020 (edited) Hello, I have had some issues recently on and off with my server with XFS corruption on my cache drive. I repair the corruption with xfs_repair -v for the drive and then the issue comes back a week later. I just replaced a drive and now I am having some strange issues I don't know how to solve. But are not xfs corruption on the cache drive. It's showing a problem with md19 which is a drive im trying to replace. Please see my attached diagnostic logs. Previous post of issues: unraid-diagnostics-20200520-1051.zip I decided it would be a good idea to run memtest. However when I choose that from the UNRAID boot menu, it imediately reboots my server. I am unable to load Memtest Edited May 24, 2020 by guyonphone adding more info Quote
JorgeB Posted May 20, 2020 Posted May 20, 2020 18 minutes ago, guyonphone said: However when I choose that from the UNRAID boot menu, it imediately reboots my server. I am unable to load Memtest Memtest won't work with UEFI boot, only legacy BIOS/CSM. Quote
guyonphone Posted May 20, 2020 Author Posted May 20, 2020 (edited) I have launched Memtest on the machine with a usb drive with just memtest86+! on it. Ill report back on the memory test status. If anyone is good enough to interpret my logs and see perhaps what is wrong I would truly appreciate it Edited May 20, 2020 by guyonphone adding more info Quote
JorgeB Posted May 20, 2020 Posted May 20, 2020 First rule out bad RAM since that could be the reason for multiple fs corruptions, if no errors are found I would try re-formatting that disk to create a new filesystem, but you need to back it up first. Quote
guyonphone Posted May 20, 2020 Author Posted May 20, 2020 (edited) My array is in a degraded status right now because I was upgrading a drive from a 6tb to a 14tb. This error occurred during the rebuild process. I will do as you ask, i appreciate the help Edited May 20, 2020 by guyonphone Quote
guyonphone Posted May 21, 2020 Author Posted May 21, 2020 (edited) Ram Tested Good. My array is rebuilding right now, what steps should I take next? Edited May 21, 2020 by guyonphone Quote
JorgeB Posted May 21, 2020 Posted May 21, 2020 Like mentioned if the same filesystem keeps getting corrupt I would create a new one. 1 Quote
guyonphone Posted May 22, 2020 Author Posted May 22, 2020 (edited) Hi Tee-Tee Jorge, Im NOT having corruption over and over on the same drive, it's been different drives every time. First it was my docker image on my cache drive. Then it was my cache drive running XFS. And now it says md19 is having this issue which is emulated and in the process of being rebuilt as we speak. I am receiving the following in my logs May 21 17:35:52 Unraid emhttpd: error: get_fs_sizes, 6412: Input/output error (5): statfs: /mnt/user/lost+found I have noticed I now have missing data from my array. You helped another user with these issues of which I appear to be having the exact same issues. I am going to put the old disk back in and perform a new config. I have attached my current logs for you to look over if you would look them over I would appreciate it as I don't know how to correctly determine what is happening. unraid-diagnostics-20200521-1739.zip Edited May 22, 2020 by guyonphone Quote
trurl Posted May 22, 2020 Posted May 22, 2020 My guess is overfilling your disks is probably what is corrupting things. Quit writing to your server until you get more capacity, and quit caching so much. Most of your shares have no Minimum Free Quote
guyonphone Posted May 22, 2020 Author Posted May 22, 2020 (edited) Hello Constructor, What is the correct amount of capacity to have free to safely write to my array? Thanks Edited May 22, 2020 by guyonphone Quote
trurl Posted May 22, 2020 Posted May 22, 2020 Hello Advanced Member 😉 The usual advice is to set the Minimum for each user share to larger than the largest file you expect to write to the share. Unraid has no way to know in advance how large a file will become when it chooses a disk for the file. If a disk has less than Minimum, it will choose another disk. Note that doesn't mean the space will not be used. Here are some examples: A share has 10G Minimum, a disk has 11G free. The disk can be chosen. You write a 9G file to the disk. It now has 2G free, which is less than Minimum, so the disk won't be chosen again. A share has 10G Minimum, a disk has 11G free. The disk can be chosen. You write a 12G file. The write fails when the disk runs out of space. 1 Quote
guyonphone Posted May 22, 2020 Author Posted May 22, 2020 (edited) Ha Ha! Ive been around on the forums for a while. Not necessarily advanced in the Unraid Technical sense. Right now I have a minimum free setting of 15GB usually 20GB, what can I say, i trim it down when im running out of space to squeeze the most out of things as I can. Drives are expensive. However 20GB even 50GB in relation to a 14TB drive is going to show 100% full, when you've filled the drives up to that level. Currently the disk i have with the least amount of free space is 6GB. Edited May 22, 2020 by guyonphone Quote
trurl Posted May 22, 2020 Posted May 22, 2020 10 hours ago, trurl said: Most of your shares have no Minimum Free 9 hours ago, guyonphone said: I have a minimum free setting of 15GB I do see one share with that setting. No way to know from diagnostics if that is the only share you typically write to or not. Quote
guyonphone Posted May 22, 2020 Author Posted May 22, 2020 (edited) Ok, my problem has come back so let me try to explain in the best way possible to solve this issue. Originally: I had some issue which corrupted my filesystem on drive md19. I hadn't fully realized this and I pulled the drive to replace it with a larger drive. After this my array showed I lost a bunch of files. I stopped the rebuild, pulled the new drive (14tb) put the old drive back in (6tb), did a new config, and started the rebuild back. All my files suddenly returned that were lost on that drive. Now: My drive is rebuilding using the original drive(6TB) which was in md19. I checked my system and I now see that I have lost a bunch of files again. It looks like this is the culprit: May 22 06:39:46 Unraid kernel: XFS (md19): xfs_do_force_shutdown(0x8) called from line 439 of file fs/xfs/libxfs/xfs_defer.c. Return address = 00000000e39b5244 May 22 06:39:46 Unraid kernel: XFS (md19): Corruption of in-memory data detected. Shutting down filesystem May 22 06:39:46 Unraid kernel: XFS (md19): Please umount the filesystem and rectify the problem(s) My array is currently rebuilding parity. What is the best method to continue forward without losing my data? 1. Should I allow the array the time to finish rebuilding and then try an XFS repair? Currently if I look at MD19 it shows empty no files. But the array sees that it is full of data. or 2. Should I stop the array rebuild and try to run an XFS repair on an emulated disk? or 3. Should I stop the rebuild pull the drive, see if I can copy data off the drive using some sort of third party tool to pull the data? I know Tee-Tee Jorge said I should backup the data and format the drive, and I want to do what he says, but what's complicating things for me is the array rebuild. Other Questions: Why is Corruption of in-memory data happening? Looks like the answer to this is just XFS corruption, not RAM. Will the XFS corruption cause issues with the Array rebuild? My guess is no, since the parity is raw values? unraid-diagnostics-20200522-1305.zip Edited May 22, 2020 by guyonphone Quote
JorgeB Posted May 23, 2020 Posted May 23, 2020 11 hours ago, guyonphone said: 1. Should I allow the array the time to finish rebuilding and then try an XFS repair? This. Also the CPU is overheating, you need to check/clean the cooler. 1 Quote
guyonphone Posted May 24, 2020 Author Posted May 24, 2020 Just want to give an update. After putting the 6TB back in, and rebuilding parity, i booted the array to maintenance mode, and ran an 'xfs_repair -n' I received a notice that the drive had valuable journal logs that need to be written back to the array and that I should try to mount the drive first. I tried and the drive would not mount. Therefore I ran an 'xfs_repair -L' and zeroed the log. I then was able to mount the drive, it doesn't look like i suffered any/much corruption. Once my array was healthy I then replaced the 6TB drive with a 14TB drive and it's rebuilding. As for my CPU temps. I took out the ol' air compressor w/ moisture trap and hit the fins with 120PSI this cleaned the heatsync honestly it wasn't that dusty (no build up between the fan and the fins) I think my heatsync just isn't really up to the task of properly cooling this CPU (it's a low profile noctua heatsync/fan) i will look into getting a beefier heatsync. Thanks for your help Johnnie Black, turl! Quote
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.