Log Full 100% Every Day

bugsysiegals · January 2, 2021

I’m not sure what’s going on with my server but the log fills to 100% every single day. I’ve run a command from another thread to expand log to 300+ MB and the server stays at 1% log most of the day but then is 100% when I check it the following day. Another command seemed to indicate it was from Docker but I don’t know how to tell which Docker app would be responsible and yet when I review the log it seems maybe my cache drive has some issue but I’ve no idea.

Is there anything besides the attached diagnostics you need from me to help pinpoint the issue? Also, how can I better debug these issues on my own in the future to avoid cluttering the forum with more issue threads?

unraid-diagnostics-20210102-1107.zip

Squid · January 2, 2021

Your zip appears to be corrupt

bugsysiegals · January 3, 2021

Every time the log becomes 100% I get errors with my VM’s because /mnt directory says it cannot access some sub-folders ... perhaps the diagnostics is failing some of the process because of this?

I’ve rebooted but the log is only 1% ... I suspect you need it to be larger before I extract, right?

trurl · January 3, 2021

Post it now and again later when log gets larger but before it is full. It shouldn't need to be very full to see that something is wrong. My server has been up for over 2 weeks and log is less than 10% and I suspect a lot of that is the boot process.

bugsysiegals · January 3, 2021

I think it might be my cache drive as it mentioned something about XFS and the cache drive. When I put the array into maintenance mode and scan the disk it tells me there’s some changes in the log which need to be replayed but I can’t seem to do that by starting the array normal, stop, and put into maintenance mode. Of course maybe I’m way off but maybe not.

Here’s a fresh log which might show this cache drive issue?

Edited January 3, 2021 by bugsysiegals
attachment not working from iphone

trurl · January 3, 2021

Your diagnostics are unavailable. I don't know why

bugsysiegals · January 3, 2021

Attached from PC.

unraid-diagnostics-20210102-2040.zip

trurl · January 3, 2021

Log usage looks normal in those, but there are some things I notice.

For one thing, your flash drive is showing up as 2 devices, sda and sr0, and sr0 is logging some errors. How is flash attached? You might try another port.

Also looks like you may have some corruption with cache filesystem. See here:

https://wiki.unraid.net/Check_Disk_Filesystems#Checking_and_fixing_drives_in_the_webGui

And you have a Ryzen CPU. Don't know if you have seen this or not:

bugsysiegals · January 3, 2021

Thanks for checking the logs for me @trurl.

I've the flash drive connected to 1 of 7 ports on the back of the PC. I believe some of the ports are USB 2.0 while others are USB 3.0 ... I say this because I've added USB 3.0 controller to the flash drive config using vfio-pci.ids in order to pass through the USB 3.0 controller to a Windows VM so I can hot swap USB devices on a USB hub. The USB hubs devices are only detected by Windows when the USB hub is connected to specific ports in the back so I assume some are USB 2.0. I suspect unRAID flash drive is connected to a USB 2.0 port, not sure if it would perform better with USB 3.0, otherwise wouldn't boot with USB 3.0 being passed through? I should also mention my Windows VM USB Controller setting is set to 3.0 (qemu XHCI) and I select the USB 3.0 from PCI Devices ... since I'm passing through this controller and unRAID cannot see it, will this cause some issue and excessive logging?

FWIW - I do not have permit UEFI mode checked within unRAID, have set the BIOS "Boot from Storage Devices" to Legacy only, and have chosen to boot the 3835MB rather than 95MB selection from my unRAID flash drive.

I'm not overclocking anything and have my 2x 16GB memory running at 2133Mhz with voltage of 1.200V. I'm not entirely sure how to read the page you linked as it relates to memory ranks. I've Ryzen Gen 2, 2600Xwith, with 2 of 4 memory slots filled ... should I be overclocking the memory higher than 2133Mhz?

I do not see any settings for PSU but the ASUS BIOS has stuff hidden so I'll google to see if I can find it.

I'm not sure why the flash drive would be mounted as sr0 ... I only see sde on the main unRAID page. I'll check out the corruption link and post my findings.

Edited January 3, 2021 by bugsysiegals

bugsysiegals · January 3, 2021

Here's the results of running Check File System Status on the cache with options -nv.

Phase 1 - find and verify superblock...
        - block cache size set to 1536576 entries
Phase 2 - using internal log
        - zero log...
zero_log: head block 24696 tail block 22124
ALERT: The filesystem has valuable metadata changes in a log which is being
ignored because the -n option was used.  Expect spurious inconsistencies
which may be resolved by first mounting the filesystem to replay the log.
        - scan filesystem freespace and inode maps...
sb_icount 389504, counted 446016
sb_ifree 1827, counted 164
sb_fdblocks 67598087, counted 64496147
        - found root inode chunk
Phase 3 - for each AG...
        - scan (but don't clear) agi unlinked lists...
        - process known inodes and perform inode discovery...
        - agno = 0
        - agno = 1
        - agno = 2
        - agno = 3
        - process newly discovered inodes...
Phase 4 - check for duplicate blocks...
        - setting up duplicate extent list...
        - check for inodes claiming duplicate blocks...
        - agno = 0
        - agno = 1
        - agno = 2
        - agno = 3
No modify flag set, skipping phase 5
Phase 6 - check inode connectivity...
        - traversing filesystem ...
        - agno = 0
        - agno = 1
        - agno = 2
        - agno = 3
        - traversal finished ...
        - moving disconnected inodes to lost+found ...
Phase 7 - verify link counts...
would have reset inode 270693270 nlinks from 0 to 1
No modify flag set, skipping filesystem flush and exiting.

        XFS_REPAIR Summary    Sun Jan  3 10:31:09 2021

Phase		Start		End		Duration
Phase 1:	01/03 10:31:06	01/03 10:31:06
Phase 2:	01/03 10:31:06	01/03 10:31:07	1 second
Phase 3:	01/03 10:31:07	01/03 10:31:08	1 second
Phase 4:	01/03 10:31:08	01/03 10:31:08
Phase 5:	Skipped
Phase 6:	01/03 10:31:08	01/03 10:31:09	1 second
Phase 7:	01/03 10:31:09	01/03 10:31:09

Total run time: 3 seconds

Edited January 3, 2021 by bugsysiegals

bugsysiegals · January 3, 2021

Not sure if this helps but when running from command line this is what I see and the dots just keep going ... maybe would move to Phase 2 eventually but didn't within 5 minutes ...

xfs_repair -nv /dev/sde
Phase 1 - find and verify superblock...
bad primary superblock - bad magic number !!!

attempting to find secondary superblock...
.found candidate secondary superblock...
unable to verify superblock, continuing...
.found candidate secondary superblock...
unable to verify superblock, continuing...
.found candidate secondary superblock...
unable to verify superblock, continuing...
........................................................................................................................................................................................................................................................

trurl · January 3, 2021

28 minutes ago, bugsysiegals said:

unRAID flash drive is connected to a USB 2.0 port, not sure if it would perform better with USB 3.0

USB2 is more reliable and performance isn't a factor since Unraid OS is loaded fresh into RAM from the archives on flash at each boot, and runs completely in RAM. Flash also has settings you make in the webUI, and these are applied at boot, and updated on flash when changes are made. So very little access of flash.

27 minutes ago, bugsysiegals said:

running from command line this is what I see

The reason it isn't working from the command line is because you need to run it on a partition, not a device. Doing it from the webUI takes care of that for you and is better because there are several different ways you can do it wrong from the command line. For example, when working with disks in the parity array, you must specify the md# device for the disk instead of the sdX1 partition or you will invalidate parity.

35 minutes ago, bugsysiegals said:

results of running Check File System Status on the cache with options -nv

Those first results you posted look OK but of course it doesn't actually fix anything with the -n (nomodify) option specified. See the wiki link.

bugsysiegals · January 3, 2021

This is what I get when I try to run -v option. How do I mount the file system to replay the log and then unmount and try again the option -v? I started the array in regular mode, stopped, put back into maintenance mode, but this doesn’t seem to do anything.

Phase 1 - find and verify superblock...
        - block cache size set to 1536576 entries
Phase 2 - using internal log
        - zero log...
zero_log: head block 24696 tail block 22124
ERROR: The filesystem has valuable metadata changes in a log which needs to
be replayed.  Mount the filesystem to replay the log, and unmount it before
re-running xfs_repair.  If you are unable to mount the filesystem, then use
the -L option to destroy the log and attempt a repair.
Note that destroying the log may cause corruption -- please attempt a mount
of the filesystem before doing this.

Edited January 3, 2021 by bugsysiegals

trurl · January 3, 2021

Covered in the wiki

bugsysiegals · January 3, 2021

I didn’t see anything for XFS about replaying the log file.

It seems you said to stick with the GUI but I'm running "xfs_repair -v /dev/sde" from the command line as that's the only thing I can see on the tutorial regarding XFS drives.

Edited January 3, 2021 by bugsysiegals

trurl · January 3, 2021

49 minutes ago, bugsysiegals said:

I'm running "xfs_repair -v /dev/sde" from the command line

That is specifying the device. You must specify a partition on the device.

Here is a more recent wiki link:

https://wiki.unraid.net/UnRAID_6/Storage_Management#Repairing_a_File_System

bugsysiegals · January 3, 2021

I tried also with the partition but receive the same error I did from the GUI. Unfortunately I see nothing in the newest wiki regarding this specific error message...

root@unRAID:~# xfs_repair -v /dev/sde1
Phase 1 - find and verify superblock...
- block cache size set to 1536576 entries
Phase 2 - using internal log
- zero log...
zero_log: head block 24696 tail block 22124
ERROR: The filesystem has valuable metadata changes in a log which needs to
be replayed. Mount the filesystem to replay the log, and unmount it before
re-running xfs_repair. If you are unable to mount the filesystem, then use
the -L option to destroy the log and attempt a repair.
Note that destroying the log may cause corruption -- please attempt a mount
of the filesystem before doing this.

trurl · January 3, 2021

From that wiki

Quote

If repairing a XFS formatted drive then it is quite normal for the xfs_repair process to give you a warning and saying you need to provide the -L option to proceed. Despite this ominous warning message this is virtually always the right thing to do and does not result in data loss.

bugsysiegals · January 3, 2021

I clicked "Disk Log Information" for the Cache drive and found "mount -t xfs -o noatime,nodiratime /dev/sde1 /mnt/cache" in the log details. I tried this in the command line which gave an error about the mount point not existing so I changed it to /mnt which did exist and it mounted the drive. I then tried to run the repair but the disk was mounted so I unmounted with "umount /dev/sde1". Finally, I did "xfs_repair -v /dev/sde1" which proceeded with what seemed some successful operation.

I did it again from the GUI using -v option and this is what I see ... seems good now?

Phase 1 - find and verify superblock...
        - block cache size set to 1536568 entries
Phase 2 - using internal log
        - zero log...
zero_log: head block 24700 tail block 24700
        - scan filesystem freespace and inode maps...
        - found root inode chunk
Phase 3 - for each AG...
        - scan and clear agi unlinked lists...
        - process known inodes and perform inode discovery...
        - agno = 0
        - agno = 1
        - agno = 2
        - agno = 3
        - process newly discovered inodes...
Phase 4 - check for duplicate blocks...
        - setting up duplicate extent list...
        - check for inodes claiming duplicate blocks...
        - agno = 0
        - agno = 1
        - agno = 2
        - agno = 3
Phase 5 - rebuild AG headers and trees...
        - agno = 0
        - agno = 1
        - agno = 2
        - agno = 3
        - reset superblock...
Phase 6 - check inode connectivity...
        - resetting contents of realtime bitmap and summary inodes
        - traversing filesystem ...
        - agno = 0
        - agno = 1
        - agno = 2
        - agno = 3
        - traversal finished ...
        - moving disconnected inodes to lost+found ...
Phase 7 - verify and correct link counts...

        XFS_REPAIR Summary    Sun Jan  3 12:57:30 2021

Phase		Start		End		Duration
Phase 1:	01/03 12:57:27	01/03 12:57:27
Phase 2:	01/03 12:57:27	01/03 12:57:27
Phase 3:	01/03 12:57:27	01/03 12:57:29	2 seconds
Phase 4:	01/03 12:57:29	01/03 12:57:29
Phase 5:	01/03 12:57:29	01/03 12:57:29
Phase 6:	01/03 12:57:29	01/03 12:57:30	1 second
Phase 7:	01/03 12:57:30	01/03 12:57:30

Total run time: 3 seconds
done

Edited January 3, 2021 by bugsysiegals

bugsysiegals · January 3, 2021

FWIW - After rebooting, I see the following in the Disk Log Information for the Cache drive which had errors before ...

Jan 3 13:03:51 unRAID kernel: ata6: SATA max UDMA/133 abar m131072@0xfc680000 port 0xfc680380 irq 48
Jan 3 13:03:51 unRAID kernel: ata6: SATA link up 6.0 Gbps (SStatus 133 SControl 300)
Jan 3 13:03:51 unRAID kernel: ata6.00: supports DRM functions and may not be fully accessible
Jan 3 13:03:51 unRAID kernel: ata6.00: ATA-11: Samsung SSD 860 EVO 500GB, S3Z1NB0KC89212E, RVT02B6Q, max UDMA/133
Jan 3 13:03:51 unRAID kernel: ata6.00: 976773168 sectors, multi 1: LBA48 NCQ (depth 32), AA
Jan 3 13:03:51 unRAID kernel: ata6.00: supports DRM functions and may not be fully accessible
Jan 3 13:03:51 unRAID kernel: ata6.00: configured for UDMA/133
Jan 3 13:03:51 unRAID kernel: ata6.00: Enabling discard_zeroes_data
Jan 3 13:03:51 unRAID kernel: sd 6:0:0:0: [sde] 976773168 512-byte logical blocks: (500 GB/466 GiB)
Jan 3 13:03:51 unRAID kernel: sd 6:0:0:0: [sde] Write Protect is off
Jan 3 13:03:51 unRAID kernel: sd 6:0:0:0: [sde] Mode Sense: 00 3a 00 00
Jan 3 13:03:51 unRAID kernel: sd 6:0:0:0: [sde] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
Jan 3 13:03:51 unRAID kernel: ata6.00: Enabling discard_zeroes_data
Jan 3 13:03:51 unRAID kernel: sde: sde1
Jan 3 13:03:51 unRAID kernel: ata6.00: Enabling discard_zeroes_data
Jan 3 13:03:51 unRAID kernel: sd 6:0:0:0: [sde] Attached SCSI disk
Jan 3 13:03:58 unRAID emhttpd: Samsung_SSD_860_EVO_500GB_S3Z1NB0KC89212E (sde) 512 976773168
Jan 3 13:03:58 unRAID emhttpd: import 30 cache device: (sde) Samsung_SSD_860_EVO_500GB_S3Z1NB0KC89212E
Jan 3 13:04:03 unRAID emhttpd: shcmd (38): mount -t xfs -o noatime,nodiratime /dev/sde1 /mnt/cache
Jan 3 13:04:03 unRAID kernel: XFS (sde1): Mounting V5 Filesystem
Jan 3 13:04:04 unRAID kernel: XFS (sde1): Ending clean mount

trurl · January 3, 2021

11 minutes ago, bugsysiegals said:

I clicked "Disk Log Information" for the Cache drive and found "mount -t xfs -o noatime,nodiratime /dev/sde1 /mnt/cache" in the log details. I tried this in the command line which gave an error about the mount point not existing so I changed it to /mnt which did exist and it mounted the drive. I then tried to run the repair but the disk was mounted so I unmounted with "umount /dev/sde1". Finally, I did "xfs_repair -v /dev/sde1" which proceeded with what seemed some successful operation.

Don't know where you got all that additional complication.

When the array is started in normal mode no disk is mounted. When the array is started in maintenance mode no disks are mounted. All you needed to do is start in maintenance mode and use the -L option.

Looks OK though. Post new diagnostics.

bugsysiegals · January 3, 2021

It was saying the disk must be mounted to replay the log so I figured I needed to mount it ... in the future I'll do the -L.

Here's the latest diagnostics ...

unraid-diagnostics-20210103-1324.zip

bugsysiegals · January 3, 2021

I see my USB flash drive is showing up as sda and sr0 as you mentioned. Would this be because it sees the two things I mentioned earlier which are 3835MB and 95MB? I'm not even sure why it sees two different things and assumed one was maybe for UEFI but maybe the drive has been formatted incorrectly?

trurl · January 3, 2021

What do you get from the command line with this?

lsblk

bugsysiegals · January 4, 2021

Linux 4.19.107-Unraid.
Last login: Sun Jan  3 18:00:45 -0600 2021 on /dev/pts/1.
root@unRAID:~# lsblk
NAME   MAJ:MIN RM   SIZE RO TYPE MOUNTPOINT
loop0    7:0    0  22.7M  1 loop /lib/modules
loop1    7:1    0   7.1M  1 loop /lib/firmware
loop2    7:2    0    25G  0 loop /var/lib/docker
loop3    7:3    0     1G  0 loop /etc/libvirt
sda      8:0    1   3.8G  0 disk 
└─sda1   8:1    1   3.8G  0 part /boot
sdb      8:16   0   7.3T  0 disk 
└─sdb1   8:17   0   7.3T  0 part 
sdc      8:32   0   7.3T  0 disk 
└─sdc1   8:33   0   7.3T  0 part 
sdd      8:48   0   3.7T  0 disk 
├─sdd1   8:49   0    16M  0 part 
└─sdd2   8:50   0   3.7T  0 part 
sde      8:64   0 465.8G  0 disk 
└─sde1   8:65   0 465.8G  0 part /mnt/cache
sdf      8:80   0 232.9G  0 disk 
├─sdf1   8:81   0   499M  0 part 
├─sdf2   8:82   0   100M  0 part 
├─sdf3   8:83   0    16M  0 part 
└─sdf4   8:84   0 232.3G  0 part 
sdg      8:96   0   7.3T  0 disk 
└─sdg1   8:97   0   7.3T  0 part 
md1      9:1    0   7.3T  0 md   /mnt/disk1
md2      9:2    0   7.3T  0 md   /mnt/disk2
sr0     11:0    1    96M  0 rom  
root@unRAID:~#

It seems I’m correct ... should the unRAID flash drive have this extra 96M bootable existence?

Log Full 100% Every Day

Recommended Posts

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Join the conversation