Jump to content

Log Spam filling /var/log to 100%: Cannot stat file /proc/10916/fd/xxxxxx: Stale file handle


Recommended Posts

My logs are being spammed with messages about a "Stale file handle" to the point that /var/log is 100% filled and I have to restart my server. Here's an excerpt:

Jun 16 04:40:16 Tower root: Cannot stat file /proc/10916/fd/278312: Stale file handle
Jun 16 04:40:16 Tower root: Cannot stat file /proc/10916/fd/278314: Stale file handle
Jun 16 04:40:16 Tower root: Cannot stat file /proc/10916/fd/278316: Stale file handle
Jun 16 04:40:16 Tower root: Cannot stat file /proc/10916/fd/278318: Stale file handle
Jun 16 04:40:16 Tower root: Cannot stat file /proc/10916/fd/278321: Stale file handle
Jun 16 04:40:16 Tower root: Cannot stat file /proc/10916/fd/278322: Stale file handle
Jun 16 04:40:16 Tower root: Cannot stat file /proc/10916/fd/278324: Stale file handle
Jun 16 04:40:16 Tower root: Cannot stat file /proc/10916/fd/278326: Stale file handle
Jun 16 04:40:16 Tower root: Cannot stat file /proc/10916/fd/278329: Stale file handle
Jun 16 04:40:16 Tower root: Cannot stat file /proc/10916/fd/278330: Stale file handle
Jun 16 04:40:16 Tower root: Cannot stat file /proc/10916/fd/278331: Stale file handle
Jun 16 04:40:16 Tower root: Cannot stat file /proc/10916/fd/278335: Stale file handle
Jun 16 04:40:16 Tower root: Cannot stat file /proc/10916/fd/278337: Stale file handle
Jun 16 04:40:16 Tower root: Cannot stat file /proc/10916/fd/278338: Stale file handle
Jun 16 04:40:16 Tower root: Cannot stat file /proc/10916/fd/278339: Stale file handle
Jun 16 04:40:16 Tower root: Cannot stat file /proc/10916/fd/278342: Stale file handle
Jun 16 04:40:16 Tower root: Cannot stat file /proc/10916/fd/278799: Stale file handle
Jun 16 04:40:16 Tower root: Cannot stat file /proc/10916/fd/278800: Stale file handle
Jun 16 04:40:16 Tower root: Cannot stat file /proc/10916/fd/278802: Stale file handle
Jun 16 04:40:16 Tower root: Cannot stat file /proc/10916/fd/280171: Stale file handle
Jun 16 04:40:16 Tower root: Cannot stat file /proc/10916/fd/280172: Stale file handle

How can I fix this and/or stop it from filling up my logs?

Link to comment
18 hours ago, JorgeB said:

Please post the diagnostics.

Trying, but every time I try to download the diagnostics the tab in chrome eventually crashes with an out of memory error. Trying now on Edge, but it's been probably about 20 minutes so far with memory usage slowly increasing. Seems like it's trying to run sed on a bunch of files in my cache drive:
image.thumb.png.bb8e07d34c57a2a0cf4bcb2b9307e815.png

If it ever finishes I'll go ahead and upload the diagnostics.

Link to comment

You have multiple things going on:

Jun 16 19:45:29 Tower rc.docker: pihole: Error response from daemon: network br0 not found
Jun 16 19:45:29 Tower rc.docker: Error: failed to start containers: pihole

 

Jun 16 19:46:15 Tower root: Minecraft-polytech-direwolf20-1.19: Could not download icon https://vignette.wikia.nocookie.net/minecraft/images/7/7b/Grass_block.png
Jun 16 19:46:15 Tower root: urBackup-Server: Could not download icon http://192.168.1.90:55414/images/urbackup.png
Jun 16 19:46:24 Tower  smbd[17971]: [2023/06/16 19:46:24.485175,  0] ../../source3/smbd/close.c:1397(close_directory)
Jun 16 19:46:24 Tower  smbd[17971]:   close_directory: Could not get share mode lock for polytech-direwolf20-1.19
Jun 16 19:46:24 Tower  smbd[17971]: [2023/06/16 19:46:24.485224,  0] ../../source3/smbd/fd_handle.c:39(fd_handle_destructor)

 

 

Jun 16 19:51:00 Tower root: Fix Common Problems Version 2023.04.26
Jun 16 19:51:00 Tower root: Fix Common Problems: Error: eth0 does not have a valid IP address

 

You have networking issues you need to straighten out.  I'd also get all your Docker Containers updated and turn off mover logging.

 

  • Like 1
Link to comment
3 hours ago, dlandon said:

You have networking issues you need to straighten out.

Working on it already:

 

3 hours ago, dlandon said:

I'd also get all your Docker Containers updated and turn off mover logging.

I'll update the containers and try turning off mover logging. Are you saying the stale file handle messages are from the mover?

5 hours ago, JorgeB said:

Nothing obvious that I can see, are you using Time Machine with that share? Also is NFS enabled for any shares?

I have time machine enabled for that share in anticipation of potentially using it with time machine, but I don't have anything actually using it for time machine. NFS is disabled for all shares.

Link to comment
1 minute ago, sazrocks said:

I'll update the containers and try turning off mover logging. Are you saying the stale file handle messages are from the mover?

Not necessarily, the mover is very chatty and makes the log hard to read though.  That is not helping your log filling issue.

 

That being said, the stale file handle messages seem to appear when the mover is running.

 

Once you get your networking sorted out, post the diagnostics when the stale file handle messages appear again.

 

I would also try running your system without any Docker Containers running to see if that's contributing to the stale file handle messages.  I saw some log messages about the minecraft Docker Container that didn't look right.

Link to comment
16 minutes ago, dlandon said:

Not necessarily, the mover is very chatty and makes the log hard to read though.  That is not helping your log filling issue.

 

That being said, the stale file handle messages seem to appear when the mover is running.

 

Once you get your networking sorted out, post the diagnostics when the stale file handle messages appear again.

Makes sense. I’ll monitor the log now that I’ve disabled mover logging and will post diagnostics if/when it happens again.

17 minutes ago, dlandon said:

I would also try running your system without any Docker Containers running to see if that's contributing to the stale file handle messages.  I saw some log messages about the minecraft Docker Container that didn't look right.

Unfortunately I run a number of services not only just for myself that I try to minimize downtime for, so turning off all docker containers would be something I’d like to save for later if possible.

 

You say you saw some logs from a minecraft docker that don’t look right, is this the message you’re talking about?

Quote

close_directory: Could not get share mode lock for polytech-direwolf20-1.19

This is one of my currently running containers, not sure what the message means exactly. I haven’t had any strange behavior from the container.

Link to comment
  • 2 weeks later...

Here is what I see:

Jun 23 04:40:02 Tower root: Fix Common Problems Version 2023.04.26
Jun 23 04:40:03 Tower root: Fix Common Problems: Warning: Unraid OS not up to date
Jun 23 04:40:03 Tower root: Fix Common Problems: Warning: Docker Application thespaghettidetective_redis_1 has an update available for it
Jun 23 04:40:03 Tower root: Fix Common Problems: Warning: Share Secure-storage is set for both included (disk4) and excluded (disk1,disk2,disk3) disks
Jun 23 04:40:03 Tower root: Fix Common Problems: Warning: Share VMs is set for both included (disk3) and excluded (disk1,disk2) disks
Jun 23 04:40:03 Tower root: Fix Common Problems: Warning: Deprecated plugin ca.backup2.plg
Jun 23 04:40:05 Tower root: Fix Common Problems: Warning: Docker application Dolphin has moderator comments listed
Jun 23 04:40:05 Tower root: Fix Common Problems: Warning: Docker application PlexMediaServer has moderator comments listed
Jun 23 04:40:05 Tower root: Fix Common Problems: Warning: Docker application steamcache-DNS has moderator comments listed ** Ignored
Jun 23 04:40:06 Tower root: Fix Common Problems: Warning: NerdPack.plg Not Compatible with Unraid version 6.11.5
Jun 23 04:40:07 Tower root: minecraft-James-revelation: Could not download icon https://vignette.wikia.nocookie.net/minecraft/images/7/7b/Grass_block.png
Jun 23 04:40:07 Tower root: Minecraft-polytech-direwolf20-1.19: Could not download icon https://vignette.wikia.nocookie.net/minecraft/images/7/7b/Grass_block.png
Jun 23 04:40:08 Tower root: urBackup-Server: Could not download icon http://192.168.1.90:55414/images/urbackup.png
Jun 23 04:40:10 Tower root: minecraft-James-revelation: Could not download icon https://vignette.wikia.nocookie.net/minecraft/images/7/7b/Grass_block.png
Jun 23 04:40:11 Tower root: Minecraft-polytech-direwolf20-1.19: Could not download icon https://vignette.wikia.nocookie.net/minecraft/images/7/7b/Grass_block.png
Jun 23 04:40:11 Tower root: urBackup-Server: Could not download icon http://192.168.1.90:55414/images/urbackup.png
Jun 23 04:40:11 Tower root: minecraft-James-revelation: Could not download icon https://vignette.wikia.nocookie.net/minecraft/images/7/7b/Grass_block.png
Jun 23 04:40:11 Tower root: Minecraft-polytech-direwolf20-1.19: Could not download icon https://vignette.wikia.nocookie.net/minecraft/images/7/7b/Grass_block.png
Jun 23 04:40:12 Tower root: urBackup-Server: Could not download icon http://192.168.1.90:55414/images/urbackup.png
Jun 23 04:40:12 Tower root: Fix Common Problems: Warning: Write Cache is disabled on disk5
Jun 23 04:40:13 Tower root: Fix Common Problems: Other Warning: Unassigned Devices Plus not installed
Jun 23 04:40:14 Tower root: Fix Common Problems: Warning: Docker Update Patch not installed
Jun 23 04:47:31 Tower  apcupsd[7934]: Communications with UPS lost.

 

Here are my suggestions:

  • Go through all the FCP issues and get them sorted out.  Be sure to remove the NerdPack plugin as it is not compatible with 6.11.  You have some disk mapping issues.
  • Find out why your UPS keeps losing communications.
  • Run Unraid without any Docker containers running and see if the log messages and filling of the log stops.
Link to comment
2 hours ago, dlandon said:

Here are my suggestions:

  • Go through all the FCP issues and get them sorted out. Be sure to remove the NerdPack plugin as it is not compatible with 6.11.  You have some disk mapping issues.

Done, except for migrating to 6.12. I see way too many issues with that here in these forums and I can't afford to deal with that right now.

Quote

Find out why your UPS keeps losing communications.

It's not connected. I went ahead and turned off the daemon.

Quote

Run Unraid without any Docker containers running and see if the log messages and filling of the log stops.

This is not really an option. It takes the better part of 10 days for the issue to manifest after a reboot, and the primary purpose of this server is to run a number of services using docker. I might as well just have the server turned off for 9 days. Is there any way at all to investigate if docker is the problem without rendering my server useless for a week?

I also went ahead and rebooted so that at least my VM can run. Will post here again if the issue recurs.

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...