Why is my Syslog getting full???


Recommended Posts

My syslog keeps getting full and locking up the server. It usually is at like 1% then all the sudden out of the blue it fills up. I think it might coincide with a parity check running. I have a 12TB parity drive and it seems like every time this has happened it I noticed it had just finished a parity drive within a day.  I setup a "Syslog server" after the last time so I wouldn't loose the file (normally the server freezes and when I reboot the syslog is erased)

 

Well I'm trying to upload the file but it is 13.5GB and when I try to attach it here I keep getting a error -200 and it wont let me attach it....guessing because of the size. Any ideas on what to do???

 

Thanks!!!

Link to comment

If your log is getting full, there is most certainly some stuff spamming the log.

Either errors or mover logging.

 

You should identify it and if it is errors, attach an extract of the log with that. The larger the better, ideally when the error starts.

 

Did you try to zip the syslog ? If the same error pops again and again, you might have a pretty good compression ratio and be able to attach the file ?

Edited by ChatNoir
Link to comment
On 2/2/2021 at 12:39 AM, ChatNoir said:

If your log is getting full, there is most certainly some stuff spamming the log.

Either errors or mover logging.

 

You should identify it and if it is errors, attach an extract of the log with that. The larger the better, ideally when the error starts.

 

Did you try to zip the syslog ? If the same error pops again and again, you might have a pretty good compression ratio and be able to attach the file ?

 

I'm sorry for the delay but I've not been able to get back to the forum until now. So I zipped the file and it is 112MB but it still wont upload. It gives me a error -200 which I guess it for being too big but I don't really know. Without compression the file is 13.4 GB so from there down to 112mb seems pretty good. Any other ideas? Should I try to use some specific zip tool that may make it smaller?

 

With the file being 13.4 GB I cant even open it in Notepad because it says the file is too large for Notepad. Is there some other app that I can use to open a log file that will take large files?

 

Also, I do have my mover scheduled to be often....hourly....however my log file is 1% all the time, then in one day, right after a parity check it becomes maxed out at 100%. I have my parity check running monthly and this seems to always happen right after the parity check is finished.

Edited by SPOautos
Link to comment

If your syslog is growing out of control but your server hasn't yet crashed you could type the following at the command line to display the last, say 20, lines on the screen:

 

tail -n 20 /var/log/syslog

 

Or, to extract the last 500 lines to a separate file on your flash device:

 

tail -n 500 /var/log/syslog > /boot/syslog-tail

 

Link to comment
On 2/6/2021 at 9:29 PM, John_M said:

If your syslog is growing out of control but your server hasn't yet crashed you could type the following at the command line to display the last, say 20, lines on the screen:

 


tail -n 20 /var/log/syslog

 

Or, to extract the last 500 lines to a separate file on your flash device:

 


tail -n 500 /var/log/syslog > /boot/syslog-tail

 

 

Thank you, I'll probably do that. It typically isnt growing out of control but for some reason about once a month, all the sudden, it fills up very fast all at once and the server crashes. It must be happening in a matter of hours and I've never been able to catch it in process, I just see the server crash. I also have parity check run once a month and it always seems to happen in conjunction with that.

Link to comment
On 2/8/2021 at 9:19 AM, trurl said:

You should post diagnostics before this happens just to give us a baseline of how your system is configured with it working normally.

 

Here is what it looks like right now with the server running good. I updated my containers today and everything is running great. The log is showing 1% on the Dashboard, which is what it normally shows except when it goes nuts every so often and maxes out.

 

Do you see anything in here that looks unusual?

tower-diagnostics-20210215-2145.zip

Link to comment

You have errors in your log :

 

Feb  9 07:46:56 Tower kernel: pcieport 0000:00:03.0: AER: Multiple Corrected error received: 0000:00:03.0
Feb  9 07:46:56 Tower kernel: pcieport 0000:00:03.0: PCIe Bus Error: severity=Corrected, type=Data Link Layer, (Receiver ID)
Feb  9 07:46:56 Tower kernel: pcieport 0000:00:03.0:   device [8086:2f08] error status/mask=00000080/00002000
Feb  9 07:46:56 Tower kernel: pcieport 0000:00:03.0:    [ 7] BadDLLP 

but the frequency does not look like it should fill the log.

 

I also see this type of errors

Feb  4 19:13:45 Tower kernel: igb 0000:0c:00.0 eth1: igb: eth1 NIC Link is Down
Feb  4 19:13:45 Tower kernel: e1000e: eth0 NIC Link is Down
Feb  4 19:13:45 Tower kernel: bond0: link status definitely down for interface eth0, disabling it
Feb  4 19:13:45 Tower kernel: bond0: link status definitely down for interface eth1, disabling it
Feb  4 19:13:45 Tower kernel: bond0: now running without any active interface!
Feb  4 19:13:45 Tower kernel: br0: port 1(bond0) entered disabled state
Feb  4 19:13:46 Tower dhcpcd[2809]: br0: carrier lost
Feb  4 19:13:46 Tower dhcpcd[2809]: br0: deleting route to 192.168.1.0/24
Feb  4 19:13:46 Tower dhcpcd[2809]: br0: deleting default route via 192.168.1.1
Feb  4 19:13:46 Tower rsyslogd: omfwd/udp: socket 6: sendto() error: Network is unreachable [v8.1908.0 try https://www.rsyslog.com/e/2354 ]
Feb  4 19:13:46 Tower rsyslogd: omfwd: socket 6: error 101 sending via udp: Network is unreachable [v8.1908.0 try https://www.rsyslog.com/e/2354 ]
Feb  4 19:13:46 Tower rsyslogd: action 'action-2-builtin:omfwd' suspended (module 'builtin:omfwd'), retry 0. There should be messages before this one giving the reason for suspension. [v8.1908.0 try https://www.rsyslog.com/e/2007 ]
Feb  4 19:13:46 Tower rsyslogd: action 'action-2-builtin:omfwd' resumed (module 'builtin:omfwd') [v8.1908.0 try https://www.rsyslog.com/e/2359 ]
Feb  4 19:13:46 Tower rsyslogd: omfwd/udp: socket 2: sendto() error: Network is unreachable [v8.1908.0 try https://www.rsyslog.com/e/2354 ]
Feb  4 19:13:46 Tower rsyslogd: omfwd: socket 2: error 101 sending via udp: Network is unreachable [v8.1908.0 try https://www.rsyslog.com/e/2354 ]
Feb  4 19:13:46 Tower rsyslogd: action 'action-2-builtin:omfwd' suspended (module 'builtin:omfwd'), retry 0. There should be messages before this one giving the reason for suspension. [v8.1908.0 try https://www.rsyslog.com/e/2007 ]
Feb  4 19:13:46 Tower rsyslogd: action 'action-2-builtin:omfwd' resumed (module 'builtin:omfwd') [v8.1908.0 try https://www.rsyslog.com/e/2359 ]
Feb  4 19:13:46 Tower rsyslogd: omfwd/udp: socket 2: sendto() error: Network is unreachable [v8.1908.0 try https://www.rsyslog.com/e/2354 ]
Feb  4 19:13:46 Tower rsyslogd: omfwd: socket 2: error 101 sending via udp: Network is unreachable [v8.1908.0 try https://www.rsyslog.com/e/2354 ]
Feb  4 19:13:46 Tower rsyslogd: action 'action-2-builtin:omfwd' suspended (module 'builtin:omfwd'), retry 0. There should be messages before this one giving the reason for suspension. [v8.1908.0 try https://www.rsyslog.com/e/2007 ]

 

I guess that if the network stayed down long enough it could spam the log to death ?

 

You should look at those issues.

Link to comment
6 hours ago, ChatNoir said:

You have errors in your log :

 



Feb  9 07:46:56 Tower kernel: pcieport 0000:00:03.0: AER: Multiple Corrected error received: 0000:00:03.0
Feb  9 07:46:56 Tower kernel: pcieport 0000:00:03.0: PCIe Bus Error: severity=Corrected, type=Data Link Layer, (Receiver ID)
Feb  9 07:46:56 Tower kernel: pcieport 0000:00:03.0:   device [8086:2f08] error status/mask=00000080/00002000
Feb  9 07:46:56 Tower kernel: pcieport 0000:00:03.0:    [ 7] BadDLLP 

but the frequency does not look like it should fill the log.

 

I also see this type of errors



Feb  4 19:13:45 Tower kernel: igb 0000:0c:00.0 eth1: igb: eth1 NIC Link is Down
Feb  4 19:13:45 Tower kernel: e1000e: eth0 NIC Link is Down
Feb  4 19:13:45 Tower kernel: bond0: link status definitely down for interface eth0, disabling it
Feb  4 19:13:45 Tower kernel: bond0: link status definitely down for interface eth1, disabling it
Feb  4 19:13:45 Tower kernel: bond0: now running without any active interface!
Feb  4 19:13:45 Tower kernel: br0: port 1(bond0) entered disabled state
Feb  4 19:13:46 Tower dhcpcd[2809]: br0: carrier lost
Feb  4 19:13:46 Tower dhcpcd[2809]: br0: deleting route to 192.168.1.0/24
Feb  4 19:13:46 Tower dhcpcd[2809]: br0: deleting default route via 192.168.1.1
Feb  4 19:13:46 Tower rsyslogd: omfwd/udp: socket 6: sendto() error: Network is unreachable [v8.1908.0 try https://www.rsyslog.com/e/2354 ]
Feb  4 19:13:46 Tower rsyslogd: omfwd: socket 6: error 101 sending via udp: Network is unreachable [v8.1908.0 try https://www.rsyslog.com/e/2354 ]
Feb  4 19:13:46 Tower rsyslogd: action 'action-2-builtin:omfwd' suspended (module 'builtin:omfwd'), retry 0. There should be messages before this one giving the reason for suspension. [v8.1908.0 try https://www.rsyslog.com/e/2007 ]
Feb  4 19:13:46 Tower rsyslogd: action 'action-2-builtin:omfwd' resumed (module 'builtin:omfwd') [v8.1908.0 try https://www.rsyslog.com/e/2359 ]
Feb  4 19:13:46 Tower rsyslogd: omfwd/udp: socket 2: sendto() error: Network is unreachable [v8.1908.0 try https://www.rsyslog.com/e/2354 ]
Feb  4 19:13:46 Tower rsyslogd: omfwd: socket 2: error 101 sending via udp: Network is unreachable [v8.1908.0 try https://www.rsyslog.com/e/2354 ]
Feb  4 19:13:46 Tower rsyslogd: action 'action-2-builtin:omfwd' suspended (module 'builtin:omfwd'), retry 0. There should be messages before this one giving the reason for suspension. [v8.1908.0 try https://www.rsyslog.com/e/2007 ]
Feb  4 19:13:46 Tower rsyslogd: action 'action-2-builtin:omfwd' resumed (module 'builtin:omfwd') [v8.1908.0 try https://www.rsyslog.com/e/2359 ]
Feb  4 19:13:46 Tower rsyslogd: omfwd/udp: socket 2: sendto() error: Network is unreachable [v8.1908.0 try https://www.rsyslog.com/e/2354 ]
Feb  4 19:13:46 Tower rsyslogd: omfwd: socket 2: error 101 sending via udp: Network is unreachable [v8.1908.0 try https://www.rsyslog.com/e/2354 ]
Feb  4 19:13:46 Tower rsyslogd: action 'action-2-builtin:omfwd' suspended (module 'builtin:omfwd'), retry 0. There should be messages before this one giving the reason for suspension. [v8.1908.0 try https://www.rsyslog.com/e/2007 ]

 

I guess that if the network stayed down long enough it could spam the log to death ?

 

You should look at those issues.

 

Regarding the errors on Feb 4th, was it just on Feb 4th or is it a ongoing error? We were having some internet issues some time not too long ago and I rebooted the router. If Feb 4th was a isolated incident, it could be from that, right?

 

I dont have any idea regarding the pcieport error on Feb 9th. I only have 1 pcie device which is a video card. Does this mean something odd is going on with the video card? 

 

I'm sorry, I'm not much of a computer person so Im not positive what these are telling me.

Edited by SPOautos
Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.