Persistent Syslog or "How to save syslog in a crash"


Recommended Posts

I've seen some discussion on creating a persistent RAM disk in old forum posts.  Has anyone seen an app or script that can save the syslog when the server crashes?  In the past 6 or 8 months, I've had my server crash hard, requiring a power-on restart.  I'd love to see if the log can point anything out.  I made the conversion from 5.x to 6.x with my SAS2LP controller, been hearing that's no good anymore, so considering a change.  I have two servers with that controller on unRaid 6.x.  One has continued to work without error.  The other, not so well...

 

It would be nice to have something like this available in unRaid...

OS feature to write syslog to flash...

https://www.cisco.com/c/en/us/td/docs/ios/12_0s/feature/guide/cs_sysls.html

 

Appreciate any advice/tips/techniques to make the syslog persistent so I can see if there is anything diagnostic in it...

Edited by jeffreywhunter
Link to comment
1 minute ago, jeffreywhunter said:

I've seen some discussion on creating a persistent RAM disk in old forum posts.  Has anyone seen an app or script that can save the syslog when the server crashes?  In the past 6 or 8 months, I've had my server crash hard, requiring a power-on restart.  I'd love to see if the log can point anything out.  I made the conversion from 5.x to 6.x with my SAS2LP controller, been hearing that's no good anymore, so considering a change.  I have two servers with that controller on unRaid 6.x.  One has continued to work without error.  The other, not so well...

 

Appreciate any advice/tips/techniques to make the syslog persistent so I can see if there is anything diagnostic in it...

User scripts plugin

 

Add this as a script to run at first array start only in the background:


#!/bin/bash

mkdir -p /boot/logs
FILENAME="/boot/logs/syslog-$(date +%s)"
tail -f /var/log/syslog > $FILENAME

Will create a new file syslog-xxxxx (with xxx being the number of seconds since linux epoch time) at every boot, and continue until a reboot / crash.

 

Alternatively, Fix Common Problems in troubleshooting mode, but its not designed to run for long periods of time, and logs a ton of extra info that could be considered spam

 

  • Upvote 1
Link to comment
1 hour ago, jeffreywhunter said:

I have two servers with that controller on unRaid 6.x.  One has continued to work without error.  The other, not so well...

Can you post the diagnostics from both servers (labelled as to which is good / which is not so good).  I've been working on a theory with regards to Marvel controllers

 

 

Link to comment
On 8/7/2017 at 0:08 PM, Squid said:

Can you post the diagnostics from both servers (labelled as to which is good / which is not so good).  I've been working on a theory with regards to Marvel controllers

 

Here ya go.  I laid out all the disk arrays and what controllers they are attached to in the HunterNAS Disk Arrays.xlsx spreadsheet.  Let me know if you need anything further...  Hope it helps!  Happy to run any diagnostics or even do a screen share to see the internals of the servers.

 

Hunternas - The system with the lockup problem.  UnRaid 6.3.5.  Diagnostics File Attached: hunternas-diagnostics-20170810-1342.zip

ASUSTeK COMPUTER INC. - P8Z77-V LK
CPU: Intel® Core™ i5-2500K CPU @ 3.30GHz
HVM: Disabled
IOMMU: Disabled
Cache: 256 kB, 1024 kB, 6144 kB
Memory: 24 GB (max. installable capacity 32 GB)
Network: eth0: 1000 Mb/s, full duplex, mtu 1500
Kernel: Linux 4.9.30-unRAID x86_64
OpenSSL: 1.0.2k

HunternasNDH - Stable system.  Diagnostics File Attached: hunternasndh-diagnostics-20170810-1315.zip

M/B: BIOSTAR Group - TA880GU3+
CPU: AMD Athlon™ II X4 640 @ 2999
HVM: Enabled
IOMMU: Disabled
Cache: 512 kB, 2048 kB
Memory: 8 GB (max. installable capacity 8 GB)
Network: eth0: 1000 Mb/s, full duplex, mtu 1500
Kernel: Linux 4.9.30-unRAID x86_64
OpenSSL: 1.0.2k

 

hunternasndh-diagnostics-20170810-1315.zip

hunternas-diagnostics-20170810-1342.zip

HunterNAS Disk Arrays.xlsx

Link to comment
1 hour ago, jeffreywhunter said:

 

Here ya go.  I laid out all the disk arrays and what controllers they are attached to in the HunterNAS Disk Arrays.xlsx spreadsheet.  Let me know if you need anything further...  Hope it helps!  Happy to run any diagnostics or even do a screen share to see the internals of the servers.

 

Hunternas - The system with the lockup problem.  UnRaid 6.3.5.  Diagnostics File Attached: hunternas-diagnostics-20170810-1342.zip


ASUSTeK COMPUTER INC. - P8Z77-V LK
CPU: Intel® Core™ i5-2500K CPU @ 3.30GHz
HVM: Disabled
IOMMU: Disabled
Cache: 256 kB, 1024 kB, 6144 kB
Memory: 24 GB (max. installable capacity 32 GB)
Network: eth0: 1000 Mb/s, full duplex, mtu 1500
Kernel: Linux 4.9.30-unRAID x86_64
OpenSSL: 1.0.2k

HunternasNDH - Stable system.  Diagnostics File Attached: hunternasndh-diagnostics-20170810-1315.zip


M/B: BIOSTAR Group - TA880GU3+
CPU: AMD Athlon™ II X4 640 @ 2999
HVM: Enabled
IOMMU: Disabled
Cache: 512 kB, 2048 kB
Memory: 8 GB (max. installable capacity 8 GB)
Network: eth0: 1000 Mb/s, full duplex, mtu 1500
Kernel: Linux 4.9.30-unRAID x86_64
OpenSSL: 1.0.2k

 

hunternasndh-diagnostics-20170810-1315.zip

hunternas-diagnostics-20170810-1342.zip

HunterNAS Disk Arrays.xlsx

If / when you get bored, try this:

 

Move Hitachi_HUA723030ALA640_MK0371YVG9PYUA - 3 TB, Hitachi_HUA723030ALA640_MK0373YVHH0A4C - 3TB, Hitachi_HUA723030ALA640_MK0361YHJ0N8JD - 3 TB off of the SAS2LP and onto the motherboard.  Don't put them onto the Syba Card.

 

And see if there's any improvement.  No guarantees.

Edited by Squid
Link to comment
8 minutes ago, Squid said:

If / when you get bored, try this:

 

Move Hitachi_HUA723030ALA640_MK0371YVG9PYUA - 3 TB, Hitachi_HUA723030ALA640_MK0373YVHH0A4C - 3TB, Hitachi_HUA723030ALA640_MK0361YHJ0N8JD - 3 TB off of the SAS2LP and onto the motherboard.  Don't put them onto the Syba Card.

 

And see if there's any improvement.  No guarantees.

Interesting.  Are you thinking this is caused by the 3TB Hitachi drives?

Link to comment
Just now, jeffreywhunter said:

Interesting.  Are you thinking this is caused by the 3TB Hitachi drives?

drives 3TB+ using ATA8-ACS as the interface.  Theory.  I can only prove that certain ST3000DM's using ATA8-ACS on a SAS2LP cause 2 known marvel issues.  

 

You've got ports to spare on the mobo, so nothing to lose

 

  • Upvote 2
Link to comment
  • 2 weeks later...
  • 3 months later...
On 8/10/2017 at 4:52 PM, Squid said:

drives 3TB+ using ATA8-ACS as the interface.  Theory.  I can only prove that certain ST3000DM's using ATA8-ACS on a SAS2LP cause 2 known marvel issues.  

 

You've got ports to spare on the mobo, so nothing to lose

 

 

So over the past couple of months I've tried a number of things.  While I initially thought that moving the drives as you suggested helped, it didn't make a permanent difference.  I continue to have the server crash after a day or two.

 

I recently replaced the marvell controller with 2 LSI controllers and moved everything off the motherboard, except for the cache drive and parity.  Performance picked up, but I continue to crash.

 

Since the crash happens pretty quick, I decided to try your suggestion to use Fix Common Problems Troubleshooting mode.  Attached is the output from that.

 

The only interesting thing I see (with my limited knowledge) in the log is a mention "shfs/user: share cache full", but that happens and stops many hours before the system crashed.

 

Reading through the forums, I do see a lot of the reasons for the GUI to lockup is related to cache issues.  Perhaps the share cache full means something.

 

Thanks in advance.  Appreciate the time to review this.  Be very thankful to get this behind me...

FCPsyslog_tail.txt

hunternas-diagnostics-20171214-1741.zip

unbalance.log

Link to comment
  • 1 month later...

@Squid - Have you had any further thoughts on this.  My server is basically a paper weight.  Its boots fine, runs a pairty check, runs for 24-36 hours, then just dies.  I'm at a loss of what to do next.  I've even turned off all the dockers and plugins.  Still crashes after a couple days.

 

Per suggestions earlier this year, I totally rebuilt the unRaid USB a few months ago from scratch, added the various plugins/dockers from scratch.  No difference, still seeing the crashes.  Could this be a motherboard problem?  How would I diagnose that?

Edited by jeffreywhunter
Link to comment
  • 3 weeks later...

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.