jeffreywhunter Posted August 7, 2017 Share Posted August 7, 2017 (edited) I've seen some discussion on creating a persistent RAM disk in old forum posts. Has anyone seen an app or script that can save the syslog when the server crashes? In the past 6 or 8 months, I've had my server crash hard, requiring a power-on restart. I'd love to see if the log can point anything out. I made the conversion from 5.x to 6.x with my SAS2LP controller, been hearing that's no good anymore, so considering a change. I have two servers with that controller on unRaid 6.x. One has continued to work without error. The other, not so well... It would be nice to have something like this available in unRaid... OS feature to write syslog to flash... https://www.cisco.com/c/en/us/td/docs/ios/12_0s/feature/guide/cs_sysls.html Appreciate any advice/tips/techniques to make the syslog persistent so I can see if there is anything diagnostic in it... Edited August 7, 2017 by jeffreywhunter Quote Link to comment
Squid Posted August 7, 2017 Share Posted August 7, 2017 1 minute ago, jeffreywhunter said: I've seen some discussion on creating a persistent RAM disk in old forum posts. Has anyone seen an app or script that can save the syslog when the server crashes? In the past 6 or 8 months, I've had my server crash hard, requiring a power-on restart. I'd love to see if the log can point anything out. I made the conversion from 5.x to 6.x with my SAS2LP controller, been hearing that's no good anymore, so considering a change. I have two servers with that controller on unRaid 6.x. One has continued to work without error. The other, not so well... Appreciate any advice/tips/techniques to make the syslog persistent so I can see if there is anything diagnostic in it... User scripts plugin Add this as a script to run at first array start only in the background: #!/bin/bash mkdir -p /boot/logs FILENAME="/boot/logs/syslog-$(date +%s)" tail -f /var/log/syslog > $FILENAME Will create a new file syslog-xxxxx (with xxx being the number of seconds since linux epoch time) at every boot, and continue until a reboot / crash. Alternatively, Fix Common Problems in troubleshooting mode, but its not designed to run for long periods of time, and logs a ton of extra info that could be considered spam 1 Quote Link to comment
jeffreywhunter Posted August 7, 2017 Author Share Posted August 7, 2017 5 minutes ago, Squid said: Add this as a script to run at first array start only in the background: Thanks Squid! This should be in the "GO" file or somewhere else? Quote Link to comment
Squid Posted August 7, 2017 Share Posted August 7, 2017 7 minutes ago, jeffreywhunter said: Thanks Squid! This should be in the "GO" file or somewhere else? easiest to use user scripts plugin as noted. 1 Quote Link to comment
jeffreywhunter Posted August 7, 2017 Author Share Posted August 7, 2017 Ah, sorry, missed that (obviously). Not used that plugin before, will check it out. Thanks! Quote Link to comment
Squid Posted August 7, 2017 Share Posted August 7, 2017 1 hour ago, jeffreywhunter said: I have two servers with that controller on unRaid 6.x. One has continued to work without error. The other, not so well... Can you post the diagnostics from both servers (labelled as to which is good / which is not so good). I've been working on a theory with regards to Marvel controllers Quote Link to comment
jeffreywhunter Posted August 10, 2017 Author Share Posted August 10, 2017 On 8/7/2017 at 0:08 PM, Squid said: Can you post the diagnostics from both servers (labelled as to which is good / which is not so good). I've been working on a theory with regards to Marvel controllers Here ya go. I laid out all the disk arrays and what controllers they are attached to in the HunterNAS Disk Arrays.xlsx spreadsheet. Let me know if you need anything further... Hope it helps! Happy to run any diagnostics or even do a screen share to see the internals of the servers. Hunternas - The system with the lockup problem. UnRaid 6.3.5. Diagnostics File Attached: hunternas-diagnostics-20170810-1342.zip ASUSTeK COMPUTER INC. - P8Z77-V LK CPU: Intel® Core™ i5-2500K CPU @ 3.30GHz HVM: Disabled IOMMU: Disabled Cache: 256 kB, 1024 kB, 6144 kB Memory: 24 GB (max. installable capacity 32 GB) Network: eth0: 1000 Mb/s, full duplex, mtu 1500 Kernel: Linux 4.9.30-unRAID x86_64 OpenSSL: 1.0.2k HunternasNDH - Stable system. Diagnostics File Attached: hunternasndh-diagnostics-20170810-1315.zip M/B: BIOSTAR Group - TA880GU3+ CPU: AMD Athlon™ II X4 640 @ 2999 HVM: Enabled IOMMU: Disabled Cache: 512 kB, 2048 kB Memory: 8 GB (max. installable capacity 8 GB) Network: eth0: 1000 Mb/s, full duplex, mtu 1500 Kernel: Linux 4.9.30-unRAID x86_64 OpenSSL: 1.0.2k hunternasndh-diagnostics-20170810-1315.zip hunternas-diagnostics-20170810-1342.zip HunterNAS Disk Arrays.xlsx Quote Link to comment
Squid Posted August 10, 2017 Share Posted August 10, 2017 (edited) 1 hour ago, jeffreywhunter said: Here ya go. I laid out all the disk arrays and what controllers they are attached to in the HunterNAS Disk Arrays.xlsx spreadsheet. Let me know if you need anything further... Hope it helps! Happy to run any diagnostics or even do a screen share to see the internals of the servers. Hunternas - The system with the lockup problem. UnRaid 6.3.5. Diagnostics File Attached: hunternas-diagnostics-20170810-1342.zip ASUSTeK COMPUTER INC. - P8Z77-V LK CPU: Intel® Core™ i5-2500K CPU @ 3.30GHz HVM: Disabled IOMMU: Disabled Cache: 256 kB, 1024 kB, 6144 kB Memory: 24 GB (max. installable capacity 32 GB) Network: eth0: 1000 Mb/s, full duplex, mtu 1500 Kernel: Linux 4.9.30-unRAID x86_64 OpenSSL: 1.0.2k HunternasNDH - Stable system. Diagnostics File Attached: hunternasndh-diagnostics-20170810-1315.zip M/B: BIOSTAR Group - TA880GU3+ CPU: AMD Athlon™ II X4 640 @ 2999 HVM: Enabled IOMMU: Disabled Cache: 512 kB, 2048 kB Memory: 8 GB (max. installable capacity 8 GB) Network: eth0: 1000 Mb/s, full duplex, mtu 1500 Kernel: Linux 4.9.30-unRAID x86_64 OpenSSL: 1.0.2k hunternasndh-diagnostics-20170810-1315.zip hunternas-diagnostics-20170810-1342.zip HunterNAS Disk Arrays.xlsx If / when you get bored, try this: Move Hitachi_HUA723030ALA640_MK0371YVG9PYUA - 3 TB, Hitachi_HUA723030ALA640_MK0373YVHH0A4C - 3TB, Hitachi_HUA723030ALA640_MK0361YHJ0N8JD - 3 TB off of the SAS2LP and onto the motherboard. Don't put them onto the Syba Card. And see if there's any improvement. No guarantees. Edited August 10, 2017 by Squid Quote Link to comment
jeffreywhunter Posted August 10, 2017 Author Share Posted August 10, 2017 8 minutes ago, Squid said: If / when you get bored, try this: Move Hitachi_HUA723030ALA640_MK0371YVG9PYUA - 3 TB, Hitachi_HUA723030ALA640_MK0373YVHH0A4C - 3TB, Hitachi_HUA723030ALA640_MK0361YHJ0N8JD - 3 TB off of the SAS2LP and onto the motherboard. Don't put them onto the Syba Card. And see if there's any improvement. No guarantees. Interesting. Are you thinking this is caused by the 3TB Hitachi drives? Quote Link to comment
Squid Posted August 10, 2017 Share Posted August 10, 2017 Just now, jeffreywhunter said: Interesting. Are you thinking this is caused by the 3TB Hitachi drives? drives 3TB+ using ATA8-ACS as the interface. Theory. I can only prove that certain ST3000DM's using ATA8-ACS on a SAS2LP cause 2 known marvel issues. You've got ports to spare on the mobo, so nothing to lose 2 Quote Link to comment
jeffreywhunter Posted August 10, 2017 Author Share Posted August 10, 2017 Interesting, so avoiding any 3TB disks with ATA8-ACS... I'll give this a try next reboot...and post back results here... Quote Link to comment
jeffreywhunter Posted August 22, 2017 Author Share Posted August 22, 2017 On 8/10/2017 at 4:52 PM, Squid said: drives 3TB+ using ATA8-ACS as the interface. Theory. I can only prove that certain ST3000DM's using ATA8-ACS on a SAS2LP cause 2 known marvel issues. You've got ports to spare on the mobo, so nothing to lose Ok, so you must be a genius. 3 days 5 hours, no crash. I'll keep you posted! Quote Link to comment
jeffreywhunter Posted December 15, 2017 Author Share Posted December 15, 2017 On 8/10/2017 at 4:52 PM, Squid said: drives 3TB+ using ATA8-ACS as the interface. Theory. I can only prove that certain ST3000DM's using ATA8-ACS on a SAS2LP cause 2 known marvel issues. You've got ports to spare on the mobo, so nothing to lose So over the past couple of months I've tried a number of things. While I initially thought that moving the drives as you suggested helped, it didn't make a permanent difference. I continue to have the server crash after a day or two. I recently replaced the marvell controller with 2 LSI controllers and moved everything off the motherboard, except for the cache drive and parity. Performance picked up, but I continue to crash. Since the crash happens pretty quick, I decided to try your suggestion to use Fix Common Problems Troubleshooting mode. Attached is the output from that. The only interesting thing I see (with my limited knowledge) in the log is a mention "shfs/user: share cache full", but that happens and stops many hours before the system crashed. Reading through the forums, I do see a lot of the reasons for the GUI to lockup is related to cache issues. Perhaps the share cache full means something. Thanks in advance. Appreciate the time to review this. Be very thankful to get this behind me... FCPsyslog_tail.txt hunternas-diagnostics-20171214-1741.zip unbalance.log Quote Link to comment
jeffreywhunter Posted January 22, 2018 Author Share Posted January 22, 2018 (edited) @Squid - Have you had any further thoughts on this. My server is basically a paper weight. Its boots fine, runs a pairty check, runs for 24-36 hours, then just dies. I'm at a loss of what to do next. I've even turned off all the dockers and plugins. Still crashes after a couple days. Per suggestions earlier this year, I totally rebuilt the unRaid USB a few months ago from scratch, added the various plugins/dockers from scratch. No difference, still seeing the crashes. Could this be a motherboard problem? How would I diagnose that? Edited January 27, 2018 by jeffreywhunter Quote Link to comment
jeffreywhunter Posted January 28, 2018 Author Share Posted January 28, 2018 Bump. Anyone want a paperweight? Quote Link to comment
jeffreywhunter Posted February 12, 2018 Author Share Posted February 12, 2018 Just an update. Upgrade to 6.4.1 indicated that I had a couple of plugins that were not compatible (PreClear, Dynamix buttons). Removing that from my system and upgrading to 6.4.1 seems to have resolved my 24-48 hour server crash. Quote Link to comment
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.