zybron

Members
  • Posts

    10
  • Joined

  • Last visited

Everything posted by zybron

  1. My /var/log is showing full (100%) on the dashboard but if I run du -sh /var/log/* in the console I get this: root@Tower:/var/log# du -sh /var/log/* 40K /var/log/Xorg.0.log 0 /var/log/apcupsd.events 4.0K /var/log/apcupsd.events.1 0 /var/log/btmp 0 /var/log/cron 0 /var/log/debug 88K /var/log/dmesg 0 /var/log/docker.log 0 /var/log/docker.log.1 0 /var/log/faillog 0 /var/log/lastlog 0 /var/log/libvirt 0 /var/log/maillog 0 /var/log/messages 0 /var/log/nfsd 4.0K /var/log/nginx 0 /var/log/packages 0 /var/log/pkgtools 0 /var/log/plugins 0 /var/log/removed_packages 0 /var/log/removed_scripts 0 /var/log/samba 0 /var/log/scripts 0 /var/log/secure 0 /var/log/setup 0 /var/log/spooler 0 /var/log/swtpm 308K /var/log/syslog 8.0K /var/log/wtmp That's clearly less than either the default of 128M or the 384M that I have attempted to increase it in my go file using this line: mount -o remount,size=384m /var/log Can anyone shed any light on what might be using the extra space? Diagnostic files attached. tower-diagnostics-20191112-1531.zip
  2. I just finished using this plugin to split up a monthly parity check. I have my normal parity check schedule set to correct errors, however, both the notifications during the parity check and the parity.check.tuning.progress.save file are showing non-correcting parity checks. type|date|time|sbSynced|sbSynced2|sbSyncErrs|sbSyncExit|mdState|mdResync|mdResyncPos|mdResyncSize|mdResyncCorr|mdResyncAction|Description STARTED|2019 Oct 20 00:00:01|1571544001|1570408676|1570493018|0|0|STARTED|0|0|7814026532|0|check P|Non-Correcting Parity Check| PAUSE|2019 Oct 20 07:30:01|1571571001|1571544001|0|0|0|STARTED|7814026532|2851442360|7814026532|1|check P|Non-Correcting Parity Check| RESUME|2019 Oct 21 00:00:06|1571630406|1571630401|0|0|0|STARTED|7814026532|2851930044|7814026532|1|check P|Non-Correcting Parity Check| PAUSE|2019 Oct 21 07:30:01|1571657401|1571630401|0|0|0|STARTED|7814026532|5397770972|7814026532|1|check P|Non-Correcting Parity Check| RESUME|2019 Oct 22 00:00:06|1571716806|1571716801|0|0|0|STARTED|7814026532|5398227420|7814026532|1|check P|Non-Correcting Parity Check| COMPLETED|2019 Oct 22 07:30:01|1571743801|1571716801|1571741144|0|0|STARTED|0|0|7814026532|1|check P|Non-Correcting Parity Check|
  3. Awesome, thanks! I will do some testing and see what I find.
  4. I'm seeing the following in my log periodically: Sep 29 10:31:51 Tower kernel: mce: [Hardware Error]: Machine check events logged Sep 29 10:31:51 Tower kernel: EDAC sbridge MC0: HANDLING MCE MEMORY ERROR Sep 29 10:31:51 Tower kernel: EDAC sbridge MC0: CPU 0: Machine Check Event: 0 Bank 9: 8c000041000800c0 Sep 29 10:31:51 Tower kernel: EDAC sbridge MC0: TSC 12475bcf2a26c Sep 29 10:31:51 Tower kernel: EDAC sbridge MC0: ADDR 54c5c4000 Sep 29 10:31:51 Tower kernel: EDAC sbridge MC0: MISC 90002000200028c Sep 29 10:31:51 Tower kernel: EDAC sbridge MC0: PROCESSOR 0:306e4 TIME 1569767511 SOCKET 0 APIC 0 Sep 29 10:31:51 Tower kernel: EDAC MC0: 1 CE memory scrubbing error on CPU_SrcID#0_Ha#0_Chan#0_DIMM#0 (channel:0 slot:0 page:0x54c5c4 offset:0x0 grain:32 syndrome:0x0 - area:DRAM err_code:0008:00c0 socket:0 ha:0 channel_mask:1 rank:0) In my system profiler, I have this entry for one of the DIMM modules: Memory Device Total Width: 72 bits Data Width: 64 bits Size: 8192 MB Form Factor: DIMM Set: None Locator: P1_DIMMD1 Bank Locator: Node0_Bank0 Type: DDR3 Type Detail: Registered (Buffered) Speed: 1600 MT/s Manufacturer: Kingston Serial Number: CF0F9841 Asset Tag: Dimm9_AssetTag Part Number: 9965433-180.A Rank: 1 Configured Memory Speed: 1600 MT/s Does the Dimm9_AssetTag correspond to the Bank 9 reference in the log? Just in case, I have also attached diagnostics if that is useful. tower-diagnostics-20190929-1433.zip
  5. Yeah, I saw some of those while searching. Since I was previously using MiniSAS to SATA breakout cables it ends up about the same cost either way with the extra cable purchases for internal to internal. Thank you for the suggestion.
  6. Before I spend about $100 to just try to diagnose if the card is the issue, is there a way to connect this disk shelf (that uses QSFP cables) to an internal LSI card? I see somewhat expensive QSFP to MiniSAS (8088) cables but nothing that is QSFP to MiniSAS internal (8087). I realize under normal circumstances, you'd probably never want that particular combination, but in this one case, I sort of would. Anyone know an easier way I can try to test using my LSI 9201-8i to the NetApp DS4243 that uses QSFP than buying a new LSI external card and the QSFP to MiniSAS cable to go with it?
  7. I was previously using an LSI controller in this machine until I ran out of disk slots. It's an internal SAS and I don't have any QSFP to SAS cables. I can look into acquiring a QSFP to SAS cable for testing this, though.
  8. I'm having an odd issue whenever I am attempting to run a parity sync/check with my array. I recently purchased a used NetApp DS4243 for my disks and a NetApp 111-00341 PMC Sierra PM8003 HBA card to interface with the disk shelf. Under normal operating conditions, the disks are readily accessible and running extended SMART tests has shown that they are healthy, however starting a Parity Check causes random disks to be knocked out of the array it seems and, of course, read errors start happening. If I run the array without the parity disk, I don't see any errors or other issues even when the server is under as much load as the dockers/vm provide. I have done a full pre-clear on the parity disk to check for issues, just in case, and it succeeded without any errors. Google searches have not provided me with any answers that I can find, so far. Can anyone assist me in diagnosing this problem? tower-diagnostics-20190409-0938.zip
  9. I'm getting the error messages below repeated about once an hour. Is there something I can tweak or modify to address this? Everything seems to be working correctly, but I wonder if I'm actually getting updates because of this. ERROR: constraint "dbmirror_pendingdata_SeqId" of relation "dbmirror_pendingdata" does not exist STATEMENT: ALTER TABLE dbmirror_pendingdata DROP CONSTRAINT "dbmirror_pendingdata_SeqId" CASCADE WARNING: SET TRANSACTION can only be used in transaction blocks WARNING: SET CONSTRAINTS can only be used in transaction blocks
  10. EDIT: It seems my issue is resolved. Tried everything I could think of except rebooting the server. After the reboot, I can access everything just fine. Still confused why it happened, but it seems like it's fine, now. I recently stopped my DelugeVPN container with the thought of mapping another path to deal with some custom RSS feeds. After doing so, I am unable to access the web UI or connect via daemon with a thin client. The log (attached) seems to show everything is starting up and running normally without any errors. Any ideas why I cannot seem to connect now? deluge.txt