MarkRMonaco

Members
  • Posts

    93
  • Joined

  • Last visited

Everything posted by MarkRMonaco

  1. @Altheran, if you have not done it already, I would suggest moving plex to its own dedicated SSD via unassigned devices. I have mine running on a 2nd nvme SSD formatted as XFS. From there, I have my Plex docker container's appdata mapped to the unassigned drive's mount point. Its helped to take the load off Unraid's SSD (not only in performance but storage used) and has not been affected by my recent stability issues. You can use Krusader to move the Plex appdata when its container is not running.
  2. I let the system run for a little over two hours and went ahead with reformatting my cache drive to XFS.
  3. @jonathanm, that would make a lot more sense. I have two slots configured (as I was going to add a 2nd drive a few months ago, but never did). I'll worry about reformatting once I get my system stabilized.
  4. Another update, I had a few more issues pop-up last night. First one, the WebGUI was complaining that the license file was missing/corrupted. So, I redownloaded a fresh copy of my key file, and placed it on the USB drive. I then moved the drive to a different port on the computer (because I was also seeing a mention of "reset SuperSpeed Gen 1 USB device" in the syslog). Once the system was back up, it ran for several hours without issues (before I went to bed). When I woke up this morning, I found that the system was being unresponsive (hard-locked). I verified in the BIOS that Global C-States was already off, and typical current was already enabled. Therefore, I turned off spread spectrum (since XMP was enabled at that time). Once it was back up, I added "rcu_nocbs=0-15" to the syslinux config and rebooted (at the time I didn't realize that I mistyped it and had "cu_nocbs=0-15" in the config). From there, I went out to the store and came back a few hours later to another hard-lock. This time, I went back into BIOS and turned off XMP. Once the system was back up, I corrected the "rcu_nocbs=0-15" entry and rebooted. From there, I opened putty on my other computer and began a tail on the syslog. Note, I already have the syslog server enabled on the unraid system with it looping back to itself, but have never been able to get anything to write to the share. As for the USB drive itself, I have Unraid configured to use UEFI mode.
  5. Thanks. I'll have to look into that. When I reassigned the cache drive, it automatically formatted as btrfs.
  6. Well, this morning I logged in and found that the WebGUI was reporting that it couldn't access flash. So, I powered the server down, reformatted the thumb drive with a fresh copy of 6.9-rc2, and restored my config backup. Now, I need to start the parity check all over again... joy.
  7. I agree @Squid. I'm assuming there is a chance that the appdata may have had some corruption in it when it was backed-up before the reformat. At the moment, everything seems to be ok and any btrfs errors were corrected (according to the logs). Since it is doing a parity check after the forced reboot, I'm going to let it sit for the time being. If I see anything else pop-up in the system log or if any other abnormal activity occurs, I'll reply back here (hopefully, w/ logs).
  8. Just another update. Ran into some issues while restoring my docker containers from my saved templates. It would occasionally cause the docker service to stop. In most cases, I was able to stop the array and restart it, which would get docker running again. However, at some point, stopping the array would get hung up at the cache drive. Thankfully, I was able to stop it via terminal with "umount -l /mnt/cache". At which point, I rebooted the server and immediately ran btrfs scrub again. Errors/corruption were detected, but corrected. Scrub device /dev/nvme0n1p1 (id 1) done Scrub started: Thu Jan 28 21:51:42 2021 Status: finished Duration: 0:00:54 Total to scrub: 235.02GiB Rate: 3.15GiB/s Error summary: no errors found WARNING: errors detected during scrubbing, corrected Unfortunately, I forgot to pull logs before I rebooted... I'll continue to keep an eye on it and will report back (w/ logs) if anything changes.
  9. Just an update on this. Reading other information on the forums and the wiki, I decided to reformat the drive. As a precaution, I first erased the drive, unassigned it, formatted to XFS, reassigned and let it automatically reformat it back to BTRFS. When rebuilding the docker image file, I also opted for the XFS option. Running appdata backup/restore as we speak...
  10. Ran into a kernel panic error within the last 24 hours. Recently (within the past hour or so), my Unraid server went unresponsive. At the time, the only activity was a single user on Plex (Docker) transcoding. I had to force the system to shutdown (via power button) and turn it back on again. As a precaution, since transcoding was on that cache drive, I moved it to RAM for the time being. Looking at the log, I found one instance of BTRFS complaining about corruption: Jan 28 20:47:26 WadeWilson kernel: BTRFS error (device loop2): bdev /dev/loop2 errs: wr 0, rd 0, flush 0, corrupt 1, gen 0 Since my cache drive is the only BTRFS formatted device in the system, I ran scrub on it: Scrub device /dev/nvme0n1p1 (id 1) done Scrub started: Thu Jan 28 20:57:14 2021 Status: finished Duration: 0:00:25 Total to scrub: 133.02GiB Rate: 2.77GiB/s Error summary: csum=1 Corrected: 0 Uncorrectable: 1 Unverified: 0 ERROR: there are uncorrectable errors Anyone have any advice on what I should do next? Logs are attached. wadewilson-diagnostics-20210128-2102.zip
  11. I don't have a 2nd Unraid server (yet). So, I'm just trying to save the files locally. It was a recommendation I received in a prior post just to make sure that I am not losing anything in the event the system crashes (which was happening to me prior before I realized I had a bad SATA3 cable).
  12. Looking at my system log (and the fact that my local share is never populated with files), I am suspecting that Rsyslog is never initializing all the way. - I've tried various configuration changes, including trying to send (loop) the files back to the server using the remote address option (per a tip I received in one of my prior forum posts). However, it never seems to work correctly... - Is there a way to reinstall/rebuild it? System Log: Oct 1 10:50:09 WadeWilson rsyslogd: [origin software="rsyslogd" swVersion="8.2002.0" x-pid="21673" x-info="https://www.rsyslog.com"] start Oct 1 10:50:35 WadeWilson ool www[21084]: /usr/local/emhttp/plugins/dynamix/scripts/rsyslog_config Oct 1 10:50:38 WadeWilson rsyslogd: Could not find template 1 'remote' - action disabled [v8.2002.0 try https://www.rsyslog.com/e/3003 ] Oct 1 10:50:38 WadeWilson rsyslogd: error during parsing file /etc/rsyslog.conf, on or before line 121: errors occured in file '/etc/rsyslog.conf' around line 121 [v8.2002.0 try https://www.rsyslog.com/e/2207 ] Oct 1 10:50:38 WadeWilson rsyslogd: [origin software="rsyslogd" swVersion="8.2002.0" x-pid="21912" x-info="https://www.rsyslog.com"] start My "syslog" Share: Current Settings: This is specific line that the error is complaining about in /etc/rsyslog.conf: wadewilson-diagnostics-20201001-1106.zip
  13. Everything appears to be fixed on my end and no further CRC errors have shown up since the cables were replaced. - For the time being, I keep the RAM at the factory XMP speed without any additional overclocks applied. As for the Syslog server, it is still not functioning correctly. Therefore, I may look into either a Docker that can serve the same purpose (making the Unraid server itself just a client to send logs) or one of my Raspberry Pi's.
  14. Just another update... I picked up the replacement SATA-III cables and installed them. Of course, that means that I now need to wait for the drive to rebuild again since it was postponed earlier today. If anyone has any further insight into the Syslog issue, I would love to get that up and running. Thanks in advance.
  15. Ok, I just received a CRC error on a completely different drive. Therefore, I'm pretty confident that the cable itself is bad (or going bad). - Since that particular drive was my only parity drive. I'm going to pause the rebuild just to be safe. I'm going to shut down the server and have already placed a store pickup order for replacement cables (going to replace all 4 since the original cables were a sleeved bundle) from my local Fry's. -- Just waiting on the "order ready" confirmation...
  16. I get that and will drop it down further, if necessary. - However, I just changed several things at once (again), and need to see if any of it has made an impact.
  17. Thanks @kevschu. An update on my end... About a half of day later, the drive went back into "disabled" status due to errors. - Therefore, I went into the BIOS and brought the RAM clock back down to the base/stock XMP setting (3000MHz) w/ no additional overclock. From there, I shut the system back down, and swapped the SATA cable ordering (they're physically tagged) across the four 3.5" drives (1 through 4, top to bottom; versus 4 through 1). All of the power connectors were checked as well to ensure that they were fully seated. Now, I'm back to square one with the parity rebuild/sync since the drive had to be removed and re-added to the pool... In the meantime, let me know if anyone is interested in a new set of logs pulled from the system.
  18. Just an update, my drive is almost 100% rebuilt and I have not ran into issues with it being unresponsive (yet). So, it looks like one of these steps (above) solved the issue. I am, however, still experiencing issues actually getting syslog working at all. The Unraid share has yet to be populated with anything, and I am still running into that single error message whenever the service is started or restarted.
  19. I also checked my "syslog" share, and it looks like it is not populating with any files as well...
  20. Thanks. I missed that part. With the "flash mirroring" option turned off and the syslog server set to "both" for protocols, I'm still getting one error message returned when the service started/restarted: Starting rsyslogd daemon: /usr/sbin/rsyslogd -i /var/run/rsyslogd.pid rsyslogd: Could not find template 1 'remote' - action disabled [v8.2002.0 try https://www.rsyslog.com/e/3003 ] rsyslogd: error during parsing file /etc/rsyslog.conf, on or before line 121: errors occured in file '/etc/rsyslog.conf' around line 121 [v8.2002.0 try https://www.rsyslog.com/e/2207 ] Current Config:
  21. I also put a screenshot of the specific lines that were called out from the /etc/rsyslog.conf file in my previous reply.
  22. Now, regarding the syslog configuration, is this something I should be concerned about (and do I need to take any action)? Lines 66 & 67: Line 123:
  23. Fair enough. - That will be my next step (going back down to the base XMP setting) if the system goes unresponsive again.
  24. Just an update. - I fixed the "system" share issue and made sure that it only resides within the cache pool. The other files that were on one of my drives were outdated. Therefore, I deleted them through Krusader. I also did the following: Enabled the local syslog server and have it mirroring between the cache pool and the "flash" share. Reverted back to stock (and rebooted) from the linuxserver.io Nvidia (Unraid Nvida plugin) image since my card wasn't supported. Turned off (disabled) any ErP or C-State settings in the BIOS (which were previously enabled).
  25. I'm not running in single-channel mode. There are two DIMMs installed. - 16gb (2x8gb) means that the kit installed is comprised of two 8gb modules, which is pretty standard notation for RAM specs.