shooga

Members
  • Posts

    195
  • Joined

  • Last visited

Everything posted by shooga

  1. Thanks for the info guys. I read through post and it sounds like exactly the same issue. I've had enough with chasing this, so I just ordered an LSI SAS9201-8i card. Found one for $40 on eBay. Hopefully this will put it to bed for good.
  2. I've been fighting intermittent errors that come up as a result of parity checks for a few months. I keep thinking I've figured it out, but it comes back up. The weirdest part about it is that it's always exactly 5 errors. Twice now, I thought I figured out the problem and was able to get back to a complete parity check with no errors only to have the 5 errors come back around. The last time this happened I ran a correcting check, then checked again and found 0 errors. Does this sound familiar to anyone? Any ideas on how to troubleshoot? I've been trying to use the Dynamix File Integrity plugin to track/find the errors when this happens, but the plugin has never reported any errors.
  3. Aha. That actually made me realize that we have a laser printer that definitely pulls the voltage down (lights dim) when it kicks on to print. The UPS/server is in the same closet as the printer. I'll do a test to see if printing triggers the message. I guess there's nothing really wrong then, but ideally I'd get more current to that room...
  4. I'm sure we're not losing power this frequently. Even 2 second power drops would leave noticeable evidence. This is happening multiple times per day, sometimes when we are home. Also worth noting that my UPS sounds an alarm when the power goes out and I haven't heard it.
  5. It's just a theory really, but I couldn't think of any other difference between a manual run of this backup script and the scheduled ones. I never had an issue with a manual run. This work around seems to be helping so far. Also: It's not really clear above, but I run mover every night and the backup script once per week. I wanted to make sure there was no overlap on the day they both run.
  6. I'm noticing false positives for power failures via my UPS in my syslog. Any ideas what would be causing this? Is it something to be worried about? They almost always coincide with disk spin downs, but not 100% of the time. The log entries look like this: Jul 7 08:22:55 Bunker kernel: mdcmd (117): spindown 0 Jul 7 08:22:56 Bunker kernel: mdcmd (118): spindown 8 Jul 7 08:22:57 Bunker kernel: mdcmd (119): spindown 29 Jul 7 11:41:23 Bunker apcupsd[9954]: Power failure. Jul 7 11:41:25 Bunker apcupsd[9954]: Power is back. UPS running on mains. Jul 7 16:10:55 Bunker apcupsd[9954]: Power failure. Jul 7 16:10:57 Bunker apcupsd[9954]: Power is back. UPS running on mains. Jul 7 16:43:49 Bunker apcupsd[9954]: Power failure. Jul 7 16:43:51 Bunker apcupsd[9954]: Power is back. UPS running on mains. Jul 7 19:08:00 Bunker autofan: Highest disk temp is 36C, adjusting fan speed from: OFF (0% @ 0rpm) to: 160 (62% @ 1506rpm) Jul 7 19:20:06 Bunker autofan: Highest disk temp is 37C, adjusting fan speed from: 160 (62% @ 1480rpm) to: 170 (66% @ 1591rpm) Jul 7 19:25:09 Bunker kernel: mdcmd (120): spindown 3
  7. @SelfSD I've had this same issue. Not really a fix, but I have found that (so far) the problem is avoided by adjusting my schedules so that Mover runs at midnight and this backup runs at 4am. That seems to leave enough time for Mover to finish so there isn't a conflict. I wish there was a more concrete solution, but so far so good...
  8. Thanks @Squid. I understand your logic here and it makes sense. I am using xfs on my array and btrfs on my cache pool. I will try changing the backup setting to use a single disk. I am also trying to space out my scheduled mover and backup times, just to make sure they are not running at the same time. I guess it's worth mentioning that I was just deleting some old backup data and when I used rm -r on the plex directory it took ~10 minutes and my GUI was unresponsive for part of that. It did come back though. The hang I see when backing up seems to last indefinitely, but the behavior is similar.
  9. The log file info that you used to troubleshoot your issue.
  10. Ok. I'll check out this plugin. Thanks again!
  11. I've been having trouble on and off with this too. It seems to work fine when I do a manual backup, but sometimes the automated backups seem to hang and cause the server to become non-responsive. I'm also backing up my Plex directory, but that's one of the main things I want to back up, so I don't want to exclude it. @thaddeussmithAre you just looking at the system log? Does that persist past the forced reboot? I'm unable to access the server when mine hangs and couldn't get to the log file.
  12. Unfortunately, I don't have checksums. Is there a recommended tool for calculating and then checking the files? Just did a quick search and I see that Squid has a plugin, but it's no longer being actively maintained.
  13. I guess I'm not 100% sure, but I think the only recent unclean shutdown triggered a non-correcting check, which came back with zero errors. Isn't there a chance that I will be writing bad data to parity? Is there further troubleshooting that can be done to figure out whether the parity or data is correct? (thanks for the replies)
  14. The second non-correcting check just finished with the exact same number of errors. Any advice?
  15. Here's the full sequence of events: The monthly parity check setting had inadvertently changed from non-correcting to correcting. It ran and found/corrected 5 errors. I then changed the setting and ran a non-correcting check. This check found something like 239 errors. I pulled the memory and ran a non-correcting check. This found 5 errors. (from the correcting check that found false positives) I ran a correcting check, which found and corrected 5 errors. I ran a non-correcting check, which found zero errors. I RMA'd the new stick of memory (a second 8GB stick), while keeping my original 8GB stick in place While waiting for the replacement, I had a server crash that required a hard reset of the server. After coming back up, the server did a non-correcting parity check and found zero errors After coming back up, I got a "Warning [BUNKER] - current pending sector is 1" for one of my drives (unrelated I think) Received the new stick of memory and completed 3 cycles of memtest with no errors Non-correcting monthly parity check completed with 2789 errors. I will run another non-correcting check to see if I get the same number of errors.
  16. I'm having issues chasing parity check errors. I added an additional memory stick about a month ago and had parity check errors. I was able to remove the memory stick and solve them. Replaced it and ran the new one through 3 memtest passes with no issues. Have just completed my monthly parity check with the new memory in place and this time have almost 3000 errors. Help trouble shooting would be greatly appreciated. My diagnostic info is attached. You can see the errors at: Jun 1 18:13:48 Thanks!!! bunker-diagnostics-20170601-2148.zip
  17. @Squid I'm having problems with the Backup and Auto-Update plugins too. After using them for many months with no issues, I've had two consecutive weeks where my server has become totally unresponsive (requiring a long power button press) when doing my weekly backup and update. WebGUI won't load, can't SSH in. I've disabled both for now. What's the best way to troubleshoot this?
  18. Ok, thanks. That makes sense. I'll give included/excluded a try.
  19. Yeah, it's strange indeed. The power connection means that the drives aren't connect to the power switch at all, they always get power. That MB is part of what has me considering an upgrade. That plus the tempting new Ryzen CPUs. But I'm having second thoughts now because this server has been rock solid. And you're right about that gift horse, but finding a mobile CPU cooler that I like sure was hard I have my 6TB drives and SSD on the MB SATA ports, but thanks for pointing that out. BTW, I'm glad you found the info about my server, but how did you find it? Just old posts? I'd like to share it and people used to put that info in their signatures here on the forum, but it seems like that stopped with the new forum software. I don't see an obvious place for it in my profile either. Edit: Well now I see that there is a setting that you have to turn on to show people's signatures. Guess you probably had that turned on... (and my old sig is still there, but now it's in Account Settings rather than Profile)
  20. Ah, ok. Got it. That makes sense. It seems that the nature of my files on Unraid means that I'm typically only accessing one or two drives at a time. Or am I missing something? It just depends on the number of users and the way files are split across drives right? Your post has TONS of great info! Thanks for sharing that. It looks like the 8 x 80MB/s limitation must be based on the card itself and not the x4 bus speed, because it's not approaching the 1000MB/s limit. Correct?
  21. Just wanted to follow up with what I've learned in case it can help anyone else. It looks like the AOC-SASLP-MV8 supports 300MB/s per channel (8 channels). 6TB WD drives support sustained read/write speeds in the 175-225 MB/s range (Blues are slower than Reds). My testing with the diskspeed script shows results that are in line with this for my 6TB drives (slower for my older/smaller drives). So the card doesn't seem to be a bottleneck at all. So no reason to upgrade until sustained drive speeds exceeds 300MB/s. If you want to connect SSDs to a SAS/SATA controller then you'd probably want something faster, but I'd recommend using the motherboard SATA ports for those if at all possible.
  22. It looks like the Cache Dirs plugin is preventing several of my drives from spinning down. I used the Open Files plugin to figure out that a find process was the only thing accessing the drives and after some forum searching realized that Cache Dirs might be the culprit. Sure enough, the drives seem to stay spun down after disabling Cache Dirs. Is this a known issue? Or there any fixes or workarounds? I installed it long ago and didn't used to have this problem. I believe I'm using the default settings. I'm on version 6.3.2.
  23. Thanks for the responses. Just wanted to make sure I really understood (and I do now).
  24. This looks like a great utility. Quick question: The instructions say to "ensure no other processes are running on the server". It seems impossible to take that literally Does this mean processes that would impact disk performance? No parity check, file copy, mover, etc? I can easily shut down docker, but what about plugins?