weirdcrap

Members
  • Posts

    446
  • Joined

  • Last visited

Everything posted by weirdcrap

  1. UPDATE 3/3/2021: I have definitively determined my performance issues are caused by WireGuard. I do not yet know if or when I'll find a solution. UPDATE 8/1/2022: This is still very much broken. I try a file transfer every couple of months and it continues to be horribly slow. Using RSYNC over SSH outside the Wireguard tunnel works great and is what I will continue to use until I can figure this sh*t out. FINAL UPDATE 11/24/2022: See my last post here for solution and TL;DR: Let me preface all of this by saying I'm not sure where my issue lies, so I'm going to layout what I know and hopefully get some ideas on where to look for my performance woes. The before times: Before setting up WireGuard I had SSH open to the world (with security and precautions in place) on my main server so that once a month my backup server could connect and push and pull content as defined in my backup script. This all worked splendidly for years and I always got my full speeds up to the bandwidth limit I set in my rsync parameters. Now: With the release of WireGuard for UnRAID I quickly shutdown my SSH port forward and setup WireGuard. I have one tunnel for my administrative devices and a second tunnel which serves as sever2server access between NODE and VOID. NODE is my main server, and runs 6.8.3 stable. It is located on a 100Mbps/100Mbps fiber line. UPDATE: As a last ditch effort I upgraded NODE to 6.9.0-RC2 as well, no change in the issue. VOID is my backup, runs 6.9.0-RC2 and lives in my home on a 400Mbps/20Mbps cable line. In this setup, my initial rsync session will go full speed for anywhere from 5-30 minutes, then suddenly and dramatically drop in speed, down to 10Mbps or less and stay there until I cancel the transfer. I can restart the transfer immediately and regain full speed for a time, but it always eventually falls again. Here is my rsync call: rsync -avu --stats --numeric-ids --progress --delete -e "ssh -i /mnt/cache/.watch/id_rsa -T -o Compression=no -x -o StrictHostKeyChecking=no" root@NODE:/mnt/user/TV/Popeye/ /mnt/user/TV/Popeye/ Here is a small sample of the rsync transfer log to illustrate the sudden and sharp drop in speed: Season 1938/Popeye - S1938E09 - Mutiny Ain't Nice DVD [BTN].mkv 112,422,538 100% 10.80MB/s 0:00:09 (xfr#24, to-chk=58/135) Season 1938/Popeye - S1938E10 - Goonland DVD [BTN].avi 72,034,304 100% 9.76MB/s 0:00:07 (xfr#25, to-chk=57/135) Season 1938/Popeye - S1938E11 - A Date to Skate DVD [BTN].mkv 138,619,127 100% 10.44MB/s 0:00:12 (xfr#26, to-chk=56/135) Season 1938/Popeye - S1938E12 - Cops Is Always Right DVD [BTN].mkv 127,109,972 100% 11.02MB/s 0:00:10 (xfr#27, to-chk=55/135) Season 1939/Popeye - S1939E01 - Customers Wanted DVD [BTN].mkv 114,673,044 100% 10.50MB/s 0:00:10 (xfr#28, to-chk=54/135) Season 1939/Popeye - S1939E02 - Aladdin and His Wonderful Lamp DVD [BTN].mkv 325,996,501 100% 11.69MB/s 0:00:26 (xfr#29, to-chk=53/135) Season 1939/Popeye - S1939E03 - Leave Well Enough Alone DVD [BTN].mkv 105,089,182 100% 11.30MB/s 0:00:08 (xfr#30, to-chk=52/135) Season 1939/Popeye - S1939E04 - Wotta Nitemare DVD [BTN].mkv 149,742,115 100% 754.78kB/s 0:03:13 (xfr#31, to-chk=51/135) Season 1939/Popeye - S1939E05 - Ghosks Is The Bunk DVD [BTN].mkv 114,536,257 100% 675.53kB/s 0:02:45 (xfr#32, to-chk=50/135) Season 1939/Popeye - S1939E06 - Hello, How Am I DVD [BTN].mkv 92,083,730 100% 700.03kB/s 0:02:08 (xfr#33, to-chk=49/135) Season 1939/Popeye - S1939E07 - It's The Natural Thing to Do DVD [BTN].mkv 110,484,799 100% 715.66kB/s 0:02:30 (xfr#34, to-chk=48/135) Season 1939/Popeye - S1939E08 - Never Sock a Baby DVD [BTN].mkv 97,660,132 100% 716.88kB/s 0:02:13 (xfr#35, to-chk=47/135) Season 1940/Popeye - S1940E01 - Shakespearian Spinach DVD [BTN].mkv 102,543,357 100% 632.64kB/s 0:02:38 (xfr#36, to-chk=46/135) Season 1940/Popeye - S1940E02 - Females is Fickle DVD [BTN].mkv 102,363,188 100% 674.34kB/s 0:02:28 (xfr#37, to-chk=45/135) Season 1940/Popeye - S1940E03 - Stealin' Ain't Honest DVD [BTN].mkv 100,702,236 100% 732.80kB/s 0:02:14 (xfr#38, to-chk=44/135) Season 1940/Popeye - S1940E04 - Me Feelins is Hurt DVD [BTN].mkv 111,018,052 100% 672.35kB/s 0:02:41 (xfr#39, to-chk=43/135) Season 1940/Popeye - S1940E05 - Onion Pacific DVD [BTN].mkv 103,088,015 100% 650.18kB/s 0:02:34 (xfr#40, to-chk=42/135) Season 1940/Popeye - S1940E06 - Wimmin is a Myskery DVD [BTN].mkv 61,440,000 59% 757.02kB/s 0:00:56 ^C rsync error: received SIGINT, SIGTERM, or SIGHUP (code 20) at rsync.c(701) [generator=3.2.3] and my accompanying stats page during the same transfer. You can see the sudden decline around 11:46 which coincides with my sudden drop in transfer speed above: I don't see anything telling in the system logs on either server when this speed drop happens. It almost seems like a buffer is filling up and not being emptied quick enough, causing the speed to tank. What I don't think it is: I don't think my issue is with WireGuard or my ISP speeds on either end. While the transfer is crawling along over SSH at sub-par speeds I can easily browse to NODE over WireGuard from my Windows or Mac computer and pick any file to copy over the tunnel and I can fully saturate the sending servers upload with no issues while SSH is choking in the background: Could it have something to do with the SSH changes that took place between 6.8.3 and 6.9.0? None of the changes I'm aware of sound like the culprit but I could be wrong. So besides that I'm pretty much out of ideas on what it could be without just playing with random ssh and rsync options. Let me know if there is some other info I can provide, below are both servers diagnostic files: node-diagnostics-20210204-0751.zip void-diagnostics-20210204-0752.zip EDIT: I just realized LimeTech has a guide about this published: https://unraid.net/blog/unraid-server-to-server-backups-with-rsync-and-wireguard I looked it over and I'm not really doing anything different except not passing -z (compression) to rsync and disabling compression for the SSH connection. a lot of what is transferred for me is video and doesn't compress well so why waste the CPU cycles on it.
  2. Yeah I noticed that when I upgraded my test server to beta 31 (the first beta I tried this round). My non-root users were still able to login but I was getting random permission errors when trying to do things like launch MC. Unfortunately rather than mess with troubleshooting it I decided to just scrub it and wait until the new stable comes out to see what DocGyver and the community would come up with for non-root access again. I'll be following your thread though to see what you find out so do keep us posted.
  3. The changes are detailed in beta 25 release notes: So they changed about 3 lines in the config which if I had to take a stab are: only root user is permitted to login via ssh (remember: no traditional users in Unraid OS - just 'root'): AllowUsers root non-root tunneling is disabled: Match Group root AllowTcpForwarding yes non-null password is now required: Not sure about this one, I think by default SSH already didn't allow blank passwords. PermitEmptyPasswords is set to No but commented out in a freshly generated UnRAID sshd_config file (at least for me). Are you using DocGyver's SSH plugin to allow non-root users access? I decided before trying out the betas to scrub his plugin and its changes from my system on the off chance I was going to have conflicts since he doesnt appear to have updated it for the betas. I have attached my sshd_config for you to compare with if you want as it should be entirely UnRAID defaults besides my changing of where to look for the authorizedkeysfile (line 44). sshd_config
  4. Single or dual parity? UnRAID should be smart enough to keep track of your drive locations without you having to do anything special with a single parity disk. Dual parity is something I'm not familiar with as I don't currently use it but it does make a difference. With that in mind you should probably get a copy of your diagnostics zip and take a screenshot of the main tab where all your drives and serial #s are displayed in the current order just in case.
  5. Oh yeah I thought about trying to dig through the PLGs myself but I didn't really know what I was looking for so I figured asking would be quicker. I'd just have a million questions about the PLGs instead lol. @dlandon thank you for the quick fix! Glad it wasn't anything serious.
  6. I did remove Docgyver's SSH and denyhosts plugins after the beta update as they were quite old and I saw the SSH changes made in beta 25. I initially thought they might be my problem but even after removing those and rebooting this persisted.
  7. The update checker in UnRAID disagrees with you. I assumed I was all good. How do I tell which are out of date if I can't trust the updater? EDIT: I even have automatic weekly plugin updates turned on in Squid's plugin.
  8. Decided to dive straight into beta 35 (from stable) with my backup server to get ahead of any issues I might encounter as my backup and main are configured very similarly. I've noticed two innocuous errors on the monitor attached to the server. They don't show up in the logs anywhere that I can find and there is nothing preceding or following them. I rebooted into safe mode and those two lines don't appear. So it has something to do with a plugin, but I'm not sure how to go about figuring out which one. I poked through the beta threads and tried some searching but didn't see this mentioned anywhere else yet. I haven't noticed any lost or broken functionality yet, so they appear to be harmless but I would feel better knowing where they are coming from and what if any damage it may cause. Diagnostics: void-diagnostics-20201203-1515.zip
  9. UnRAID v6.8.3 Diagnostics and the previous two system logs capturing roughly my last 30 days of uptime: node-diagnostics-20201129-0852.zip A very strange issue I have just now started experiencing. Twice in the last 48 hours UnRAID has completely list its ability to resolve DNS names without a reboot of the server. All attempts to ping by name result in "name or service not known". UnRAID is unable to resolve any names for update checks and the like. Pinging by IP address works without issue and my wireguard sever continues to provide access to the system. I have made no recent changes to network setup or DNS. My network settings are statically configured and I utilize 3 upstream DNS servers which all remain pingable during this outage: 8.8.8.8, 1.1.1.1, 8.8.4.4 When this issue occurs I can get into other devices on the network and they can resolve names just fine so this is something exclusive to my machine it would appear. If you look at my syslog from 11/28-11/29 (included in the zip above) you will see I rebooted the server and after a few minutes DNS resolution started working again and I was able to update some plugins and whatnot. Nov 28 11:49:12 Node emhttpd: cmd: /usr/local/emhttp/plugins/dynamix.plugin.manager/scripts/plugin update fix.common.problems.plg Nov 28 11:49:12 Node root: plugin: creating: /boot/config/plugins/fix.common.problems/fix.common.problems-2020.11.28-x86_64-1.txz - downloading from URL https://raw.githubusercontent.com/Squidly271/fix.common.problems/master/archive/fix.common.problems-2020.11.28-x86_64-1.txz Nov 28 11:49:12 Node root: plugin: checking: /boot/config/plugins/fix.common.problems/fix.common.problems-2020.11.28-x86_64-1.txz - MD5 Nov 28 11:49:12 Node root: plugin: running: /boot/config/plugins/fix.common.problems/fix.common.problems-2020.11.28-x86_64-1.txz Nov 28 11:49:13 Node root: plugin: running: anonymous Nov 28 12:49:00 Node kernel: veth375a643: renamed from eth0 Nov 28 12:49:00 Node kernel: docker0: port 1(veth20db4d4) entered disabled state Nov 28 12:49:00 Node kernel: docker0: port 1(veth20db4d4) entered disabled state Nov 28 12:49:00 Node kernel: device veth20db4d4 left promiscuous mode Nov 28 12:49:00 Node kernel: docker0: port 1(veth20db4d4) entered disabled state Nov 28 13:51:25 Node kernel: mdcmd (47): spindown 4 Nov 28 13:51:26 Node kernel: mdcmd (48): spindown 6 Nov 28 13:51:26 Node kernel: mdcmd (49): spindown 7 Nov 28 13:51:26 Node kernel: mdcmd (50): spindown 9 Nov 28 13:51:52 Node kernel: mdcmd (51): set md_write_method 0 Nov 28 13:51:52 Node kernel: Nov 28 14:00:43 Node kernel: mdcmd (52): spindown 2 Nov 28 14:01:36 Node kernel: mdcmd (53): spindown 1 Nov 28 15:21:54 Node kernel: mdcmd (54): set md_write_method 1 Nov 28 15:21:54 Node kernel: Nov 28 17:25:38 Node kernel: mdcmd (55): spindown 1 Nov 28 17:25:48 Node kernel: mdcmd (56): spindown 2 Nov 28 17:25:50 Node kernel: mdcmd (57): spindown 9 Nov 28 17:25:53 Node kernel: mdcmd (58): spindown 7 Nov 28 17:25:56 Node kernel: mdcmd (59): set md_write_method 0 Nov 28 17:25:56 Node kernel: Nov 28 19:40:22 Node kernel: mdcmd (60): spindown 4 Nov 28 19:52:48 Node kernel: mdcmd (61): spindown 8 Nov 28 20:41:07 Node kernel: mdcmd (62): spindown 6 Nov 28 21:27:50 Node kernel: mdcmd (63): spindown 9 Nov 28 23:20:56 Node kernel: mdcmd (64): spindown 4 Nov 28 23:27:19 Node kernel: mdcmd (65): spindown 5 Nov 28 23:41:38 Node kernel: mdcmd (66): spindown 7 Nov 29 00:00:01 Node Docker Auto Update: Community Applications Docker Autoupdate running Nov 29 00:00:01 Node Docker Auto Update: Checking for available updates Nov 29 00:00:02 Node Docker Auto Update: No updates will be installed Nov 29 00:15:03 Node kernel: mdcmd (67): spindown 8 Nov 29 00:43:31 Node kernel: mdcmd (68): spindown 2 Nov 29 01:00:01 Node root: Fix Common Problems Version 2020.11.28 Nov 29 01:00:01 Node root: Fix Common Problems: Error: Unable to communicate with GitHub.com Nov 29 01:00:01 Node root: Fix Common Problems: Other Warning: Could not check for blacklisted plugins Nov 29 01:00:12 Node root: Fix Common Problems: Other Warning: Could not perform docker application port tests Nov 29 01:00:12 Node sSMTP[31740]: Unable to locate smtp.gmail.com Nov 29 01:00:12 Node sSMTP[31740]: Cannot open smtp.gmail.com:465 Nov 29 01:01:11 Node kernel: mdcmd (69): set md_write_method 1 Everything was great until 1AM when FCP wanted to run and was unable to resolve github.com again... I see nothing between those two events that explains my sudden loss of DNS or why this seems to be all of a sudden happening daily. I'd love some help in figuring this out as it brings my server and dockers to a grinding halt with so many parts depending on name resolution. EDIT: In the time it took me to draft this topic I have lost DNS resolution again. I'm at a loss as to what has suddenly changed to cause this... EDIT2: hmm it almost seems to be related to docker? I stopped the dockers and docker service and now its back up? I'm honestly just grasping at straws here though, I can't seem to find a rhyme or reason yet. Bringing docker back online doesn't seem to immediately break it. UPDATE: Well It's now been 48 hours with no issues...I still have no idea what caused this but it seems to have resolved itself....
  10. Is your AT&T router in bridge mode or otherwise allowing pfsense to handle IP addressing, routing, DNS, etc? Putting the AT&T device in bridge mode essentially turns it into a dumb modem, allowing pfsense to handle your network traffic and security. If not you may be double NATing yourself which may be your biggest problem here. I couldn't find home user instructions and am not familiar with AT&T provided routers but this should be a start: https://www.att.com/support/smallbusiness/article/smb-internet/KM1188700/ Beyond that, I had on and off issues with remote access until I added: server:private-domain: "plex.direct" to the DNS resolver custom options box in PFSENSE. https://forums.plex.tv/t/secure-connections-lan-and-pfsense/123319/9 ^I didn't find the NAT+Proxy setting in that thread necessary, mine is set to system default which for me is keeping NAT reflection disabled. That and making sure the port forward was setup properly was, IIRC, all I had to do to get plex working behind PFSense. As mfwade mentioned, if you use PFBlockerng and have GeoIP filtering on you may want to turn it off until you get plex working to eliminate it as a potential problem.
  11. Yeah I would be interested in both your original use case (failed fan) and my fringe case (failed AC and parity check kick off). I have 4x 5in3 drive cages with separate fans in my second server and would definitely be interested in having the ability to stop the array or shut the system down if one of those failed and my drives started heating up real bad. To put my mind at ease a bit, when you had this happen to you @JorgeB did you notice the "cooked" drives failed at a higher rate than the others? I'm waiting for someone to get into the DC and check the AC before my system gets powered up so right now i'm just doing a lot of reading on overheated drives and possible issues I may encounter.
  12. Well in this case that would have helped me a bit. The drives were toasty but not overheating before the parity check kicked off. I'll look into that plugin as I apparently can't trust people to remember to turn on the flippin air conditioner after a power outage. It would have saved me from several of my drives toasting themselves out of warranty coverage (the older Reds have a max op temp of 60C). Anyone have experience with RMA'ing drives that are overheated? is that something normally checked by WD? I'm just trying to educate myself for the future on how likely I'm going to be screwed by this little incident if I end up trying to RMA some of these down the road.
  13. I would be interested in seeing this come to fruition for scenarios where cooling is normally adequate but sudden failure leads to sky rocketing temps. I had a power outage where my main server is hosted, server stayed up on battery, but when the power came back the AC was not switched back on and a parity check kicked off. This led to my disks running at 60-62C for the whole night until I woke up and saw the 50+ alerts from UnRAID and shut the server down. Every single one of my disks reached 60C at one time or another. I'm thinking about stronger fans that can move more air as well but in a locked DC closet with limited airflow without the AC I think a shutdown would always be the safer scenario.
  14. Next gen seems to work well enough. Thanks binhex!
  15. Ah so Android 10 then, not AndroidTV. See if the my Files app lets you specify you want to connect to the public share as a guest. The "Files" app on my Android 11 phone doesn't appear to support looking at my local network so I can't really test your specific scenario out. Let me know if you have trouble connecting with Kodi, I can try to help with it though I mostly use it strictly for the Plex add-on so I can watch offline when the internet goes down (Plex app on the Shield breaks without internet).
  16. Nope, I got two notifications from UnRAID, one about the read errors and a second about the 2 pending sectors. I went to check the SMART stats and saw the discrepancy, canceled the check and shut down the server to swap the disk with another. The diag file posted above is from before I shut down the server and after the alerts were generated. I'll let the parity check finish and order a replacement disk. I missed my warranty window by about 6 months =(
  17. To be clear is this actually an AndroidTV box like the shield TV or is it an "Android box for TVs" like the ones you can find all over Amazon? I ask because some of these actually run AndroidTV while others just run Android and can look and act differently. The "My Files" app is not a part of AndroidTV that I'm aware of so I imagine you are using a set top box with Android installed. On my 2019 Shield TV I was able to use Kodi's file manager to access my UnRAID shares with credentials by passing the credentials as part of the SMB path call: Requiring credentials for public shares is unfortunately rather common when dealing with SMB shares in my experience. It is not an UnRAID issue specifically, certain OSes and apps seem to be able to handle it, others can't. I use File Commander on my shield to side load apps from my UnRAID public share and it works fine (the guest box is required, it doesn't like blank credentials):
  18. UnRAID v6.8.3 void-diagnostics-20200915-0707.zip My monthly parity check on my backup server has produced read errors on disk 1 for the last two months... The first time it happened there was no report of pending re-allocated sectors and the drive passed both a short and long smart self test so I wrote it off as a fluke and went on with my life. This morning it happened again within minutes of the parity check starting, this time with UnRAID claiming there are 2 pending sectors: However when I go to look at the drive stats in the context menu SMART doesn't report any pending or reallocated sectors?? I plan on moving the drive to a different slot and see if the error follows the disk or stays with the slot. Anyone ever seen UnRAID misreport pending sectors like that before? Is SMART just slow on the uptake? EDIT: Swapped disk with another slot, rerunning nocorrect check now. EDIT2: It appears to be following the disk, different slot, same disk with read errors. No reports of reallocated sectors this time by unraid, just read errors. Sep 15 07:25:06 VOID kernel: mdcmd (57): check nocorrect Sep 15 07:25:06 VOID kernel: md: recovery thread: check P ... Sep 15 07:25:24 VOID emhttpd: cmd: /usr/local/emhttp/plugins/dynamix/scripts/tail_log syslog Sep 15 07:26:20 VOID emhttpd: cmd: /usr/local/emhttp/plugins/dynamix/scripts/tail_log syslog Sep 15 07:28:31 VOID ntpd[1859]: kernel reports TIME_ERROR: 0x41: Clock Unsynchronized Sep 15 07:28:52 VOID kernel: mpt2sas_cm1: log_info(0x31080000): originator(PL), code(0x08), sub_code(0x0000) Sep 15 07:28:52 VOID kernel: mpt2sas_cm1: log_info(0x31080000): originator(PL), code(0x08), sub_code(0x0000) Sep 15 07:28:52 VOID kernel: mpt2sas_cm1: log_info(0x31080000): originator(PL), code(0x08), sub_code(0x0000) Sep 15 07:28:52 VOID kernel: mpt2sas_cm1: log_info(0x31080000): originator(PL), code(0x08), sub_code(0x0000) Sep 15 07:28:52 VOID kernel: mpt2sas_cm1: log_info(0x31080000): originator(PL), code(0x08), sub_code(0x0000) Sep 15 07:28:52 VOID kernel: mpt2sas_cm1: log_info(0x31080000): originator(PL), code(0x08), sub_code(0x0000) Sep 15 07:28:52 VOID kernel: sd 10:0:0:0: [sdp] tag#3130 UNKNOWN(0x2003) Result: hostbyte=0x00 driverbyte=0x08 Sep 15 07:28:52 VOID kernel: sd 10:0:0:0: [sdp] tag#3130 Sense Key : 0x3 [current] Sep 15 07:28:52 VOID kernel: sd 10:0:0:0: [sdp] tag#3130 ASC=0x11 ASCQ=0x0 Sep 15 07:28:52 VOID kernel: sd 10:0:0:0: [sdp] tag#3130 CDB: opcode=0x88 88 00 00 00 00 00 01 ba 94 b0 00 00 04 00 00 00 Sep 15 07:28:52 VOID kernel: print_req_error: critical medium error, dev sdp, sector 29005968 Sep 15 07:28:52 VOID kernel: md: disk1 read error, sector=29005904 Sep 15 07:28:52 VOID kernel: md: disk1 read error, sector=29005912 Sep 15 07:28:52 VOID kernel: md: disk1 read error, sector=29005920 Sep 15 07:28:52 VOID kernel: md: disk1 read error, sector=29005928 Sep 15 07:29:16 VOID kernel: sd 10:0:0:0: attempting task abort! scmd(00000000ee3221de) Sep 15 07:29:16 VOID kernel: sd 10:0:0:0: [sdp] tag#3104 CDB: opcode=0x85 85 06 20 00 00 00 00 00 00 00 00 00 00 40 e5 00 Sep 15 07:29:16 VOID kernel: scsi target10:0:0: handle(0x0009), sas_address(0x4433221104000000), phy(4) Sep 15 07:29:16 VOID kernel: scsi target10:0:0: enclosure logical id(0x5c81f660e69c9f00), slot(7) Sep 15 07:29:17 VOID kernel: sd 10:0:0:0: task abort: SUCCESS scmd(00000000ee3221de) Sep 15 07:29:17 VOID kernel: sd 10:0:0:0: Power-on or device reset occurred Sep 15 07:29:22 VOID kernel: sd 10:0:0:0: Power-on or device reset occurred Sep 15 07:29:34 VOID kernel: mpt2sas_cm1: log_info(0x31080000): originator(PL), code(0x08), sub_code(0x0000) Sep 15 07:29:34 VOID kernel: mpt2sas_cm1: log_info(0x31080000): originator(PL), code(0x08), sub_code(0x0000) Sep 15 07:29:34 VOID kernel: sd 10:0:0:0: [sdp] tag#3105 UNKNOWN(0x2003) Result: hostbyte=0x00 driverbyte=0x08 Sep 15 07:29:34 VOID kernel: sd 10:0:0:0: [sdp] tag#3105 Sense Key : 0x3 [current] Sep 15 07:29:34 VOID kernel: sd 10:0:0:0: [sdp] tag#3105 ASC=0x11 ASCQ=0x0 Sep 15 07:29:34 VOID kernel: sd 10:0:0:0: [sdp] tag#3105 CDB: opcode=0x88 88 00 00 00 00 00 01 ba c8 b0 00 00 04 00 00 00 Sep 15 07:29:34 VOID kernel: print_req_error: critical medium error, dev sdp, sector 29019160 Sep 15 07:29:34 VOID kernel: md: disk1 read error, sector=29019096 Sep 15 07:29:34 VOID kernel: md: disk1 read error, sector=29019104 Sep 15 07:29:34 VOID kernel: md: disk1 read error, sector=29019112 Sep 15 07:29:34 VOID kernel: md: disk1 read error, sector=29019120 Sep 15 07:29:34 VOID kernel: md: disk1 read error, sector=29019128 Sep 15 07:29:34 VOID kernel: md: disk1 read error, sector=29019136 Sep 15 07:29:34 VOID kernel: md: disk1 read error, sector=29019144 Sep 15 07:29:34 VOID kernel: md: disk1 read error, sector=29019152 Sep 15 07:29:34 VOID kernel: md: disk1 read error, sector=29019160 Sep 15 07:29:34 VOID kernel: md: disk1 read error, sector=29019168 Sep 15 07:29:34 VOID kernel: md: disk1 read error, sector=29019176 Sep 15 07:29:34 VOID kernel: md: disk1 read error, sector=29019184 Sep 15 07:29:34 VOID kernel: md: disk1 read error, sector=29019192 Sep 15 07:29:34 VOID kernel: md: disk1 read error, sector=29019200 Sep 15 07:29:34 VOID kernel: md: disk1 read error, sector=29019208 Sep 15 07:29:34 VOID kernel: md: disk1 read error, sector=29019216 Sep 15 07:29:34 VOID kernel: md: disk1 read error, sector=29019224 Sep 15 07:29:34 VOID kernel: md: disk1 read error, sector=29019232 Sep 15 07:29:34 VOID kernel: md: disk1 read error, sector=29019240 Sep 15 07:29:39 VOID rc.diskinfo[12312]: SIGHUP received, forcing refresh of disks info. Sep 15 07:30:57 VOID emhttpd: cmd: /usr/local/emhttp/plugins/dynamix/scripts/tail_log syslog EDIT: "solved" per say. I know the drive is dying though I do find the ghost re-allocated sectors reported strange. I have a new one on order.
  19. Just came here to see why the update broke the container and what the deal was with the spinning logo. Also not a fan of the animated logo. Thanks for the quick fix.
  20. Is anyone else noticing that the plex docker is starting to get killed for Out of Memory errors during Plex's server maintenance window? This morning alone Plex has been OOM reaped dozens of times, sometimes as often as every minute!! https://pastebin.com/8BcJnQ5J This is happening across two different severs with different memory amounts and client loads (both use LSIO Plex docker). The errors begin within minutes of the maintenance window beginning and never appear after the maintenance window has closed. I do have hard RAM limits set in the docker's config, however they have never approached or hit this limit before in normal operation so I'm hesitant to increase the limit (they are in place to prevent runaway RAM consumption by dockers). NODEFlix has an 8GB RAM limit set for the PMS docker NOTYOFLIX has a 6GB RAM limit set for the PMS docker I haven't changed anything about either servers Plex settings or docker configurations recently. The only thing I can think of is Plex's new "Detect Intro" feature runs as a scheduled task so I have disabled it and will monitor to see if the errors return with that setting off. However I don't recall seeing this issue when that feature was introduced, this problem just appeared a few days ago... Attached are both server diagnostic zips. node-diagnostics-20200705-0545.zipvoid-diagnostics-20200705-0545.zip I've posted about my issues over on the Plex forums since this thread is to massive for anything to be seen or addressed. https://forums.plex.tv/t/server-maintenance-leads-to-excessive-resource-utilization/611012/
  21. For me it is somewhat different. it is 100% reproducible with Chrome (currently version # 83.0.4103.116 (64-bit)) on Windows 10 v1909, the tunnel goes down and stays down. I just tried with Chrome on my Android (Pixel 3A XL w/ Android 10) over LTE and it also brought down the tunnel and did not restart it. The first set of stop/start is me on Windows 10 adding a test peer then logging in locally and re-enabling the tunnel. The second set was me logging in via my android and removing said peer, which also brought down the tunnel. Connected directly to the web interface via the LAN (not a VPN) I can make changes to the tunnel settings in Chrome on Windows 10 and the tunnel rolls without issue. The tunnel only stops and stays down when I'm managing Wireguard over a Wireguard connection. EDIT: Also happens in latest Firefox on Windows. EDIT2: I tried to manage wireguard over wireguard again this time using RDP Windows to Windows machine that I then use to access the unraid webui. Management over the RDP connection tunneled through WireGuard successfully brought the tunnel down and back up. So my problem seems to be any direct attempt to manage the wireguard server over a wireguard connection results in the tunnel going down and staying down. If I connect to another machine on the LAN over wireguard and use that machine to manage the wireguard server then it seems to go down and come back up gracefully.
  22. Is Wireguard supposed to just stop the tunnel and leave it stopped when adding a new peer or making any changes at all really? I setup wireguard remote access to LAN for my phone and PC no problem super easy as advertised. I'm connected over wireguard managing unraid and I go to add a peer, hit apply and the unraid webui stops working because the tunnel has been stopped: Jun 25 09:41:05 Node wireguard: Tunnel WireGuard-wg0 stopped Jun 25 09:43:20 Node webGUI: Successful login user root from xxx.xxx.xxx.xxx Jun 25 09:43:24 Node wireguard: Tunnel WireGuard-wg0 started Thankfully I have other remote access methods to this server so I was able to go in and restart the tunnel but I don't see how this could be by design...shouldn't it be able to gracefully roll the connection? I'll make a new thread if this is unexpected behavior where troubleshooting can be done. EDIT: just got kicked again just trying to change the connection type for a peer that isn't even in use currently. It just stopped the tunnel and left it off... EDIT2: it sounds like depending on how peers are added active session interruption could be avoided: https://manpages.debian.org/unstable/wireguard-tools/wg.8.en.html#COMMANDS:~:text=syncconf EDIT3: I am just so utterly lost on how to make my main server talk to my backup server directly over wireguard. I currently have SSH and rsync running a monthly backup of my data, I would like to stop leaving SSH open to the net but I can't get server to server or remote access to server to work to save my life. I followed your "rough instructions" of setup server to server on one and import on the other but now i have a second tunnel I don't really want. Do I have to have a second tunnel for this to work? Can I not just add the server as a peer to my existing tunnel with my phone and home PC? I got server to server working, still not sure if a second port forward and tunnel was required or not but at least the Chinese will stop spamming my logs with SSH brute force attempts (Key based auth only so it is more aesthetic than a real security concern).
  23. Yeah, I ran a second scrub after deleting the corrupted file and it reports no further errors: UUID: cc9f1614-fc5d-406a-8ee7-58a5651dc9ae Scrub started: Thu May 21 07:58:40 2020 Status: finished Duration: 0:02:48 Total to scrub: 75.17GiB Rate: 458.17MiB/s Error summary: no errors found Thanks for reminding me about not being able to repair without a pool, i forgot that was the case.
  24. Ok cool so I don't necessarily need to do a scrub repair? Neat I'll just delete the file then. Thanks for the reassurance 😃