weirdcrap

Members
  • Posts

    380
  • Joined

  • Last visited

Everything posted by weirdcrap

  1. Good morning itimpi, this may be related to the issue ScottinOkla is describing? I had a parity check running all day yesterday (manually started), went to bed, woke up and it was paused for no reason. I do have the parity check tuner plugin installed, and my only pause scenario is disks overheating. I see no indication in the logs that any disks were overheating (additionally my house has quite a cold ambient temperature). I did have "Send notification for temperature related pause" set to NO, but I assume it would have still logged the pause and reason in the logs? It stopped right at 3:30AM CST with no corresponding log entries around it as to why... Feb 12 13:44:24 VOID webGUI: Successful login user root from 192.168.1.8 Feb 12 13:45:00 VOID root: Fix Common Problems Version 2021.01.27 Feb 12 13:47:23 VOID kernel: mdcmd (38): check nocorrect Feb 12 13:47:23 VOID kernel: md: recovery thread: check P ... Feb 12 13:47:48 VOID kernel: mdcmd (39): set md_write_method 1 Feb 12 13:47:48 VOID kernel: Feb 12 13:49:04 VOID root: Stopping Auto Turbo Feb 12 13:49:04 VOID root: Setting write method to unRaid defined Feb 12 13:49:04 VOID kernel: mdcmd (40): set md_write_method auto Feb 12 17:04:35 VOID dhcpcd[1683]: br0: failed to renew DHCP, rebinding Feb 12 18:19:34 VOID emhttpd: cmd: /usr/local/emhttp/plugins/dynamix/scripts/tail_log syslog Feb 12 23:00:01 VOID Plugin Auto Update: Checking for available plugin updates Feb 12 23:00:06 VOID Plugin Auto Update: Checking for language updates Feb 12 23:00:06 VOID Plugin Auto Update: Community Applications Plugin Auto Update finished Feb 13 00:00:07 VOID crond[1849]: exit status 1 from user root /usr/local/sbin/mover &> /dev/null Feb 13 00:10:37 VOID emhttpd: spinning down /dev/sdh Feb 13 00:10:37 VOID emhttpd: spinning down /dev/sdg Feb 13 00:10:37 VOID emhttpd: spinning down /dev/sdb Feb 13 00:10:37 VOID emhttpd: spinning down /dev/sdf Feb 13 00:10:37 VOID emhttpd: spinning down /dev/sdq Feb 13 00:10:37 VOID emhttpd: spinning down /dev/sdo Feb 13 02:00:01 VOID Docker Auto Update: Community Applications Docker Autoupdate running Feb 13 02:00:01 VOID Docker Auto Update: Checking for available updates Feb 13 02:00:07 VOID Docker Auto Update: No updates will be installed Feb 13 03:30:06 VOID kernel: mdcmd (49): nocheck PAUSE Feb 13 03:30:06 VOID kernel: Feb 13 03:30:06 VOID kernel: md: recovery thread: exit status: -4 Feb 13 05:02:21 VOID emhttpd: read SMART /dev/sdh Feb 13 05:02:29 VOID emhttpd: read SMART /dev/sdg Feb 13 05:02:35 VOID emhttpd: read SMART /dev/sdb Feb 13 05:02:42 VOID emhttpd: read SMART /dev/sdo Feb 13 05:02:49 VOID emhttpd: read SMART /dev/sdf Feb 13 05:02:54 VOID emhttpd: read SMART /dev/sdq void-diagnostics-20210213-0521.zip
  2. Interesting, you may have more than one plugin to blame here then. As Ken-ji mentioned in that other thread, you will probably need to go through the plugins one by one to figure out which one is changing your ownership and ask the author to fix it.
  3. You shouldn't even need to remove the plugin. At least for me, the proper ownership of the / directory was restored by simply rebooting UnRAID.
  4. You were right @ken-ji My other server hadn't had the parity check tuning plugin updated yet so I installed. Sure enough I lost pub key access and my usr folder is owned by nobody:users now EDIT: So I went digging into the folders it seems to have touched and I ended up in the plugins folder and it looks like some of the other plugins ive installed over the years aren't owned by root either. Is it a rule of thumb that all of these files should be owned by root? Or is it different from plugin to plugin?
  5. Interesting. The reboot actually fixed it for me so UnRAID must be correctly applying the default permissions and overriding anything the plugins may be messing with at boot. I don't have any more plugin updates currently but I will keep my eye out for an update and see if my ssh breaks after applying it. EDIT: Yeah it was almost certainly the "usr" folder that caused this as I recall seeing that it was the only folder in / that wasn't owned by root. I ignored it as I didn't think that folder would have anything to do with my issue since my key and ssh files weren't stored there.
  6. Yeah I was able to get in with my password still but I could not find anything wrong with my / directory. Next time it happens ill be sure to get a screenshot and share it so I can get some more eyes on it. If it was a plugin though wouldn't it affect me from the moment I booted the system? This occurred after 6 days of uptime and multiple uses of the key up to that point. I haven't added or removed any plugins, I think I may have updated one, the parity check tuning plugin.
  7. I also just got hit with my first case of "Authentication refused: bad ownership or modes for directory /" I have SSH'd into this server every day this week and used my public key. Now, with zero changes on my end, UnRAID says my permissions are wrong. I have a second server with the same exact setup and the public key works fine.... I'm mid appdata backup, I'm going to restart after and see if this persists. I use Termius, but also have putty installed. Updating putty from 0.7.3 to 0.7.4 did nothing to fix the issue. EDIT: While I'm waiting for the appdata backup to finish I've been poking around and I can't find anything wrong with the ownership of any of the files or folders according to output of ls -al. I've even tried re-owning and chmodding the relevant folders as suggested above and on other sites, it makes no difference. Public key auth is just flat out broken for no discernable reason. EDIT: Yeah a reboot fixes it with no changes to my config. Hopefully whatever causes this gets fixed before stable, its alarming to suddenly not be able to connect to your server. Luckily I don't disallow password auth right now, though I would like to turn it off at some point for heightened security.
  8. Well I guess I'll stick with the MX500 then, even though just ignoring the issue gives me a deep seeded (seated?) feeling of wrongness lol. Maybe the SmartMonTools update you linked to will resolve this once and for all by adjusting the alerting behavior for this drive. A question about the 860 EVO drive on my LSI and it's lack of TRIM: If I don't ever fill my SSD up (it generally only hovers around 100-200GB used) then will my lack of TRIM support have much of any noticeable performance and/or longevity effects? I've been reading about the subject and people talk about the write amplification implications of not having TRIM or garbage collection for heavily utilized SSDs. However it sounds like if you have lots of blocks without needed data there shouldn't need to be a lot of shuffling around done by the controller firmware, right?
  9. @JorgeB So what are good 1TB SATA (Non M.2) SSDs that are known to work well with UnRAID without frustratingly weird firmware issues like my 860 EVO issue or apparently Crucial MX500's randomly reporting bogus bad sectors ? I just bought an MX500 to put into my remote server as a replacement for my other aging 850 EVO. But if it's going to randomly report bad sectors I don't really want anything to do with that and I would rather return it for something else. Intel seems to have all but abandoned consumer line SSDs bigger than 512GB. I don't particularly want to pay a premium for a "data center quality" SATA SSD: https://www.intel.com/content/www/us/en/products/memory-storage/solid-state-drives/data-center-ssds.html I've not had good experience with Kingstons and I don't really know much about the Western Digital line of SSDs. EDIT: So I found the main topic for these crucial drives: So disabling the monitoring for 197 just prevents email/push alerts but it will still track and report bad sectors that stick around in the webui? UnRAID only disables a disk if it fails a write test, so this bug shouldn't cause any sort of disabling issues, right?
  10. You are correct on all counts. I have no clue what to make of it, I've recreated the tunnel and peer numerous times... Could this be some sort of weird networking or router tech? My only thoughts would be to turn on allowed connection logging and comparing the "good" connections to the "bad" connections.
  11. I am still perplexed by why WireGuard is not using the port specified in its config to hit my home server: Why the random ports instead of the defined port that I have a forward for?
  12. @clauniaThat is interesting, I have been blaming my setup for the issues. With how widespread WireGuard adoption is becoming and its touted speed I would have thought someone would have noticed this before us if it was in fact being caused by WireGuard. I also wonder why it only seems to affect SSH/RSYNC transfers for me. I can use Windows file explorer to copy data at full speed. I've used up most of my data cap from my ISP for this month so I can't do more testing right now. However next month I plan on re-enabling direct SSH access again with public key only auth setup for a few days of testing and comparing what speeds I can get. I will be quite disappointed if it turns out I can't use WireGuard for backups...
  13. Yeah I thought so as well, I guess i didn't wait long enough for the tunnel to close. I don't get why WireGuard isn't using the ports I defined in the interface to communicate with the endpoints? My port forwards are for 51820 and 51821, but for whatever reason its trying to hit random ports like 49997, 26608, 10757, etc. It feels like I'm missing something really obvious that I've blatantly misconfigured, but I can't figure out what it is. Source is on the left, which is me on NODE trying to ping VOID. Destination is on the right which appears to be NODE trying random ports that the firewall is blocking because those aren't my forwarded ports.
  14. I spoke to LJM42 about this a bit in my other thread but I figured i'd post in the proper thread for help as it still isn't working correctly... I have server to server setup for doing backup and sync activities between two remote unraid servers. My problem lies in that for some strange reason I can only ever start the tunnel from one side of the connection despite my setup being identical (from what I can tell). I've already tried completely deleting the tunnel on both ends and re-creating them. VOID and NODE are both WG1 on their respective servers and have all the properly defined endpoints (see screenshots below). VOID sits behind a PFSENSE box with a 2 port UDP range (51820-51821) forwarded for my two tunnels. I can always start the tunnel (WG1) from VOID. NODE sits behind a Zyxel USG110 with the same 2 port UDP range forwarded. I can never start the tunnel (WG1) from NODE. Once I send a ping from VOID to NODE, then and only then can NODE start talking to VOID. In pfsense when I am attempting to ping from NODE to VOID I am seeing blocked connections in the firewall originating from NODEs endpoint IP/port and coming to VOID but NOT on my forwarded ports so they are getting denied. I assume this is why I can't start the tunnel from the other side but why is it not respecting the set port? Do I just not understand how the ping function works?
  15. Well ideally I'd like to make this one work in this system as I'm stuck with it. I have moved it to the LSI controller to see if that helps. If not I may just keep this for an eventual gaming build (whenever the scalping and price crazyness stops) and buy a CrucialMX500. I was reading their marketing and they have TRIM at the firmware level which sounds interesting for scenarios where it may be on an LSI controller. EDIT: Ok so far so good on the LSI. EDIT: Well TRIM fails and throws an error but other than that things appear to be working.
  16. Well crap, apparently there is no newer firmware available... EDIT: https://bugzilla.kernel.org/show_bug.cgi?id=203475 Seems to be both TRIM and NCQ related. I TRIM on Sundays so this is almost certainly because of NCQ. Disabling NCQ tanks 4k random performance though which isn't ideal. If I had known the 860 EVOs were going to be such trouble I would never have bought them... I guess I should start buying a different brand of SSD. How would I go about trying to disable NCQ in UnRAID for this disk as a test? EDIT: alternatively, I have not yet tried this drive on the LSI. Loss of TRIM would obviously bypass that issue but what about NCQ? Does the LSI support it?
  17. @JorgeB It waited 2 days to resurface but the errors are back, even on a different controller with different sata and power cables... Feb 5 00:45:10 VOID crond[1837]: exit status 1 from user root /usr/local/sbin/mover &> /dev/null Feb 5 01:00:58 VOID kernel: ata7.00: exception Emask 0x0 SAct 0xc0c00041 SErr 0x0 action 0x6 frozen Feb 5 01:00:58 VOID kernel: ata7.00: failed command: WRITE FPDMA QUEUED Feb 5 01:00:58 VOID kernel: ata7.00: cmd 61/08:00:b8:ca:00/00:00:00:00:00/40 tag 0 ncq dma 4096 out Feb 5 01:00:58 VOID kernel: res 40/00:01:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout) Feb 5 01:00:58 VOID kernel: ata7.00: status: { DRDY } Feb 5 01:00:58 VOID kernel: ata7.00: failed command: READ FPDMA QUEUED Feb 5 01:00:58 VOID kernel: ata7.00: cmd 60/08:30:00:20:cd/00:00:0c:00:00/40 tag 6 ncq dma 4096 in Feb 5 01:00:58 VOID kernel: res 40/00:01:00:4f:c2/00:00:00:00:00/00 Emask 0x4 (timeout) Feb 5 01:00:58 VOID kernel: ata7.00: status: { DRDY } Feb 5 01:00:58 VOID kernel: ata7.00: failed command: WRITE FPDMA QUEUED Feb 5 01:00:58 VOID kernel: ata7.00: cmd 61/08:b0:28:a4:63/00:00:00:00:00/40 tag 22 ncq dma 4096 out Feb 5 01:00:58 VOID kernel: res 40/00:01:00:4f:c2/00:00:00:00:00/00 Emask 0x4 (timeout) Feb 5 01:00:58 VOID kernel: ata7.00: status: { DRDY } Feb 5 01:00:58 VOID kernel: ata7.00: failed command: WRITE FPDMA QUEUED Feb 5 01:00:58 VOID kernel: ata7.00: cmd 61/08:b8:e8:72:65/00:00:00:00:00/40 tag 23 ncq dma 4096 out Feb 5 01:00:58 VOID kernel: res 40/00:01:01:4f:c2/00:00:00:00:00/00 Emask 0x4 (timeout) Feb 5 01:00:58 VOID kernel: ata7.00: status: { DRDY } Feb 5 01:00:58 VOID kernel: ata7.00: failed command: SEND FPDMA QUEUED Feb 5 01:00:58 VOID kernel: ata7.00: cmd 64/01:f0:00:00:00/00:00:00:00:00/a0 tag 30 ncq dma 512 out Feb 5 01:00:58 VOID kernel: res 40/00:01:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout) Feb 5 01:00:58 VOID kernel: ata7.00: status: { DRDY } Feb 5 01:00:58 VOID kernel: ata7.00: failed command: WRITE FPDMA QUEUED Feb 5 01:00:58 VOID kernel: ata7.00: cmd 61/08:f8:b0:be:00/00:00:00:00:00/40 tag 31 ncq dma 4096 out Feb 5 01:00:58 VOID kernel: res 40/00:01:00:4f:c2/00:00:00:00:00/00 Emask 0x4 (timeout) Feb 5 01:00:58 VOID kernel: ata7.00: status: { DRDY } Feb 5 01:00:58 VOID kernel: ata7: hard resetting link Feb 5 01:00:58 VOID kernel: ata7: SATA link up 6.0 Gbps (SStatus 133 SControl 300) Feb 5 01:00:58 VOID kernel: ata7.00: supports DRM functions and may not be fully accessible Feb 5 01:00:58 VOID kernel: ata7.00: supports DRM functions and may not be fully accessible Feb 5 01:00:58 VOID kernel: ata7.00: configured for UDMA/133 Feb 5 01:00:58 VOID kernel: ata7: EH complete Feb 5 01:00:58 VOID kernel: ata7.00: Enabling discard_zeroes_data Feb 5 01:00:58 VOID kernel: ata7.00: invalid checksum 0xdc on log page 10h Feb 5 01:00:58 VOID kernel: ata7: log page 10h reported inactive tag 1 Feb 5 01:00:58 VOID kernel: ata7.00: exception Emask 0x1 SAct 0x1f8 SErr 0x0 action 0x0 Feb 5 01:00:58 VOID kernel: ata7.00: irq_stat 0x40000008 Feb 5 01:00:58 VOID kernel: ata7.00: failed command: WRITE FPDMA QUEUED Feb 5 01:00:58 VOID kernel: ata7.00: cmd 61/08:18:b0:be:00/00:00:00:00:00/40 tag 3 ncq dma 4096 out Feb 5 01:00:58 VOID kernel: res 40/00:20:00:00:00/00:00:00:00:00/a0 Emask 0x1 (device error) Feb 5 01:00:58 VOID kernel: ata7.00: status: { DRDY } Feb 5 01:00:58 VOID kernel: ata7.00: failed command: SEND FPDMA QUEUED Feb 5 01:00:58 VOID kernel: ata7.00: cmd 64/01:20:00:00:00/00:00:00:00:00/a0 tag 4 ncq dma 512 out Feb 5 01:00:58 VOID kernel: res 40/00:20:00:00:00/00:00:00:00:00/a0 Emask 0x1 (device error) Feb 5 01:00:58 VOID kernel: ata7.00: status: { DRDY } Feb 5 01:00:58 VOID kernel: ata7.00: failed command: WRITE FPDMA QUEUED Feb 5 01:00:58 VOID kernel: ata7.00: cmd 61/08:28:e8:72:65/00:00:00:00:00/40 tag 5 ncq dma 4096 out Feb 5 01:00:58 VOID kernel: res 40/00:20:00:00:00/00:00:00:00:00/a0 Emask 0x1 (device error) Feb 5 01:00:58 VOID kernel: ata7.00: status: { DRDY } Feb 5 01:00:58 VOID kernel: ata7.00: failed command: WRITE FPDMA QUEUED Feb 5 01:00:58 VOID kernel: ata7.00: cmd 61/08:30:28:a4:63/00:00:00:00:00/40 tag 6 ncq dma 4096 out Feb 5 01:00:58 VOID kernel: res 40/00:20:00:00:00/00:00:00:00:00/a0 Emask 0x1 (device error) Feb 5 01:00:58 VOID kernel: ata7.00: status: { DRDY } Feb 5 01:00:58 VOID kernel: ata7.00: failed command: READ FPDMA QUEUED Feb 5 01:00:58 VOID kernel: ata7.00: cmd 60/08:38:00:20:cd/00:00:0c:00:00/40 tag 7 ncq dma 4096 in Feb 5 01:00:58 VOID kernel: res 40/00:20:00:00:00/00:00:00:00:00/a0 Emask 0x1 (device error) Feb 5 01:00:58 VOID kernel: ata7.00: status: { DRDY } Feb 5 01:00:58 VOID kernel: ata7.00: failed command: WRITE FPDMA QUEUED Feb 5 01:00:58 VOID kernel: ata7.00: cmd 61/08:40:b8:ca:00/00:00:00:00:00/40 tag 8 ncq dma 4096 out Feb 5 01:00:58 VOID kernel: res 40/00:20:00:00:00/00:00:00:00:00/a0 Emask 0x1 (device error) Feb 5 01:00:58 VOID kernel: ata7.00: status: { DRDY } Feb 5 01:00:58 VOID kernel: ata7.00: failed to IDENTIFY (I/O error, err_mask=0x100) Feb 5 01:00:58 VOID kernel: ata7.00: revalidation failed (errno=-5) Feb 5 01:00:58 VOID kernel: ata7: hard resetting link Feb 5 01:00:59 VOID kernel: ata7: SATA link up 6.0 Gbps (SStatus 133 SControl 300) Feb 5 01:00:59 VOID kernel: ata7.00: supports DRM functions and may not be fully accessible Feb 5 01:00:59 VOID kernel: ata7.00: supports DRM functions and may not be fully accessible Feb 5 01:00:59 VOID kernel: ata7.00: configured for UDMA/133 Feb 5 01:00:59 VOID kernel: ata7.00: device reported invalid CHS sector 0 Feb 5 01:00:59 VOID kernel: sd 8:0:0:0: [sdh] tag#4 UNKNOWN(0x2003) Result: hostbyte=0x00 driverbyte=0x08 cmd_age=57s Feb 5 01:00:59 VOID kernel: sd 8:0:0:0: [sdh] tag#4 Sense Key : 0x5 [current] Feb 5 01:00:59 VOID kernel: sd 8:0:0:0: [sdh] tag#4 ASC=0x21 ASCQ=0x4 Feb 5 01:00:59 VOID kernel: sd 8:0:0:0: [sdh] tag#4 CDB: opcode=0x93 93 08 00 00 00 00 00 00 10 00 00 00 00 20 00 00 Feb 5 01:00:59 VOID kernel: blk_update_request: I/O error, dev sdh, sector 4096 op 0x3:(DISCARD) flags 0x800 phys_seg 1 prio class 0 Feb 5 01:00:59 VOID kernel: ata7: EH complete Feb 5 01:00:59 VOID kernel: ata7.00: Enabling discard_zeroes_data Feb 5 01:01:59 VOID kernel: ata7.00: exception Emask 0x0 SAct 0x10ff018 SErr 0x0 action 0x6 frozen Feb 5 01:01:59 VOID kernel: ata7.00: failed command: WRITE FPDMA QUEUED Feb 5 01:01:59 VOID kernel: ata7.00: cmd 61/10:18:d0:1f:cd/00:00:0c:00:00/40 tag 3 ncq dma 8192 out Feb 5 01:01:59 VOID kernel: res 40/00:20:00:00:00/00:00:00:00:00/a0 Emask 0x4 (timeout) Feb 5 01:01:59 VOID kernel: ata7.00: status: { DRDY } Feb 5 01:01:59 VOID kernel: ata7.00: failed command: WRITE FPDMA QUEUED Feb 5 01:01:59 VOID kernel: ata7.00: cmd 61/08:20:e0:1f:cd/00:00:0c:00:00/40 tag 4 ncq dma 4096 out Feb 5 01:01:59 VOID kernel: res 40/00:20:00:00:00/00:00:00:00:00/a0 Emask 0x4 (timeout) Feb 5 01:01:59 VOID kernel: ata7.00: status: { DRDY } Feb 5 01:01:59 VOID kernel: ata7.00: failed command: WRITE FPDMA QUEUED Feb 5 01:01:59 VOID kernel: ata7.00: cmd 61/20:60:c0:a4:c5/00:00:05:00:00/40 tag 12 ncq dma 16384 out Feb 5 01:01:59 VOID kernel: res 40/00:01:00:4f:c2/00:00:00:00:00/00 Emask 0x4 (timeout) Feb 5 01:01:59 VOID kernel: ata7.00: status: { DRDY } Feb 5 01:01:59 VOID kernel: ata7.00: failed command: SEND FPDMA QUEUED Feb 5 01:01:59 VOID kernel: ata7.00: cmd 64/01:68:00:00:00/00:00:00:00:00/a0 tag 13 ncq dma 512 out Feb 5 01:01:59 VOID kernel: res 40/00:01:01:4f:c2/00:00:00:00:00/00 Emask 0x4 (timeout) Feb 5 01:01:59 VOID kernel: ata7.00: status: { DRDY } Feb 5 01:01:59 VOID kernel: ata7.00: failed command: WRITE FPDMA QUEUED Feb 5 01:01:59 VOID kernel: ata7.00: cmd 61/20:70:40:a5:c5/00:00:05:00:00/40 tag 14 ncq dma 16384 out Feb 5 01:01:59 VOID kernel: res 40/00:01:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout) Feb 5 01:01:59 VOID kernel: ata7.00: status: { DRDY } Feb 5 01:01:59 VOID kernel: ata7.00: failed command: WRITE FPDMA QUEUED Feb 5 01:01:59 VOID kernel: ata7.00: cmd 61/20:78:a0:a5:c5/00:00:05:00:00/40 tag 15 ncq dma 16384 out Feb 5 01:01:59 VOID kernel: res 40/00:01:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout) Feb 5 01:01:59 VOID kernel: ata7.00: status: { DRDY } Feb 5 01:01:59 VOID kernel: ata7.00: failed command: WRITE FPDMA QUEUED Feb 5 01:01:59 VOID kernel: ata7.00: cmd 61/80:80:c0:35:2b/02:00:00:00:00/40 tag 16 ncq dma 327680 out Feb 5 01:01:59 VOID kernel: res 40/00:01:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout) Feb 5 01:01:59 VOID kernel: ata7.00: status: { DRDY } Feb 5 01:01:59 VOID kernel: ata7.00: failed command: WRITE FPDMA QUEUED Feb 5 01:01:59 VOID kernel: ata7.00: cmd 61/80:88:c0:35:33/02:00:00:00:00/40 tag 17 ncq dma 327680 out Feb 5 01:01:59 VOID kernel: res 40/00:01:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout) Feb 5 01:01:59 VOID kernel: ata7.00: status: { DRDY } Feb 5 01:01:59 VOID kernel: ata7.00: failed command: WRITE FPDMA QUEUED Feb 5 01:01:59 VOID kernel: ata7.00: cmd 61/08:90:58:75:65/00:00:00:00:00/40 tag 18 ncq dma 4096 out Feb 5 01:01:59 VOID kernel: res 40/00:01:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout) Feb 5 01:01:59 VOID kernel: ata7.00: status: { DRDY } Feb 5 01:01:59 VOID kernel: ata7.00: failed command: WRITE FPDMA QUEUED Feb 5 01:01:59 VOID kernel: ata7.00: cmd 61/08:98:90:77:65/00:00:00:00:00/40 tag 19 ncq dma 4096 out Feb 5 01:01:59 VOID kernel: res 40/00:01:00:4f:c2/00:00:00:00:00/00 Emask 0x4 (timeout) Feb 5 01:01:59 VOID kernel: ata7.00: status: { DRDY } Feb 5 01:01:59 VOID kernel: ata7.00: failed command: WRITE FPDMA QUEUED Feb 5 01:01:59 VOID kernel: ata7.00: cmd 61/20:c0:20:a4:c5/00:00:05:00:00/40 tag 24 ncq dma 16384 out Feb 5 01:01:59 VOID kernel: res 40/00:01:00:4f:c2/00:00:00:00:00/00 Emask 0x4 (timeout) Feb 5 01:01:59 VOID kernel: ata7.00: status: { DRDY } Feb 5 01:01:59 VOID kernel: ata7: hard resetting link Feb 5 01:02:00 VOID kernel: ata7: SATA link up 6.0 Gbps (SStatus 133 SControl 300) Feb 5 01:02:00 VOID kernel: ata7.00: supports DRM functions and may not be fully accessible Feb 5 01:02:00 VOID kernel: ata7.00: supports DRM functions and may not be fully accessible Feb 5 01:02:00 VOID kernel: ata7.00: configured for UDMA/133 Feb 5 01:02:00 VOID kernel: ata7.00: device reported invalid CHS sector 0 Feb 5 01:02:00 VOID kernel: ata7: EH complete Feb 5 01:02:00 VOID kernel: ata7.00: Enabling discard_zeroes_data Feb 5 01:02:30 VOID kernel: ata7.00: NCQ disabled due to excessive errors Feb 5 01:02:30 VOID kernel: ata7.00: exception Emask 0x0 SAct 0x60000007 SErr 0x0 action 0x6 frozen Feb 5 01:02:30 VOID kernel: ata7.00: failed command: WRITE FPDMA QUEUED Feb 5 01:02:30 VOID kernel: ata7.00: cmd 61/20:00:c0:a4:c5/00:00:05:00:00/40 tag 0 ncq dma 16384 out Feb 5 01:02:30 VOID kernel: res 40/00:01:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout) Feb 5 01:02:30 VOID kernel: ata7.00: status: { DRDY } Feb 5 01:02:30 VOID kernel: ata7.00: failed command: WRITE FPDMA QUEUED Feb 5 01:02:30 VOID kernel: ata7.00: cmd 61/08:08:e0:1f:cd/00:00:0c:00:00/40 tag 1 ncq dma 4096 out Feb 5 01:02:30 VOID kernel: res 40/00:01:00:4f:c2/00:00:00:00:00/00 Emask 0x4 (timeout) Feb 5 01:02:30 VOID kernel: ata7.00: status: { DRDY } Feb 5 01:02:30 VOID kernel: ata7.00: failed command: WRITE FPDMA QUEUED Feb 5 01:02:30 VOID kernel: ata7.00: cmd 61/10:10:d0:1f:cd/00:00:0c:00:00/40 tag 2 ncq dma 8192 out Feb 5 01:02:30 VOID kernel: res 40/00:01:01:4f:c2/00:00:00:00:00/00 Emask 0x4 (timeout) Feb 5 01:02:30 VOID kernel: ata7.00: status: { DRDY } Feb 5 01:02:30 VOID kernel: ata7.00: failed command: WRITE FPDMA QUEUED Feb 5 01:02:30 VOID kernel: ata7.00: cmd 61/20:e8:40:a5:c5/00:00:05:00:00/40 tag 29 ncq dma 16384 out Feb 5 01:02:30 VOID kernel: res 40/00:01:01:4f:c2/00:00:00:00:00/00 Emask 0x4 (timeout) Feb 5 01:02:30 VOID kernel: ata7.00: status: { DRDY } Feb 5 01:02:30 VOID kernel: ata7.00: failed command: SEND FPDMA QUEUED Feb 5 01:02:30 VOID kernel: ata7.00: cmd 64/01:f0:00:00:00/00:00:00:00:00/a0 tag 30 ncq dma 512 out Feb 5 01:02:30 VOID kernel: res 40/00:01:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout) Feb 5 01:02:30 VOID kernel: ata7.00: status: { DRDY } Feb 5 01:02:30 VOID kernel: ata7: hard resetting link Feb 5 01:02:30 VOID kernel: ata7: SATA link up 6.0 Gbps (SStatus 133 SControl 300) Feb 5 01:02:30 VOID kernel: ata7.00: supports DRM functions and may not be fully accessible Feb 5 01:02:30 VOID kernel: ata7.00: supports DRM functions and may not be fully accessible Feb 5 01:02:30 VOID kernel: ata7.00: configured for UDMA/133 Feb 5 01:02:30 VOID kernel: ata7: EH complete Feb 5 01:02:30 VOID kernel: ata7.00: Enabling discard_zeroes_data Feb 5 01:02:43 VOID kernel: BTRFS warning (device sdh1): failed to trim 1 block group(s), last error -5 void-diagnostics-20210205-0458.zip Maybe a firmware update will help it get its sh*t together. putting the timeout error into google brings up lots of these: https://bbs.archlinux.org/viewtopic.php?id=168530 https://askubuntu.com/questions/1154493/why-is-my-ssd-periodically-hanging-for-20-to-30-seconds
  18. Oh, learned something new today then. glad you got it sorted.
  19. According to your disk.cfg, your default FS type is XFS and all of your disks are either explicitly set to use XFS or are set to auto (which I assume means they are XFS as well). You can see that loop2 is the BTRFS disk image under the advanced docker settings: EDIT: Oh I apologize, I missed your second question. You don't, UnRAID doesn't allow you to use XFS for the docker.img file. BTRFS is your only option as far as I'm aware.
  20. Yeah I've been looking through Google, I was just hoping someone here would have encountered this issue before and would be able to guide me to a solution quicker than I could find one just trying random stuff I find online. A lot of what I find on Google seems to be people expecting rsync to be able to saturate gigabit ethernet consistently, or somehow magically overcome the limitations of their USB 2.0 drive, HDD I/O, CPU, whatever bottleneck they may have: https://serverfault.com/questions/377598/why-is-my-rsync-so-slow https://unix.stackexchange.com/questions/303937/why-my-rsync-slow-down-over-time https://superuser.com/questions/424512/why-do-file-copy-operations-in-linux-get-slower-over-time The consensus online seems to be rsync and SSH are not and were never meant to be the most performant pieces of software out there, which I totally get. There are threads full of alternatives to try. However, in my googling I'm regularly seeing all these people who complain about the slowness of rsync easily achieving the (IMO) very low speeds I'm wanting. Their worst reported speeds are honestly what I'm aiming for here. Like your example from reddit, I'd be absolutely thrilled if I could maintain 7MB/s or even 5MB/s that they complain of in that thread. Everywhere I've seen stats reported, rsync with SSH should be completely capable of maintaining this paltry speed with the encryption & protocol overhead and my so-so hardware. I normally limit my rsync jobs to 5MB/s (--bwlimit=5000) as I have to share bandwidth with other servers where NODE is hosted and I'm positive that it can handle a consistent 5MB/s stream of data both sending and receiving. I'm not seeing high iowaits, CPU, RAM, or anything really when the transfers do slow down. That is what has made this so hard to diagnose, a complete and utter lack of clues. https://www.raspberrypi.org/forums/viewtopic.php?p=1404560&sid=ac2739c958d835a87f2afff7ad0df267#p1404560 This suggestion is interesting, I had not considered that it could be in between network equipment. However I'm able to utilize other TCP heavy protocols at maximum speed like SCP, SFTP, FTP when downloading files from the internet to these servers separately. I simply can't make them talk quickly to each other, for extended periods of time. Finally, I want to say thank you for taking time out of your day to look at this with me. Trying to figure out a problem you've been working on for weeks and weeks requires new perspectives sometimes.
  21. Ok the WireGuard issue is fixed, now either server can restart the connection. It shouldn't have mattered but I'm going to start another transfer and see if the WireGuard fix made any difference. I'm really hoping to rally the vast community knowledge here (and reddit), I can't imagine I'm the only person using rsync and ssh in this manner for backups. EDIT: Nope, no difference with the WG issue fixed.
  22. I am, I for some reason thought it was written by one of the unraid admins, I didn't realize it was a user guest blog. Either way I didn't mean to come off like I expected him to come help me, I just meant in comparing my self made setup to theirs I did everything I could "by the book" so to speak. It varies, sometimes I can get through 10-20 files at 1-2GB a pop. Other times the speed will tank half way through the first file. I did test a cache to cache transfer in my flurry of work yesterday. It did not improve the situation from what I recall but I did not document the results well so it bears another test to be sure. I do not unfortunately and the primary server is actually hosted about 4 hours away from me so its not something I can just go and visit on a whim. When I first setup WireGuard I definitely tested this and both ends were able to start the connection so I'm not sure what changed there. My router here with VOID is a pfsense router and from what I can tell my port forwards are setup correctly: I'll look those threads over again as I'll admit my grasp of wireguard when i configured it was tentative at best and its all looking greek to me now looking back at it. I think I see my problem though, on NODE I have the peer address set to the same thing as my local endpoint. The peer for VOID should be my dynamic DNS name for my home IP right? EDIT: Started a cache (NODE) to cache (VOID) transfer and barely made it into the first file before the speed tanked. Meanwhile I've been uploading files from VOID to NODE for the last hour at full speed. 10GB files at 2.8MB/s (the max for my crappy comcast upload). This is the part that drives me nuts, I can't seem to find a pattern to what transfers utilize their full potential speed while others just languish in the land of dial-up.
  23. I didn't think to try this until now, when I run multiple transfers in parallel, one slowing down does NOT cause the other to slow down. Why is that? If this was a resource issue (whether CPU, RAM, bandwidth, etc) I would think parallel transfers would be affected equally...
  24. @ljm42 I'm noticing an odd behavior with my server to server wireguard tunnel... When I'm signed into NODE, I can't ping VOID initially. Once I start a ping from VOID to NODE however, replies start flowing both ways... This is not something I had noticed before. Should I enable keep alive to prevent this? I'm going to snag a video of this behavior. EDIT: So as you can see in the video, I start to ping VOID's WG IP on NODE and get no replies until I start a ping from VOID to NODE. Then like magic all of a sudden NODE realizes VOID is, in fact, available. I'm to the point in this I'm willing to offer a cash reward if someone can just tell me WTF is wrong with rsync/SSH.
  25. Yup, that's the thread I found. I linked to LJM42's reply specifically as that appears to be what my issue was. On a side note, I see the tips and tweaks plugin recommends the performance scaler, does this offer much difference for day to day use over the OnDemand option?