Jump to content

jsmj

Members
  • Content Count

    17
  • Joined

  • Last visited

Everything posted by jsmj

  1. I've reproduced the issue when booting from each individual dimm by itself but I haven't run memtest with a single dimm installed. Is that worth doing?
  2. As far as I can tell, memory speeds are unchanged by me and running at 2133. The sticks are rated for 3200. Here's the product page for the memory I'm using. I've attached the screens for my bios settings bios_screens.zip
  3. Hey guys, sorry I went dark. I was out of town. I started memtest to run over night and forgot to stop it and it ran for 161 hours before I manually shut it down. It accumulated 1 error well after the 24 hour mark. I tried using a different PCIE GPU and it ran for 3 hours or so before I went to bed with high hopes, but it crashed sometime in the night. When I went to reboot it, it wouldn’t even make it to the boot selector screen before lockup. This behavior of needing a “rest” after being booted for longer (hour+) periods is super strange to me. The length of time the machine will stay booted seems to be related to the amount of time it’s been powered down. I went out of town and got 3 whole hours of uptime! But if I boot it directly after a crash, it can’t make it long enough to fully boot. That seems to be a temp issue which I can’t chase down or unraid needing to “forget” something that it did or accumulated in order to cause the crash. I’d like to make an unraid trial usb and boot from that, but I wanted to check and make sure that’s safe for my data. I wouldn’t start the array.
  4. It's pretty dust free in there. I wouldn't eat out of it, but I've seen far worse. Ventilation shouldn't be a problem. I'll run memtest for 24 hours and see how it goes. Attached is the syslog from the USB after the latest crash and full diagnostics zip Edit to add: There are 13 files on my flash drive with the name FSCK000*.REC that I don't remember having before. Is this of any significance? I can add my flash backup .zip file as well if that'll help syslog tower-diagnostics-20190910-0012.zip
  5. checked the paste today, confirmed the pins aren't bent or anything and redid the paste. No dice. Stayed booted for about 45 minutes, just enough to crush my hopes
  6. The only setting I found in my BIOS is global C-state, and disabling that didn't solve it The RAM settings are unchanged from board default, which is 2133. I tried changing it to 3200 for kicks, and the system won't boot at all that way. I haven't messed with timings or voltage or anything. This morning I tried a different PSU, which didn't solve anything either. I also ditched the 2 PCI-E 4 port SATA cards for an IBM M1015 HBA in IT mode, which seems to be working properly when the server is running, but I still get the blackouts. I've also noticed that after the machine crashes, subsequent boots only last a couple minutes if it boots at all. If I give it some down time, it'll stay booted for longer, but eventually crashes. I thought this might mean a thermal problem, but I can't find any offending high temps, 40C at the most.
  7. I upgraded my mobo and CPU and am having trouble keeping the server booted. Starts up normally, and I can access the GUI over the network. I haven't tried starting the array yet, as I'm worried about the unclean shutdowns. Basically what happens is the server operates normally (as normal as it can without the array mounted) and then falls off the network, and the monitor attached to the machine goes black. Keyboard input does nothing. All I can do is shut it down. I've tried booting from a Ubuntu drive, and I can boot into Ubuntu and noodle around, and it stays booted for a few hours (as long as I've tested), no crash to black screen. I can also run memtest86 from the Unraid flash drive and it runs for hours as well. No crash, but I might have it run again overnight for good measure. Unraid is the only thing that makes it crash so far, and 30 minutes is as long as it'll stay booted. Sometimes it crashes to black sooner or not even get through the boot up process before it crashes to black screen. Specs: ASUS ROG Strix B450-F Ryzen 5 2600 32GB G.Skill 3200 DDR4 Corsair CX600M MSI GTX 970 Two PCIe 4 port Sata adapters (One is full, the other has one drive attached) 11 Disks (6 on the mobo, 5 on the PCIe SATA cards) Another thing I've tried is booting UEFI from the flash drive, and this doesn't crash to black. Instead it just will reboot after a few minutes. Logs are attached. Only thing I have yet to try is a new flash drive, and I will as soon as I can keep it booted long enough to download a backup. Can I just copy the contents of the current flash drive onto a new one or is there something special about the backup? I also watched temps in the BIOS for a good while and never saw anything above 40C. BIOS is updated to the current version. Logs are attached. When I looked through them I ctrl+f'd 'fail' and found a lot, but don't understand them. Edit: Also tried booting with one stick of RAM and both sticks (16 gb/ea) show the same issue syslog.txt
  8. Edit: Nevermind, the problem persists. I'm still at 1000 down / 100 up. I get 1G/1G only if the array is offline. If I start the array, the link goes back to 1G/100M 👎 So the bad news, that command took the server off the network completely. I hooked up a monitor to try to do a graceful reboot from the command line, but couldn't get an image, so eventually had to just hard reset it. The good news is my link is 1G/1G in all directions every which way. No idea why or what or how. The following iperf tests were done with the array offline and result in 1G/1G iperf3 in both directions for problem server <=> Shield root@Tower:~# iperf3 -c 192.168.1.208 -i 20 Connecting to host 192.168.1.208, port 5201 [ 4] local 192.168.1.101 port 44286 connected to 192.168.1.208 port 5201 [ ID] Interval Transfer Bandwidth Retr Cwnd [ 4] 0.00-10.00 sec 1.06 GBytes 910 Mbits/sec 0 5.66 KBytes - - - - - - - - - - - - - - - - - - - - - - - - - [ ID] Interval Transfer Bandwidth Retr [ 4] 0.00-10.00 sec 1.06 GBytes 910 Mbits/sec 0 sender [ 4] 0.00-10.00 sec 1.06 GBytes 908 Mbits/sec receiver iperf Done. root@Tower:~# iperf3 -c 192.168.1.208 -i 20 -R Connecting to host 192.168.1.208, port 5201 Reverse mode, remote host 192.168.1.208 is sending [ 4] local 192.168.1.101 port 44290 connected to 192.168.1.208 port 5201 [ ID] Interval Transfer Bandwidth [ 4] 0.00-10.00 sec 1.08 GBytes 932 Mbits/sec - - - - - - - - - - - - - - - - - - - - - - - - - [ ID] Interval Transfer Bandwidth Retr [ 4] 0.00-10.00 sec 1.09 GBytes 934 Mbits/sec 0 sender [ 4] 0.00-10.00 sec 1.09 GBytes 932 Mbits/sec receiver iperf Done. iperf3 in both directions for the other unraid server (server B) A<=>B root@Tower:~# iperf3 -c 192.168.1.86 -i 20 Connecting to host 192.168.1.86, port 5201 [ 4] local 192.168.1.101 port 42310 connected to 192.168.1.86 port 5201 [ ID] Interval Transfer Bandwidth Retr Cwnd [ 4] 0.00-10.00 sec 1.09 GBytes 935 Mbits/sec 0 5.66 KBytes - - - - - - - - - - - - - - - - - - - - - - - - - [ ID] Interval Transfer Bandwidth Retr [ 4] 0.00-10.00 sec 1.09 GBytes 935 Mbits/sec 0 sender [ 4] 0.00-10.00 sec 1.09 GBytes 934 Mbits/sec receiver iperf Done. root@Tower:~# iperf3 -c 192.168.1.86 -i 20 -R Connecting to host 192.168.1.86, port 5201 Reverse mode, remote host 192.168.1.86 is sending [ 4] local 192.168.1.101 port 42314 connected to 192.168.1.86 port 5201 [ ID] Interval Transfer Bandwidth [ 4] 0.00-10.00 sec 1.10 GBytes 941 Mbits/sec - - - - - - - - - - - - - - - - - - - - - - - - - [ ID] Interval Transfer Bandwidth Retr [ 4] 0.00-10.00 sec 1.10 GBytes 943 Mbits/sec 0 sender [ 4] 0.00-10.00 sec 1.10 GBytes 941 Mbits/sec receiver iperf Done.
  9. Here's mine for the problem server. Looks slightly different from yours. Particularly the bits about "Supported pause frame use: Symmetric Receive-only" and the port being MII while yours is "twisted pair". PHYAD is different as well root@Tower:~# ethtool eth0 Settings for eth0: Supported ports: [ TP MII ] Supported link modes: 10baseT/Half 10baseT/Full 100baseT/Half 100baseT/Full 1000baseT/Half 1000baseT/Full Supported pause frame use: Symmetric Receive-only Supports auto-negotiation: Yes Supported FEC modes: Not reported Advertised link modes: 10baseT/Half 10baseT/Full 100baseT/Half 100baseT/Full 1000baseT/Half 1000baseT/Full Advertised pause frame use: Symmetric Receive-only Advertised auto-negotiation: Yes Advertised FEC modes: Not reported Link partner advertised link modes: 10baseT/Half 10baseT/Full 100baseT/Half 100baseT/Full 1000baseT/Full Link partner advertised pause frame use: Symmetric Receive-only Link partner advertised auto-negotiation: Yes Link partner advertised FEC modes: Not reported Speed: 1000Mb/s Duplex: Full Port: MII PHYAD: 0 Transceiver: internal Auto-negotiation: on Supports Wake-on: pumbg Wake-on: g Current message level: 0x00000033 (51) drv probe ifdown ifup Link detected: yes root@Tower:~# And here's the output from the other server that has a working 1G/1G link. Settings for eth0: Supported ports: [ TP ] Supported link modes: 10baseT/Half 10baseT/Full 100baseT/Half 100baseT/Full 1000baseT/Full Supported pause frame use: No Supports auto-negotiation: Yes Supported FEC modes: Not reported Advertised link modes: 10baseT/Half 10baseT/Full 100baseT/Half 100baseT/Full 1000baseT/Full Advertised pause frame use: No Advertised auto-negotiation: Yes Advertised FEC modes: Not reported Speed: 1000Mb/s Duplex: Full Port: Twisted Pair PHYAD: 1 Transceiver: internal Auto-negotiation: on MDI-X: off Supports Wake-on: g Wake-on: d Link detected: yes
  10. I'm testing using two clients: another Unraid server (call it server B), and an Nvidia Shield. The Shield has an app that allows it to host an iperf3 connection. I get 1Gbps/1Gbps in both directions on both those machines when testing between them. So they are both negotiating a 1G link. It's only when I test either of those clients against the problem Unraid server (call it server A). A => B = <100 Mbps A => Shield = <100 Mbps Shield => A = 1000 Mbps B => A = 1000 Mbps B <=> Shield = 1000 Mbps Here are the iperfs between B and Shield in both directions, but to summarize, they are both able to negotiate a 1Gbps link both ways Connecting to host 192.168.1.208, port 5201 [ 4] local 192.168.1.86 port 53388 connected to 192.168.1.208 port 5201 [ ID] Interval Transfer Bandwidth Retr Cwnd [ 4] 0.00-10.00 sec 1.04 GBytes 889 Mbits/sec 0 5.66 KBytes - - - - - - - - - - - - - - - - - - - - - - - - - [ ID] Interval Transfer Bandwidth Retr [ 4] 0.00-10.00 sec 1.04 GBytes 889 Mbits/sec 0 sender [ 4] 0.00-10.00 sec 1.03 GBytes 887 Mbits/sec receiver iperf Done. root@TwoTower:~# iperf3 -c 192.168.1.208 -i 20 -R Connecting to host 192.168.1.208, port 5201 Reverse mode, remote host 192.168.1.208 is sending [ 4] local 192.168.1.86 port 53394 connected to 192.168.1.208 port 5201 [ ID] Interval Transfer Bandwidth [ 4] 0.00-10.00 sec 1.01 GBytes 867 Mbits/sec - - - - - - - - - - - - - - - - - - - - - - - - - [ ID] Interval Transfer Bandwidth Retr [ 4] 0.00-10.00 sec 1.01 GBytes 869 Mbits/sec 0 sender [ 4] 0.00-10.00 sec 1.01 GBytes 867 Mbits/sec receiver iperf Done.
  11. Best I can tell, MTU is set to 1500 on the two Unraid machines I am testing between. The other device I'm using is an Nvidia Shield and I don't believe there's a way to set MTU without rooting it
  12. That was the first thing I tried. Was really hoping it'd be something simple like that
  13. Hey guys, I'm trouble shooting an asymmetrical network link. I get gigabit speeds when going from client -> unraid server, but 100 Mbits/sec (at best) when testing from Unraid -> client. I've changed cables, tested cables, changed ports on the switch and router, and I'm out of ideas. The lights on the switch show full duplex 1000 link, as does Unraid on the dashboard. Here are a couple iperf3 results with the server acting as client (sending): root@Tower:~# iperf3 -c 192.168.1.86 -i 20 Connecting to host 192.168.1.86, port 5201 [ 4] local 192.168.1.101 port 40006 connected to 192.168.1.86 port 5201 [ ID] Interval Transfer Bandwidth Retr Cwnd [ 4] 0.00-10.00 sec 36.6 MBytes 30.7 Mbits/sec 24588 79.2 KBytes - - - - - - - - - - - - - - - - - - - - - - - - - [ ID] Interval Transfer Bandwidth Retr [ 4] 0.00-10.00 sec 36.6 MBytes 30.7 Mbits/sec 24588 sender [ 4] 0.00-10.00 sec 34.5 MBytes 29.0 Mbits/sec receiver iperf Done. root@Tower:~# iperf3 -c 192.168.1.208 -i 20 Connecting to host 192.168.1.208, port 5201 [ 4] local 192.168.1.101 port 47854 connected to 192.168.1.208 port 5201 [ ID] Interval Transfer Bandwidth Retr Cwnd [ 4] 0.00-10.00 sec 94.8 MBytes 79.5 Mbits/sec 65838 67.9 KBytes - - - - - - - - - - - - - - - - - - - - - - - - - [ ID] Interval Transfer Bandwidth Retr [ 4] 0.00-10.00 sec 94.8 MBytes 79.5 Mbits/sec 65838 sender [ 4] 0.00-10.00 sec 93.6 MBytes 78.5 Mbits/sec receiver iperf Done. And here are a couple more with the -R flag to show I get 1Gps in the other direction (receiving): root@Tower:~# iperf3 -c 192.168.1.86 -i 20 -R Connecting to host 192.168.1.86, port 5201 Reverse mode, remote host 192.168.1.86 is sending [ 4] local 192.168.1.101 port 40156 connected to 192.168.1.86 port 5201 [ ID] Interval Transfer Bandwidth [ 4] 0.00-10.00 sec 1.07 GBytes 919 Mbits/sec - - - - - - - - - - - - - - - - - - - - - - - - - [ ID] Interval Transfer Bandwidth Retr [ 4] 0.00-10.00 sec 1.07 GBytes 920 Mbits/sec 0 sender [ 4] 0.00-10.00 sec 1.07 GBytes 919 Mbits/sec receiver iperf Done. root@Tower:~# iperf3 -c 192.168.1.208 -i 20 -R Connecting to host 192.168.1.208, port 5201 Reverse mode, remote host 192.168.1.208 is sending [ 4] local 192.168.1.101 port 48004 connected to 192.168.1.208 port 5201 [ ID] Interval Transfer Bandwidth [ 4] 0.00-10.00 sec 1.07 GBytes 923 Mbits/sec - - - - - - - - - - - - - - - - - - - - - - - - - [ ID] Interval Transfer Bandwidth Retr [ 4] 0.00-10.00 sec 1.08 GBytes 925 Mbits/sec 0 sender [ 4] 0.00-10.00 sec 1.07 GBytes 923 Mbits/sec receiver iperf Done. My next thought is replacing the onboard Realtek NIC (8111C), but I'm kind of running out of expansion slots so I want to be sure that's the issue first before I tackle that. Any ideas? Logs are attached tower-diagnostics-20190821-1526.zip
  14. Alright, preclear finished over night and I formatted and added the drive to the pool this morning Rebooted and rootfs warning was gone from FCP Successfully updated to 6.6.7, rebooted again (new UI looks legit) FCP is reporting no problems, Plex seems to be working and behaving normally. Best I can tell I'm in the clear. Thanks guys
  15. alright I moved the /config mapping to /mnt/user/appdata/plex and when this preclear finishes I'll reboot and see where that gets me. I had to set up the Plex libraries again but I take it the'll stick now that they're on a physical drive
  16. I don't. I used to, but I was under the impression that it's no longer needed and that the server will take care of it by default when the array is spun up with a new drive. It was set at 10G by default and I was nearing that limit. I wasn't sure a sane value to expand it to and 50G sounded good and I had the storage. This was quite awhile back. the only /mnt/cache mapping I can see is Plex /config mapped to /mnt/cache/appdata/Plex. But it's been like this for months. Could it slowly grow to fill rootfs and be finally cropping up? What should it be mapped to if not /mnt/cache/appdata?
  17. Hey friends. I can't figure this one out. My Fix Common Problems plugin is griping that rootfs is getting full. If I understand it right, something is either writing to /tmp or memory, but best I can tell my dockers all write to /mnt/user or /mnt/disk# or /mnt/cache, but I could have overlooked something. Possibly related, I can't download the 6.7 update because I get a i/o failure. From googling this is because the gui saves the update to RAM, and mine is full from the rootfs issue. I've attached my diagnostics from Fix Common Problems. Let me know if I can provide anything else. I'm currently pre-clearing a drive, if that matters. Just trying to include everything I can think of that's going on. tower-diagnostics-20190321-1617.zip