a_bomb

Members
  • Posts

    21
  • Joined

  • Last visited

a_bomb's Achievements

Noob

Noob (1/14)

0

Reputation

  1. It is certainly worth the experiment I guess. But not a long term solution to me. If that does end up being a solution. I will probably roll back to 6.9.1 and set it all back up the way I had it and see what happens.
  2. So if they were in pair under the same controller, don't you think I would see the same type of issues with my Unraid OS flash drive?
  3. I did reseat the cable from the UPS and the back of the server. The server motherboard only has 2 USB ports in the rear and the other has my Unraid flash drive. I guess I can just move that flash drive to a front USB port without issue? Then try the other rear USB.
  4. I upgraded to 6.9.2 recently and started getting a TON of alert notifications around my Tripplite UPS that had no issues before. I did end up correcting my email notifications which seems to have stopped it from happening every second on the server notifications, now I just get them every 5 minutes or so. Diagnostics attached. Log is showing: pr 16 19:36:24 Server kernel: usb 2-1-port5: disabled by hub (EMI?), re-enabling... Apr 16 19:36:24 Server kernel: usb 2-1.5: USB disconnect, device number 124 Apr 16 19:36:24 Server kernel: usb 2-1.5: new low-speed USB device number 125 using ehci-pci Apr 16 19:36:24 Server kernel: hid-generic 0003:09AE:3015.00E8: hiddev96,hidraw3: USB HID v1.10 Device [Tripp Lite TRIPP LITE SMART1500RM2U ] on usb-0000:00:1d.0-1.5/input0 Apr 16 19:36:30 Server apcupsd[17859]: Communications with UPS restored. skynet-diagnostics-20210416-1932.zip
  5. I took out my GTX 1050 and a few memory sticks and I have been up for over 2 days now.
  6. I didn't seem to find it there. I've taken out all but 4 sticks just whittling it down until I don't see the errors anymore. Seems it crashed and rebooted again this morning (last log entries before that happened below). I went ahead and took out the 1050 just a moment ago as well. Mar 13 03:50:37 Skynet kernel: EDAC sbridge MC1: HANDLING MCE MEMORY ERROR Mar 13 03:50:37 Skynet kernel: EDAC sbridge MC1: CPU 8: Machine Check Event: 0 Bank 7: cc0007c000010093 Mar 13 03:50:37 Skynet kernel: EDAC sbridge MC1: TSC 93f4c92e09d4 Mar 13 03:50:37 Skynet kernel: EDAC sbridge MC1: ADDR 5ae239040 Mar 13 03:50:37 Skynet kernel: EDAC sbridge MC1: MISC 403e0486 Mar 13 03:50:37 Skynet kernel: EDAC sbridge MC1: PROCESSOR 0:306e4 TIME 1615625437 SOCKET 1 APIC 20 Mar 13 03:50:37 Skynet kernel: EDAC MC1: 31 CE memory read error on CPU_SrcID#1_Ha#0_Chan#3_DIMM#0 (channel:3 slot:0 page:0x5ae239 offset:0x40 grain:32 syndrome:0x0 - OVERFLOW area:DRAM err_code:0001:0093 socket:1 ha:0 channel_mask:8 rank:1) Mar 13 03:50:37 Skynet kernel: EDAC sbridge MC1: HANDLING MCE MEMORY ERROR Mar 13 03:50:37 Skynet kernel: EDAC sbridge MC1: CPU 8: Machine Check Event: 0 Bank 7: cc00078000010093 Mar 13 03:50:37 Skynet kernel: EDAC sbridge MC1: TSC 93f4c92e889c Mar 13 03:50:37 Skynet kernel: EDAC sbridge MC1: ADDR 5b17cbd80 Mar 13 03:50:37 Skynet kernel: EDAC sbridge MC1: MISC 52768086 Mar 13 03:50:37 Skynet kernel: EDAC sbridge MC1: PROCESSOR 0:306e4 TIME 1615625437 SOCKET 1 APIC 20 Mar 13 03:50:37 Skynet kernel: EDAC MC1: 30 CE memory read error on CPU_SrcID#1_Ha#0_Chan#3_DIMM#0 (channel:3 slot:0 page:0x5b17cb offset:0xd80 grain:32 syndrome:0x0 - OVERFLOW area:DRAM err_code:0001:0093 socket:1 ha:0 channel_mask:8 rank:1) Connection reset by 192.168.1.63 port 22
  7. I am in the random restart by itself group with 6.9 and 6.9.1
  8. Well I got up to 11 hours after doing the above steps and then upgrading to 6.9.1 I was tailing the syslog and got this before it shutdown. Going by the timestamps, it rebooted more than once. This is just what was there on the terminal I had open last night. There seem to be memory errors for sure, but it also looks like it is handling them? I'm not sure how I would go about pulling the exact sticks short of assuming Channel 0 = Channel A etc. or pulling them one by one and checking the log. I'm thinking about pulling the 1050 as well Mar 12 00:49:33 Skynet kernel: EDAC MC0: 1 CE memory read error on CPU_SrcID#0_Ha#0_Chan#3_DIMM#0 (channel:3 slot:0 page:0x6a751a offset:0x340 grain:32 syndrome:0x0 - area:DRAM err_code:0001:0093 socket:0 ha:0 channel_mask:8 rank:1) Mar 12 01:26:41 Skynet kernel: mce: [Hardware Error]: Machine check events logged Mar 12 01:26:41 Skynet kernel: EDAC sbridge MC1: HANDLING MCE MEMORY ERROR Mar 12 01:26:41 Skynet kernel: EDAC sbridge MC1: CPU 8: Machine Check Event: 0 Bank 7: 8c00004000010093 Mar 12 01:26:41 Skynet kernel: EDAC sbridge MC1: TSC 1a5c2d76dc3d Mar 12 01:26:41 Skynet kernel: EDAC sbridge MC1: ADDR 727108f40 Mar 12 01:26:41 Skynet kernel: EDAC sbridge MC1: MISC 1424a5c86 Mar 12 01:26:41 Skynet kernel: EDAC sbridge MC1: PROCESSOR 0:306e4 TIME 1615530401 SOCKET 1 APIC 20 Mar 12 01:26:41 Skynet kernel: EDAC MC1: 1 CE memory read error on CPU_SrcID#1_Ha#0_Chan#1_DIMM#1 (channel:1 slot:1 page:0x727108 offset:0xf40 grain:32 syndrome:0x0 - area:DRAM err_code:0001:0093 socket:1 ha:0 channel_mask:2 rank:5) Mar 12 01:34:45 Skynet kernel: mce: [Hardware Error]: Machine check events logged Mar 12 01:34:45 Skynet kernel: EDAC sbridge MC1: HANDLING MCE MEMORY ERROR Mar 12 01:34:45 Skynet kernel: EDAC sbridge MC1: CPU 8: Machine Check Event: 0 Bank 7: 8c00004000010093 Mar 12 01:34:45 Skynet kernel: EDAC sbridge MC1: TSC 1b80e90da98b Mar 12 01:34:45 Skynet kernel: EDAC sbridge MC1: ADDR d65da23c0 Mar 12 01:34:45 Skynet kernel: EDAC sbridge MC1: MISC 425a4686 Mar 12 01:34:45 Skynet kernel: EDAC sbridge MC1: PROCESSOR 0:306e4 TIME 1615530885 SOCKET 1 APIC 20 Mar 12 01:34:45 Skynet kernel: EDAC MC1: 1 CE memory read error on CPU_SrcID#1_Ha#0_Chan#3_DIMM#0 (channel:3 slot:0 page:0xd65da2 offset:0x3c0 grain:32 syndrome:0x0 - area:DRAM err_code:0001:0093 socket:1 ha:0 channel_mask:8 rank:1) Mar 12 01:51:10 Skynet kernel: mce: [Hardware Error]: Machine check events logged Mar 12 01:51:10 Skynet kernel: EDAC sbridge MC0: HANDLING MCE MEMORY ERROR Mar 12 01:51:10 Skynet kernel: EDAC sbridge MC0: CPU 0: Machine Check Event: 0 Bank 7: 8c00004000010093 Mar 12 01:51:10 Skynet kernel: EDAC sbridge MC0: TSC 1dd51b124c58 Mar 12 01:51:10 Skynet kernel: EDAC sbridge MC0: ADDR 68663a340 Mar 12 01:51:10 Skynet kernel: EDAC sbridge MC0: MISC 1526a5886 Mar 12 01:51:10 Skynet kernel: EDAC sbridge MC0: PROCESSOR 0:306e4 TIME 1615531870 SOCKET 0 APIC 0 Mar 12 01:51:10 Skynet kernel: EDAC MC0: 1 CE memory read error on CPU_SrcID#0_Ha#0_Chan#3_DIMM#0 (channel:3 slot:0 page:0x68663a offset:0x340 grain:32 syndrome:0x0 - area:DRAM err_code:0001:0093 socket:0 ha:0 channel_mask:8 rank:1) Mar 12 01:59:03 Skynet kernel: mce: [Hardware Error]: Machine check events logged Mar 12 01:59:03 Skynet kernel: EDAC sbridge MC1: HANDLING MCE MEMORY ERROR Mar 12 01:59:03 Skynet kernel: EDAC sbridge MC1: CPU 8: Machine Check Event: 0 Bank 7: 8c00004000010093 Mar 12 01:59:03 Skynet kernel: EDAC sbridge MC1: TSC 1ef3720de11e Mar 12 01:59:03 Skynet kernel: EDAC sbridge MC1: ADDR 7289aeb00 Mar 12 01:59:03 Skynet kernel: EDAC sbridge MC1: MISC 4214e486 Mar 12 01:59:03 Skynet kernel: EDAC sbridge MC1: PROCESSOR 0:306e4 TIME 1615532343 SOCKET 1 APIC 20 Mar 12 01:59:03 Skynet kernel: EDAC MC1: 1 CE memory read error on CPU_SrcID#1_Ha#0_Chan#0_DIMM#1 (channel:0 slot:1 page:0x7289ae offset:0xb00 grain:32 syndrome:0x0 - area:DRAM err_code:0001:0093 socket:1 ha:0 channel_mask:1 rank:5) Mar 12 02:00:06 Skynet root: /etc/libvirt: 920.4 MiB (965103616 bytes) trimmed on /dev/loop3 Mar 12 02:00:06 Skynet root: /var/lib/docker: 15.5 GiB (16609398784 bytes) trimmed on /dev/loop2 Mar 12 02:00:06 Skynet root: /mnt/cache: 191.9 GiB (206013878272 bytes) trimmed on /dev/sdj1 Connection reset by 192.168.1.63 port 22
  9. Thanks. I will go ahead and do that. Then if it crashes again I guess I will just roll back to 6.8.3 again.
  10. root@Skynet:~# ls -lah /mnt/disk23/system total 0 drwxrwxrwx 3 nobody users 20 Mar 3 21:15 ./ drwxrwxrwx 8 nobody users 155 Mar 7 23:50 ../ drwxrwxrwx 2 root root 24 Mar 3 21:15 docker/ So I should go ahead and move the docker/ back to the cache drive? Or just delete it? Seems there is an appdata folder on disk 23. I could just remove that since all of that data should be on the cache drive.
  11. I am having the same issue. I have a 1050 in my server. Though it seems to be fine in 6.8.3.
  12. I have also been having a lot of issues with the 6.9 upgrade. Cache drive unmountable - Samsung EVO 250GB as a cache drive I ended up wiping it before seeing the solutions about stopping the array and unmounting/mounting it (lesson learned there). I seem to have got past that and re-created almost everything I needed. I had a lot of issues with it rebooting on its own as well. Usually after about 30-40 minutes and it seemed like it would just keep doing it until I reverted back to 6.8.3. I had 6.9 stable for over 7 hours today and then started to bring back my VMs (had to recreate domains share and libvirt folder) and the reboots just started up again which also caused a parity check. It just did it again, while typing this, only up for 9 minutes before a reboot this time. skynet-diagnostics-20210308-0015.zip
  13. I am trying a similar thing with VMware Workstation Windows 10 VM > Kali Linux VM and I get I already did these steps after stopping VM's and disabling VM Manager:
  14. Just a quick update. I created a Win10 1903 VM on Unraid and mapped the same drive and was able to scan folders without issue. Things I dont want to do now: 1.) A fresh Windows 10 install on my gaming PC 2.) Scan my network folders via a Windows 10 VM (Sighs heavily)
  15. No kidding? This is super weird then. It was working the other month. Then stopped. The worked again a couple weeks ago and now it stopped again.