ChronoStriker1

Members
  • Posts

    148
  • Joined

  • Last visited

Everything posted by ChronoStriker1

  1. It was the cpu and Intel has RMA'ed it. Unfortunately the usb drive that I was using seems to have failed so I now have a licensing support ticket open with unraid, hopefully that can be taken care of relatively quickly.
  2. Nope still unstable with the same kind of errors, guess contacting intel is next
  3. OK ive replaced the board, so far it "feels" more stable but Im still seeing segfaults (it has not hard crashed on me as of yet). Anyone mind talking a look at the diagnostics and a quick grab of my current syslog?
  4. The motherboard seems like its going to be the hardest thing to rma since I cant find how to do an Advanced RMA through ASUS and as much as Im crashing I dont want to be without the server for at least 10 days. Related though since I dont think its the ram i decided to just buy the ram that was sent as a replacement so I have 128GB now, which is nice.
  5. Atomic burn worked without issue max temp was 71C and no crashes while it did it.
  6. CPU temps have been fine this entire time. Ill install the plugin and run it after the latest parity check is complete. I will also contact intel and asus to see if I can rma the processor and motherboard sicne im outside my 30 day window with Amazon.
  7. Well that will be fun to rma then. Another question that is hopefully easy, even after reboots and swapping the ram, its always the same things that always segfault. Wouldn't random things segfault as it thinks there is an issue or it runs out of memory? Looking at my syslog from yesterday: May 9 21:37:29 Tower kernel: unraid-api[16221]: segfault at ffffffffffff3b28 ip 0000000001518f00 sp 00007ffe4d6f11a8 error 5 in unraid-api[91c000+167b000] likely on CPU 0 (core 0, socket 0) May 9 21:45:25 Tower kernel: python3[11316]: segfault at 7 ip 00001504488506f3 sp 00007ffdc92cd5f0 error 4 in libpython3.10.so.1.0[15044873b000+1be000] likely on CPU 14 (core 28, socket 0) May 9 22:25:13 Tower kernel: Thunar[22347]: segfault at 600000003a ip 00001512f15d1f1c sp 00007ffc678c45b0 error 4 in libglib-2.0.so.0.6600.8[1512f157e000+88000] likely on CPU 12 (core 24, socket 0) May 9 22:25:18 Tower kernel: thunar[1956]: segfault at 600000003a ip 0000153cad537f1c sp 00007ffc9fc1f040 error 4 in libglib-2.0.so.0.6600.8[153cad4e4000+88000] likely on CPU 2 (core 4, socket 0) May 9 22:45:28 Tower kernel: python[24692]: segfault at 1 ip 00001507b28ac411 sp 00007fff2a585a50 error 6 in libpython3.11.so.1.0[1507b2799000+1bb000] likely on CPU 0 (core 0, socket 0) May 9 23:56:34 Tower kernel: unraid-api[8794]: segfault at ffffffffffff3b28 ip 0000000001518f00 sp 00007fff36d19508 error 5 in unraid-api[91c000+167b000] likely on CPU 0 (core 0, socket 0) May 10 01:09:03 Tower kernel: unraid-api[15814]: segfault at ffffffffffff3b28 ip 0000000001518f00 sp 00007fffca0cda58 error 5 in unraid-api[91c000+167b000] likely on CPU 0 (core 0, socket 0) May 10 02:37:20 Tower kernel: python[7637]: segfault at 8 ip 000014f6acd47ac9 sp 000014f6a84bba90 error 4 in libpython3.9.so.1.0[14f6acc13000+1b8000] likely on CPU 0 (core 0, socket 0) May 10 03:14:47 Tower kernel: python3[27555]: segfault at 0 ip 000014dc983af61b sp 000014dc95621998 error 6 in libpython3.8.so.1.0[14dc98273000+183000] likely on CPU 12 (core 24, socket 0) May 10 03:46:44 Tower kernel: python[4413]: segfault at 6 ip 0000151abbf715e6 sp 00007ffc2b350e40 error 6 in libpython3.11.so.1.0[151abbe5f000+1bb000] likely on CPU 0 (core 0, socket 0) May 10 05:24:00 Tower kernel: php7[4270]: segfault at 40 ip 00005585e53dd3a0 sp 00007ffc2b656380 error 4 in php7[5585e5200000+240000] likely on CPU 0 (core 0, socket 0) May 10 05:29:39 Tower kernel: unraid-api[30617]: segfault at ffffffffffff3b28 ip 0000000001518f00 sp 00007ffe8bc8fca8 error 5 in unraid-api[91c000+167b000] likely on CPU 0 (core 0, socket 0) May 10 06:32:13 Tower kernel: unraid-api[21391]: segfault at ffffffffffff3b28 ip 0000000001518f00 sp 00007ffe80db5a58 error 5 in unraid-api[91c000+167b000] likely on CPU 0 (core 0, socket 0) I know for a fact that unraid-api, python3, and thunar are always programs (or the libriaries associated with them) that seem to segfault. Is it possible that some of the files have been damaged due to the crashes and thats why they are faulting?
  8. Welp tried one stick twice had the same issue, was able to get another set and am having the same errors, so I think at this point I can say its not the ram. So where would the next place to check be?
  9. And it looks like it crashed again. I can attempt running one stick at a time later today to see if there is any changes but is there anything else I can test other than just the memory?
  10. I have disabled xmp, there has been at least one segfault that I noticed so far but it hasnt crashed yet. I will continue to keep an eye on it.
  11. After another crash yesterday I ran a full 4 pass memtest86 run and my memory passed. Well it's still possible the issue is the memory I need to be more specific so I can RMA parts. Is there any better way to track down what's going on.
  12. The system stopped letting me do some actions again, one cpu looked like it was pegged at 100% by "/usr/src/app/vendor/bundle/ruby/3.1.0/bin/rake jobs:work" tried killing it but it wouldnt die, I tried stoping things in order to reboot but the web interface became unresponsive, I attempted to reboot from the commandline but the last message I saw was "Tower init: Trying to re-exec init". It did eventually reboot but it had an uncleen shutdown.
  13. I had run it last night for 1 pass and the ram had passed. I can attempt to let it run over night sometime this weekend.
  14. After deleting the file and doing another scrub that error went away. I also manually moved thigns again, this time arround it looks like things moved (it looked like it stopped part way through previously). Currently I am not getting that shfs: cache disk full message. Still waiting for it to crash again.
  15. I ran a scrub on my cache and I recieved two errors one is to a file that i can redownload so no issue there, the other was: zfs permanent error cache:<0x24fc8a> I do not know how to deal with that one.
  16. I will after it crashes again. I do have a second question since I've been trying to monitor the syslog myself, if I envoke the mover manually I see this come up: May 3 09:09:50 Tower move: mover: started May 3 09:09:50 Tower shfs: /usr/sbin/zfs destroy -r cache/Downloads May 3 09:09:50 Tower shfs: error: retval 1 attempting 'zfs destroy' Is that expected? I decided to go with zfs to learn more about it but I dont know why it would try to do a destroy.
  17. I have enabled it but had not set it to write to usb (I have it doing that now). In the same run the server crashed again, I did get a picture of the screen. Ive also noticed that on reboot the server would crash, I think this is that error or at the very least I think its the same error. Also in the logs I noticed it keeps saying "shfs: cache disk full". As far as I can tell all of the share floors are set low enough that I shouldn't be getting that message.
  18. I seem to be having a problem with the share floor plugin on 6.12.0-rc.4.1, none of the shares below have a file that large in them, so Im not quite sure whats happening. The cache is a two 2TB nvme mirror zfs pool. Share 'Downloads' updated to new floor level: 659.7 GB Share 'retronas' updated to new floor level: 82.5 GB Share 'appdata' updated to new floor level: 82.5 GB Pool 'cache' updated to new floor level: 659.7 GB
  19. Right now I doubt the issue is due to being on the rc but as I am using it I figure I should post this here. I had recently replaced most of the hardware on my unraid server but now it seems to be somewhat unstable. The last "crash" happened probably an hour ago, its not really a crash, I have partial access, some dockers will still be running, my ssh session seems to stay up, but some applications (like htop) will just freeze. I think I may loose /mnt/user at the time but Ive only been able to prove it once. Things are usually so bad that I cant run diagnostics or even the shutdown or reboot commands. This was way more prevalent (happening almost daily if not more often) when I first set the server up, but after changing a few hardware around (had a few bad sata cables and an HBA that seemed like it was having issues until I updated its firmware) it seemed like it went away, but I guess thats not the case. I had run memtest but the memory had passed so Im kinda out of ideas. Would appreciate anyones help.
  20. The apps page wont load right now. It looks like the json file that its trying to connect to is blank on the github.
  21. Anyone know how to proxy the novnc with Nginx Proxy Manager? I cant figure out the correct settings for /websockify to get it to forward for sound properly.
  22. Ok I think I fixed the issue with sound, at the very least its now working through novnc. The vnc-audio.ini needed to be changed. It looks like tcpserver isnt able to lookup localhost via dns so I changed it to 127.0.0.1. I sent a pull request, but Im somewhat new to git so the build file for the docker needs to be changed back to the main repo from mine.
  23. OK I was looking more at the dockerfile on git, I dont think the environmental variables for the ports will change anything as they are currently. The ports are exposes as the actual ports and not the variables. So even if you change the variable the port is still the same. I could be wrong as this is really the first time Im looking at ta dockerfile ment for dockerman. I tested on the fork I made and it seems to work the way I think it should when I do this: # Configure required ports ENV \ PORT_SSH="2222" \ PORT_VNC="5904" \ PORT_AUDIO_STREAM="5905" \ PORT_NOVNC_WEB="18083" \ PORT_AUDIO_WEBSOCKET="32123" # Expose the required ports EXPOSE ${PORT_SSH} EXPOSE ${PORT_VNC} EXPOSE ${PORT_AUDIO_STREAM} EXPOSE ${PORT_NOVNC_WEB} EXPOSE ${PORT_AUDIO_WEBSOCKET} Note I left in port changes I had made for myself but I tested by changing the env variables for vnc and audio for 5908 and 5909. I can say for sure that I can vnc into 5908, I still have no audio so thats still an issue but I dont "think" its related to this. Unless im using port 32123 somewhere and do not realize it.