Jump to content

2Piececombo

Members
  • Posts

    138
  • Joined

  • Last visited

Everything posted by 2Piececombo

  1. Small follow up, I've booted into safe mode, started array and docker, and it has not crashed, and the onboard graphics is displaying the console as normal. This makes me think that the nvidia driver plugin is somehow to blame for my issues. Thoughts?
  2. Follow up, booted the server up in safe mode (so no nvidia plugin) and the output is displaying properly.
  3. Powered down my server and reisntalled the GPU, booted back up normally, and within 5 mins it shutdown again. It's 100%, without a doubt, related to the GPU. I will try again but booted into safe mode and post results. @alexbn71 I wanted to ask you, when you have a GPU isntalled, does your onboard graphics display the unraid boot process/login screen normally? My BIOS is set to force display out the onboard graphics (so unraid does display login/console out the GPU display) and it works as normal, up until the boot process is complete and it goes to the login screen. I just get a black screen and am not able to use the local console. The webui is still fine and accessible, but not locally. Do you have this same behavior? I thought it might be something to do with the nvidia driver plugin, so I posted on the support page. The author claimed the issue was due to me still being on 6.9 and said the issue would go away once I upgraded to 6.10.3.. Well, it didn't. Curious if you have this issue as well
  4. @ich777 So it's been two months since I last posted regarding my issues. I have since updated unraid to 6.10.3 as you suggested, but the problem persists. The problem being that unraid boots up and outputs display on the onboard display adapter, but as soon as it hits the login screen the display shows a blank page. The server is still accessible via the webui and everything seems to work fine. You mentioned previously that this would be resolved upon upgrading to 6.10, however this issue is still there. Additionally, and possibly related, I have crashing/shutdown issues that only seem to occur while a GPU is connected to my system. I've documented the issue HERE. I have tested EVERY piece of hardware in my server. The crashing issue has happened with two different GPUs. I understand if the shut downs are not something you have any input on, I just figured I'd mention is in case it helps in some way. Id really like to know why unraid will not display past the boot process with a GPU isntalled. (and no, the login is not showing while connected to the GPU display either) I've included diags, and can supply any other info if needed. Cheers diagnostics-20220821-1744.zip
  5. I have not run in safe mode with the GPU installed, but I guess I should give it a go. @Frank1940 suppose this is the cause, are you aware of any remedies for this problem? Or what causes it? I've never heard of an issue like this with docker, but I'm clearly no expert.
  6. BIOS is up to date. To my knowledge, the bottom pcie slot in my mobo is the only one that does not support use of a graphics card (mob is Tyan S7012)
  7. I have extensively tested cooling by booting into another OS and running a stress test for well over an hour. The PSU has been replaced previously as well with a 850w Gold rated supply from evga. It should be noted that this shutdown only occurs while in unraid. I can leave the GPU in and boot into something else and there are no issues. Every piece of hardware has been replaced/tested out. All RAM, both CPUs, the mobo, HBA, and PSU. There is no hardware issue here. This shutdown have occurred with 2 different GPUs. Neither of which even use additional power from the PSU
  8. Same thing for me as well. Syslogging to a second unraid server and there's nothing of note there. The logs just end when the shutdown occurs.
  9. Nope. Solid as a rock since I took the GPU out..
  10. What GPU do you have installed in your server out of curiosity? You said your server reboots, which is slightly different than the shutdown issue that I've been facing. Is your server truly rebooting on it's own? Or is it shutting down and being automatically powered back on via BIOS settings, or manually? Take the GPU out, and run it like that for a while and see if the issue goes away
  11. I upgraded the PSU a while back, it's an 850w EVGA. the GPU is only a p600, doesnt even require additional power, just the pcie power, so it shouldnt be a power issue
  12. Update to this issue. I realized something a few nights ago that I should have figured out a long time ago. When I was having this problem months ago (before it magically went away) I had a GPU installed that I was planning to use for Plex. The shutdowns started happening, and I was so frustrated I gave up setting up plex for transcoding, and eventually took the card out. I left the server offline for sometime after that, still too frustrated to continue dealing with it. When I powered it back up it was fine and no longer crashed. I then put a different GPU in the server to give the plex transcoding another go, and sometime in the next few days/week the shutdowns came back. After I installed the GPU, but before it started shutting down, I had an issue where unraid wouldnt display the GUI through the onboard display output but instead used the GPU and I verified this by plugging in a monitor to the GPU. Top fix this I went into the BIOS and forced it to use the onboard graphics. After rebooting, the boot sequence would show on the onboard video port (not just the mobo boot process, but unraid as well, like the blue screen where you can chose which mode to boot, gui/nongui/safemode/etc) but as soon as it got the the point where it would show the login screen, nothing, just a blank black screen. Ther server was still accessible through the webgui. I posted in the Nvidia plugin support page and the author said it was due to not being on 6.10 (was still on 6.9 something) and said it should be resolved after I update. I was hesitant to try and update unraid, because by now the shutdowns were happening again, and I didnt want it to shutdown mid upgrade. Eventually I did it anyway, and Im now on 6.10.3. The display output issue still isnt solved, but thats an issue for the nvida plugin guy I guess. And the shutdowns continued. Fast forward to a few days ago and It hit me that the shutdowns only seem to happen when I have a GPU installed. I initially ruled out the GPU as the cause, since I ended up switching GPUs, so it's not the specific GPU, but rather seems to happen when ANY gpu is installed. Why this could be, I haven't the foggiest. But to confirm this I took the GPU out yesterday and the server is running fine for 28 hours now - no shutdowns. I continue to be puzzled by this issue and Im hoping that one of you brilliant people has an idea why this would be happening. Cheers for any help as usual
  13. Oh, I misunderstood what you were asking. The syslog I showed you is from a remote syslog server, so it should include everything right til the very second it died.
  14. IIRC, I grabbed those diags right after booting up from the server shutting down on it's own. I will gather another set of diags immediately after it shuts down to be sure, though. Thanks for the help
  15. I've replaced every piece of hardware, except the HDDs/SSDs and the USB boot drive. I assume you saw nothing in the diags that pointed to anything? Im just at a complete loss. The weird part to me is that this only happens while booted into unraid. Would it be worth it to replace the USB and reflash unraid to a new usb?
  16. My server has been randomly shutting down for a while. I've tested basically every possible component, replaced CPUs, RAM, motherboard, PSU, and HBA. Tested the RAM for over 24 hours with no errors. Sometimes it shuts down within a minute or two of booting up, sometimes it lasts for hours. It even lasted around a month at once point. But eventually it always shuts down. It's a Tyan S7012 motherboard. The IPMI remains accessible, and there is no event created. I have the server syslogging to a second unraid server, and I see nothing that identified a problem there. Im not good enough with the diagnostics to find a problem, so I'm hoping someone else can take a look at it and find something. Im wondering if something with unraid is causing it, because ive booted into both windows and linux which have not crashed once. It only seems to occur when booted into unraid. I'm not sure what that could mean. Someone suggested I replace the USB, but it seems unlikely that this would be the cause. I simply don't know what else to look for or test or check. I have this server and one other plugged into the same UPS, and only this one shuts down so I dont think it's a power problem either. Any help is greatly appreciated. my head is raw from all the scratching.. server diags 7-28-22.zip
  17. After some headscratching and testing, I have realized a few things. In the IPMI, it lists the temps as "CPU Below Tmax" followed by a value. According to Intel Ark for my cpus (x5670) tcase is 81.3. So if you subtract the value in the screenshot, 67, from tcase: 81.3-67=14.3c. This was basically impossible given the temp of the room my server is in. I booted into linux and used some sensor tool to check temps and compare them to the IPMI values. It stated the cpu was idling about 25-30. It also listed two more values, HIGH = 80 and MAX = 96, neither of these match Intel Ark, but if we use 96 in the previous equation, the numbers work out. 96 - 67 = 29c, or about 85f, which is exactly how hot it was in that room. The CPUs have noctua coolers on them which should be doing a pretty good job of keeping them cool. I ran a CPU stress test and monitored the temps in both linux and the values in the IPMI interface, and as the temps in linux went up, the value in ipmi went down, confirming it IS measuring distance from SOME max value. I don't know how to confirm what it thinks the Tmax value really is (an the thresholds table makes little sense to me) but I've at least confirmed that temps are in check and the cpus arent overheating. I still need to do some more testing while monitoring the temp data in unraid, but I will do that tomorrow and share any interesting data. hopefully this helps at least 1 person someday who might be scratching their head like I was..
  18. Im having some trouble getting the correct CPU temp info. I installed the CPU temp plugin and installed perl. Hitting detect finds coretemp i5500_temp w83793 and from there the dropdown gives me the following options But none of these appear to be correct, as this is that the sensor readings in the IPMI show I have no idea if the detected driver is correct. The mobo is a Tyan S7012. I cant seem to find anywhere to confirm what the sensor drive should be for this board. Or perhaps I just dont know where to look.
  19. No particular reason I didnt go to 6.9.2, a lack of time more than anything. I read a few random comments on YT videos and such that people had a few issues with 6.10, so I was going to upgrade eventually. I'll update over the weekend and post back if I still ahve issues. Cheers
  20. Done. If I find a solution Ill post back here as well in case anyone else is having the same issue, Also just to note, I confirmed it was the nvidia driver plguin by uninstalling it, which brought back the login page. Reinstalled it, and the login page is gone again.
  21. To confirm the plugin was the problem, I removed it and rebooted, and the login screen was back. I then reisntalled the plugin, turned docker off/on, rebooted, and no more login page So it's definitely the plugin causing something. I also found this thread where someone else appears to have the exact same problem, but it looks like he didnt find a solution
  22. Sorry if this has already been addresses somewhere in here, if so I couldnt find it. I posted in general support, but it was suggested to post here as the issue likely comes from the nvidia plugin best I can tell. I installed a P600 and the nvidia plugin. changed my BIOS to use the igpu instead of pci gpu. Everything is fine through the whole boot process, except at the very end when you would normally get a login screen, its blank. Server is up and running fine and can be accessed via the webgui. but the console monitor shows no login page. The BIOS settings are correct, other wise I would get NO video at all from the onboard graphics, but like I said it's fine til the login page. I booted in GUI safemode, and the login screen reappeared, which makes me more confident the issue stems from the nvidia plugin. I have no idea what to try next.
  23. Update: booted into gui safemode and login screen is back. Im sure it has something to do with the nvidia plugin, which leaves me with the following question. How do I use the nvidia driver plugin so my gpu can be used in plex, but get the login screen back? I dont even know where to start looking for a fix on this one
×
×
  • Create New...