delgadot2040 Posted February 24 Share Posted February 24 Hi everyone, I am a new unraid user and I thought I had everything installed smoothly but the last two days now I have had unexpected crashes. I get no warning before this happens so I am not sure how to recreate it. I did grab a screenshot of the dash that was frozen while the server restarted. I also attached the diagnostic zip grabbed after the reboot. The only things I had running at the time of both crashes was qbittorrent and my windows vm. I am running Version: 6.12.6 the first time the crash happened my video card stopped being passed through to my vm on the reboot. i noticed IOMMU was changed to disabled. I went back to my bios and changed it to enabled, repassed it to my vm and everything ran nice again until just now. this second crash happened but the gpu is still passed through and IOMMU is still enabled so maybe that was just a random thing that happened during the first crash? I am very new to all this so if there is anything else that would help you help me work things out I am happy to provide it. Just let me know how to obtain the info from my system and I can share it with you. System INFO: Model:Custom M/B:Micro-Star International Co., Ltd. Z590 PRO WIFI (MS-7D09) Version 1.0 s/n 07D0910_L31E243506 BIOS:American Megatrends International, LLC. Version 1.90 Dated 06/06/2023 CPU:11th Gen Intel® Core™ i9-11900KF @ 3.50GHz HVM:Enabled IOMMU:Enabled Cache:L1 Cache: 384 KiB, L1 Cache: 256 KiB, L2 Cache: 4 MiB, L3 Cache: 16 MiB Memory:32 GiB DDR4 (max. installable capacity 64 GiB) Network:eth0: 1000 Mbps, full duplex, mtu 1500 Kernel:Linux 6.1.64-Unraid x86_64 OpenSSL:1.1.1v pcserver-diagnostics-20240223-1919.zip Quote Link to comment
trurl Posted February 24 Share Posted February 24 Have you done memtest? Setup syslog server. Quote Link to comment
delgadot2040 Posted February 24 Author Share Posted February 24 18 minutes ago, trurl said: Have you done memtest? Setup syslog server. i setup syslog server after posting this thread as i saw others say this is a way to check crash logs. i am currently running Memtest86 but it looks like it gonna be a while to complete. i did recently add another kit of identical ram to my system so maybe thats the issue? hopefully the test shows more. Quote Link to comment
itimpi Posted February 24 Share Posted February 24 4 hours ago, delgadot2040 said: i am currently running Memtest86 but it looks like it gonna be a while to complete. i did recently add another kit of identical ram to my system so maybe thats the issue? hopefully the test shows more. If memtest fails then that is definitive. If it passes you can still have a RAM issue when the system is under load so in such a case the solution is often to run with less RAM sticks installed. Quote Link to comment
delgadot2040 Posted February 24 Author Share Posted February 24 (edited) 5 hours ago, itimpi said: If memtest fails then that is definitive. If it passes you can still have a RAM issue when the system is under load so in such a case the solution is often to run with less RAM sticks installed. ok so i ran the test overnight and the memory passed with 0 errors. i noticed the ram config were as follows and i was not using these settings because i had xmp enabled at the time of the crash. (that was running at 2933 speed) IMG_0850.HEIC so perhaps the crash was caused by the ram being pushed harder than it could handle? when the test finished i went into my bios and checked to see if i had the same basic ram config and NOT with the xmp settings. IMG_0852.HEIC i also noticed that IOMMU had been disabled again so i enabled it. with the ram settings that passed the memtest i started unraid again and tried to run my windows vm again but the hdmi connected to my gpu did not pass the signal even though i reenabled IOMMU. my screen changes from the initial unraid linux load scroll text to a black screen as if the vm was going to come thru but then it just says no signal. i tried to stop the vm but it would not and i needed to use the force stop option to close it. so i am able to run unraid but not my vm. what should i do? EDIT/ i was able to get back in my vm by unbinding and rebinding my video card to the vfio. the strange thing is now the offical unraid logo does not appear before loading the vm.. it just goes to black screen, no signal and then im in windows. if none of this seems like a major issue i am fine with testing this new ram setting on the system until a new crash happens. once that happens i have the syslog server going to capture the log and i could come back here with that info. worst case scenario i just remove the two new sticks of ram i installed and go with the old setup that didnt have an issue. let me know if that sounds like a good plan of action. thanks yall Double Edit// so the system just crashed... here are the logs from the server syslog.log Edited February 24 by delgadot2040 update Quote Link to comment
delgadot2040 Posted February 24 Author Share Posted February 24 I have removed the two sticks of ram and left the original two in, so far no crash with an up time of 4 hours and counting. does the syslog specify the ram was the issue? Quote Link to comment
trurl Posted February 24 Share Posted February 24 Don't see anything about RAM in that syslog, but problem communicating with some disk. Can't tell which without more context though. Attach Diagnostics to your NEXT post in this thread. Quote Link to comment
delgadot2040 Posted February 24 Author Share Posted February 24 17 minutes ago, trurl said: Don't see anything about RAM in that syslog, but problem communicating with some disk. Can't tell which without more context though. Attach Diagnostics to your NEXT post in this thread. hmm thats interesting. here you go pcserver-diagnostics-20240224-1341.zip Quote Link to comment
trurl Posted February 24 Share Posted February 24 Don't see it in those after reboot. Quote Link to comment
delgadot2040 Posted February 24 Author Share Posted February 24 6 minutes ago, trurl said: Don't see it in those after reboot. is it possible that the 4 sticks of ram messed with the communication to the disk you mentioned? the server seems to be running ok with the original two ram sticks. i might get some bigger sized pair of sticks to replace these and to avoid installing 4. any advice? Quote Link to comment
delgadot2040 Posted February 25 Author Share Posted February 25 it just crushed on me now with the original 2 sticks of ram. It happened while launching a game inside my windows vm. Could it have something to do with the vdisk being shared on an nvmie drive that im also using as my cache? idk what to do or try here is the latest log syslog-.log Quote Link to comment
trurl Posted February 25 Share Posted February 25 Same ata4 resets you had earlier. Seems to be referring to disk3. Check connections. Shouldn't cause crash though. Something else is probably going on. Usual advice is to boot in SAFE mode with Docker and VM Manager disabled and let it run like that for a while to see if it still crashes. Quote Link to comment
delgadot2040 Posted February 25 Author Share Posted February 25 14 minutes ago, trurl said: Same ata4 resets you had earlier. Seems to be referring to disk3. Check connections. Shouldn't cause crash though. Something else is probably going on. Usual advice is to boot in SAFE mode with Docker and VM Manager disabled and let it run like that for a while to see if it still crashes. could this have something to do with that drive being a WD white label that was shucked and it has a 3.3v pin? disk 2 was the same but i put tape over the third pin after it was not recognized on initial install. the tape solved that drive. when i put in disk 3, also a shucked white label, it was recognized immediately without having to cover that pin so i never did. perhaps if i remove the drive add the tape on the pin like the other it will function as intended. Quote Link to comment
delgadot2040 Posted February 25 Author Share Posted February 25 so an update after putting tape on the third pin of the power input on disk3 things have stayed up now for 19 hours. I have been running my vm for gaming, running torrents, plex, and doing a parity check. no crashes knock on wood. im going to assume the problem was the WD white label with that power pin. just wanted to come back and let yall know incase anyone else runs into this in the future. 1 Quote Link to comment
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.