Rudder2

Members
  • Posts

    157
  • Joined

  • Last visited

Everything posted by Rudder2

  1. I've been having a problem with all my web UIs becoming unresponsive every 26 to 48 hours. This is unRAID, Plex, Sonarr, etc. During the shutdown process they start working again until the server kills the web UI processes. I'm sure it's something hanging up that is killed early in the shutdown process but don't know what. I SSH in to the server and do a diagnostics collection and the diagnostics appears to hang till I initiate the shutdown then it does the diagnostics before it shuts down. Here is that diagnostics files from yesterday and today I did before shutdown from the SSH terminal since the Web UIs are unresponsive. If anyone could give me a clue I would greatly appreciate it. Thank you! rudder2-server-diagnostics-20181227-0928.zip rudder2-server-diagnostics-20181226-0431.zip
  2. As promised I am writing to tell you the out come. unRAID no longer locks up. Changing all Interrupts to MSI Interrupts has stopped the original problem I opened this thread about. I've been researching, I havne't found a definitive answer yet, but I'm starting to think that between kernel update in 6.6.0 and/or the QEMU update that a reverse comparability between the old Interrupt system, which Micro$haft still uses, and the new MSI Interrupt system, which Linux adopted a long time ago in technology time, was removed. The original complaint of this thread has been cured by enabling MSI Interrupts. Thank you everyone for your help!
  3. https://bugs.launchpad.net/qemu/+bug/1580459 It's an old bug report so I don't know how I just had the problem...Maybe the problem was worsened when QEMU upgraded to 3.0 or a combination of the new kernel...Who knows but this is the bug report that ultimately led to me finding the problem. Suggest adding to the Windows VM help page making sure they turn on MSI Interrupts.
  4. I will leave this open and let my system run a week without doing anything with her because I've had my server run up to 6 days before I suffered a freeze but I think the problem is resolved. It was settings in WinBlow$ causing the problem. With my VM running gaming benchmarks for over an hour the Host, unRAID, didn't lockup! The things I did in Diagnostics that I didn't reverse: 1. Upgraded the Windows 10 VM from SeaBios to OVMF using this solution 2. Upgraded the Machine from i1440fx-2.10 to i1440fx-3.0 3. Change USB Controller from 3.0 (nec XHCI) to 3.0 (qemu xhci) 4. Upgraded all the VirtIO Drivers to the latest downloadable from unRAID (virtio-win-0.1.160-1) The above didn't seam to fix the problem. 5. Used the MSI_util_v2 found http://www.mediafire.com/file/2kkkvko7e75opce/MSI_util_v2.zip to turn on the MSI Interrupts for all hardware in the util's list. #5 is what fixed the lock up issue as far as I can tell. I will know when I update you next week.
  5. Hello all. Since there has been crickets here I assume no one has any idea what to try. I disabled my Asus GTX 960 OC passthrew and the VM stops locking up the host. I found a bug report on Launchpad relating to this very problem. People are showing success with enabling MSI Interrupts in the Windows guest for all hardware that supports it fixing the problem. Their testing shows that Linux by default uses MSI Interrupts and they felocify here is that M$ WinBlow$ not used them by default is causing conflicts. I'm going to let my VM run a couple hours with out the passthrew to make sure it's stable then start trying the fixes on the Launchpad Bug Report for QEMU. It's looking like this is a QEMU problem and not a problem with unRAID at all. This problem is happening across Linux Flavors. I'm starting to think that the QEMU update from 6.5.0 to 6.6.x is the real problem here. I will keep y'all posted with my results. It's so nice to be in-between classes and have the time to research and try things properly.
  6. So is there any more ideas on how I might be able to figure out why my MicroShaft WinBlow$ 10 VM with Asus nVidia GTX 960 SC passthrew is locking up my entire unRAID system on 6.6.x unRAID upgrades? This VM is VERY important to my education so any help in fixing this would be great.
  7. Are you saying I might have a hardware problem if it only happens when my Windows 10 VM is running? It doesn't happen on 6.5.0. Or are you telling the last poster to do more diagnostics.
  8. I tested my system without my VMs running. It's has been 100% stable for the better part of a week. I start my Windows 10 VM and it locked up the system. It's starting to look like a Problem with Windows 10 VM with nVidia GTX 960 passthrew when I upgrade to 6.6.x not a hardware issue or a problem with the core unRAID. If it is we need to figure out why this is...I need my Windows 10 VM for college...Luckily classes are ending so I don't need it for the next 3 weeks and I will have more time to try to diagnose the problem. I've had this Windows 10 VM with passthrew for over a year. 2018-12-12 Windows 10 VM Configuration Page.pdf Windows Steam Streaming.xml
  9. You are the man! It still works perfectly!!
  10. Thank you for the information. I will not ax ASRock as a considerations for future upgrades. They migh also feel like it's a 4 year old board so they don't think they should support it. I bough ASRock because of friends that had 10 year old computers running on their boards with out a hitch. I planned on getting 8 to 10 years out of my unRAID hardware before upgrading or replacing.
  11. Just for information in this thread, this didn't change the problem. I installed the custom update that made it so the MicroCode wasn't installed by the Kernel. My server still locks up every 23 minutes of run time like it started with 6.6.6, It use to run 5 or 6 days before the first lock up 6.6.0-6.6.5. Also for shits and giggles I ran a Memory Test again and it passed with out issues. Anything else to try would be greatly appreciated. Here are my logs: FCPsyslog_tail-2018-12-05 6.6.6 custom BZRoot.txt rudder2-server-diagnostics-6.6.6-20181205-1124-custom BZRoot.zip
  12. Thank you for this information. I was afraid of this when they stopped replying to me about BIOS info and pointed to the fact I'm running Linux. I'm trying an experimental 6.6.6 right now with out the Microcode added and this should shed more light on the problem, we hope. I'm loosing faith in ASRock. If this proves to be their problem, what manufacture would you recommend? Always thought of ASRock as a great company. As said in Indiana Jones and The Last Crusade "He Chose Poorly" Ironically, my Shuttle, which isn't known for their awesomeness and stopped updating the BIOS in 2016, XPC barebones computer build running a 4770K, hasn't run in to issues and it was built 6 months before my super expensive unRAID server build running KUbuntu 18.04 with kernel 4.15.0-38, the latest for 18.04. I just noticed that it is way older of a Kernel than unRAID.
  13. No, only window is listed on the board I have. I bought this board 4 years ago for use with unRAID and it's been working GREAT since I installed it till the problems I'm having with 6.6.x.
  14. Lasted 30 minutes again. What ever happened it's worse now. Here are the logs again...Not sure if they will help. Here are the logs from the 2nd try. rudder2-server-diagnostics-6.6.6-20181203-2029.zip FCPsyslog_tail-2018-12-03 6.6.6 2nd try.txt What's next to try?
  15. This is the shortest the server ever ran...Lasted only 30 minutes on 6.6.6. I did a power off and power on instead of reset button. Diagnostics after reboot and the FCP Tail: rudder2-server-diagnostics-6.6.6-20181203-2005.zip FCPsyslog_tail.txt Going to let it run once more with Fix Common Problems in Diagnostics mode before I downgrade and wait for something to try.
  16. Installing 6.6.6 today and putting fix common problems in to diagnostics mode just in case since I haven't had a successful install of a 6.6.x update. Really trying to capture the logs as it happens this time. I will also power down the server for a couple minutes and power it back on after the update instead of issuing a reboot. Fingers Crossed!
  17. They haven't responded back to me yet. I looked at all their current boards and not one lists Linux. I'm hoping they message me back but I'm starting to think ASRock is not the manufacture to use if you want to run Linux. I've read on the Ubuntu forms that others have had the same issue with ASRock. I wasn't even asking them about a Linux related problem...Was just curious why they haven't produced a microcode updated BIOS since R24, if it was because of comparability or just haven't gotten around to it to help with our diagnostics. I like ASRock boards but I think this will be my last...I don't have a winblows computer in my house...Just a winblows 10 VM on my unRAID server to do Steam in home streaming for games not compatible with Linux. I gave up on MicroShaft after windblows 7. I've been Linux 100% since 2012. For the future of my computer building can someone suggest a more Linux friendly mother board Manufacture? It's just odd that I only get machine check messages when I'm on unRAID 6.6.0 - 6.6.5. I think we figure that out we will figure out why my system locks up when upgraded.
  18. So, ASRock basically told be to buy a board that they designed for Linux because they designed this board for winblows and refused to answer the question about the Microcode version in their BIOS.
  19. I know, it's so odd. Since it's hard locked I can't power down properly so I press the reset button. Honestly, I haven't had a computer lockup in so long I forgot powering off and waiting a minute and powering back on was different than reset button.
  20. I'm totally open to this as long as I retain my downgrade ability to the stable release for me, 6.5.0. I've been testing every release and downgrading since 6.6.0, because that's when this problem started for me (only missed 6.6.4 because I never got a notification when it came out, this notification thing has been remedied). I though it would go away when you fixed the RealTek NIC problem but it didn't. Just tell me what you want my to try and I will upgrade my system and try it. When I first install an upgrade it takes 5-6 days to lockup the first time then only 6 hours, then only 3 hours, thank every 30 minutes so it mite take a week or so to get you the results. Also, if you could tell my away to have the logs stored on the cache drive I will so this so we have logs right up to the lockup, been wanting this as an option for a long time with maybe deleting the logs after a week or 2 old. In the mean time I'm putting in an email to ASRock support. Thank you for your help. I really appreciate it.
  21. From LOG: Nov 14 23:48:13 Rudder2-Server kernel: microcode: microcode updated early to revision 0x25, date = 2018-04-02 From Motherboard Manufacture (ASRock): BIOS P2.80: Update Haswell CPU Microcode to revision 24 and Broadwell CPU Microcode to revision 1D. So yes, I think we are on to something...The microcode update in the Kernel is breaking the CPU. I wander if this is why ASRock didn't update the BIOS to it. Where do we go from here? Is this something that must be fixed by Limetech or is there a code I can insert to fix my system till a permanent fix is implemented? Really don't want to be stuck with no updates to unRAID. Thank you for your help, you shed some light on the situation...