-
Posts
72 -
Joined
Content Type
Profiles
Forums
Downloads
Store
Gallery
Bug Reports
Documentation
Landing
Everything posted by Mantene
-
Let me preface this by saying everything seems to be working great, so this seems like an error of no consequence at the moment, but I am seeing this:
-
I can probably do all of those things. I am fairly sure I have a spare, though lower wattage, PSU. And just using two DIMMs is easy enough to try. However, I just started Prime95 cpu test so I will let that run for a few hours (if the PC stays up that long)! Thank you for the suggestions, I will report back when I have added information.
-
May 5 08:58:30 Eeyore kernel: RSP: 0018:ffffc900007b78a0 EFLAGS: 00010202 May 5 08:58:30 Eeyore kernel: RAX: ffffea0005e41d80 RBX: ffffc900007b7940 RCX: 0000000000000006 May 5 08:58:30 Eeyore kernel: RDX: 0000000000000101 RSI: 17fec0817ed02b28 RDI: ffffc900007b78e8 May 5 08:58:30 Eeyore kernel: RBP: ffffc900007b7930 R08: 000000000000007f R09: ffffea0005e41d80 May 5 08:58:30 Eeyore kernel: R10: 0000000000000000 R11: ffff888103891500 R12: 000000000000000d May 5 08:58:30 Eeyore kernel: R13: ffff888103891500 R14: ffff8881026826c0 R15: 17fec0817ed02b08 May 5 08:58:30 Eeyore kernel: FS: 0000000000000000(0000) GS:ffff888ffea40000(0000) knlGS:0000000000000000 May 5 08:58:30 Eeyore kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 May 5 08:58:30 Eeyore kernel: CR2: 00001510c418d4e8 CR3: 000000000200a000 CR4: 0000000000350ee0 May 5 08:59:20 Eeyore kernel: mce: [Hardware Error]: Machine check events logged May 5 08:59:20 Eeyore kernel: [Hardware Error]: Corrected error, no action required. May 5 08:59:20 Eeyore kernel: [Hardware Error]: CPU:9 (17:71:0) MC2_STATUS[-|CE|MiscV|AddrV|-|-|SyndV|CECC|-|-|-]: 0x9c20400000020136 May 5 08:59:20 Eeyore kernel: [Hardware Error]: Error Addr: 0x00000001790531e0 May 5 08:59:20 Eeyore kernel: [Hardware Error]: IPID: 0x000200b000000000, Syndrome: 0x000171f21a4418f5 May 5 08:59:20 Eeyore kernel: [Hardware Error]: L2 Cache Ext. Error Code: 2, L2M Data Array ECC Error. May 5 08:59:20 Eeyore kernel: [Hardware Error]: cache level: L2, tx: DATA, mem-tx: DRD May 5 08:59:20 Eeyore kernel: mce: [Hardware Error]: Machine check events logged May 5 08:59:20 Eeyore kernel: [Hardware Error]: Corrected error, no action required. May 5 08:59:20 Eeyore kernel: [Hardware Error]: CPU:1 (17:71:0) MC14_STATUS[Over|CE|MiscV|AddrV|-|SyndV|CECC|-|-|-]: 0xdc2040000004010b May 5 08:59:20 Eeyore kernel: [Hardware Error]: Error Addr: 0x00000001790531e0 May 5 08:59:20 Eeyore kernel: [Hardware Error]: IPID: 0x000700b020f50300, Syndrome: 0x000171f21a47010a May 5 08:59:20 Eeyore kernel: [Hardware Error]: L3 Cache Ext. Error Code: 4, L3M Data ECC Error. May 5 08:59:20 Eeyore kernel: [Hardware Error]: cache level: L3/GEN, tx: GEN, mem-tx: GEN
-
Yep, it still does it with the Ram at 2133. I did safemode with docker and vms disabled. it stays up for longer, but it still seems to reboot randomly. So yes, I am also of the opinion that it is hardware. I just wish I knew which component. MB, CPU, or PSU are the main suspects.
-
Thank you, @JorgeB for moving the thread to the correct forum. Apologies to @Squid for posting in the wrong place. I was in somewhat of a panic when I created the original thread. So, to address the comments of @John_M - I let MemTest run overnight and there were no errors in the morning - also the system did not reboot at all. I am now mirroring syslog to flash. I will attach a new diags bundle. Also, I have removed the modprobe i915 now - this used to be on an intel system, and that is a remnant. @ChatNoir PSU seems to be okay, but that is one of the more difficult pieces of hardware to know for sure. Cooling also seems okay - the cpu and mb temps hover around 45, one occasionally hits 60 but only ever for a few seconds and it has always been so. These were my first thoughts too. I have seen some errors relating to L2 or L3 cache in the syslog. Could that be the issue? Is there a way to test the CPU for faults? I am at a loss here. The system stays up if I boot into safe mode and don't mount the array. Once I mount the array it just takes minutes until an unexcepted reboot. @JorgeB - as to the memory overcocking - you are right! I had the XMP turned on for my ram. I turned it off this AM and the current speed should be 2133. I deeply appreciate all the help you are all providing. Any ideas what my next steps should be? eeyore-diagnostics-20210505-1013.zip
-
Help, Server is rebooting every few minutes!
Mantene commented on Mantene's report in Stable Releases
Oh, it even does it in safe mode. I am ready to throw the box out the window -
Oh, it even does it in safe mode. I am ready to throw the box out the window
-
Help, Server is rebooting every few minutes!
Mantene commented on Mantene's report in Stable Releases
Yes, I read that a while back. And I do not overclock my RAM (or my CPU), I am using approved RAM and all the same RAM in all the slots. Also, my power settings are correct so the c-state should not be an issue. And again, I haven't made any changes to any of that recently and I have been running Unraid for quite some time now. Also, here is a diagnostics from a regular boot - it crashed about 20 seconds after I got this! eeyore-diagnostics-20210504-1730.zip -
Yes, I read that a while back. And I do not overclock my RAM (or my CPU), I am using approved RAM and all the same RAM in all the slots. Also, my power settings are correct so the c-state should not be an issue. And again, I haven't made any changes to any of that recently and I have been running Unraid for quite some time now. Also, here is a diagnostics from a regular boot - it crashed about 20 seconds after I got this! eeyore-diagnostics-20210504-1730.zip
-
I don't know what is going on. All of the sudden my server is rebooting what seems like every few minutes. I have been running 6.9.2 since it first came out, so it isn't like I am running a beta release. And I haven't made any configuration changes to the server. In fact, I was simply using the Windows 10 VM when it started this behavior. I am attaching the diagnostic data from Safe Mode with the array started. Yes, it seems to work in safe mode. I know that some plugins got updated today but I honestly don't know which ones - unassigned devices? But that shouldn't cause this, right? Please help! eeyore-diagnostics-20210504-1719.zip
-
I don't know what is going on. All of the sudden my server is rebooting what seems like every few minutes. I have been running 6.9.2 since it first came out, so it isn't like I am running a beta release. And I haven't made any configuration changes to the server. In fact, I was simply using the Windows 10 VM when it started this behavior. I am attaching the diagnostic data from Safe Mode with the array started. Yes, it seems to work in safe mode. I know that some plugins got updated today but I honestly don't know which ones - unassigned devices? But that shouldn't cause this, right? Please help! eeyore-diagnostics-20210504-1719.zip
-
Getting: Mar 8 09:58:03 Eeyore kernel: caller _nv000708rm+0x1af/0x200 [nvidia] mapping multiple BARs Mar 8 09:58:04 Eeyore kernel: resource sanity check: requesting [mem 0x000c0000-0x000fffff], which spans more than PCI Bus 0000:00 [mem 0x000c0000-0x000dffff window] Mar 8 09:58:04 Eeyore kernel: caller _nv000708rm+0x1af/0x200 [nvidia] mapping multiple BARs Mar 8 09:58:05 Eeyore kernel: resource sanity check: requesting [mem 0x000c0000-0x000fffff], which spans more than PCI Bus 0000:00 [mem 0x000c0000-0x000dffff window] Mar 8 09:58:05 Eeyore kernel: caller _nv000708rm+0x1af/0x200 [nvidia] mapping multiple BARs Mar 8 09:58:07 Eeyore kernel: resource sanity check: requesting [mem 0x000c0000-0x000fffff], which spans more than PCI Bus 0000:00 [mem 0x000c0000-0x000dffff window] In my log file constantly. What do I have configured incorrectly? eeyore-diagnostics-20210308-0959.zip
-
I just wanted to come back and report that uninstalling atop did the trick. My log has been sitting at 1% Full for the past two weeks. Thank you, @JorgeB and @trurl!
-
Yep, that is what i decided after the first response.
-
Thanks! I will remove it and see what develops!
-
Can someone please help me figure out why my log is filling up? Please let me know if there is anything else I can provide to help solve this issue. I would prefer not to reboot every week to keep the log from hitting 100%! Thank you, Matt eeyore-diagnostics-20201118-0841.zip
-
I added a second nVidia card to my box and now I cannot get my windows 10 vm to start up. The monitor keeps flashing on and off every minute, as though it sends a signal and then stops. I cannot find anything in the logs. ANY help would be greatly appreciated. eeyore-diagnostics-20200927-1747.zip
-
Windows 10 VM - Is it possible to get WSL2 to work?
Mantene replied to Mantene's topic in VM Engine (KVM)
I think the @limetech people need to fix this. The new build of Windows 10 still does not allow AMD procs to use Hyper V. -
Windows 10 VM - Is it possible to get WSL2 to work?
Mantene replied to Mantene's topic in VM Engine (KVM)
Nothing from me. I keep hoping some unraid update will fix this. -
Until that is a feature, I would think it should be possible to build wireguard into the containers. I haven't experimented with that, but I would imagine it should be possible. I could be very very wrong though.
-
Docker Desktop (Hyper-V) inside (nested) Windows 10 VM
Mantene replied to k11su's topic in VM Engine (KVM)
No one has found a solution to this yet? -
Windows 10 VM - Is it possible to get WSL2 to work?
Mantene replied to Mantene's topic in VM Engine (KVM)
So I am guessing no one can help. Okay. I will keep playing around. -
Windows 10 VM - Is it possible to get WSL2 to work?
Mantene replied to Mantene's topic in VM Engine (KVM)
Does anyone have WSL2 working with an AMD processor? -
Windows 10 VM - Is it possible to get WSL2 to work?
Mantene replied to Mantene's topic in VM Engine (KVM)
wsl1 works. The better wsl2 does not.