November 1, 20178 yr Hi all, I have been running an unRAID server on an Intel Core-i5 7640X and it runs fine. However, I replaced the processor with a Core-i9 7940X (with no other hardware changes), and I get a kernel panic on boot (shown below). I suspected a hardware problem with the processor, but I wanted to eliminate unRAID as a factor, as this is a quite new processor (it's only been out for about a month), so I put a spare hard drive in and installed Windows 10, and to my surprise it ran fine. I even ran the Intel Processor Diagnostic Tool, and everything passed. I also ran the Windows Memory Diagnostic Tool too, just in case, with no errors, and I ran the machine through a game just to stress it a bit, but no issues emerged; the hardware would seem to be completely fine by every test. I'm not quite sure what else to try - it would seem to be an unRAID issue, since the processor works fine and passes stress tests with Windows, and the other hardware works fine with unRAID. Has anyone else tried unRAID with one of the newer Core-i9 processors? The 7940X I'm using is the 14-core model but I'd be happy to hear about the 18-, 16-, 12-, and 10-core models, or "X" models of the i7's and i5's that use the LGA2066. I would appreciate your help. I'm pretty desperate at this point for anything else to try. Other details / things I've tried: The exact kernel panic shown is as follows: mce: [Hardware Error]: CPU 0: Machine Check Exception: 5 Bank 0: b200000000070005 mce: [Hardware Error]: RIP !INEXACT! 10:<ffffffff810d2560> {mark_page_accessed+0x118/0x132} mce: [Hardware Error]: TSC 1db62fe240 mce: [Hardware Error]: PROCESSOR 0:50654 TIME 1509237128 SOCKET 0 APIC 8 microcode 200002b mce: [Hardware Error]: Run the above through 'mcelog --ascii' mce: [Hardware Error]: Machine check: processor context corrupt Kernel panic - not syncing: Fatal machine check Kernel Offset: disabled Rebooting in 30 seconds.. The kernel panic doesn't always appear at the same time. This time, it appeared following "cpuload started". Previously, it appeared following "Disabling lock debugging due to kernel taint". It seems fairly random each boot. The panic is slightly different each time; previously for example instead of {mark_page_accessed... it was {get_page_from_freelist+0x1ef/0x78f}. Various hexadecimal values are different each time. I am running unRAID 6.3.5 with the "Plus" license. No overclocking, XMP, etc., and I have tried reverting all BIOS settings to their defaults, to no effect. I put the i5 7640X back in, and verified that it still works fine with that processor (i.e., to confirm that another component didn't just happen to fail when I swapped the processors), and then put the i9 7940X in again, and verified that it still kernel panics. I tried disabling each core of the i9 except one, and then again with a different core enabled, but it had no effect. Thermals are all good, with the processor idling between around 30 at idle and 55 when stressed. It has an AIO cooler on it. Around the internet, I found a suggestion saying that this may be related to the "TSC clock source" and that I could try using a different clock source, but I couldn't figure out how to do this, and I'm not sure if I'm really on the right track or not. Since this machine doesn't boot, I can't "run the above through mcelog --ascii", but my other unRAID box (on completely different hardware) doesn't seem to have a program called "mcelog" installed anyway, so I don't really understand what it's suggesting I do here. Other hardware: MSI X299 Tomahawk AC motherboard Corsair TX850M power supply 16 GB RAM (normally 64 GB, other sticks have been removed for the time being, I also tried using a different stick) 1 TB SanDisk Ultra II SSD MSI GT 710 graphics card Normally, I have a five-disk array of hard disks and two additional graphics cards installed, but they are all removed for the moment to make sure they weren't causing any problems. If any additional information or tests I can run would be helpful I would be happy to provide them.
November 3, 20178 yr Author So, when I booted up UnRaid again (still 6.3.5) it just worked.. I didn't do anything. I spent two weeks trying to figure out what was wrong with the darn thing, and you'd think I'd be happy that it works now, yet somehow I am just kind of annoyed. So, false alarm, I guess? Maybe Windows 10 updated the firmware of some hardware component without me realizing it? Anyway, thanks for your help.
Archived
This topic is now archived and is closed to further replies.