Rbby258 Posted March 16, 2023 Share Posted March 16, 2023 (edited) Hi, been a longtime Unraid user. Recently upgraded my system to a 13900k and everything been working fine. But now i'm having an issue. First noticed when my vm didn't boot and it said it was because a core was missing. I checked processor cores in system devices and core 12 and 13 are not there so only a total of 30 cores. Bios is stock in terms of voltages and cores. Booting bare metal into windows all 32 cores are present and all benchmarks work and run perfect including gaming for about 5 hours last night. Today I've tried update bios, swapping memory, recreating Unraid usb and copying over my config folder. Not sure what else to do. Thanks in advanced. Just ran a memtest and passed fine also. tower-diagnostics-20230316-1916.zip Edited March 16, 2023 by Rbby258 Quote Link to comment
Squid Posted March 16, 2023 Share Posted March 16, 2023 Bad CPU? Mar 16 19:15:21 Tower mcelog: CPU 12 on socket 0 has large number of corrected cache errors in Level-2 Instruction Mar 16 19:15:21 Tower mcelog: System operating correctly, but might lead to uncorrected cache errors soon Mar 16 19:15:21 Tower mcelog: Running trigger `cache-error-trigger' (reporter: yellow) Mar 16 19:15:21 Tower mcelog: Too many trigger children running already Mar 16 19:15:21 Tower mcelog: CPU 13 on socket 0 has large number of corrected cache errors in Level-2 Instruction Mar 16 19:15:21 Tower mcelog: System operating correctly, but might lead to uncorrected cache errors soon Mar 16 19:15:21 Tower mcelog: Running trigger `cache-error-trigger' (reporter: yellow) Mar 16 19:15:21 Tower mcelog: Too many trigger children running already Mar 16 19:15:21 Tower mcelog: Offlining CPU 12 due to cache error threshold Tons of mce errors on CPU12 during the boot process and the kernel disabled the core to protect the system's integrity Quote Link to comment
apandey Posted March 17, 2023 Share Posted March 17, 2023 (edited) Yes, looks like a cpu problem. Since you see all cores when booting windows, try running a cpu benchmark and look for any errors One idea can be to disable any hyper threading, run prime95 single thread test, then go to task manager and set cpu core Affinity to only 1 core you are testing. See for stats like killed threads There might be other tools lik occt that can help to verify as well. You just want to test core stability Edited March 17, 2023 by apandey Quote Link to comment
Rbby258 Posted March 17, 2023 Author Share Posted March 17, 2023 I'll look into this further later today. It seems weird that everything functions fine when booting into windows? Maybe a warped cpu with bad contact to the pins? I've never seen a cpu sort of half work, especially with everything set to default + xmp. Thanks guys, I will post any updates. Quote Link to comment
Solution Rbby258 Posted March 19, 2023 Author Solution Share Posted March 19, 2023 (edited) Is there a way I can disable the kernel disabling the core temporally? I've been using OCCT in bare metal windows and got the system stable but I still cant get them cores back? It was working originally. Using "cd /sys/devices/system/cpu" followed by "ls" shows all 32 cores Edit: I've figured it out. I added "mce=off" under Syslinux configuration. The cores are isolated and used only on a vm so for the meantime I'm going to run like this and see how it goes. Edited March 19, 2023 by Rbby258 Quote Link to comment
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.