August 20, 20196 yr Hi everyone, I am having a few issues with my unraid server since I moved the disks into a new platform. The new system is X370/Ryzen 1st gen based and initially I had a problem with random system freezes, but since setting rcu_nocbs=0-5 (Ryzen 5 1600), as per this post the system does not hang anymore. However, now the array is randomly dropping offline every few hours (without the server hanging) and I am clueless as to why. I had a look in the logs and apparently the CPU is throwing some MCE codes which is of course rather disconcerting, but with it appearing after the above mentioned fix I am wondering if it is connected? Aug 20 06:14:45 VAULT kernel: mce: [Hardware Error]: Machine check events logged Aug 20 06:14:45 VAULT kernel: mce: [Hardware Error]: CPU 7: Machine Check: 0 Bank 5: bea0000000000108 Aug 20 06:14:45 VAULT kernel: mce: [Hardware Error]: TSC 0 ADDR 1ffff816560e2 MISC d012000200000000 SYND 4d000000 IPID 500b000000000 Aug 20 06:14:45 VAULT kernel: mce: [Hardware Error]: PROCESSOR 2:800f11 TIME 1566274467 SOCKET 0 APIC 3 microcode 8001137 System: Gigabyte Aorus X370 G5 Ryzen 5 1600 16 GiB DDR4-2666 3x Hdd / 2x SATA SSD / 1x nvme SSD unraid 6.7.2 2019-06-25 Plugins: Community Applications, Fix common problems, Dynamix SSD trim, Dynamix File Integrity, PreClear Disks, Nerdtools Docker: Plex, Transmission Diagnostics: here Edit: I just realized I had the rcu_nocbs=0-5 set for cores instead of threads. Now changed it to rcu_nocbs=0-11, will report back if problem goes away. Edit 2: Problem persists Edited August 20, 20196 yr by Fiffty
Archived
This topic is now archived and is closed to further replies.