Trylo Posted September 18, 2018 Share Posted September 18, 2018 Hi! Last night my Unraid machine restarted and "Fix Common Problems" gave me this message: Your server has detected hardware errors. You should install mcelog via the NerdPack plugin, post your diagnostics and ask for assistance on the unRaid forums. The output of mcelog (if installed) has been logged. Please let me know if it's something serious. Diagnostics attached. Thank you in advance! nas-diagnostics-20180918-1154.zip Quote Link to comment
Frank1940 Posted September 18, 2018 Share Posted September 18, 2018 (edited) One thing that stands out is that you have a 250GB SSD drive as your cache drive. You then attempted to copy 350GB of files to the array. I am guessing that you had the User Share that was the target for this copy assigned to use the cache drive! GUESS WHAT!!!! It filled up. While this should not have closed caused a restart, you should probably not use the cache drive for this User Share until this copy is finished. Edited September 18, 2018 by Frank1940 Quote Link to comment
Trylo Posted September 18, 2018 Author Share Posted September 18, 2018 Yeah, it filled up, that doesn't surprise me After it fills up then it copies the rest of files straight to the HDD. I have done this a few times and it never resulted in a restart. Quote Link to comment
Frank1940 Posted September 18, 2018 Share Posted September 18, 2018 There is also this series of events during the boot process: Sep 18 00:14:47 NAS kernel: mce: [Hardware Error]: Machine check events logged Sep 18 00:14:47 NAS kernel: mce: [Hardware Error]: CPU 0: Machine Check: 0 Bank 4: fa000010000b0c0f Sep 18 00:14:47 NAS kernel: mce: [Hardware Error]: TSC 0 MISC d012000001000000 Sep 18 00:14:47 NAS kernel: mce: [Hardware Error]: PROCESSOR 2:660f01 TIME 1537222472 SOCKET 0 APIC 0 microcode 600610e Sep 18 00:14:47 NAS kernel: Performance Events: Fam15h core perfctr, AMD PMU driver. Sep 18 00:14:47 NAS kernel: ... version: 0 Sep 18 00:14:47 NAS kernel: ... bit width: 48 Sep 18 00:14:47 NAS kernel: ... generic registers: 6 Sep 18 00:14:47 NAS kernel: ... value mask: 0000ffffffffffff Sep 18 00:14:47 NAS kernel: ... max period: 00007fffffffffff Sep 18 00:14:47 NAS kernel: ... fixed-purpose events: 0 Sep 18 00:14:47 NAS kernel: ... event mask: 000000000000003f Sep 18 00:14:47 NAS kernel: Hierarchical SRCU implementation. Sep 18 00:14:47 NAS kernel: smp: Bringing up secondary CPUs ... Sep 18 00:14:47 NAS kernel: x86: Booting SMP configuration: Sep 18 00:14:47 NAS kernel: .... node #0, CPUs: #1 #2 #3 I am not sure what is causing it. Hopefully, someone else will have a better feel for what is happening here... One thing, you can do is to Google mce: [Hardware Error]: CPU 0: Machine Check: 0 Bank 4: fa000010000b0c0f and look at the results. You could also try the other mce errors with Google and see if there is some commonality. Quote Link to comment
trurl Posted September 18, 2018 Share Posted September 18, 2018 2 hours ago, Trylo said: Yeah, it filled up, that doesn't surprise me After it fills up then it copies the rest of files straight to the HDD. I have done this a few times and it never resulted in a restart. You should set Minimum Free on the cache drive to larger than the largest file you expect to write to cache. Then if you are writing to a cached user share, it will go ahead and overflow to the array before you actually fill up cache and get an error. Minimum Free for cache is in Global Share Settings. And of course, each user share has its own Minimum Free setting which should be set to larger than the largest file you expect to write to the user share. Unraid has no way to know how large a file will become when it chooses a disk to write it to. If there is less than minimum free on a disk, it will choose another. None of this should have anything to do with your restart though. Have you done a memtest? Quote Link to comment
Trylo Posted September 18, 2018 Author Share Posted September 18, 2018 So Google says it's either bad CPU or bad microcode on the motherboard. Can you do memtest headless? If not, taking NAS apart and installing GPU in it will take me around 2 hours... Quote Link to comment
trurl Posted September 18, 2018 Share Posted September 18, 2018 59 minutes ago, Trylo said: Can you do memtest headless? No. Maybe or maybe not if you have IPMI but I have never had that on any of my builds so I don't know. Some googling suggests it might be possible but if you don't have IPMI either its moot. Quote Link to comment
JorgeB Posted September 18, 2018 Share Posted September 18, 2018 25 minutes ago, trurl said: Maybe or maybe not if you have IPMI Just for reference, anything you can do with a monitor/keyboard you can do with IPMI, IMO one of the best things ever for servers/NAS, once I got my first IPMI board and got used to it took very little time until I replaced all my other servers with IPMI enable boards. Quote Link to comment
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.