Muath Posted February 24, 2020 Author Share Posted February 24, 2020 On 2/19/2020 at 6:46 PM, Dissones4U said: This may be elementary but did you try to remove the new hardware and revert to the prior "working" configuration? I was thinking it is very unlikely for this to be the reason so I forget about it, but I will do it this week. * Parity is stuck now for the third time in a row 😞. Quote Link to comment
Muath Posted March 2, 2020 Author Share Posted March 2, 2020 (edited) so the parity check will take 3 whole years to finish 😞 so this is the 4th time I tried to operate the parity then got stuck. * I just notice one of the threads is stuck also! moathcenterr-diagnostics-20200302-2154.zip Edited March 2, 2020 by Muath Quote Link to comment
JorgeB Posted March 2, 2020 Share Posted March 2, 2020 Very strange, there's nothing on the log, if you pause/unpause do you get the same nginx errors as before? 1 Quote Link to comment
Muath Posted March 2, 2020 Author Share Posted March 2, 2020 (edited) 2 hours ago, johnnie.black said: Very strange, there's nothing on the log, if you pause/unpause do you get the same nginx errors as before? Yes, but I don't think the nginx is the reason. Mar 3 00:04:02 MoathCenterr kernel: mdcmd (63): nocheck Pause Mar 3 00:05:35 MoathCenterr nginx: 2020/03/03 00:05:35 [error] 5831#5831: *1630028 connect() to unix:/var/run/emhttpd.socket failed (11: Resource temporarily unavailable) while connecting to upstream, client: 192.168.100.35, server: , request: "POST /update.htm HTTP/1.1", upstream: "http://unix:/var/run/emhttpd.socket:/update.htm", host: "moathcenterr", referrer: "http://moathcenterr/Main" (Video Recording). Parity Check History didn't show the last 4 hung sessions only the one I canceled. Edited March 2, 2020 by Muath Quote Link to comment
JorgeB Posted March 3, 2020 Share Posted March 3, 2020 14 hours ago, Muath said: Yes, but I don't think the nginx is the reason. Yes, you're right, this is an issue I've never seen before and have no idea what's the problem or how to diagnose it, maybe @limetechhas some ideas. 1 Quote Link to comment
itimpi Posted March 3, 2020 Share Posted March 3, 2020 To try and eliminate as many variables as possible do you get the same symptoms if you disable the docker and VM services under Settings and then reboot in Safe Mode to suppress plugins? 2 Quote Link to comment
Muath Posted March 6, 2020 Author Share Posted March 6, 2020 (edited) On 3/3/2020 at 2:57 PM, johnnie.black said: Yes, you're right, this is an issue I've never seen before and have no idea what's the problem or how to diagnose it, maybe @limetechhas some ideas. Thank you very much for your assist. On 3/3/2020 at 3:06 PM, itimpi said: To try and eliminate as many variables as possible do you get the same symptoms if you disable the docker and VM services under Settings and then reboot in Safe Mode to suppress plugins? Now on the fifth try the parity check completed with no issue! I didn't do much I just change the GPU to an old one and cleaned the fans, not sure if the issue fixed now or not, I will be back next month during the parity check if anything happen. Thank you everyone. Edited March 6, 2020 by Muath Quote Link to comment
Muath Posted April 16, 2020 Author Share Posted April 16, 2020 (edited) >> so my system keep hang the last month from time to time and since there's no more info I could gather I didn't update my issue here sometimes some threads hang and system will keep running but other times all the threads hang which then I need to restart the system forcely: but now Fix Common Problems detect hardware errors after suddenly the parity check triggered!: error logs: Apr 16 11:57:06 MoathCenterr kernel: mce: [Hardware Error]: Machine check events logged Apr 16 11:57:06 MoathCenterr kernel: mce: [Hardware Error]: CPU 10: Machine Check: 0 Bank 5: bea0000000000108 Apr 16 11:57:06 MoathCenterr kernel: mce: [Hardware Error]: TSC 0 ADDR 14798839663c MISC d010000000000000 SYND 4d000000 IPID 500b000000000 Apr 16 11:57:06 MoathCenterr kernel: mce: [Hardware Error]: PROCESSOR 2:870f10 TIME 1587027393 SOCKET 0 APIC 5 microcode 8701013 Apr 16 12:07:25 MoathCenterr root: Fix Common Problems: Error: Machine Check Events detected on your server Apr 16 12:07:25 MoathCenterr root: mcelog: ERROR: AMD Processor family 23: mcelog does not support this processor. Please use the edac_mce_amd module instead. Apr 16 14:13:07 MoathCenterr kernel: Plex Script Hos[29328]: segfault at 0 ip 000014f2ee0a8d37 sp 000014f2e540e130 error 4 in libpython2.7.so.1.0[14f2edf71000+19f000] Can it be CPU failure? 😥 UPDATE: logs below keep happening from time to time and activate the Parity. moathcenterr-diagnostics-20200424-0508.zip moathcenterr-diagnostics-20200416-1648.zip Edited April 24, 2020 by Muath Quote Link to comment
Muath Posted April 24, 2020 Author Share Posted April 24, 2020 This is become annoying: Apr 24 04:58:20 MoathCenterr kernel: mce: [Hardware Error]: Machine check events logged Apr 24 04:58:20 MoathCenterr kernel: mce: [Hardware Error]: CPU 10: Machine Check: 0 Bank 5: bea0000000000108 Apr 24 04:58:20 MoathCenterr kernel: mce: [Hardware Error]: TSC 0 ADDR 14b767dc2084 MISC d012000100000000 SYND 4d000000 IPID 500b000000000 Apr 24 04:58:20 MoathCenterr kernel: mce: [Hardware Error]: PROCESSOR 2:870f10 TIME 1587693467 SOCKET 0 APIC 5 microcode 8701013 Apr 24 05:03:30 MoathCenterr root: Fix Common Problems: Error: Machine Check Events detected on your server Apr 24 05:03:30 MoathCenterr root: mcelog: ERROR: AMD Processor family 23: mcelog does not support this processor. Please use the edac_mce_amd module instead. Apr 24 05:08:22 MoathCenterr root: Fix Common Problems: Error: Machine Check Events detected on your server Apr 24 05:08:22 MoathCenterr root: mcelog: ERROR: AMD Processor family 23: mcelog does not support this processor. Please use the edac_mce_amd module instead. Apr 24 18:12:33 MoathCenterr kernel: CPU: 10 PID: 5903 Comm: unraidd0 Tainted: G O 4.19.107-Unraid #1 Apr 24 18:12:33 MoathCenterr kernel: Call Trace: Apr 24 18:13:04 MoathCenterr kernel: CPU: 5 PID: 1727 Comm: scsi_eh_10 Tainted: G D O 4.19.107-Unraid #1 Apr 24 18:13:04 MoathCenterr kernel: Call Trace: Quote Link to comment
JorgeB Posted April 24, 2020 Share Posted April 24, 2020 That's a hardware issue, most likely RAM, CPU or board related. 1 Quote Link to comment
Muath Posted April 24, 2020 Author Share Posted April 24, 2020 (edited) 6 minutes ago, johnnie.black said: That's a hardware issue, most likely RAM, CPU or board related. I changed the RAMs, GPU and motherboard, so most likely CPU issue! or could it be RAM not supporting AMD CPU? Edited April 24, 2020 by Muath Quote Link to comment
JorgeB Posted April 24, 2020 Share Posted April 24, 2020 8 minutes ago, Muath said: could it be RAM not supporting AMD CPU? Make sure you're respecting max officially supported RAM speed depending on the config, a common source of problems with Ryzen. 1 Quote Link to comment
Hoopster Posted April 24, 2020 Share Posted April 24, 2020 9 minutes ago, Muath said: I changed the RAMs and motherboard, so most likely CPU issue! or could it be RAM not supporting AMD CPU? Is the RAM on the QVL for your motherboard? Sometimes, certain Ryzen CPU/Motherboards have problems if all four RAM slots are occupied. They become very picky with RAM speed and only support certain speeds before they become unstable. Are you overclocking the RAM at all or are you running it at the stock RAM speed? There will probably be a chart in your motherboard manual that shows what RAM speeds are supported depending on which and how many RAM slots are populated. 1 Quote Link to comment
Muath Posted May 10, 2020 Author Share Posted May 10, 2020 (edited) I'm using these RAMs: https://www.amazon.com/G-SKILL-TridentZ-288-Pin-3000MHz-F4-3000C16D-16GTZR/dp/B06WP4L3D7/ and tried to use: https://www.newegg.com/g-skill-16gb-288-pin-ddr4-sdram/p/N82E16820232290 not overclocked and I was using 4 of them but then removed 2 and switch between the slots but issue remain .. actually it's getting worse! (I will keep updating this comment with new logs for index purpose if someone search for it 👁️🗨️) Jun 24 18:18:05 MoathCenterr kernel: mce: [Hardware Error]: Machine check events logged Jun 24 18:18:05 MoathCenterr kernel: mce: [Hardware Error]: CPU 10: Machine Check: 0 Bank 5: bea0000000000108 Jun 24 18:18:05 MoathCenterr kernel: mce: [Hardware Error]: TSC 0 ADDR 4206c8 MISC d012000100000000 SYND 4d000000 IPID 500b000000000 Jun 24 18:18:05 MoathCenterr kernel: mce: [Hardware Error]: PROCESSOR 2:870f10 TIME 1593011852 SOCKET 0 APIC 5 microcode 8701013 2020-07-06 logs: Jul 6 18:46:08 MoathCenterr kernel: mce: [Hardware Error]: Machine check events logged Jul 6 18:46:08 MoathCenterr kernel: mce: [Hardware Error]: CPU 2: Machine Check: 0 Bank 5: bea0000000000108 Jul 6 18:46:08 MoathCenterr kernel: mce: [Hardware Error]: TSC 0 ADDR 1ffff8109a37a MISC d012000100000000 SYND 4d000000 IPID 500b000000000 Jul 6 18:46:08 MoathCenterr kernel: mce: [Hardware Error]: PROCESSOR 2:870f10 TIME 1594050335 SOCKET 0 APIC 4 microcode 8701013 Jul 6 18:56:27 MoathCenterr root: Fix Common Problems: Error: Machine Check Events detected on your server Jul 6 18:56:27 MoathCenterr root: mcelog: ERROR: AMD Processor family 23: mcelog does not support this processor. Please use the edac_mce_amd module instead. 2020-07-07 - OS suddenly shut down and the message below shown: Edited July 7, 2020 by Muath Quote Link to comment
Muath Posted July 15, 2020 Author Share Posted July 15, 2020 It did turn out to be a Faulty CPU 💔. Thank you everyone 🌹 1 Quote Link to comment
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.