[SOLVED] My Array Won't Work!! HELP


Muath

Recommended Posts

On 2/19/2020 at 6:46 PM, Dissones4U said:

This may be elementary but did you try to remove the new hardware and revert to the prior "working" configuration?

I was thinking it is very unlikely for this to be the reason so I forget about it, but I will do it this week.

 

* Parity is stuck now for the third time in a row 😞.

 

1673433435_unraid2.thumb.png.4803dfd04759ce614b5b52b3c5f1f2ab.png

 

 

 

 

Link to comment
2 hours ago, johnnie.black said:

Very strange, there's nothing on the log, if you pause/unpause do you get the same nginx errors as before?

Yes, but I don't think the nginx is the reason.

Mar  3 00:04:02 MoathCenterr kernel: mdcmd (63): nocheck Pause
Mar  3 00:05:35 MoathCenterr nginx: 2020/03/03 00:05:35 [error] 5831#5831: *1630028 connect() to unix:/var/run/emhttpd.socket failed (11: Resource temporarily unavailable) while connecting to upstream, client: 192.168.100.35, server: , request: "POST /update.htm HTTP/1.1", upstream: "http://unix:/var/run/emhttpd.socket:/update.htm", host: "moathcenterr", referrer: "http://moathcenterr/Main"

 

(Video Recording).

 

Parity Check History didn't show the last 4 hung sessions only the one I canceled.

 

 

 

Edited by Muath
Link to comment
On 3/3/2020 at 2:57 PM, johnnie.black said:

Yes, you're right, this is an issue I've never seen before and have no idea what's the problem or how to diagnose it, maybe @limetechhas some ideas.

Thank you very much for your assist. 

 

On 3/3/2020 at 3:06 PM, itimpi said:

To try and eliminate as many variables as possible  do you get the same symptoms if you disable the docker and VM services under Settings and then reboot in Safe Mode to suppress plugins?

Now on the fifth try the parity check completed with no issue!

I didn't do much I just change the GPU to an old one and cleaned the fans, not sure if the issue fixed now or not, I will be back next month during the parity check if anything happen.

image.png.307ad98806b57398aa12f260a2581c18.png

 

Thank you everyone.

Edited by Muath
Link to comment
  • 1 month later...

>> 
so my system keep hang the last month from time to time and since there's no more info I could gather I didn't update my issue here

sometimes some threads hang and system will keep running but other times all the threads hang which then I need to restart the system forcely: 
 1642335941_unraidcpu100.png.6107b9da44f9741696ad1ce46d5c7edd.png

 

 

but now Fix Common Problems detect hardware errors after suddenly the parity check triggered!:

image.thumb.png.a2396ab3a7b6ced4ba3928bb852fdc35.png

 

error logs:
 

Apr 16 11:57:06 MoathCenterr kernel: mce: [Hardware Error]: Machine check events logged
Apr 16 11:57:06 MoathCenterr kernel: mce: [Hardware Error]: CPU 10: Machine Check: 0 Bank 5: bea0000000000108
Apr 16 11:57:06 MoathCenterr kernel: mce: [Hardware Error]: TSC 0 ADDR 14798839663c MISC d010000000000000 SYND 4d000000 IPID 500b000000000 
Apr 16 11:57:06 MoathCenterr kernel: mce: [Hardware Error]: PROCESSOR 2:870f10 TIME 1587027393 SOCKET 0 APIC 5 microcode 8701013
Apr 16 12:07:25 MoathCenterr root: Fix Common Problems: Error: Machine Check Events detected on your server
Apr 16 12:07:25 MoathCenterr root: mcelog: ERROR: AMD Processor family 23: mcelog does not support this processor.  Please use the edac_mce_amd module instead.
Apr 16 14:13:07 MoathCenterr kernel: Plex Script Hos[29328]: segfault at 0 ip 000014f2ee0a8d37 sp 000014f2e540e130 error 4 in libpython2.7.so.1.0[14f2edf71000+19f000]

 

Can it be CPU failure? 😥

 

UPDATE:

logs below keep happening from time to time and activate the Parity.

 

image.thumb.png.b90bdc430e48ce1b6dc1897c57b5ed25.png

 

moathcenterr-diagnostics-20200424-0508.zip

moathcenterr-diagnostics-20200416-1648.zip

Edited by Muath
Link to comment
  • 2 weeks later...

This is become annoying: 

 

image.thumb.png.619d97e22ced1ebe425251aea6ee72bd.png

 

Apr 24 04:58:20 MoathCenterr kernel: mce: [Hardware Error]: Machine check events logged
Apr 24 04:58:20 MoathCenterr kernel: mce: [Hardware Error]: CPU 10: Machine Check: 0 Bank 5: bea0000000000108
Apr 24 04:58:20 MoathCenterr kernel: mce: [Hardware Error]: TSC 0 ADDR 14b767dc2084 MISC d012000100000000 SYND 4d000000 IPID 500b000000000 
Apr 24 04:58:20 MoathCenterr kernel: mce: [Hardware Error]: PROCESSOR 2:870f10 TIME 1587693467 SOCKET 0 APIC 5 microcode 8701013
Apr 24 05:03:30 MoathCenterr root: Fix Common Problems: Error: Machine Check Events detected on your server
Apr 24 05:03:30 MoathCenterr root: mcelog: ERROR: AMD Processor family 23: mcelog does not support this processor.  Please use the edac_mce_amd module instead.
Apr 24 05:08:22 MoathCenterr root: Fix Common Problems: Error: Machine Check Events detected on your server
Apr 24 05:08:22 MoathCenterr root: mcelog: ERROR: AMD Processor family 23: mcelog does not support this processor.  Please use the edac_mce_amd module instead.
Apr 24 18:12:33 MoathCenterr kernel: CPU: 10 PID: 5903 Comm: unraidd0 Tainted: G           O      4.19.107-Unraid #1
Apr 24 18:12:33 MoathCenterr kernel: Call Trace:
Apr 24 18:13:04 MoathCenterr kernel: CPU: 5 PID: 1727 Comm: scsi_eh_10 Tainted: G      D    O      4.19.107-Unraid #1
Apr 24 18:13:04 MoathCenterr kernel: Call Trace:

 


 

Link to comment
6 minutes ago, johnnie.black said:

That's a hardware issue, most likely RAM, CPU or board related.

 

I changed the RAMs, GPU and motherboard, so most likely CPU issue!

or could it be RAM not supporting AMD CPU?

Edited by Muath
Link to comment
9 minutes ago, Muath said:

 

I changed the RAMs and motherboard, so most likely CPU issue! 

or could it be RAM not supporting AMD CPU?

Is the RAM on the QVL for your motherboard?

 

Sometimes, certain Ryzen CPU/Motherboards have problems if all four RAM slots are occupied.  They become very picky with RAM speed and only support certain speeds before they become unstable.

 

Are you overclocking the RAM at all or are you running it at the stock RAM speed? 

 

There will probably be a chart in your motherboard manual that shows what RAM speeds are supported depending on which and how many RAM slots are populated.

  • Like 1
Link to comment
  • 3 weeks later...

I'm using these RAMs: 

https://www.amazon.com/G-SKILL-TridentZ-288-Pin-3000MHz-F4-3000C16D-16GTZR/dp/B06WP4L3D7/

and tried to use: 

https://www.newegg.com/g-skill-16gb-288-pin-ddr4-sdram/p/N82E16820232290

 

not overclocked and I was using 4 of them but then removed 2 and switch between the slots but issue remain .. actually it's getting worse!

 

 

(I will keep updating this comment with new logs for index purpose if someone search for it 👁️‍🗨️)

Jun 24 18:18:05 MoathCenterr kernel: mce: [Hardware Error]: Machine check events logged
Jun 24 18:18:05 MoathCenterr kernel: mce: [Hardware Error]: CPU 10: Machine Check: 0 Bank 5: bea0000000000108
Jun 24 18:18:05 MoathCenterr kernel: mce: [Hardware Error]: TSC 0 ADDR 4206c8 MISC d012000100000000 SYND 4d000000 IPID 500b000000000 
Jun 24 18:18:05 MoathCenterr kernel: mce: [Hardware Error]: PROCESSOR 2:870f10 TIME 1593011852 SOCKET 0 APIC 5 microcode 8701013

photo_٢٠٢٠-٠٥-١٠_٢٣-٣٨-٣٤.jpg

 

2020-07-06 logs:

 

Jul  6 18:46:08 MoathCenterr kernel: mce: [Hardware Error]: Machine check events logged
Jul  6 18:46:08 MoathCenterr kernel: mce: [Hardware Error]: CPU 2: Machine Check: 0 Bank 5: bea0000000000108
Jul  6 18:46:08 MoathCenterr kernel: mce: [Hardware Error]: TSC 0 ADDR 1ffff8109a37a MISC d012000100000000 SYND 4d000000 IPID 500b000000000 
Jul  6 18:46:08 MoathCenterr kernel: mce: [Hardware Error]: PROCESSOR 2:870f10 TIME 1594050335 SOCKET 0 APIC 4 microcode 8701013
Jul  6 18:56:27 MoathCenterr root: Fix Common Problems: Error: Machine Check Events detected on your server
Jul  6 18:56:27 MoathCenterr root: mcelog: ERROR: AMD Processor family 23: mcelog does not support this processor.  Please use the edac_mce_amd module instead.

 

2020-07-07 - OS suddenly shut down and the message below shown:

 

 20200707_215809.thumb.jpg.d519bd4507995cc6493c85946adfbcb2.jpg

Edited by Muath
Link to comment
  • 2 months later...
  • JorgeB changed the title to [SOLVED] My Array Won't Work!! HELP

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.