February 4, 20206 yr I've been battling random system lock up for the last month and need help. I upgrade ram from 16G to 32G and started having problems with crashing after 12 to 24 hours of running. Things I've done so far: Updated bios to latest version reset bios to default config replaced my marvel HBA with an LSI Switch PCI slots of HBA Updated unraid to 6.8.2 None of these helped. I'm switched logging to flash drive. A copy of the syslog right after a crash and diag files are attached. ANY HELP WOULD BE APPRECIATED.... syslog crash mediaserv-diagnostics-20200204-0523.zip
February 4, 20206 yr Author Thank you for the suggestion! I didn't think brand new ram would could have a problem....but once again I was wrong. I have 4 X 8G modules installed. Is there a way to figure out which stick is bad, or should I just start over will all new ram? Using: G.SKILL Ripjaws V Series DDR4 PC4-25600 3200MHz Model F4-3200C16D-16GVKB Any recommendations on a replacement? Edited February 4, 20206 yr by krh1009
February 4, 20206 yr Try and reseat the ram first and run memtest. Sometimes that is all it is. If it doesn't fix it you can pull ram chips until you don't get errors on memtest to eliminate the bad chip.
February 4, 20206 yr Also make sure CPU is seated properly. This is a easily overlooked problem with Threadripper CPU's. All 3 CPU hold down screws must be torqued in proper sequence & amount for CPU to be seated properly. I had a memory issue with mine at first and reseating the CPU fixed it for me.
February 4, 20206 yr Community Expert You can also try dropping the speed profile down to 3000mhz (2933?).
February 4, 20206 yr Author 22 minutes ago, Gragorg said: Try and reseat the ram first and run memtest. Sometimes that is all it is. If it doesn't fix it you can pull ram chips until you don't get errors on memtest to eliminate the bad chip. Thanks. Going to re-seat and test each pair separately and see if I get errors
February 5, 20206 yr Author So I tested each pair separately ( two tests of 16GB). Both tests show no errors, so I know the modules are not the problem. When I insert all four at one time I get a ton of errors. 11 hours ago, jpowell8672 said: Also make sure CPU is seated properly. This is a easily overlooked problem with Threadripper CPU's. All 3 CPU hold down screws must be torqued in proper sequence & amount for CPU to be seated properly. I had a memory issue with mine at first and reseating the CPU fixed it for me. I think the CPU re-seating might be what is needed, Do i need to unmount the CPU completely and remount or can I loosen the bracket and re-tighten the screws in the proper sequence? Thanks again for the help
February 5, 20206 yr 12 minutes ago, krh1009 said: When I insert all four at one time I get a ton of errors. Are you reducing the memory speed to compensate for the added modules?
February 5, 20206 yr Author 3 minutes ago, jonathanm said: Are you reducing the memory speed to compensate for the added modules? I can give it a try....I need to find that setting the bios. I'd rather do that than re-seat CPU.
February 5, 20206 yr 2 hours ago, krh1009 said: So I tested each pair separately ( two tests of 16GB). Both tests show no errors, so I know the modules are not the problem. When I insert all four at one time I get a ton of errors. I think the CPU re-seating might be what is needed, Do i need to unmount the CPU completely and remount or can I loosen the bracket and re-tighten the screws in the proper sequence? Thanks again for the help
February 5, 20206 yr If you are unable to resolve the issue with your current ram and decide to purchase new ram the Samsung B-die chips work the best with Threadripper v1 & 2 if you can find some. https://benzhaomin.github.io/bdiefinder/
February 5, 20206 yr Author 9 minutes ago, jpowell8672 said: If you are unable to resolve the issue with your current ram and decide to purchase new ram the Samsung B-die chips work the best with Threadripper v1 & 2 if you can find some. https://benzhaomin.github.io/bdiefinder/ Thanks for the video and the brand recommendation. I'm going to 1) lower the memory speed, if no luck 2) reseat CPU, if no luck 3) buy more ram
February 5, 20206 yr 10 hours ago, krh1009 said: Thanks for the video and the brand recommendation. I'm going to 1) lower the memory speed, if no luck 2) reseat CPU, if no luck 3) buy more ram For (1) lower speed to 2133 MHz (or simpler, just turn off XMP or whatever AMD calls it) 2133MHz is standard stock DDR4 speed.
February 6, 20206 yr Author OK...so I took the cowards way out. I didn't feel like removing the cooler cleaning the TIM and reseating the CPU. So I ordered 2 sticks of 16G ram and place them in the lower two (working) slots. memtest showed 100% pass. SO I think I'm good for now. One long rainy weekend I'll attempt to reseat the CPU., which I think will solve the problem completely.
Archived
This topic is now archived and is closed to further replies.