Squazz Posted December 3, 2018 Share Posted December 3, 2018 (edited) Hi there. I have been troubled by parity errors on/off for some time. I might be looking at the logs the wrong way, but it seems that the same sector is being corrected on subsequent runs. The sector I found was: sector=366010328 Is this something I'll have to worry about? If we are talking about a drive that's beginning to malfunction, then how do I identify the specific drive in order to replace it? nas-diagnostics-20181203-0631.zip Edited December 3, 2018 by Squazz Quote Link to comment
JorgeB Posted December 3, 2018 Share Posted December 3, 2018 Since there are repeating but also new sync errors this suggests to me a RAM problem, start by running Memtest, if no errors appear try lowering the RAM clock speed, as other users with Ryzen had similar issues when running higher clocked memory. 1 Quote Link to comment
Squazz Posted December 3, 2018 Author Share Posted December 3, 2018 1 minute ago, johnnie.black said: Since there are repeating but also new sync errors this suggests to me a RAM problem, start by running Memtest, if no errors appear try lowering the RAM clock speed, as other users with Ryzen had similar issues when running higher clocked memory. I have tried Memtest earlier, with multiple days of consecutive runs, with no errors. But I will try this out again if the problem persists and we are unable to find a specific drive that generates the errors Quote Link to comment
JorgeB Posted December 3, 2018 Share Posted December 3, 2018 Very unlikely to be a disk, if Memtest doens't find anything lower RAM clock to 2133 or 2400 an run a couple of parity correcting parity checks, 1st one could still find errors, but 2nd one should find 0. 1 Quote Link to comment
Squazz Posted December 3, 2018 Author Share Posted December 3, 2018 (edited) 24 minutes ago, johnnie.black said: Very unlikely to be a disk, if Memtest doens't find anything lower RAM clock to 2133 or 2400 an run a couple of parity correcting parity checks, 1st one could still find errors, but 2nd one should find 0. Do we know why higher memory could be causing these errors? After all, Ryzen is hungry for the Mhz, so I would feel kinda sad having to kneecap my processor by forcing the frequency of the memory down. Edited December 3, 2018 by Squazz Quote Link to comment
JorgeB Posted December 3, 2018 Share Posted December 3, 2018 Instability problem, ie, platform can't handle the high clocks and memory errors occur. 1 Quote Link to comment
Squazz Posted December 3, 2018 Author Share Posted December 3, 2018 Just now, johnnie.black said: Instability problem, ie, platform can't handle the high clocks and memory errors occur. Then it is just so weird that MemTest doesn't tell this story but unRaid does? Quote Link to comment
JorgeB Posted December 3, 2018 Share Posted December 3, 2018 Memtest can't prove a negative, i.e., you can't be 100% sure memory is fine when no errors are detected, it can only prove there are problems when an error is detected, hence why I use and recommend using ECC RAM for files servers if you care about data integrity. 1 Quote Link to comment
Squazz Posted December 3, 2018 Author Share Posted December 3, 2018 1 hour ago, johnnie.black said: Memtest can't prove a negative, i.e., you can't be 100% sure memory is fine when no errors are detected, it can only prove there are problems when an error is detected, hence why I use and recommend using ECC RAM for files servers if you care about data integrity. I'd love to run ECC, but I was stupid enough to buy an MSI board for my Ryzen 1700. It can boot ECC memory, but ECC is disabled. Regarding memTest and the errors. If I can't find errors on the memory on any frequency, then it could still be a frequency problem? I'll have to try and give parity checking a few more runs. And double check if I consistently can get rid of the errors with a lower frequency Quote Link to comment
JorgeB Posted December 3, 2018 Share Posted December 3, 2018 4 minutes ago, Squazz said: If I can't find errors on the memory on any frequency, then it could still be a frequency problem? It's a possibility and it doesn't hurt to try, there were at least of couple of similar cases with other Ryzen users where parity sync errors stopped after RAM clock was lowered. 1 Quote Link to comment
Squazz Posted December 3, 2018 Author Share Posted December 3, 2018 Just now, johnnie.black said: It's a possibility and it doesn't hurt to try, there were at least of couple of similar cases with other Ryzen users where parity sync errors stopped after RAM clock was lowered. Thanks, I'll try it out. My data is too valuable to risk discarding it in favor of a few Mhz Next upgrade will be on a proper motherboard that supports ECC memory Quote Link to comment
John_M Posted December 3, 2018 Share Posted December 3, 2018 Your sig says "2x 8GB 3200Mhz". Is that just the spec of the RAM or also the speed at which you're running it? If the latter then it's way out of spec for a 1st gen. Ryzen. What are the part numbers for the DIMMs? The fastest you can run the memory controller is 2666MHz without overclocking it but depending on how may DRAM chips you're hanging on the bus you might have to de-rate it further. 2 hours ago, Squazz said: Next upgrade will be on a proper motherboard that supports ECC memory I'm interested to hear of any AM4 motherboard that actually supports ECC as I haven't found one yet. Yes, many allow you to use ECC RAM and some even have the necessary extra tracks to the sockets, but I'm not aware of any that have support for ECC in the BIOS. It's a shame because the processors all support it but the motherboards are aimed at gamers. Perhaps if SuperMicro or ASRock Rack ever develop an AM4 board... They both make EPYC motherboards though, so I think I'm going to use one of those in my next server. 1 Quote Link to comment
Squazz Posted December 3, 2018 Author Share Posted December 3, 2018 43 minutes ago, John_M said: Your sig says "2x 8GB 3200Mhz". Is that just the spec of the RAM or also the speed at which you're running it? If the latter then it's way out of spec for a 1st gen. Ryzen. What are the part numbers for the DIMMs? The fastest you can run the memory controller is 2666MHz without overclocking it but depending on how may DRAM chips you're hanging on the bus you might have to de-rate it further. They are running 3200Mhz, running that speed due to XMP. The part number for the memory is F4-3200C16D 43 minutes ago, John_M said: I'm interested to hear of any AM4 motherboard that actually supports ECC as I haven't found one yet. Yes, many allow you to use ECC RAM and some even have the necessary extra tracks to the sockets, but I'm not aware of any that have support for ECC in the BIOS. It's a shame because the processors all support it but the motherboards are aimed at gamers. Perhaps if SuperMicro or ASRock Rack ever develop an AM4 board... They both make EPYC motherboards though, so I think I'm going to use one of those in my next server. Asrock is making their PRO series that supports and runs ECC memory: https://www.asrock.com/mb/AMD/X370 Pro4/index.asp#Specification Quote Link to comment
Squazz Posted December 3, 2018 Author Share Posted December 3, 2018 Just changed the speed to 2400Mhz and started a new run. Will let you know how it went in 24 hours (2 runs) Quote Link to comment
John_M Posted December 3, 2018 Share Posted December 3, 2018 6 hours ago, Squazz said: The part number for the memory is F4-3200C16D I haven't been able to confirm for your specific motherboard* but two DIMMs of that RAM should run at DDR-2666 without taking the memory controller of an R7 1700 out of spec. With a 2000-series CPU or APU you would be able to run it at DDR-2933. *But I have been able to confirm it's the case with all the others I was able to check, including the ASRock X370 Pro4 that you mentioned. On the subject on that board, I've been chasing round in circles (ASRock support, Reddit, AMD) for a few hours and, while I'm intrigued, I'm still not convinced. In fact I'll remain sceptical until I see a screenshot of the BIOS showing support for the ECC function. There's a lot of speculation, disinformation and downright wishful thinking on the Internet. There are a number of boards that support the use of ECC RAM and some even include it in their QVL but all that does is separate them from the boards that won't even POST with ECC RAM fitted. This particular ASRock board is tantalising because it falls into that category and it seems to go further (on the web site only - there's no mention in the manual) with Quote For Ryzen Series CPUs (Raven Ridge), ECC is only supported with PRO CPUs. Now, I know that when Ryzen first launched the question was asked as to whether the AM4 platform supports the ECC function. The answer from AMD was yes, it does, but in the consumer range of products it was not qualified and motherboard support depends on the manufacturer. From this, I take it that in the PRO range of products it is qualified. But why single out Raven Ridge? 1 Quote Link to comment
Squazz Posted December 4, 2018 Author Share Posted December 4, 2018 8 hours ago, John_M said: I haven't been able to confirm for your specific motherboard* but two DIMMs of that RAM should run at DDR-2666 without taking the memory controller of an R7 1700 out of spec. With a 2000-series CPU or APU you would be able to run it at DDR-2933. *But I have been able to confirm it's the case with all the others I was able to check, including the ASRock X370 Pro4 that you mentioned. For Dual Rank memory, the memory controller on the Ryzen 1700 is only supporting 2400Mhz https://en.wikichip.org/wiki/amd/ryzen_7/1700#Memory_controller The kit I have, is dual rank memory: https://www.reddit.com/r/Amd/comments/649ay8/ram_collection_thread_please_post_your_ram/ Quote Link to comment
John_M Posted December 4, 2018 Share Posted December 4, 2018 20 hours ago, Squazz said: The part number for the memory is F4-3200C16D There are ten entries for that part number in the table you referenced. Some of them show dual rank/double sided and some show single rank/single sided, depending on whether they use 16 x 4 Gb or 8 x 8 Gb DRAM chips. For example: Quote G.Skill Ripjaws V 3200 MHz CL16 F4-3200C16D-16GVGB 4Gb Samsung E-Die Dual Double and Quote G.Skill Trident Z 3200 MHz CL16* F4-3200C16D-16GTZ 8Gb Hynix M-Die Single Single It isn't simply a difference between the Ripjaws V and Trident Z brands - both come in both configurations. The part number you gave me is too generic. When I checked on G.Skill's site it said single/single too. Presumably it only shows the newer type with the bigger chips. Quote Link to comment
Squazz Posted December 4, 2018 Author Share Posted December 4, 2018 9 minutes ago, John_M said: There are ten entries for that part number in the table you referenced. Some of them show dual rank/double sided and some show single rank/single sided, depending on whether they use 16 x 4 Gb or 8 x 8 Gb DRAM chips. For example: and It isn't simply a difference between the Ripjaws V and Trident Z brands - both come in both configurations. The part number you gave me is too generic. When I checked on G.Skill's site it said single/single too. Presumably it only shows the newer type with the bigger chips. Oh, my bad. It's the F4-3200C16D-16GVKB version I have Quote Link to comment
John_M Posted December 4, 2018 Share Posted December 4, 2018 24 minutes ago, Squazz said: It's the F4-3200C16D-16GVKB version I have Well, in that case, according to the table: Quote G.Skill Ripjaws V 3200 MHz CL16* F4-3200C16D-16GVKB 8Gb Hynix M-Die Single Single So it's single rank and DDR-2666 should be fine. Quote Link to comment
Squazz Posted December 4, 2018 Author Share Posted December 4, 2018 10 minutes ago, John_M said: Well, in that case, according to the table: So it's single rank and DDR-2666 should be fine. It has two entries So I'll have to take a closer look and determine if its Hynix or Samsung modules Quote Link to comment
John_M Posted December 4, 2018 Share Posted December 4, 2018 2 minutes ago, Squazz said: It has two entries Oh yes, you're right. Now, why the &@£$ do they do that?! 1 Quote Link to comment
Squazz Posted December 4, 2018 Author Share Posted December 4, 2018 (edited) On 12/3/2018 at 12:45 PM, johnnie.black said: It's a possibility and it doesn't hurt to try, there were at least of couple of similar cases with other Ryzen users where parity sync errors stopped after RAM clock was lowered. On 12/3/2018 at 3:48 PM, John_M said: Your sig says "2x 8GB 3200Mhz". Is that just the spec of the RAM or also the speed at which you're running it? If the latter then it's way out of spec for a 1st gen. Ryzen. What are the part numbers for the DIMMs? The fastest you can run the memory controller is 2666MHz without overclocking it but depending on how may DRAM chips you're hanging on the bus you might have to de-rate it further. I lowered the speed to 2400MHz, and have now made two parity checks. The first run found a single error. The second run didn't find anything. I have yet to confirm that the errors stops coming, but for now it seems that it helped. Thanks for everything so far Edit: Yes, the errors stopped after lowering the speed Edited January 25, 2020 by Squazz Quote Link to comment
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.