The same sector keeps being corrected in subsequent parity checks


Squazz

Recommended Posts

Hi there.

I have been troubled by parity errors on/off for some time.

I might be looking at the logs the wrong way, but it seems that the same sector is being corrected on subsequent runs.

The sector I found was: sector=366010328

 

Is this something I'll have to worry about?

If we are talking about a drive that's beginning to malfunction, then how do I identify the specific drive in order to replace it?

 

nas-diagnostics-20181203-0631.zip

Edited by Squazz
Link to comment
1 minute ago, johnnie.black said:

Since there are repeating but also new sync errors this suggests to me a RAM problem, start by running Memtest, if no errors appear try lowering the RAM clock speed, as other users with Ryzen had similar issues when running higher clocked memory.

I have tried Memtest earlier, with multiple days of consecutive runs, with no errors.

But I will try this out again if the problem persists and we are unable to find a specific drive that generates the errors

Link to comment
24 minutes ago, johnnie.black said:

Very unlikely to be a disk, if Memtest doens't find anything lower RAM clock to 2133 or 2400 an run a couple of parity correcting parity checks, 1st one could still find errors, but 2nd one should find 0.

Do we know why higher memory could be causing these errors?

After all, Ryzen is hungry for the Mhz, so I would feel kinda sad having to kneecap my processor by forcing the frequency of the memory down.

Edited by Squazz
Link to comment
1 hour ago, johnnie.black said:

Memtest can't prove a negative, i.e., you can't be 100% sure memory is fine when no errors are detected, it can only prove there are problems when an error is detected, hence why I use and recommend using ECC RAM for files servers if you care about data integrity.

I'd love to run ECC, but I was stupid enough to buy an MSI board for my Ryzen 1700. It can boot ECC memory, but ECC is disabled.

Regarding memTest and the errors.
If I can't find errors on the memory on any frequency, then it could still be a frequency problem?

I'll have to try and give parity checking a few more runs. And double check if I consistently can get rid of the errors with a lower frequency

Link to comment
4 minutes ago, Squazz said:

If I can't find errors on the memory on any frequency, then it could still be a frequency problem?

It's a possibility and it doesn't hurt to try, there were at least of couple of similar cases with other Ryzen users where parity sync errors stopped after RAM clock was lowered.

  • Upvote 1
Link to comment
Just now, johnnie.black said:

It's a possibility and it doesn't hurt to try, there were at least of couple of similar cases with other Ryzen users where parity sync errors stopped after RAM clock was lowered.

Thanks, I'll try it out.
My data is too valuable to risk discarding it in favor of a few Mhz ;)

 

Next upgrade will be on a proper motherboard that supports ECC memory :P

Link to comment

Your sig says "2x 8GB 3200Mhz". Is that just the spec of the RAM or also the speed at which you're running it? If the latter then it's way out of spec for a 1st gen. Ryzen. What are the part numbers for the DIMMs? The fastest you can run the memory controller is 2666MHz without overclocking it but depending on how may DRAM chips you're hanging on the bus you might have to de-rate it further.

 

2 hours ago, Squazz said:

Next upgrade will be on a proper motherboard that supports ECC memory :P

I'm interested to hear of any AM4 motherboard that actually supports ECC as I haven't found one yet. Yes, many allow you to use ECC RAM and some even have the necessary extra tracks to the sockets, but I'm not aware of any that have support for ECC in the BIOS. It's a shame because the processors all support it but the motherboards are aimed at gamers. Perhaps if SuperMicro or ASRock Rack ever develop an AM4 board... They both make EPYC motherboards though, so I think I'm going to use one of those in my next server.

  • Like 1
Link to comment
43 minutes ago, John_M said:

Your sig says "2x 8GB 3200Mhz". Is that just the spec of the RAM or also the speed at which you're running it? If the latter then it's way out of spec for a 1st gen. Ryzen. What are the part numbers for the DIMMs? The fastest you can run the memory controller is 2666MHz without overclocking it but depending on how may DRAM chips you're hanging on the bus you might have to de-rate it further.

They are running 3200Mhz, running that speed due to XMP.

The part number for the memory is F4-3200C16D

 

43 minutes ago, John_M said:

I'm interested to hear of any AM4 motherboard that actually supports ECC as I haven't found one yet. Yes, many allow you to use ECC RAM and some even have the necessary extra tracks to the sockets, but I'm not aware of any that have support for ECC in the BIOS. It's a shame because the processors all support it but the motherboards are aimed at gamers. Perhaps if SuperMicro or ASRock Rack ever develop an AM4 board... They both make EPYC motherboards though, so I think I'm going to use one of those in my next server.

Asrock is making their PRO series that supports and runs ECC memory: https://www.asrock.com/mb/AMD/X370 Pro4/index.asp#Specification

Link to comment
6 hours ago, Squazz said:

The part number for the memory is F4-3200C16D

I haven't been able to confirm for your specific motherboard* but two DIMMs of that RAM should run at DDR-2666 without taking the memory controller of an R7 1700 out of spec. With a 2000-series CPU or APU you would be able to run it at DDR-2933.

 

*But I have been able to confirm it's the case with all the others I was able to check, including the ASRock X370 Pro4 that you mentioned.

 

On the subject on that board, I've been chasing round in circles (ASRock support, Reddit, AMD) for a few hours and, while I'm intrigued, I'm still not convinced. In fact I'll remain sceptical until I see a screenshot of the BIOS showing support for the ECC function. There's a lot of speculation, disinformation and downright wishful thinking on the Internet. There are a number of boards that support the use of ECC RAM and some even include it in their QVL but all that does is separate them from the boards that won't even POST with ECC RAM fitted. This particular ASRock board is tantalising because it falls into that category and it seems to go further (on the web site only - there's no mention in the manual) with

Quote

For Ryzen Series CPUs (Raven Ridge), ECC is only supported with PRO CPUs.

Now, I know that when Ryzen first launched the question was asked as to whether the AM4 platform supports the ECC function. The answer from AMD was yes, it does, but in the consumer range of products it was not qualified and motherboard support depends on the manufacturer. From this, I take it that in the PRO range of products it is qualified. But why single out Raven Ridge?

  • Like 1
Link to comment
8 hours ago, John_M said:

I haven't been able to confirm for your specific motherboard* but two DIMMs of that RAM should run at DDR-2666 without taking the memory controller of an R7 1700 out of spec. With a 2000-series CPU or APU you would be able to run it at DDR-2933.

 

*But I have been able to confirm it's the case with all the others I was able to check, including the ASRock X370 Pro4 that you mentioned.

For Dual Rank memory, the memory controller on the Ryzen 1700 is only supporting 2400Mhz

https://en.wikichip.org/wiki/amd/ryzen_7/1700#Memory_controller

The kit I have, is dual rank memory: https://www.reddit.com/r/Amd/comments/649ay8/ram_collection_thread_please_post_your_ram/

Link to comment
20 hours ago, Squazz said:

The part number for the memory is F4-3200C16D

There are ten entries for that part number in the table you referenced. Some of them show dual rank/double sided and some show single rank/single sided, depending on whether they use 16 x 4 Gb or 8 x 8 Gb DRAM chips. For example:

Quote

G.Skill Ripjaws V 3200 MHz CL16  F4-3200C16D-16GVGB  4Gb Samsung E-Die  Dual  Double

and

Quote

G.Skill Trident Z 3200 MHz CL16*  F4-3200C16D-16GTZ  8Gb Hynix M-Die  Single  Single

It isn't simply a difference between the Ripjaws V and Trident Z brands - both come in both configurations. The part number you gave me is too generic. When I checked on G.Skill's site it said single/single too. Presumably it only shows the newer type with the bigger chips.

Link to comment
9 minutes ago, John_M said:

There are ten entries for that part number in the table you referenced. Some of them show dual rank/double sided and some show single rank/single sided, depending on whether they use 16 x 4 Gb or 8 x 8 Gb DRAM chips. For example:

and

It isn't simply a difference between the Ripjaws V and Trident Z brands - both come in both configurations. The part number you gave me is too generic. When I checked on G.Skill's site it said single/single too. Presumably it only shows the newer type with the bigger chips.

Oh, my bad.

It's the F4-3200C16D-16GVKB version I have

Link to comment
On 12/3/2018 at 12:45 PM, johnnie.black said:

It's a possibility and it doesn't hurt to try, there were at least of couple of similar cases with other Ryzen users where parity sync errors stopped after RAM clock was lowered.

 

On 12/3/2018 at 3:48 PM, John_M said:

Your sig says "2x 8GB 3200Mhz". Is that just the spec of the RAM or also the speed at which you're running it? If the latter then it's way out of spec for a 1st gen. Ryzen. What are the part numbers for the DIMMs? The fastest you can run the memory controller is 2666MHz without overclocking it but depending on how may DRAM chips you're hanging on the bus you might have to de-rate it further.

 

I lowered the speed to 2400MHz, and have now made two parity checks. The first run found a single error. The second run didn't find anything.
I have yet to confirm that the errors stops coming, but for now it seems that it helped.
Thanks for everything so far :)

 

Edit: Yes, the errors stopped after lowering the speed

Edited by Squazz
Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.