Constant corruption/checksum errors on cache drive


Recommended Posts

Hello all,

 

I'm having regular file corruption/checksum errors on my cache drive SSD. The SSD is brand new and this is a fresh server build. The SSD is a WD 500GB Blue plugged in via SATA.

 

I noticed it yesterday (the first time the cache drive was used). I woke up and files hadn't moved off the cache. I looked at the logs and noticed the files were corrupt. So, I shutdown the array and reformatted the cache (BTRFS again). I got some downloads started last night and woke up to the exact same thing this morning.

 

Is it something that I'm doing or a setting that I have incorrect that's causing this? Or is my SSD possibly bad?

movererror2.png

Link to comment

Attached is the diagnostics from yesterday.

 

blackbart-diagnostics-20200524-1107.zip

 

I did start running a memtest yesterday (after dumping diagnostics). Oddly, I was having issues when I had all 4x sticks in at once (see photo). So I have been running 2x at a time. The first two passed 4/4 passes of memtest86. If the next two do as well, I may try something else like Karu or HCI. I may also try a single pass with one stick of RAM at a time, cycling through motherboard each slot.

 

memtest.jpg.83a17011999310b50ac7b272b5d66518.jpg

 

As for the SATA cables themselves, they are new and there are no crazy bends in them.

Link to comment
32 minutes ago, subterminal said:

Attached is the diagnostics from yesterday.

 

blackbart-diagnostics-20200524-1107.zip 95.76 kB · 0 downloads

 

I did start running a memtest yesterday (after dumping diagnostics). Oddly, I was having issues when I had all 4x sticks in at once (see photo). So I have been running 2x at a time. The first two passed 4/4 passes of memtest86. If the next two do as well, I may try something else like Karu or HCI. I may also try a single pass with one stick of RAM at a time, cycling through motherboard each slot.

 

memtest.jpg.83a17011999310b50ac7b272b5d66518.jpg

 

As for the SATA cables themselves, they are new and there are no crazy bends in them.

Are all your sticks identical or is it a mix and match?

Link to comment

All the sticks are identical - 4x G.Skill 16GB

 

That's odd that the motherboard would have issues with all 4 slots populated. That test I posted above eventually error'd out due to too many errors (10k+). But, if that's the issue, I can live with 2x 16GB.

Edited by subterminal
Link to comment

This was the result of all 4 sticks in at once. But when tested 2x at a time, they work great :/

 

Would it still be worth paying for an additional testing software like HCI or Karu for additional testing? Just to ensure that 2x sticks don't actually have errors

20200525_081023.jpg

Edited by subterminal
Link to comment
47 minutes ago, ryperx said:

Do you have latest BIOS Version installed?

Probably not. I could try to update the BIOS, but I'm realizing that I don't actually need 64GB of RAM. If 2x16 doesn't cause errors, I may honestly stick with that and call it a day. The only reason that I have 64GB is that I came by it for free.

 

I was thinking of using another memory test software to verify 100% that the two sticks ran without issue and that the errors did infact come from having four sticks in.

 

One thing to note is that the memtest of each set had a significant difference in the time to complete (~2+ hours). Theoretically they are the same RAM, but could there be a large timing/latency difference contributing to this issue?

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.