September 14, 20214 yr HI there, When I've had failing disks before, unraid would show an error count on the main screen, however after the most recent parity check I've noticed that the system found and fixed 142 errors. I've attached my diagnostics and smart reports but I can't seem to figure out what drive or controller is causing the issue. Could somebody please point me in the right direction? My system is a Ryzen 2600, MSI B450-A Pro, 24gb ram @ 2933, LSI 6Gbps SAS HBA LSI 9200-8i = (9211-8I) IT Mode controller & Asus 1050ti Thanks in advance for your help Unraid issues.zip
September 14, 20214 yr Community Expert First we need to know if this was a one time thing or if you keep getting sync errors after they were fixed, so if that was a correcting check run another one, also you're overclocking your RAM and that is known to corrupt data with some Ryzen servers, see here for max officially supported speed for your config.
September 14, 20214 yr Author I had 12 errors two weeks ago but nothing before this, ram has been stable in the system since I built it in 2018 but maybe the memory controller is degraded, I'll down clock it. Is there any way to tell from my logs which disks or files were impacted by the errors?
September 14, 20214 yr Community Expert 57 minutes ago, adamreid said: Is there any way to tell from my logs which disks or files were impacted by the errors? The diags you posted don't show any parity sync, but even if they did it's not possible to know where the errors come from.
September 14, 20214 yr Author 8 minutes ago, JorgeB said: The diags you posted don't show any parity sync, but even if they did it's not possible to know where the errors come from. Sorry for being a noob, I just followed the guide to download diagnostics, could you please let me know where I could find these logs?
September 14, 20214 yr Author ughhhh, I had to turn the system off (gracefully) last night because one of the breakers in our house tripped so I shut everything down before resetting the RCD, I didn't see the results of my parity check until this afternoon. I'll watch for this happening again and then check syslog. Thank you.
September 14, 20214 yr Community Expert 7 hours ago, adamreid said: found and fixed 142 errors Why were you running a correcting parity check? You should run non-correcting parity checks until you determine you have sync errors that need to be corrected. You don't want to discover that you have another disk obviously causing problems and corrupting parity because you always run correcting checks.
September 14, 20214 yr Author 1 hour ago, trurl said: Why were you running a correcting parity check? You should run non-correcting parity checks until you determine you have sync errors that need to be corrected. You don't want to discover that you have another disk obviously causing problems and corrupting parity because you always run correcting checks. You know what I have no idea, I must've set 'Write corrections to parity disk' to yes without thinking about it years ago. Should I set this to no?
September 17, 20214 yr Author Ok so I set it to no, re-ran the test and my parity1 drive is now showing 493 errors and unraid reports the following: Last check completed on Fri 17 Sep 2021 04:11:25 PM AEST (today) Finding 5789 errors Duration: 19 hours, 7 minutes, 57 seconds. Average speed: 87.1 MB/sec unraid-diagnostics-20210917-1615.zip Could this be from a failing sas to sata cable or a my hba failing? I suppose it could be the disk but just seems random that I've had two fail like this in the last month. At one stage during the check it slowed down to single digits megabytes /sec for a while as well. How should I proceed from here? Edit: I'm a fucking idiot and probably didn't un-tick write corrections to parity when I ran the check, I just un-ticked it in scheduler. Edited September 17, 20214 yr by adamreid
September 17, 20214 yr Community Expert RAM is still overclocked, after fixing that run a couple of correcting checks without rebooting and post new diags. On 9/14/2021 at 9:15 AM, JorgeB said: you're overclocking your RAM and that is known to corrupt data with some Ryzen servers, see here for max officially supported speed for your config.
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.