Sync errors corrected / read error


Recommended Posts

Hello i hope someone can assist and point me in the correct direction.

 

So i would like to start with the following mover took 24 hours to move 50GB normally this is done within an hour.

 

parity check start the other day

so far it has took 1 day to get to 98.4% and it says it will take longer to complete

a parity check normally only takes 10/12 hours.

i noticed the estimated speed is drop down into the KB for a long period of time normally it is steady in the mbs

i also noticed it has over 600 sync errors corrected

screenshot below

 

I have scanned the logs and found the following

 

i noticed this in the logs - Jan 14 16:36:25 Server kernel: md: disk5 read error, sector=1729901760

it is only with disk 5, is this disk failure or maybe a cable issue ?

 

 

 

can someone please review logs and voice there thoughts logs attached

 

 

Capture.PNG

server-syslog-20210115-1023.zip

 

 

Just checked Grafana and it reports the following so this slow down started around 1am GMT image below

 

image.thumb.png.0c8d7559deac59c64212f915722a295e.png

Edited by LoyalScotsman
Link to comment

Disk5 appears to be failing, you should run an extended SMART test, sync errors are likely the result of this:

 

Jan 14 10:03:01 Server emhttpd: unclean shutdown detected

 

As for the speed it could be disk related, but not disk5 since it's already past it, though parity is also showing some issues, but there also appears to be something else accessing the array, so stop all that first.

Link to comment
5 minutes ago, JorgeB said:

Disk5 appears to be failing, you should run an extended SMART test, sync errors are likely the result of this:

 


Jan 14 10:03:01 Server emhttpd: unclean shutdown detected

 

As for the speed it could be disk related, but not disk5 since it's already past it, though parity is also showing some issues, but there also appears to be something else accessing the array, so stop all that first.

thank you for reviewing

 

issues on parity like what ?

i am aware the parity disk has 1 "UDMA CRC error count" error but this has been there for a while now, also this disk is an older disk.

 

So i have just stopped all dockers, is there anything you want me to do or just wait and see if there is any change ?

I am aware that my auto downloader has picked up something but this will be in my cache and not been moved into the array as of yet

 

I can also see that it says mover is running, which if i am write to believe mover shouldn't kick in during a parity check ?

image.png.b6d26d0e043da7acf5ad17c58349e1be.png

 

 

Link to comment
16 minutes ago, JorgeB said:

Disk5 appears to be failing, you should run an extended SMART test, sync errors are likely the result of this:

 


Jan 14 10:03:01 Server emhttpd: unclean shutdown detected

 

As for the speed it could be disk related, but not disk5 since it's already past it, though parity is also showing some issues, but there also appears to be something else accessing the array, so stop all that first.

you also mentioned disk 5 has passed it, are you able to work out disk the parity check was on ?

Link to comment
10 minutes ago, LoyalScotsman said:

issues on parity like what ?

These should be ideally 0 on an healthy WD drive:

 

ID# ATTRIBUTE_NAME          FLAGS    VALUE WORST THRESH FAIL RAW_VALUE
  1 Raw_Read_Error_Rate     POSR-K   200   200   051    -    65
200 Multi_Zone_Error_Rate   ---R--   200   200   000    -    2

And there also also UNC @ LBA errors in the report, keep monitoring those attributes, if they keep increasing there will likely be more error soon.

 

 

13 minutes ago, LoyalScotsman said:

and see if there is any change ?

Yes.

 

13 minutes ago, LoyalScotsman said:

if i am write to believe mover shouldn't kick in during a parity check ?

AFAIK It will.

 

8 minutes ago, LoyalScotsman said:

you also mentioned disk 5 has passed it, are you able to work out disk the parity check was on ?

It's past it because disk5 is only 1TB, so at the 1TB mark it ends checking it.

 

Link to comment

not fully sure how the quoting works so i will just type it as comment

 

so i can advice after stopping all dockers the parity check has completed

also mover is nearly done as well since the parity check is completed good to know it kicks in during the parity check

 

so the receilt changes i did to unraid was setup grafana so i am not sure if this is what has causeed the read/write delays. it had no issues before this.

 

so i looked for the errors you mentioned and i will monitor them for the next few days then up it to a week check and so on.

 

But i checked the other drives for the same sections you listed above and found the following

 

so for following disks 

 

ST4000DM004-2CV104_ZFN2TLLK - 4 TB - which is not a very old disk has the following and is a disk my friend provided me due to low space on the array

 

ID#    ATTRIBUTE_NAME          FLAGS    VALUE   WORST   THRESH   FAIL     RAW_VALUE
  1 Raw_Read_Error_Rate        POSR--     083      064         006          -         214020306

 7 Seek_Error_Rate                 POSR--     077      060         045          -         49259391

9 Power_On_Hours                  -O--CK      098     098         000          -         1982 (165 171 0)

there is no multi zone error rates on this disk

 

WDC_WD10JPVX-22JC3T0_WD-WX31E73AWSD2 - 1 TB

ID#    ATTRIBUTE_NAME          FLAGS    VALUE   WORST   THRESH   FAIL     RAW_VALUE

  1 Raw_Read_Error_Rate        POSR--      198      198          051          -         33639

 7 Seek_Error_Rate                  -OSR-K     200     200         000         -         0

200 Multi_Zone_Error_Rate       ---R--       200     200         000          -         30

 

so i was planning on picking up 3x WD red disks in march but going by the above errors on those disks should i just get them replaced sooner ?

oh that makes sense why it had passed the 1TB disk :)

 

 

Link to comment
38 minutes ago, JorgeB said:

Seagate drives report this differently, I mentioned it should be 0 only WD drives.

 

Run an extended SMART test on disk5, but it's likely past its prime.

ahh ok so the seagate disk is ok for now EDIT i missed your hyperlink its ok the Seagate disk has no errors :)

 

i will start it now and advice back results

Edited by LoyalScotsman
Link to comment

so a small update

 

i ran the extended disk scan it failed at 90% saying the disk need spin up even though it was, so i decided to run a disk diags test on the drive and got the following results on disk 5 not looking good going by the stats, a huge drop then raise then drop again which is very worrying so i have started pulling everything of that drive.

 

 i have reran the extended test but tbh it is not looking good 

 

image.thumb.png.ce1ef2a00fb7edcefd0f73d80e4bb431.png

Link to comment
Seagate drives report this differently, I mentioned it should be 0 only WD drives.
 
Run an extended SMART test on disk5, but it's likely past its prime.
Hello

Thanks for your help diagnosing the suspected drive I decided to rip all data off it and it finished around an hour ago thankfully cause I just got an email from unraid saying the disk is not OK and after a further investigation in archived notifications it turns out the disk is now flagging 147 errors

So i have pudged the drive off my unraid and shunk the array for now, but 2 new 4tb wd red disks have been ordered

Also an FYI the smart check still ain't completed and been stuck at 90% for a good few hours.

Image below ce61843eac77aeffb6296b2263a47fbf.jpg

Sent from my SM-G973F using Tapatalk

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.