Jump to content

Painfully slow parity check - dying drive?


FreeMan
Go to solution Solved by JorgeB,

Recommended Posts

My parity checks have never been particularly fast on my machine, usually running 22-24 hours. The check that kicked off on 1 Nov took nearly 59 hours!

 

My presumption is that I have a failing drive. However, after running extended SMART checks on all drives, they've all returned without errors (even though 2 of the disks took > 24 hours to run).

 

I've attached diagnostics. Anyone see anything there that looks suspect? What else should I be checking to identify why things have suddenly slowed down so much. Other than the update to 6.11.1 (6.11.2 update was pending the completion of the SMART checks, though I'm holding off for now), I haven't made any config changes, I have CA doing automatic updates of dockers/plugins, but haven't added or removed any dockers, plugins, or VMs or done anything other than read & write data to the array.

 

nas-diagnostics-20221106-1306.zip

Link to comment

This is my most recent disk speed test from "a while ago" (prior to the addition of the 9th data drive):

 

image.png.ccf61e441cb0455ddb5859ac494a51fe.png

 

And here is one I just ran this morning (including my new 9th drive):

image.png.73bcbead0e704dd6631e5a931c2c49b6.png

 

The only differences I see is an extra drive making the speed dip from ~4 - ~5.5TB and that seems to be attributable to the addition of the new drive. Those 3 drives are all Seagate IronWolfs so similar behavior is to be expected.

Link to comment
  • 4 weeks later...

Well, the next monthly parity check has rolled around and it's currently running at about 37MB/sec.

 

As requested, here are diagnostics during the run.

 

Also, of note, I received an error warning:

 

Quote

Unraid Disk 7 SMART health [5]:

Warning [NAS] - reallocated sector ct is 8

ST8000NM005-1RM112_ZA1FS9VW (sdn)

Disk7 is, at least, still under warranty, but it's not one of the drives I suspected to be failing due to very slow extended SMART tests earlier. Sigh...

 

Should have kept my eyes open during Black Friday/Cyber Monday sales for good deals on hard drives. Forgot all about it. :(

nas-diagnostics-20221201-0642.zip

Link to comment

There's something else reading from disks 1 and 5, looks like it's cache dirs:

 

root      7894  0.1  0.0   5580  3684 ?        S    Nov21  27:16 /bin/bash /usr/local/emhttp/plugins/dynamix.cache.dirs/scripts/cache_dirs -e appdata -l on
root      2905  0.0  0.0   4556  2848 ?        S    06:42   0:00  \_ /bin/bash /usr/local/emhttp/plugins/dynamix.cache.dirs/scripts/cache_dirs -e appdata -l on
root      2915  0.0  0.0   2584   908 ?        S    06:42   0:00  |   \_ /bin/timeout 30 find /mnt/disk1/Audio -noleaf
root      2924  0.6  0.0   3872  2472 ?        D    06:42   0:00  |       \_ find /mnt/disk1/Audio -noleaf
root      2911  0.0  0.0   4556  2912 ?        S    06:42   0:00  \_ /bin/bash /usr/local/emhttp/plugins/dynamix.cache.dirs/scripts/cache_dirs -e appdata -l on
root      3101  0.0  0.0   2584   872 ?        S    06:42   0:00      \_ /bin/timeout 30 find /mnt/disk5/Photos -noleaf
root      3102  3.2  0.0   3940  2524 ?        D    06:42   0:00          \_ find /mnt/disk5/Photos -noleaf

 

Link to comment

Interesting. I've been running cache dirs for ages (since v5.x, probably earlier than that) and it's never seemed to have an impact on parity speeds like this. I may have to go have a look to see if it's been updated recently and that might be causing issues. (I run CA Auto Update and get notifications of updates, but I don't recall every one of them.)  NOPE last update to cache dirs was August 2020. It's been this way for > 2 years now.

 

I wonder if this is also what's been preventing drives from spinning down for the last several weeks. I've noticed that every time I look at the dashboard (and in every one of my 4x/day array status notifications) that all drives are spinning. I did get them to manually spin down and stay down for a little bit when I tried yesterday, but they all went back to spinning again within a few minutes.

 

Is the reallocated sector count something to be concerned about or to just keep an eye on to make sure it's not increasing too rapidly?

Edited by FreeMan
Link to comment
42 minutes ago, JorgeB said:

First thing to do it to stop anything else accessing the array and see the difference it makes.

Well, I shut down all my dockers and gee... now I'm running along at nearly 100Mb/s.

 

Guess I have to rummage about and figure out what happened that they're all creating so much disk access. I never used to have to shut down dockers to get decent parity check speed, so something's changed.

 

Thanks for pointing out the obvious. And I do still need to replace at least the one drive with reallocated sectors - at least I think it's still under warranty, so that should be a simple one.

Link to comment

Interesting.

 

Telegraph seems to cause a cyclical dip in read speeds from ~90MB/s down to ~60MB/s down to ~20MB/s then back up to ~90.

Jellyfin just causes the drives to thrash and read speeds fall to the floor.

 

I've been running Telegraph for a few years and never noticed any sort of significant slow down because of it, but maybe it's because I just never noticed...

 

Seems a pain to have to shut down dockers in order to get through the parity check in a reasonable amount of time, but these two aren't critical, so I can live w/o 'em for 24 hours.

  • Like 1
Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...