Painfully slow parity check - dying drive?

FreeMan · November 6, 2022

My parity checks have never been particularly fast on my machine, usually running 22-24 hours. The check that kicked off on 1 Nov took nearly 59 hours!

My presumption is that I have a failing drive. However, after running extended SMART checks on all drives, they've all returned without errors (even though 2 of the disks took > 24 hours to run).

I've attached diagnostics. Anyone see anything there that looks suspect? What else should I be checking to identify why things have suddenly slowed down so much. Other than the update to 6.11.1 (6.11.2 update was pending the completion of the SMART checks, though I'm holding off for now), I haven't made any config changes, I have CA doing automatic updates of dockers/plugins, but haven't added or removed any dockers, plugins, or VMs or done anything other than read & write data to the array.

nas-diagnostics-20221106-1306.zip

JorgeB · November 7, 2022

Nothing obvious in the log, but no check running now, run a test with the diskspeed docker to see if all disks are performing normally.

FreeMan · November 7, 2022

This is my most recent disk speed test from "a while ago" (prior to the addition of the 9th data drive):

image.png.ccf61e441cb0455ddb5859ac494a51fe.png

And here is one I just ran this morning (including my new 9th drive):

image.png.73bcbead0e704dd6631e5a931c2c49b6.png

The only differences I see is an extra drive making the speed dip from ~4 - ~5.5TB and that seems to be attributable to the addition of the new drive. Those 3 drives are all Seagate IronWolfs so similar behavior is to be expected.

JorgeB · November 7, 2022

That looks OK, grab the diags during the next check if it's still slow.

FreeMan · December 1, 2022

Well, the next monthly parity check has rolled around and it's currently running at about 37MB/sec.

As requested, here are diagnostics during the run.

Also, of note, I received an error warning:

Quote

Unraid Disk 7 SMART health [5]:

Warning [NAS] - reallocated sector ct is 8

ST8000NM005-1RM112_ZA1FS9VW (sdn)

Disk7 is, at least, still under warranty, but it's not one of the drives I suspected to be failing due to very slow extended SMART tests earlier. Sigh...

Should have kept my eyes open during Black Friday/Cyber Monday sales for good deals on hard drives. Forgot all about it.

nas-diagnostics-20221201-0642.zip

JorgeB · December 1, 2022

There's something else reading from disks 1 and 5, looks like it's cache dirs:

root      7894  0.1  0.0   5580  3684 ?        S    Nov21  27:16 /bin/bash /usr/local/emhttp/plugins/dynamix.cache.dirs/scripts/cache_dirs -e appdata -l on
root      2905  0.0  0.0   4556  2848 ?        S    06:42   0:00  \_ /bin/bash /usr/local/emhttp/plugins/dynamix.cache.dirs/scripts/cache_dirs -e appdata -l on
root      2915  0.0  0.0   2584   908 ?        S    06:42   0:00  |   \_ /bin/timeout 30 find /mnt/disk1/Audio -noleaf
root      2924  0.6  0.0   3872  2472 ?        D    06:42   0:00  |       \_ find /mnt/disk1/Audio -noleaf
root      2911  0.0  0.0   4556  2912 ?        S    06:42   0:00  \_ /bin/bash /usr/local/emhttp/plugins/dynamix.cache.dirs/scripts/cache_dirs -e appdata -l on
root      3101  0.0  0.0   2584   872 ?        S    06:42   0:00      \_ /bin/timeout 30 find /mnt/disk5/Photos -noleaf
root      3102  3.2  0.0   3940  2524 ?        D    06:42   0:00          \_ find /mnt/disk5/Photos -noleaf

FreeMan · December 1, 2022

Interesting. I've been running cache dirs for ages (since v5.x, probably earlier than that) and it's never seemed to have an impact on parity speeds like this. I may have to go have a look to see if it's been updated recently and that might be causing issues. (I run CA Auto Update and get notifications of updates, but I don't recall every one of them.) NOPE last update to cache dirs was August 2020. It's been this way for > 2 years now.

I wonder if this is also what's been preventing drives from spinning down for the last several weeks. I've noticed that every time I look at the dashboard (and in every one of my 4x/day array status notifications) that all drives are spinning. I did get them to manually spin down and stay down for a little bit when I tried yesterday, but they all went back to spinning again within a few minutes.

Is the reallocated sector count something to be concerned about or to just keep an eye on to make sure it's not increasing too rapidly?

Edited December 1, 2022 by FreeMan

itimpi · December 1, 2022

38 minutes ago, FreeMan said:

Is the reallocated sector count something to be concerned about or to just keep an eye on to make sure it's not increasing too rapidly?

Reallocated sectors are never a good sign. If it does not stay stable then it is definite sign of impending problems.

FreeMan · December 1, 2022

Is there anything in the diagnostics that would indicate why my parity checks have taken such a nose dive in speed?

I'm ready to pull the trigger on a new drive to replace the one with reallocated sectors, but I want to know if I need to replace any of the others before they just up and die on me...

JorgeB · December 1, 2022

First thing to do it to stop anything else accessing the array and see the difference it makes.

FreeMan · December 1, 2022

42 minutes ago, JorgeB said:

First thing to do it to stop anything else accessing the array and see the difference it makes.

Well, I shut down all my dockers and gee... now I'm running along at nearly 100Mb/s.

Guess I have to rummage about and figure out what happened that they're all creating so much disk access. I never used to have to shut down dockers to get decent parity check speed, so something's changed.

Thanks for pointing out the obvious. And I do still need to replace at least the one drive with reallocated sectors - at least I think it's still under warranty, so that should be a simple one.

FreeMan · December 1, 2022

Interesting.

Telegraph seems to cause a cyclical dip in read speeds from ~90MB/s down to ~60MB/s down to ~20MB/s then back up to ~90.

Jellyfin just causes the drives to thrash and read speeds fall to the floor.

I've been running Telegraph for a few years and never noticed any sort of significant slow down because of it, but maybe it's because I just never noticed...

Seems a pain to have to shut down dockers in order to get through the parity check in a reasonable amount of time, but these two aren't critical, so I can live w/o 'em for 24 hours.

Painfully slow parity check - dying drive?

Recommended Posts

FreeMan

Link to comment

JorgeB

Link to comment

FreeMan

Link to comment

JorgeB

Link to comment

FreeMan

Link to comment

JorgeB

Link to comment

FreeMan

Link to comment

itimpi

Link to comment

FreeMan

Link to comment

JorgeB

Link to comment

FreeMan

Link to comment

FreeMan

Link to comment

Join the conversation