Very unstable read/write speads during parity sync


Fredrick

Recommended Posts

Disks look normal, though it's some times hard to find a slow disk with diskspeed even with a lot of samples.

 

20 minutes ago, Fredrick said:

One more bit of information. Several of my drives are very full. Could this be a problem?

 

No, filesystem used and how full it is has no effect whatsoever.

 

Start a new parity check a monitor for a few minutes making sure nothing is using the disks.

  • Like 1
Link to comment
9 minutes ago, johnnie.black said:

No, filesystem used and how full it is has no effect whatsoever.

 

 

9 minutes ago, Benson said:

XFS should not got such problem and those problem only affect when file system read/write, not in parity operation.

 

 

Great news, thanks :)

 

9 minutes ago, johnnie.black said:

Start a new parity check a monitor for a few minutes making sure nothing is using the disks.

 

Coming right up. I'll start array but shut down dockers/VMs so its quiet and let it run for 15-20 minutes to see if there are any odd writes. Just gonna let the current diskspeed test finish.

 

10 minutes ago, Benson said:

You may need further test to dig out the cause, this need plug in/out hardware.

 

Suggestions? I've only got one SFF-8088 cable, but have ordered a second one (from China so its gonna be a few weeks). I've got another HBA I could test with, or I could move my current HBA to another PCIe slot. None of these would explain the poor performance on internal drives on the P410 controller. Shouldn't I be getting SMART and other data from these drives?

 

Also, this is not a test set up, my server is very much in a live environment where it is running services for a couple of users, in addition to hosting my home automation system. Any downtime is unwanted ^^

Link to comment

Just out of interest what Dockers have you got running?

 

Deluge (I know can be a pain to the array..), Plex, PlexPy, Radarr, Sonarr, NZBHydra, Jackett, NZBGet, LetsEncrypt, Organizr, one Win10-VM. Most of these are idling. 

 

-----------

 

Alright, some more data!

 

Diskspeed with array stopped looks very much the same as before:

My cache drive (and the Logical Volume) refuses to test beyond 0gb. Might be a bug with the script?

image.thumb.png.1e89a5750333bbdccb32233f7f599c83.png

 

Here is my parity check with docker/VM disabled after 10 minutes. There are writes here that I have no clue what would be. Note that there is no sync errors corrected, and that there is also reads from cache and writes to the USB-boot drive.
image.thumb.png.2a58da6e42997eb00e1ed974891ff7c5.png

 

I cleared the statistics, and here is the view after another 10 minutes. More writes here, but not to all drives.

image.thumb.png.68cc9188941292b14da7095faad0fb9b.png

 

 

 

Link to comment

Also my parity check history:

 

image.png.6f267e80568e55107aa1d90a29f52850.png

 

Further underlining the chaotic mess of the performance. Note that its just the last one from 28.09.2017 that were with my current configuration with EXP3000. Previously the drives were attached to my HBA with SATA breakout cables.

 

I'm running tunables tester now, so I wont have any news for a couple of hours.

Link to comment
1 hour ago, johnnie.black said:

The low number of writes on the first screenshot is normal but on the second one something is still writing to disk1.

 

How can I find out what this was? Does it say in a log?

 

I dont know what it could be.

 

Tunables script running now:

59ce0c6ddf397_ScreenShot2017-09-29at11_03_16.png.4c6911e65cae31947a145a4c4f1326e1.png

Edited by Fredrick
Link to comment
40 minutes ago, Fredrick said:

Wouldn't it still give better results than the "universal" tunables you posted?

 

To expand a little more on this, the current script completely ignores the new md_sync_thresh setting, and that setting has a big influence on the performance, the "universal" settings should work close to optimal with most configs, except if a SASLP and/or SAS2LP is used.

Link to comment

Fine :) Script cancelled, and I went with your values. 

 

59ce198d599e4_ScreenShot2017-09-29at11_59_23.png.9c0b2b61f34ea0fb39ea4673a907edbb.png

 

I tried a new parity check, again without docker and VM running, this time also without any shares. Basically it should just be running the system and plugins. I let it run for 10 minutes, and this is what it looks like:

 

59ce1d51772f1_ScreenShot2017-09-29at12_15_26.thumb.png.365b424b5b3bd200e2bd7c6955057856.png

59ce1d6084ac4_ScreenShot2017-09-29at12_15_48.png.292122e504f7ca6fc394dce9e241a959.png

59ce1d748cb97_ScreenShot2017-09-29at12_16_07.png.d4ac5ae7bede3a5f407fa7e202c04d1d.png

 

First of all, the maximum speed is higher  with the new tunables. I'm very pleased with that! Furthermore the speeds seems to be much more stable and without the dips we saw previously. There are still a small amount of writes going on, would this be from plugins?

 

Next on my agenda is to swap Parity 1 for the unassigned Red, which means doing a full parity sync. I'll probably start this process in a couple hour or so if there is nothing more to test here. I guess I could start docker/VM/shares again and repeat to make sure. I cant have all my services stopped each time I'm checking parity.

 

I also see the P410 controller in my Proliant is not ideal as it doesnt support JBOD. I could move my SSDs outside to a separate HBA. This would mean I'd get full speeds and SMART info on these drives aswell.

Link to comment
5 minutes ago, Fredrick said:

First of all, the maximum speed is higher  with the new tunables. I'm very pleased with that! Furthermore the speeds seems to be much more stable and without the dips we saw previously.

 

That looks much better, the slower speed for the first minutes is normal if you have cache dirs installed and started the check right after starting the array, SAS expander link will be limited to 1100MB/s so you're practically there.

 

8 minutes ago, Fredrick said:

There are still a small amount of writes going on, would this be from plugins?

 

You can ignore those, a very small amount is normal, as long as it doesn't go into the hundreds or thousands you're OK.

 

  • Like 1
Link to comment
22 hours ago, johnnie.black said:

That looks much better, the slower speed for the first minutes is normal if you have cache dirs installed and started the check right after starting the array, SAS expander link will be limited to 1100MB/s so you're practically there.

 

 

 

Just wanted to update here with my latest parity sync (when removing the Seagate and adding the WD as parity):

 

59cf5714a30e9_ScreenShot2017-09-30at10_33_45.png.09cd7f8cd07eab409ee4e3cf1c3903d7.png

 

Note that I'm preclearing the Seagate in parallell here, hence the high amount of write. The drop i write is when preclear finished. Its clear that I'm hitting the limits of the SAS expander link for the first half or so. As far as I can tell I would be hitting this limit only during parity checks as the amount of sequential read is not normal for my array in other cases.

 

The array was under load for some of the preclear+sync aswell, and for the most part seemed to work well under this stress.

 

As I have two EXP3000 and currently just use one of them, I'm thinking I could move half of my drives to the other one to practically double the available speed. I'd be limited when adding more drives, but thats a problem for tomorrow. Optionally I could move my cache+Unassigned SSD to the other EXP3000 and get S.M.A.R.T. and diagnostics for those instead. Thoughts?

Link to comment
12 minutes ago, Fredrick said:

As far as I can tell I would be hitting this limit only during parity checks as the amount of sequential read is not normal for my array in other cases.

 

Yes, only during parity check/sync and disk rebuild, or if you use turbo write, but it's only limiting by a small amount, IMO not significantly.

  • Like 1
Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.