Jump to content

(SOLVED) Irregular speed during parity check, sync or disk rebuild


JorgeB

Recommended Posts

In the last few months I’ve noticed that sometimes on some of my servers parity check, sync or disk replacement are done with a very irregular speed, this happens in at least 3 different servers so it can’t be a bad board or controller card.

 

There’s nothing in the log when the speed changes, tried blowing a fan on the SAS card in case it was overheating, happens with 6.0.1 and latest 6.1rcs but not always and with varying degrees.

 

Below are screenshots of a disk replacement and a parity sync, one done almost always at the minimum speed and the other fluctuating many times between normal and slow.

 

It’s frustrating because it’s not always reproducible but when it happens it can double the duration of the operation, anyone seen anything similar or has any ideas?

1.png.d7456fba25ed643061f9a01773c312f7.png

2.png.202389123bdb4a044d92a321dc0bfae7.png

3.png.ed5b8831beec8dac32fea61871a9854a.png

4.png.e37384050cd65e7db4639b5111b75b70.png

Link to comment

I believe I found what is causing this issue, I was replacing the slower drives from my main servers and as soon as I replace all Samsung HD203WI (these are older 500GB platter HDDs) the problem disappeared, I installed 4 of this drives to use as a backup server on a HP Gen 7 Microserver that now exhibits the same problem. This is what makes believe they are the issue, because my Microservers were always constant doing parity checks, but I had never used these disks on them and they were in all 3 servers that were giving me problems.

 

Although the disks are slower than current models they still work normally in read / write operations to the array, the problem is only during parity checks / syncs and disk rebuilds and not always the same, sometimes slower than others.

I don’t remember having this issue in Unraid V5.

 

These are older and probably little used drives in the community but if someone still uses them and has the same problem try replacing them, the same could be true for the 3 platter cousin HD153WI.

t6_before.png.0a3d671ad66438392e4558bc3c7389fc.png

t6_after.png.b463a1be8b1caeaa6abe8c10ad98d7b6.png

Microserver.png.4820e821d9cef35df5552278803254b9.png

Link to comment
  • 2 weeks later...
  • 6 months later...

Update:

 

At the time of the testing, and since the issue was present when using very different controllers, from Intel and AMD, I incorrectly assumed that it was a general problem, but after reading a post by BobPhoenix, where there appears to be a similar issue with some Hitachi disks (although in those cases only the parity sync is affected), I noticed the issue only occurred using the onboard controller in AHCI mode, AMD in Bob’s case, Intel in the OP, and that for Bob the disks performed normally when used on an HBA, so after doing some testing these are the results:

 

The following controllers have the issue with the Samsungs:

 

Intel Onboard (AHCI mode)

AMD/ATI Onboard (AHCI mode)

Asmedia 1601

Marvel 9230

 

In these they work at normal speed:

 

Adaptec 1430SA

SASLP

SAS2LP

Dell H310

 

So what do the problem controllers have in common?

 

Kernel driver in use: ahci

 

I won’t assume this is a general solution but I have a very strong guess that with v6 (they work fine with v5) any controller in AHCI mode will have the issue with these disks, I remember at least 2 or 3 cases of users here with the same HDD model and issue so this will hopefully help them and others in the future.

 

Screens below of how this issue looks if using or not one or more of these disks connect to a AHCI controller during 10 minutes of a parity sync.

AHCI.png.f6c3a85764829a679d41f9f153fd70ff.png

nonAHCI.png.a56152670d581212679b5be8f35b4877.png

Link to comment

VERY interesting results.  Especially given that it's not limited to just the HD203WI drives, but can also happen with some Hitachi models.    I suppose it simply confirms that "standards" don't always guarantee conformity  :)

 

Hopefully this won't be an issue with any of the newer 1TB/platter (and denser) drives that most of us are now using in our servers ... but there's certainly no guarantee of that.    I've seen several similar strange issues over the decades ... many are just chalked up to "life's little mysteries" -- although it's always nice to ultimately find out exactly what the cause was.

 

Link to comment

I am using two AOC-SAT2-MV8 (Marvell 88SX6081) plus the onboard SATA ports on the Supermicro X7SBE motherboard.

 

I believe all of which are running in ACHI mode.  Is my only option is to get new controllers?  I don't believe there is an HBA mode for the motherboard or the AOC-SAT2-MV8 controllers.

Link to comment

I will look into moving the Samsung drives to the AOC-SAT2-MV8 controllers.

 

Since my Norco 4224 has hot swap bays, can I just swap the drives positions or do I also need to reconfigure unRAID?  Does unRAID cares which controller and ports the drives are connected to?

Link to comment
  • 2 weeks later...

Big thanks to Johnnie.Black here.  I simply moved my Samsung drive to my Adaptec 1430SA controller card and this appears to have resulted in a much faster parity check speed (see attached screenshot).  I can't say for sure whether this is a reasonable speed for a 4 TB parity drive and all the data drives being 1 - 2 TB's, but at least it is a significant improvement.     

Snap_2016-04-04_at_09_15_01.jpg.70991fd238d238cdeae494e02e46c99a.jpg

Link to comment
  • 3 months later...

Hello again johnnie.black

Since we last talked, I've added the Samsung drive to my array, and am checking parity for the first time today. As you predicted, the parity check is chugging along at a woeful 25MB/s - no doubt due to the Samsung drive (the last time I checked parity it completed with an average speed of ~152MB/s - although this was before I enabled data caching in BIOS).

Sitting next to my server is a Dell Perc H310 controller that I will soon install and flash to IT mode. It was going to be for my SSD cache (currently only 1x disk, but was hoping to expand to 3x), but I would consider moving the HDD cage to that card if it would fix this problem. Only, I don't know if the card in IT mode is just the same as the motherboard 'card' in AHCI mode, and therefore will cause the same slow parity builds? Any idea?

Thanks for your help.

Link to comment

It will work normally on the H310, the problem exists only when the controller uses the AHCI driver, the H310 uses the mpt2sas driver.

 

Good to know. I've just flashed my Dell H310 to IT mode, and have plugged in the MicroServer's Drive cage (all my HDDs), as well as a MiniSAS > 4x SATA adapter (for my SSD cache drive).

About to boot now, so will see how speeds are affected.

Thanks again for your help and info.

 

EDIT: Just started a parity check, and it is running at 103MB/s :D

Link to comment

Archived

This topic is now archived and is closed to further replies.

×
×
  • Create New...