Parity stuck at 3.0Gbps - Rebuild 2tb to 4tb replacement is super slow


DieFalse

Recommended Posts

I recently updated to the current RC 6.7.0-rc3 and noticed during tonight's rebuild (upgraded a 2tb to 4tb) it is running at half the speed of normal, and even slower at times.  (Right now at 17.3MB/s ouch).   So I checked all drives and the parity is somehow stuck at 3.0gbps.

 

root@NAS:~# smartctl -x /dev/sdl | grep SATA
SATA Version is:  SATA 3.1, 6.0 Gb/s (current: 3.0 Gb/s)
0x11       GPL     R/O      1  SATA Phy Event Counters log
SATA Phy Event Counters (GP Log 0x11)

 

Enabled features:

       *    Gen1 signaling speed (1.5Gb/s)
       *    Gen2 signaling speed (3.0Gb/s)
       *    Gen3 signaling speed (6.0Gb/s)

 

Any ideas on this one?  All cables are secure and with the last release before upgrading to 6.7.0-rc3 it was running at 6Gb/s.

Edited by fmp4m
Link to comment

Three other things to note: 

 

I did replace the Drive 14 from 2tb to 4tb and it is rebuilding that drive right now, not a parity check.  My mis-wording.  Only other change was the RC update.

 

Second,  diags attached incase you would like them.

nas-diagnostics-20190214-2335.zip

 

Third,  last parity info:

Date                                Duration                        Speed            Status    Errors

2019-02-12, 05:33:52     11 hr, 19 min, 13 sec       196.3 MB/s    OK          0

2019-02-11,  03:20:25     12 hr, 37 min, 48 sec     176.0 MB/s    OK          0

Edited by fmp4m
Link to comment
2 hours ago, fmp4m said:

and the parity is somehow stuck at 3.0gbps.

Likely cable/connection related, and that's not the problem, since SATA2 can still do around 275MB/s.

 

There's something reading from disks 2 and 12 during the rebuild, this will cause a noticeable slowdown.

 

P.S: unrelated, disk14 has filesystem corruption, you'll need to run xfs_repair when the rebuild finishes.

Link to comment

Thanks Johnnie

 

Its currently running at 6.5MB/s and has 4 days left.  It's never been this slow. All drives are hot swap and I checked the backplane connections. (yay rails).  and everything is tight.

 

Thanks for the catch on Disk14 (thats the new 4tb I just put in,  wouldn't the rebuild be fixing the corruption?).

Link to comment
9 minutes ago, fmp4m said:

Its currently running at 6.5MB/s and has 4 days left.

That's very slow, but the only thing I can say for sure is that it's not because one is linking at SATA2 speeds, it might be a disk with slow sectors.

 

10 minutes ago, fmp4m said:

(thats the new 4tb I just put in,  wouldn't the rebuild be fixing the corruption?).

No, parity can't help with filesystem corruption.

Link to comment

Ok,  so it kept getting slower and slower.  So I decided to pull Drive14.   I put another 4tb in its place and now I am getting 2.0GB/s rebuild speed.  Since Drive14 was the 2tb to 4tb change and it was unlikely the RC update that caused it,  I chose to try this and I am glad I did.   When rebuild finishes I will post another diagnostic to see if data is still corrupted. 

 

Johnnie,  What did you see that told you there was corruption and what can I do to find this / test for this and repair it in the future?

Link to comment
7 minutes ago, fmp4m said:

2.0GB/s

That seems too fast :)

 

7 minutes ago, fmp4m said:

What did you see that told you there was corruption and what can I do to find this / test for this and repair it in the future?

Feb 14 23:29:31 NAS kernel: XFS (md14): Metadata CRC error detected at xfs_dir3_data_read_verify+0x80/0xc9 [xfs], xfs_dir3_data block 0xaea866c0
Feb 14 23:29:31 NAS kernel: XFS (md14): Unmount and run xfs_repair
Feb 14 23:29:31 NAS kernel: XFS (md14): First 128 bytes of corrupted metadata buffer:

md14 is disk14

Link to comment
17 hours ago, johnnie.black said:

That seems too fast :)

This makes me think I miss-spoke.  The original 17.3MB/s was the OVERALL data transfer rate for all 16 disks in the rebuild (which was painfully slow).

The 2.0GB/s is the same,  Overall.   Avg' 117.0+MB/s each disk. 

 

So it's right where it should be again.  Thanks again and I will check when parity finishes in a couple hours for corruption. 

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.