Issues with read/parity check speed.

Vincent77 · January 22

My parity check and read check speed is abysmal. Max 2MB/s but I am able to read from and write to the array at GB speed over the network.

I've read through the logs and found a couple of errors but no idea how to interpret them or how to proceed.

At one time the parity check speed was fine so somewhere along the line I've added or changed something that broke it.

Help from the experts would be appreciated.

Specs of my setup:

- Norco ds 24 bay disk shelf with 18 disk's in the array. Connected via 6 sff8088 cables to two LSI9201-16e sas cards on an Asus PRIME B450M Ryzen 3 3200g 16GB DDR4

tower-diagnostics-20240122-1549.zip

Edited January 22 by Vincent77
Specs

JorgeB · January 22

Post new diags saved while the check is running.

Vincent77 · January 22

Hi. New diagnostics attached, taken during read check.

tower-diagnostics-20240122-2127.zip

itimpi · January 22

At the moment there appears to be something writing to disk16 which will be slowing things down

Vincent77 · January 22

Interesting. I did notice reads and writes are much higher on that disk, and climbing slowly.

JorgeB · January 22

Jan 22 21:26:44 Tower kernel: sd 2:0:6:0: Power-on or device reset occurred
### [PREVIOUS LINE REPEATED 1 TIMES] ###
Jan 22 21:26:45 Tower kernel: sd 10:0:0:0: Power-on or device reset occurred
Jan 22 21:26:50 Tower kernel: mpt2sas_cm0: log_info(0x31110d00): originator(PL), code(0x11), sub_code(0x0d00)
### [PREVIOUS LINE REPEATED 1 TIMES] ###
Jan 22 21:26:51 Tower flash_backup: adding task: /usr/local/emhttp/plugins/dynamix.my.servers/scripts/UpdateFlashBackup update
Jan 22 21:26:55 Tower kernel: sd 2:0:6:0: Power-on or device reset occurred
Jan 22 21:26:55 Tower kernel: sd 2:0:3:0: Power-on or device reset occurred
### [PREVIOUS LINE REPEATED 1 TIMES] ###
Jan 22 21:26:56 Tower kernel: sd 2:0:6:0: Power-on or device reset occurred
Jan 22 21:26:56 Tower kernel: sd 10:0:0:0: Power-on or device reset occurred
Jan 22 21:27:00 Tower kernel: mpt2sas_cm0: log_info(0x31110d01): originator(PL), code(0x11), sub_code(0x0d01)
Jan 22 21:27:00 Tower kernel: mpt2sas_cm0: log_info(0x31110d00): originator(PL), code(0x11), sub_code(0x0d00)
Jan 22 21:27:04 Tower kernel: sd 2:0:4:0: Power-on or device reset occurred
### [PREVIOUS LINE REPEATED 1 TIMES] ###
Jan 22 21:27:05 Tower kernel: sd 2:0:6:0: Power-on or device reset occurred
### [PREVIOUS LINE REPEATED 1 TIMES] ###
Jan 22 21:27:10 Tower kernel: mpt2sas_cm1: log_info(0x31110d00): originator(PL), code(0x11), sub_code(0x0d00)
### [PREVIOUS LINE REPEATED 1 TIMES] ###
Jan 22 21:27:10 Tower kernel: mpt2sas_cm0: log_info(0x31110d00): originator(PL), code(0x11), sub_code(0x0d00)
### [PREVIOUS LINE REPEATED 4 TIMES] ###
Jan 22 21:27:15 Tower kernel: sd 2:0:6:0: Power-on or device reset occurred
Jan 22 21:27:16 Tower kernel: sd 10:0:0:0: Power-on or device reset occurred
Jan 22 21:27:16 Tower kernel: sd 2:0:6:0: Power-on or device reset occurred

This usually means a power/connection issue, check/replace cables for those devices.

trurl · January 22

21 minutes ago, Vincent77 said:

I did notice reads and writes are much higher on that disk, and climbing slowly.

Your appdata, domains, and system shares are on the array, including disk16. Better if those shares have all their files on cache or other fast pool so Docker/VM will perform better and so array disks can spindown since these files are always open.

But you don't appear to have cache or any other pools. What is the purpose of your nvme?

Vincent77 · January 22

31 minutes ago, JorgeB said:

This usually means a power/connection issue, check/replace cables for those devices.

I'll grab a new PSU and required connectors tomorrow. I think the current PSU in the disk shelf is a Corsair 550w.

19 minutes ago, trurl said:

But you don't appear to have cache or any other pools. What is the purpose of your nvme?

The nvme was the cache drive originally but I disabled it at some point a while back fiddling, trying anything to fix this issue. I'll set it back now.

trurl · January 22

2 minutes ago, Vincent77 said:

The nvme was the cache drive originally but I disabled it at some point a while back fiddling, trying anything to fix this issue. I'll set it back now.

There will be more work to do to get those shares back on cache. We can work on that later. For now, Disable Docker and VM Manager in Settings.

Vincent77 · January 22

They're both disabled now.

Vincent77 · January 23

Today I did some testing. I removed all the disks from the array except for one and did a read check. No problems it ran at 120MB/s.

I then added each disk one at a time, doing a read test after adding each one. What I found was I have three disks that when added to the array individually, will cause the read check to crawl.

Disks 4, 8 and 24 are causing the problem.

I am able to read and write to these disks at full speed, yet they slow the read check down to under 2MB/s.

I don't really want to remove these 3 disks from the array as it's 32TB or storage that otherwise performs as they should.

Tomorrow I will clear and format one to see if that does anything.

JorgeB · January 23

That's very strange, especially since they are not all the same model, I doubt clearing then will change anything, but worth a try.

Issues with read/parity check speed.

Recommended Posts

Vincent77

Link to comment

JorgeB

Link to comment

Vincent77

Link to comment

itimpi

Link to comment

Vincent77

Link to comment

JorgeB

Link to comment

trurl

Link to comment

Vincent77

Link to comment

trurl

Link to comment

Vincent77

Link to comment

Vincent77

Link to comment

JorgeB

Link to comment

Join the conversation