Jump to content
jbartlett

DiskSpeed, hard drive benchmarking (unRAID 6+), version 2.0

331 posts in this topic Last Reply

Recommended Posts

13 minutes ago, BRiT said:

Replace your slowest drives?

I'm sure at some point I'll be upgrading all the 4TB drives to 8TB, but at $250US+ each, that probably will only happen as the old drives fail. Was hoping to be able to improve parity check speeds by reconfiguring the system since it seems that my speeds are abnormally low.

Share this post


Link to post
Posted (edited)
1 hour ago, wgstarks said:

Sorry. Realized I had run the wrong benchmark tests.

 

Controller benchmarking shows a variation of 3.3% on the onboard controller (Your controller is not bottlenecking) and a variation of 0.0% on the dell H310 controller (Your controller is not bottlenecking). Not sure what the next step would be to resolve the speed issues?

Can you share the graphs of the controller benchmarks?

 

Drive 9 was found to have a steady read speed over a large portion of the drive when it should be declining for a spinner, this indicates that the drive can send data faster than the controller it's attached to can handle.

 

You also have a drive that has a wave to it. Can you tell me what make/model drive that is? Waves can be a sign of degraded areas. At a minimum, perform a benchmark on it every month or less to see if it holds steady - if it does, it may be how that drive was designed to operate. I'm working on updating the Hard Drive Database web site so it displays the latest benchmark from everyone who has the same drive (instead of averaging all tests which can cause big spikes) to see if that's just how that drive operates.

Edited by jbartlett

Share this post


Link to post
1 hour ago, wgstarks said:

I'm sure at some point I'll be upgrading all the 4TB drives to 8TB, but at $250US+ each, that probably will only happen as the old drives fail. Was hoping to be able to improve parity check speeds by reconfiguring the system since it seems that my speeds are abnormally low.

If it isn't an issue, any chance you can switch PCIExpress ports for the controller? Just to rule out any inaccuracies in the motherboard manual on which port is wired to which?

Share this post


Link to post
2 hours ago, jbartlett said:

Can you share the graphs of the controller benchmarks?

SafariScreenSnapz167.thumb.jpg.232b1a17e8af7ac70af8c19304da560f.jpg

 

SafariScreenSnapz168.thumb.jpg.7a83ba484e2baaad5200dfb0f3454703.jpg

 

2 hours ago, jbartlett said:

You also have a drive that has a wave to it. Can you tell me what make/model drive that is?

I think you mean this one-

 

SafariScreenSnapz169.thumb.jpg.a6cfd0228f926576cfdf04e7dfdce7a1.jpg

 

I also uploaded the drive benchmarks to your db.

Share this post


Link to post
Posted (edited)

This one also has a wave form. Looks like the same model.

 

SafariScreenSnapz170.thumb.jpg.38794aa5081229dbb7cef8b52fb122e2.jpg

 

Edit: Not the same but similar.

Edited by wgstarks

Share this post


Link to post
2 hours ago, BRiT said:

If it isn't an issue, any chance you can switch PCIExpress ports for the controller? Just to rule out any inaccuracies in the motherboard manual on which port is wired to which?

IF I understand the controller benchmark report correctly it's showing that the controller is installed in an x8 slot right? I could switch slots later this week if there would be some necessity for it.

Share this post


Link to post

It does look like it's in an 8x Slot.

 

Well your system does seem a bit of a mystery as to why your Parity Check speeds are limited, unless it's truly limited by the speed of the smaller capacity drives. So the thought was no harm trying to switch to the other slot and see if it impact benchmarks at all. If it's put into a lesser capable slot, then the impacts should be shown immediately, but there's a slight chance it might be put into a more capable slot or at least one that might behave differently. It's more of a this doesn't make much sense, so try something idea.

 

 

Share this post


Link to post
14 minutes ago, BRiT said:

It does look like it's in an 8x Slot.

 

Well your system does seem a bit of a mystery as to why your Parity Check speeds are limited, unless it's truly limited by the speed of the smaller capacity drives. So the thought was no harm trying to switch to the other slot and see if it impact benchmarks at all. If it's put into a lesser capable slot, then the impacts should be shown immediately, but there's a slight chance it might be put into a more capable slot or at least one that might behave differently. It's more of a this doesn't make much sense, so try something idea.

 

 

Can't do any harm I guess. I'll give it a shot later this week when I can take the server offline.

Share this post


Link to post

Hmm... do I need to be concerned about Disk 1 in this array? All drives are 8Tb WD Reds of the same age...

 

It has a consistent slow spot at the start of its platters and more wobbles than the rest of the disks. In addition when I benchmarked the controller, the first run through showed a VERY low result for it (so much so that DiskSpeed thought the controller was saturated). I redid the Benchmark and the result is consistent between tests now. I have also attached the Quick SMART for that drive.

DiskSpeedBenchmark04.jpg

DiskSpeedBenchmark03.jpg

DiskSpeedBenchmark01.jpg

DiskSpeedBenchmark02.jpg

WDC_WD80EFZX-68UW8N0_VK1DZHAY-20190820-1724.txt

Share this post


Link to post

I've got some weird results that I'm hoping to get some explanation too.  Here's the controller benchmark followed by the disk benchmark.  I've colour coded the matching disk models in the benchmark test.  I'm surprised the same model drive on the same controller can show such wild results. 

 

DiskMarked.thumb.png.b6700e39b6d44a2b42b153c06a6cf4d3.pngDiskBenchmark.PNG.80b4680bf315f1c9b86fa2867175d0f7.PNG

Share this post


Link to post
13 hours ago, DanielCoffey said:

Hmm... do I need to be concerned about Disk 1 in this array?

Perform a benchmark on it every week or so to see if it returns an identical test. If it does return an identical test over the span of a month, it may just be how that drive is.

Share this post


Link to post
8 hours ago, dalben said:

I've got some weird results that I'm hoping to get some explanation too.

It's weird that you're seeing such a variance. It's the same command given to each drive - read balls-to-the-wall starting from the start of the drive and 15 seconds later a kill command is sent - regardless if it's running on it's own or all at once.

 

Did you click off of it or otherwise leave the benchmarking page at any point prior? I added code to stop reads if such an abort takes place. If you have the "Dynamix System Statistics" plugin installed, check to see if there's constant drive activity such as one of those tasks are running wild. If stopping & starting the DiskSpeed docker clears it up, then I have more work to do on that front.

Share this post


Link to post

OK, let me do another test in a more controlled manner (making sure I don't surf out to another page etc.) and I'll fire up system stats in another window.  I'll send it through once done.

 

Thanks

Share this post


Link to post
1 hour ago, jbartlett said:

Perform a benchmark on it every week or so to see if it returns an identical test. If it does return an identical test over the span of a month, it may just be how that drive is.

The problem is that the server locks up on about every four shutdowns/sleeps and drops Parity 1 and Disk 1 every time. I was looking for anomalies and spotted that Disk 1 stood out under the DiskSpeed tests. I have my own thread started about the dropouts and have added the DiskSpeed results to it.

Share this post


Link to post
Posted (edited)
2 hours ago, DanielCoffey said:

The problem is that the server locks up on about every four shutdowns/sleeps and drops Parity 1 and Disk 1 every time. I was looking for anomalies and spotted that Disk 1 stood out under the DiskSpeed tests. I have my own thread started about the dropouts and have added the DiskSpeed results to it.

Have you run multiple tests on that drive with the same results? If so and comparable test, I would replace it.

 

On a side note, I replaced my two HP220 controllers with a new LSI 16i and have jumped 38MB/s in mt parity check. Decreasing my time by 5.5 hours. 23 hrs old 17.5 hrs new. I am hoping to shave off another couple of hours once all my 8TB drives are replaced with 10's.

 

Thanks to @jbartlett for this diskspeed test, it gave me a graphic view on where my slow downs were occurring and on what controllers.

 

Edited by Harro

Share this post


Link to post

I have run multiple tests, yes. The really low result was a one-off and the c.162Mb/s was its regular result which coincides with the highest speed I can get on the beginning of a parity check.

 

I don't think my 8 drives are even close to saturating the 9201-8i I have as it allows 4Gb/s transfer and the eight drives are less than half of that.

 

I will be pulling that drive today anyway once the rebuild completes and I can write to the array again.

Share this post


Link to post

Have you tried to mount that disk(1) on the onboard controller? 

Something just looks off with a 20MB/s difference compared to your other drive tests.

Share this post


Link to post

I agree the speed is odd but it is the one disk that seems to trigger the hard lock on power down or sleep. It is outside the array now and undergoing a full SMART test. My other Unassigned 8Tb is back in the array undergoing a rebuild.

Share this post


Link to post
On 8/21/2019 at 3:42 PM, jbartlett said:

It's weird that you're seeing such a variance. It's the same command given to each drive - read balls-to-the-wall starting from the start of the drive and 15 seconds later a kill command is sent - regardless if it's running on it's own or all at once.

 

Did you click off of it or otherwise leave the benchmarking page at any point prior? I added code to stop reads if such an abort takes place. If you have the "Dynamix System Statistics" plugin installed, check to see if there's constant drive activity such as one of those tasks are running wild. If stopping & starting the DiskSpeed docker clears it up, then I have more work to do on that front.

OK, Here's another run in a more controlled environment.  The weirdness has gone and the disk models seem to give the same performance.

DiskControll-2.thumb.PNG.f46745ea693347a2ed9d86c3c3759a98.PNG

Share this post


Link to post
On 8/23/2019 at 2:42 PM, dalben said:

OK, Here's another run in a more controlled environment.  The weirdness has gone and the disk models seem to give the same performance.

I added code to invalidate a test result if a drive gives a 10% or greater improvement in speeds over it's single-drive speed.

Share this post


Link to post
On 7/24/2019 at 1:54 PM, jbartlett said:

Apologies that this slipped through the cracks. Does unRAID return the Serial Numbers for the drives?

Sorry, been a little while since I checked for a reply.

 

Here are the drive details of the first 3 drives, with the complete serial numbers for ease, all viewed with unRAID...

 

Under Drive details for the drives I see;


(Note using as the SMART controller type: 3Ware    2    /dev/twa1)
Model family:    SAMSUNG SpinPoint F3
Device model:    SAMSUNG HD502HJ
Serial number:    S27FJ9FZ404491


(Note using as the SMART controller type: 3Ware    1    /dev/twa1)
Model family:    SAMSUNG SpinPoint F3
Device model:    SAMSUNG HD502HJ
Serial number:    S27FJ9FZ404504


(Note using as the SMART controller type: 3Ware    1    /dev/twa0)
Model family:    Seagate Barracuda 7200.7 and 7200.7 Plus
Device model:    ST3160828AS
Serial number:    5MT44SV6

 

 

And under MAIN tab in Unraid under Devices, I see this;


Device        Identification

Parity        1AMCC_FZ404491000000000000 - 500 GB (sdf)

Parity 2    1AMCC_FZ404504000000000000 - 500 GB (sdg)

Disk 1        1AMCC_5MT44SV6000000000000 - 160 GB (sdc)


 

 

Share this post


Link to post
19 minutes ago, electron286 said:

Sorry, been a little while since I checked for a reply.

 

Here are the drive details of the first 3 drives, with the complete serial numbers for ease, all viewed with unRAID...

 

Under Drive details for the drives I see;


(Note using as the SMART controller type: 3Ware    2    /dev/twa1)
Model family:    SAMSUNG SpinPoint F3
Device model:    SAMSUNG HD502HJ
Serial number:    S27FJ9FZ404491


(Note using as the SMART controller type: 3Ware    1    /dev/twa1)
Model family:    SAMSUNG SpinPoint F3
Device model:    SAMSUNG HD502HJ
Serial number:    S27FJ9FZ404504


(Note using as the SMART controller type: 3Ware    1    /dev/twa0)
Model family:    Seagate Barracuda 7200.7 and 7200.7 Plus
Device model:    ST3160828AS
Serial number:    5MT44SV6

 

 

And under MAIN tab in Unraid under Devices, I see this;


Device        Identification

Parity        1AMCC_FZ404491000000000000 - 500 GB (sdf)

Parity 2    1AMCC_FZ404504000000000000 - 500 GB (sdg)

Disk 1        1AMCC_5MT44SV6000000000000 - 160 GB (sdc)


 

 

I just saw there have been a few updates to the tool.  Downloaded the latest version and this is what I now get, it no longer stalls, but I have this;

 

 

DiskSpeed - Disk Diagnostics & Reporting tool
Version: 2.1
 

Scanning Hardware
12:44:12 Spinning up hard drives
12:44:12 Scanning system storage
12:44:25 Scanning USB Bus
12:44:32 Scanning hard drives

Lucee 5.2.9.31 Error (application)

MessageError invoking external process

Detail/usr/bin/lspci: option requires an argument -- 's'
Usage: lspci [<switches>]

Basic display modes:
-mm Produce machine-readable output (single -m for an obsolete format)
-t Show bus tree

Display options:
-v Be verbose (-vv for very verbose)
-k Show kernel drivers handling each device
-x Show hex-dump of the standard part of the config space
-xxx Show hex-dump of the whole config space (dangerous; root only)
-xxxx Show hex-dump of the 4096-byte extended config space (root only)
-b Bus-centric view (addresses and IRQ's as seen by the bus)
-D Always show domain numbers

Resolving of device ID's to names:
-n Show numeric ID's
-nn Show both textual and numeric ID's (names & numbers)
-q Query the PCI ID database for unknown ID's via DNS
-qq As above, but re-query locally cached entries
-Q Query the PCI ID database for all ID's via DNS

Selection of devices:
-s [[[[<domain>]:]<bus>]:][<slot>][.[<func>]] Show only devices in selected slots
-d [<vendor>]:[<device>][:<class>] Show only devices with specified ID's

Other options:
-i <file> Use specified ID database instead of /usr/share/misc/pci.ids.gz
-p <file> Look up kernel modules in a given file instead of default modules.pcimap
-M Enable `bus mapping' mode (dangerous; root only)

PCI access options:
-A <method> Use the specified PCI access method (see `-A help' for a list)
-O <par>=<val> Set PCI access parameter (see `-O help' for a list)
-G Enable PCI access debugging
-H <mode> Use direct hardware access (<mode> = 1 or 2)
-F <file> Read PCI configuration dump from a given file

StacktraceThe Error Occurred in
/var/www/ScanControllers.cfm: line 456 

454: <CFSET tmpbus=Replace(Key,":","-","ALL")>
455: <CFFILE action="write" file="#PersistDir#/lspci-vmm-s_#tmpbus#_exec.txt" output="/usr/bin/lspci -vmm -s #Key#" addnewline="NO" mode="666">
456: <cfexecute name="/usr/bin/lspci" arguments="-vmm -s #Key#" timeout="300" variable="lspci" />
457: <CFFILE action="delete" file="#PersistDir#/lspci-vmm-s_#tmpbus#_exec.txt">
458: <CFFILE action="write" file="#PersistDir#/lspci-vmm_#tmpbus#.txt" output="#lspci#" addnewline="NO" mode="666">
 

called from /var/www/ScanControllers.cfm: line 455 

453: <!--- Get the controller information --->
454: <CFSET tmpbus=Replace(Key,":","-","ALL")>
455: <CFFILE action="write" file="#PersistDir#/lspci-vmm-s_#tmpbus#_exec.txt" output="/usr/bin/lspci -vmm -s #Key#" addnewline="NO" mode="666">
456: <cfexecute name="/usr/bin/lspci" arguments="-vmm -s #Key#" timeout="300" variable="lspci" />
457: <CFFILE action="delete" file="#PersistDir#/lspci-vmm-s_#tmpbus#_exec.txt">
 

Java Stacktracelucee.runtime.exp.ApplicationException: Error invoking external process
  at lucee.runtime.tag.Execute.doEndTag(Execute.java:258)
  at scancontrollers_cfm$cf.call_000046(/ScanControllers.cfm:456)
  at scancontrollers_cfm$cf.call(/ScanControllers.cfm:455)
  at lucee.runtime.PageContextImpl._doInclude(PageContextImpl.java:933)
  at lucee.runtime.PageContextImpl._doInclude(PageContextImpl.java:823)
  at lucee.runtime.listener.ClassicAppListener._onRequest(ClassicAppListener.java:66)
  at lucee.runtime.listener.MixedAppListener.onRequest(MixedAppListener.java:45)
  at lucee.runtime.PageContextImpl.execute(PageContextImpl.java:2464)
  at lucee.runtime.PageContextImpl._execute(PageContextImpl.java:2454)
  at lucee.runtime.PageContextImpl.executeCFML(PageContextImpl.java:2427)
  at lucee.runtime.engine.Request.exe(Request.java:44)
  at lucee.runtime.engine.CFMLEngineImpl._service(CFMLEngineImpl.java:1090)
  at lucee.runtime.engine.CFMLEngineImpl.serviceCFML(CFMLEngineImpl.java:1038)
  at lucee.loader.engine.CFMLEngineWrapper.serviceCFML(CFMLEngineWrapper.java:102)
  at lucee.loader.servlet.CFMLServlet.service(CFMLServlet.java:51)
  at javax.servlet.http.HttpServlet.service(HttpServlet.java:729)
  at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:292)
  at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:207)
  at org.apache.tomcat.websocket.server.WsFilter.doFilter(WsFilter.java:52)
  at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:240)
  at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:207)
  at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:212)
  at org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:94)
  at org.apache.catalina.authenticator.AuthenticatorBase.invoke(AuthenticatorBase.java:492)
  at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:141)
  at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:80)
  at org.apache.catalina.valves.AbstractAccessLogValve.invoke(AbstractAccessLogValve.java:620)
  at org.apache.catalina.valves.RemoteIpValve.invoke(RemoteIpValve.java:684)
  at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:88)
  at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:502)
  at org.apache.coyote.http11.AbstractHttp11Processor.process(AbstractHttp11Processor.java:1152)
  at org.apache.coyote.AbstractProtocol$AbstractConnectionHandler.process(AbstractProtocol.java:684)
  at org.apache.tomcat.util.net.AprEndpoint$SocketWithOptionsProcessor.run(AprEndpoint.java:2464)
  at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
  at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
  at org.apache.tomcat.util.threads.TaskThread$WrappingRunnable.run(TaskThread.java:61)
  at java.lang.Thread.run(Thread.java:748)
 

Timestamp9/8/19 12:44:32 PM PDT

Share this post


Link to post

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.