DiskSpeed, hdd/ssd benchmarking (unRAID 6+), version 2.10.8


Recommended Posts

6 hours ago, SShadow said:

I know this isn't exactly what you asked for but I hope it helps.  Let me know if you need more.

 

It helped a lot! Once I saw the difference between the two was displaying the controller block & iframe, I was able to isolate the issue to a location in the code. It loops over the list of drives selected and does some validation before adding the flagged drive(s) to the benchmark list.

 

If you could, take a screen shot of your drive selection (no need to proceed past it) and submit a debug file from the "Create debug file" link on the bottom of the main page showing all the drives. Email to [email protected]. This will let me look at your configuration and trace through the logic using it.

 

I appreciate your assistance!

Edited by jbartlett
Link to comment
7 hours ago, klee said:

Only two of four HDD's are being benchmarked. I'm able to run the controller benchmark and everything seems ok. However, when I try to benchmark my HDD's only two show up and I get a blank screen when I try to benchmark the other two. Any idea what's going on?

image.thumb.png.01ce689e2900e571cf4d5bc0ca666c81.png

 

image.thumb.png.4150741b3a2c3e809444aea675db2410.png

 

image.thumb.png.d7eea54362d020b693e93a6b5564651d.png

You are likely to get better informed feedback if you attach your system’s diagnostics zip file to your next post in this thread.

Link to comment
On 11/2/2023 at 7:30 PM, jbartlett said:

Sorry, I still don't have any idea what you were referring to.

 

Was it reporting anything like retrying (x)? If so, check the SpeedGap box on the drive selection screen. How many drives do you have attached to the controller that's not completing? It's set to timeout after 50 minutes, figured that should be enough time - unless someone has a crazy loaded down controller. Are you that someone?

 

I was referring to my ssd's.  When I drilled into them, the information was a little off.  The vendor field was blank, but the vendor name was prepended to the model number....so I thought the information was being parsed incorrectly.

 

I didn't see anything indicating retrying anything. 

If there's a way to shoot ones self in the foot, I will find it....so there's a very good chance that I am that someone. 🙂

 

image.thumb.png.9d86b5e0ad710493221f002974a0568f.pngI have 24 drives in each server.   

 

The above server still gets the error even with the speedgap box checked.

The below server, also with speedgap box checked, just stops/freezes.  

The screenshots are where they errored (above) or froze (below).

 

You mentioned a 50 minute timeout...I don't know if that's exactly how long they ran for but it's really close to about an hour when I either get the error (above) or it freezes (below)....is there a way for me to increase to see if it makes a difference.?

 

--For the below server, there were a few changes since last time which may be why it no longer errors but freezes.

   1) Changed out the backplane from 6g to 12g

   2) Changed the hba from 6g to 12g

   3) Dual linked the hba to the backplane (it had been single link)

   4) Moved the hba from x4 to x8 slot (the card is x8)

I suspect 3 and 4 had the most impact.

 

image.thumb.png.5d039949eb6000e9f42686e31c7deb4e.png
 

 

Want to really, really thank you for this tool!!  Until I ran the controller benchmark, I never realized my storage was so bottlenecked and misconfigured.   Previously, my parity checks would take about 50 hours....running it now and it's flying, should complete it in about 20 hours.   I have some changes to my other server as well based on what I learned with your tool.   

So again, thank you so much!!

 

 

Link to comment
On 11/7/2023 at 10:46 PM, itimpi said:

You are likely to get better informed feedback if you attach your system’s diagnostics zip file to your next post in this thread.

The Diagnostic zip file doesn't do me any good. There's a "Create Debug File" link inside the DiskSpeed app at the bottom of the main page that provides the data I can make use of.

Link to comment
On 11/7/2023 at 2:50 PM, klee said:

Only two of four HDD's are being benchmarked. I'm able to run the controller benchmark and everything seems ok. However, when I try to benchmark my HDD's only two show up and I get a blank screen when I try to benchmark the other two. Any idea what's going on?

 

image.thumb.png.d7eea54362d020b693e93a6b5564651d.png

 

 

When this happens, can you right-click anywhere and select "Save Page As" and attach that to a reply?

Link to comment
On 11/8/2023 at 10:19 AM, broncosaddict said:

I didn't see anything indicating retrying anything. 

If there's a way to shoot ones self in the foot, I will find it....so there's a very good chance that I am that someone. 🙂

 

I'm afraid to look at my own feet.....

 

On 11/8/2023 at 10:19 AM, broncosaddict said:

You mentioned a 50 minute timeout...I don't know if that's exactly how long they ran for but it's really close to about an hour when I either get the error (above) or it freezes (below)....is there a way for me to increase to see if it makes a difference.?

 

Then I think that's the issue. It looks like the main process is timing out after the configured 50 minutes before it finishes doing all the drives on the system. I figured that would be enough but nope, looks like it's not. Didn't want to omit any timeout in case of a rogue runaway process - should never happen but when you deal with computers, the "never happens" can be known to happen. I'll update the timeout to 2 hours. It's possible that the Lucee app server is ending the process but before the "Show this error" is displayed so it looks like it freezes instead.

 

On 11/8/2023 at 10:19 AM, broncosaddict said:

Want to really, really thank you for this tool!!  Until I ran the controller benchmark, I never realized my storage was so bottlenecked and misconfigured.   Previously, my parity checks would take about 50 hours....running it now and it's flying, should complete it in about 20 hours.   I have some changes to my other server as well based on what I learned with your tool.   

So again, thank you so much!!

 

Glad to help!

Link to comment
1 hour ago, jbartlett said:

Version 2.10.7.3 has been pushed with the increased timeout and typo correction.

Thank you!! I will give it a shot tomorrow.

 

So the parity check slowed down toward the end so it took 26 hours instead of the estimated 20 hours at the start....but that's still about a reduction of 50%....I'll take that every day and twice on Sundays. 🙂

 

image.thumb.png.899e58dca433136307eadeacbdb3f3cc.png

Thanks again!!

Link to comment
13 hours ago, jbartlett said:

Version 2.10.7.3 has been pushed with the increased timeout and typo correction.

That resolved the issue, both servers finished without any issue.   Thanks!

Having never studied/reviewed this type of output before, I have a few questions regarding the output as I'm not really sure how to interpret what I'm looking at nor any actions I need to take.

 

1) The sharp drop of the black and yellow lines is bad?   (start looking for replacements?)

2) The flat lines that remain below 50M the entire time.  Same as number 1?

3) When it looks like a piece of a sine wave like the pink line.  Is that the drive speeding up/slowing down and what is labeled a speed gap?  

4) I'm confused with "Bandwidth was capped on Disk 15".   Disk 15 is one those 3 flat lines that remain straight across below 50 the entire time.   

      a) Is that because it's flat it interprets it bandwidth being capped? 

      b) It mimics 2 other drives with the same line, why would they not be listed?   

      c) What would cause a drive going so slow to have anything capped? 

5) The verbiage at the bottom "A speed Gap was detected on a drive which means the amount of data reach from one second to the next...."

     a) How to determine which drive?

     b) Is "reach" a typo in  "amount of data reach from"?  I'm thinking you meant "received", but if not, what does that mean?

6)  If you keep getting speed gap warnings, the solution is to disable speed gap detection, so why have the warning in the first place?

 

--This is my older server so drives on their last legs is not surprising 

image.thumb.png.6bf95b094f70816505adacbd108dd9cf.png

Edited by broncosaddict
Adding additional info
Link to comment
On 11/10/2023 at 7:05 AM, broncosaddict said:

1) The sharp drop of the black and yellow lines is bad?   (start looking for replacements?)

The blank line dropping fast between 4TB and 5TB or the yellow line between 10TB and 11TB? Yeah, I would say something's up with those drives, especially if they tested normally previously. Search for your drive on the companion website HDDB and see what others are getting with their scans.

 

On 11/10/2023 at 7:05 AM, broncosaddict said:

2) The flat lines that remain below 50M the entire time.  Same as number 1?

Sometimes a drive will drop from SATA 3 to a lower speed. A full power-off tends to correct those but I've seen them come back later. Potentially a faulty cable. If they test the same by itself with no other drives also running benchmarks, it's likely the drive is in some sort of safe mode.

 

On 11/10/2023 at 7:05 AM, broncosaddict said:

3) When it looks like a piece of a sine wave like the pink line.  Is that the drive speeding up/slowing down and what is labeled a speed gap?  

I'm not sure what causes the jumping line like that. See what others are getting in their drive to see if it's expected.

 

A "Speed Gap" is my term for when a drive is reading at a steady rate and then there's a big drop in the amount of data being read and then it goes back up to continue normally. Typically, this are the drives being accessed by some other process at the same time but it could also indicate that there is a spot that has remapped sectors.

 

On 11/10/2023 at 7:05 AM, broncosaddict said:

4) I'm confused with "Bandwidth was capped on Disk 15".   Disk 15 is one those 3 flat lines that remain straight across below 50 the entire time.   

      a) Is that because it's flat it interprets it bandwidth being capped? 

      b) It mimics 2 other drives with the same line, why would they not be listed?   

      c) What would cause a drive going so slow to have anything capped? 

 

A consecutive and non-decreasing rate of read speeds (within a small percentage) indicates the drive is capable of sending data faster than the data link can support. This can be easily seen on a older multi-drive controller loaded down with SSD's. If Drive 15 is also the one that looks like the sine wave, I don't have logic that clears the bandwidth indicator if it suddenly spikes higher than when the steady read occurred. There are likely other reasons for it, but I don't have such drives to add logic to detect them.

 

As to Point C, multiple devices going at the same time - but not likely to be the cause here. The Controller Benchmark does read all drives at the same time to search for controller capacity being maxed out. Could be something wonky with how the drive was designed. Could be a shucked drive that has platters/heads disabled. <shrug>

 

On 11/10/2023 at 7:05 AM, broncosaddict said:

5) The verbiage at the bottom "A speed Gap was detected on a drive which means the amount of data reach from one second to the next...."

     a) How to determine which drive?

     b) Is "reach" a typo in  "amount of data reach from"?  I'm thinking you meant "received", but if not, what does that mean?

 

It's displayed during testing, but I can look into adding it to the overview.

b - should be "read", corrected for the next release.

 

On 11/10/2023 at 7:05 AM, broncosaddict said:

6)  If you keep getting speed gap warnings, the solution is to disable speed gap detection, so why have the warning in the first place?

To make sure you're aware of it if it happens, that one of the drives could be wonky or have unexpected behaviors.

Link to comment
6 hours ago, The_Target said:

Is there a fix yet for when the benchmark fails to run? I get exactly the same problem as SShadow where I can't click the dots so I can't give any more info as to what is happening.

 

Thank you for the file. Looks like there's no error happening. I'm still investigating.

Link to comment

Hello, I just installed this in hope I could check if both my cache drive have similar speed (I think one of them is very slow). They are both SSD and detected by the apps but it says it cannot be benchmark because it cannot found the mount point. I did attach /mnt to the docker in /mnt/unraid but still having the same error. Is it because they are cache drive in mirror?

 

Thank you

Link to comment
19 hours ago, Nodiaque said:

Hello, I just installed this in hope I could check if both my cache drive have similar speed (I think one of them is very slow). They are both SSD and detected by the apps but it says it cannot be benchmark because it cannot found the mount point. I did attach /mnt to the docker in /mnt/unraid but still having the same error. Is it because they are cache drive in mirror?

 

Thank you

It won't benchmark multi-drive devices. But if the drives have their own block device, you can run the following command on each replacing "/dev/sdi" for yours. CTRL-C to end or it'll end at end of drive.

 

dd if=/dev/sdi of=/dev/null bs=256MB iflag=direct conv=noerror status=progress

Link to comment

Concerning the blank screen on benchmarking a drive, it seems to be related to a cloned or restored drive in which both the old & new drive are still on the system. I'm adding logic to detect and allow the benchmark if one of the duplicate drives is not mounted.

 

Current workarounds would be to change the Partition ID (s) on the old drive or do delete/recreate the partition.

Edited by jbartlett
Link to comment

Good morning, I updated to 2.10.7.5 and unfortunately I still have the blank screen issue on my one server.  My other server with the HBA card works fine.  I did have an odd issue on both servers though experiencing a two minute time out error Fetching Drive Platter Information.  I guess my server with the issue falls under the cloned drive scenario you mention above since I have one data drive and one parity drive?  This was a new build and when I added the drives I did a new config to add them.  
 423292304_DiskSpeedDriveScanError.thumb.jpg.c1359ebaebdc44c038dea62be89202d9.jpg  

Edited by SShadow
Link to comment
On 10/10/2023 at 12:59 PM, jbartlett said:

The graph is displayed if the file (smb share path) \\nas\appdata\DiskSpeed\Instances\local\driveinfo\DriveBenchmarks.txt exists which contains the graph data.

 

Can you check to see if that file exists and can be viewed when the graph is or is not visible?

File exists for me, but graph is missing. I'm on version 2.10.7.4.

Link to comment
  • jbartlett changed the title to DiskSpeed, hdd/ssd benchmarking (unRAID 6+), version 2.10.8

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.