Drive performance testing (version 2.6.5) for UNRAID 5 thru 6.4


Recommended Posts

There's definitely a bigger issue here in that it's not able to detect many of your drive assignments or UNRAID slots.

 

ETA: Actually, it did detect the drive assignments but it's listing all the drives twice. I'll need to dig into this one. Are you running the most recent version of the script?

Edited by jbartlett
Link to comment

Bah, I see that now too. The original plugin-generated one however is not working for me in chrome or FF. I regretfully did not have the foresight to save a copy of the full script run, not realizing it got overwritten by the 2nd smaller run.

 

I'll run the full script again.

 

Question: why do my drives appear twice under the drive identification area?

Link to comment

I noticed you asked another user for this on the last page for troubleshooting, while I don't have the same problem here is my output if it helps:

Unraid is 6.3.5

root@Unraid:~# lsscsi -i
[0:0:0:0]    disk    SanDisk  Cruzer Fit       1.27  /dev/sda   SanDisk_Cruzer_Fit_4C530012270322105470-0:0
[1:0:0:0]    cd/dvd  HL-DT-ST BDDVDRW GGC-H20L 1.03  /dev/sr0   HL-DT-ST_BDDVDRW_GGC-H20L_123456789C5F-0:0
[2:0:0:0]    disk    ATA      TOSHIBA DT01ACA3 ABB0  /dev/sdb   -
[2:0:1:0]    disk    ATA      TOSHIBA HDWD130  ACF0  /dev/sdc   -
[2:0:2:0]    disk    ATA      TOSHIBA HDWD130  ACF0  /dev/sdd   -
[2:0:3:0]    disk    ATA      WDC WD10EACS-00D 1A01  /dev/sde   -
[2:0:4:0]    disk    ATA      ST4000DM000-1F21 CC54  /dev/sdf   -
[2:0:5:0]    disk    ATA      TOSHIBA DT01ACA3 ABB0  /dev/sdg   -
[2:0:6:0]    disk    ATA      Hitachi HUS72403 A5F0  /dev/sdh   -
[2:0:7:0]    disk    ATA      Hitachi HUS72403 A5F0  /dev/sdi   -
[2:0:8:0]    disk    ATA      TOSHIBA DT01ACA3 ABB0  /dev/sdj   -
[2:0:9:0]    disk    ATA      WDC WD10EACS-00D 1A01  /dev/sdk   -
[2:0:10:0]   disk    ATA      Hitachi HUS72403 A5F0  /dev/sdl   -
[2:0:11:0]   disk    ATA      TOSHIBA DT01ACA3 ABB0  /dev/sdm   -
[2:0:12:0]   disk    ATA      WDC WD10EARX-00N AB51  /dev/sdn   -
[2:0:13:0]   disk    ATA      WDC WD10EACS-00D 1A01  /dev/sdo   -
[2:0:14:0]   disk    ATA      TOSHIBA DT01ACA3 ABB0  /dev/sdp   -
[2:0:15:0]   disk    ATA      TOSHIBA DT01ACA3 ABB0  /dev/sdq   -
[2:0:16:0]   disk    ATA      SAMSUNG MZ7WD480 NS00  /dev/sdr   -
[2:0:17:0]   disk    ATA      TOSHIBA DT01ACA3 ABB0  /dev/sds   -
[2:0:18:0]   disk    ATA      ST3000DM001-1ER1 CC43  /dev/sdt   -
[2:0:19:0]   disk    ATA      WDC WD10EACS-00Z 1B01  /dev/sdu   -
[2:0:20:0]   disk    ATA      ST4000DM000-1F21 CC54  /dev/sdv   -
[2:0:21:0]   enclosu LSI CORP SAS2X36          0717  -          -
[2:0:22:0]   disk    ATA      ST4000DM005-2DP1 0001  /dev/sdw   -
[2:0:23:0]   disk    ATA      ST4000DM005-2DP1 0001  /dev/sdx   -

 

Edited by chaosratt
Link to comment

Using the Firefox debugger tools, at least one of the JSON data lines was missing a closing bracket which effectively breaks the graphs. But I'm thinking the root issue is in detecting the inventory properly. I have a script for just that on my backup NAS, I'll boot it up and get it soon.

Link to comment

Pretty standard setup as far as I'm aware. SuperMicro SC846 chassis with SAS2 backplane, SATA drives, LSI Controller (M1015 flashed to HBA mode). Can't remember if I had 1 or 2 cables going to the backplane from the controller off hand, will have to check when I get home.

Link to comment
  • 3 weeks later...

@jbartlett -

 

Had been meaning to try out your diskspeed script for some time, and this weekend finally installed it and tried it out.

 

It is awesome and is pointing out some interesting things about my system! For one thing, I have an old SSD drive that is showing very inconsistent performance. At 25G it drops to about 210 MB/sec - very low for an SSD. I think I may not have "trimmed" the SSD sufficiently, but not sure (sdl). The new SSD is quite a bit faster! (I was running 3 iterations so this is not a fluke).

 

But I also found a problem, that I wanted report. I made an effort to fix the problem, and think I have fixed it 95%, but I'm seeing some results leading me to question whether this is working.

 

So thought I would share the problem, my "fix", and ask for your perspective on whether you think my results appear valid. (Before the fix the parity disk was only showing one data point on the graph - at the starting line 0%.)


The root cause is that my parity disks is a hardware RAID0 configuration from my Areca ARC-1200 card. It does not much like the hdparm command. When I run the hdparm -I on it, I get the following output:

 

/dev/sdb:
SG_IO: bad/missing sense data, sb[]:  f0 00 05 00 00 00 00 0b 00 00 00 00 20 00 00 00 02 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00

ATA device, with non-removable media
Standards:
        Likely used: 1
Configuration:
        Logical         max     current
        cylinders       0       0
        heads           0       0
        sectors/track   0       0
        --
        Logical/Physical Sector size:           512 bytes
        device size with M = 1024*1024:           0 MBytes
        device size with M = 1000*1000:           0 MBytes
        cache/buffer size  = unknown
Capabilities:
        IORDY not likely
        Cannot perform double-word IO
        R/W multiple sector transfer: not supported
        DMA: not supported
        PIO: pio0

It seems that your script envisioned this failing, with an alternate means of getting the disk's size, but it expects that stderr would contain the string "failed". As you can see I don't get "failed", but I do get "bad". So I changed the line around line 560 as follows:

#Err=$(grep "failed" /tmp/diskspeed.err)
Err=$(grep "[failed|bad]" /tmp/diskspeed.err)

This really helped, but the drive was reporting space closer to 7500GB vs the 8001G that other 8T drives were reporting as they took their samples, leading me to make one more change on or around line 570. (The 500G difference was resulting in the 10%, 20%, etc. points being too early, and was completely ignoring the last 500G of the disk.)

#Capacity=$GB
Capacity=$(( $Bytes / 1000 / 1000 / 1000 ))

With this change, the reading of data went as expected. On the summary graph HTML, the drive is still showing 7.3T, and has an unpleasant message about unable to determine the type of drive. But this seemed cosmetic and I didn't attempt to fix.

Parity: 	Unable to determine  	        7.3 TB	  205 MB/sec avg
Disk 1: 	HGST HDN728080ALE604 VLKK7EBZ  	8 TB	  163 MB/sec avg

So that seemed to be that. But what was odd is that the performance of the drive is showing an incredibly consistent speed for most of the drive size. I was not expecting that. The drive is showing 208 MB/sec until the 6.5T mark, and only then begins to decline down to 171 MB/sec at the very end. 

 

Literally as I am writing this I think I have a possible explanation. The RAID card is a PCIe 1.1 x1 slot, which a max bandwidth of 250 MB/sec. Maybe that is the problem, that the disk is hitting that bandwidth cap and not able to return data any faster than 208 MB/sec, and it is only at the 6.5TB mark that the read speed drops below the bandwidth limitation. But 208 is somewhat less than 250, so not sure that it is it or not. Wanted to confirm that my changes are correct and ask if there are any other ideas you might have on the flat line performance for this drive.

 

( @johnnie.black, I know you've done a lot of testing with the script and wonder if you had any ideas.)

 

John, thanks for the script and any insights you might have!

Link to comment
7 minutes ago, bjp999 said:

which a max bandwidth of 250 MB/sec.

 

That is the theoretical max speed, you'll never achieve that, there's always overhead and in my experience PCIe usable bandwidth is about 70-80% of the theoretical max.

 

You can see some of the controllers I tested here:

 

Link to comment
30 minutes ago, bjp999 said:

So thought I would share the problem, my "fix", and ask for your perspective on whether you think my results appear valid. (Before the fix the parity disk was only showing one data point on the graph - at the starting line 0%.)

I haven't testing it with a RAID set up though I recently installed an old Promise IDE card that I can experiment with. This particular issue is because the hdparam is returning a 0 GB size and the percentages would all fall under zero. I'll need to code in a different way to get the raw drive capacity.

 

33 minutes ago, bjp999 said:

unpleasant message about unable to determine the type of drive

This isn't the type of drive, it's the drive serial number and as an array, there really isn't a true value to display. But there might be one generated somewhere. Try running "lsscsi -i" to see if it gives a label in the far right column for the array. If it doesn't, there's really not much to do except seeing if there is a way to foolproof identify this as a RAID and label it as such.

Link to comment
16 minutes ago, jbartlett said:

I haven't testing it with a RAID set up though I recently installed an old Promise IDE card that I can experiment with. This particular issue is because the hdparam is returning a 0 GB size and the percentages would all fall under zero. I'll need to code in a different way to get the raw drive capacity.

 

This isn't the type of drive, it's the drive serial number and as an array, there really isn't a true value to display. But there might be one generated somewhere. Try running "lsscsi -i" to see if it gives a label in the far right column for the array. If it doesn't, there's really not much to do except seeing if there is a way to foolproof identify this as a RAID and label it as such.

 

Thanks John! Hope below might help ...

 

/proc/mdcmd should give you what you need for array drives

 

Here are the values for my RAID0 parity (I gave it the name "PARITY" via its config tool. The controller assigned that long serial number ending with 223):

diskId.0=PARITY_0000001296206223
rdevName.0=sdb
rdevSize.0=7814036428

GB calculation:

7814036428 * 1024 / 1000/1000/1000 = 8001.57

 

And here is an example for a regular disk:

diskId.1=HGST_HDN728080ALE604_VLKK7xxx
rdevName.1=sdv
rdevSize.1=7814026532

GB calculation

7814026532 * 1024 / 1000/1000/1000 = 8001.56

 

 

 

"lsscsi -i" returns similar string, and also includes non-array disks. No size though.

[1:0:0:0]    disk    Areca    PARITY           R001  /dev/sdb   PARITY_0000001296206223
[8:0:0:0]    disk    ATA      HGST HDN728080AL W91X  /dev/sdv   SATA_HGST_HDN728080ALVLKK7xxx

 

 

 

"fdisk -l /dev/sdX" returns size information:

 

Here is the output for the parity disk:

Disk /dev/sdb: 7.3 TiB, 8001573355520 bytes, 15628072960 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disklabel type: gpt
Disk identifier: 445C476F-4F1C-4C22-9E38-5E13899A9987

Device     Start         End     Sectors  Size Type
/dev/sdb1     64 15628072926 15628072863  7.3T Linux filesystem

 

Hope this helps!

Link to comment

I can't rely on mdcmd because it needs to be global in being able to determine raw capacity for drives both in & out of the array. If memory serves, fdisk, had issues with certain types of drives.

 

I'll need to run this command through my VM with just about every type of drive configured but it looks like "lsscsi -s" may work.

Link to comment
1 hour ago, johnnie.black said:

 

That is the theoretical max speed, you'll never achieve that, there's always overhead and in my experience PCIe usable bandwidth is about 70-80% of the theoretical max.

 

You can see some of the controllers I tested here:

 

 

Interesting. 70%-80% of 250MB/sec would be 175 to 200 MB/sec. 208 MB/sec looking pretty good actually!

 

Seems my theory is probably reality. The max throughput on my Areca RAID card is 208 MB/sec. Otherwise I'd probably be hitting 320+ on the outer sectors. Too bad, although practically speaking makes little difference. It is as fast or faster than any of my other disks. But might have helped on parallel accesses.

 

Looking to see if there is a PCIe 2.0 x1 RAID0 non-Marvell (say 3 times fast) card. There is an Areca 1203-2I but it is well over $200 bucks. Will keep eyes open for a deal, but in no hurry. Didn't see any other non-Marvell options.

Link to comment
  • 2 weeks later...

I'm making good progress on the next version but I'd like to get your help in populating a collection of drive models. I currently have details on over 300 Western Digital & Seagate drives 1TB and larger along with a few scattered ones to complete my own personal set. The time consuming part is finding the model numbers and a image representation of that drive to display instead of a default/generic drive image. If you'd like to help and to ensure your run is more pleasant, please run the attached "diskmodels.sh" script. What it does is to send the drive models of the drives you have connected to a web site app and it'll return either it was added, previously added, or contains unexpected characters. I've spent the past 4 days working on the drive model database and I'd rather work on the website code for now so any help on getting models for non-Seagate/non-WD or old no-longer-sold drives in use will help.

 

If the diskmodel.sh execution of "lshw" seems to be running very slow (more than a couple seconds), spin up your hard drives.

 

The drive information list contains the model number, along with interface, RPM, and cache size if known. The rest of the drive details (not displayed yet in the images) is fetched from the OS itself.

 

To Do:

  • Display full controller information
  • Display full drive information including which port it's hooked to on the controller
  • Perform bandwidth tests on a controller using one and more drives
  • Perform speed testing on one or more drives, allowing 1 drive per controller to be tested at the same time
  • Display speed results in a graph with optional historical graph comparison from past runs to see if the drive is performing worse over time
  • Display a heat map of the drive and hopefully a heat map of the drive by platter

I'm developing this on an UNRAID server running a Lucee (ColdFusion app server) self-contained install and will see about porting it to PHP in the RC phase. I've been coding in ColdFusion since 1998 so I can develop code fast but I haven't done much development of new PHP code, just modifying existing apps. In the plugin app itself on the UNRAID GUI, the user will have the option to enable/disable the Lucee server as the server itself will use up to 256MB of RAM in addition to however much file space the app server takes - so it won't run on a system with less than 2 GB of available RAM (1.25GB of that is for the speed testing itself).

 

I'm pretty excited about this project. It seems like there's nothing else out there like it. The closest thing was an old text-only app.

 

Attached are screen shots of my two UNRAID systems listing the controllers & the drives attached to those controllers.

main.png

backup.png

diskmodels.sh

Edited by jbartlett
Link to comment

I ran your diskmodels.sh and this was my output. 

is that what you are looking for?

 

Checking model [HGST HDN724040AL] - Added
Checking model [ST8000AS0002-1NA] - Previously added
Checking model [Cruzer] - Added
Checking model [Samsung SSD 850] - Previously added
 

Link to comment
6 hours ago, Harro said:

I ran your diskmodels.sh and this was my output. 

is that what you are looking for?

 

Yup! I don't need C&P of the runs but I do see that you & others have ran the script. The Cruzer is probably your USB drive, they've been a bit harder to trace in the OS /sys/devices tree so they're not in yet. I wouldn't be able to do more than a generic brand icon for "Cruzer" anyways. But I do plan on adding support for putting in your own drive image and optionally overlaying the drive size over the image with the ability to position, rotate & highlight the text to best fit the image.

Link to comment
29 minutes ago, jbartlett said:

 

I've seen that behavior on my systems if the drives aren't spun up. But no worries, thanks for the attempt!

 

Maybe I should read the instructions a little better!

 

There IS a significant delay in running that command folks should be aware of. I'd give at least 30 seconds even after the disks are spun up.

 

I added a total of 11 drive types between prod and backup servers (not including USB sticks). One of which is my RAID0 parity (just called PARITY), which might not be much help. But also a mixture of mostly Hitachi/HGST drives (2T - 8T), but also including a Toshiba 5T, WD 2T, and ancient 750G Seagate.

 

Hope this helps!

 

Link to comment
11 hours ago, bjp999 said:

There IS a significant delay in running that command folks should be aware of. I'd give at least 30 seconds even after the disks are spun up.

 

There are command line options to disable certain tests non-related to the storage subsystem but it didn't seem to do anything for me. I opted to just leave it as-is to prevent against any issues on the different systems.

Link to comment
On 7/9/2017 at 7:25 AM, jbartlett said:

I'm making good progress on the next version but I'd like to get your help in populating a collection of drive models. I currently have details on over 300 Western Digital & Seagate drives 1TB and larger along with a few scattered ones to complete my own personal set. The time consuming part is finding the model numbers and a image representation of that drive to display instead of a default/generic drive image. If you'd like to help and to ensure your run is more pleasant, please run the attached "diskmodels.sh" script. What it does is to send the drive models of the drives you have connected to a web site app and it'll return either it was added, previously added, or contains unexpected characters. I've spent the past 4 days working on the drive model database and I'd rather work on the website code for now so any help on getting models for non-Seagate/non-WD or old no-longer-sold drives in use will help.

 

If the diskmodel.sh execution of "lshw" seems to be running very slow (more than a couple seconds), spin up your hard drives.

 

The drive information list contains the model number, along with interface, RPM, and cache size if known. The rest of the drive details (not displayed yet in the images) is fetched from the OS itself.

 

To Do:

  • Display full controller information
  • Display full drive information including which port it's hooked to on the controller
  • Perform bandwidth tests on a controller using one and more drives
  • Perform speed testing on one or more drives, allowing 1 drive per controller to be tested at the same time
  • Display speed results in a graph with optional historical graph comparison from past runs to see if the drive is performing worse over time
  • Display a heat map of the drive and hopefully a heat map of the drive by platter

I'm developing this on an UNRAID server running a Lucee (ColdFusion app server) self-contained install and will see about porting it to PHP in the RC phase. I've been coding in ColdFusion since 1998 so I can develop code fast but I haven't done much development of new PHP code, just modifying existing apps. In the plugin app itself on the UNRAID GUI, the user will have the option to enable/disable the Lucee server as the server itself will use up to 256MB of RAM in addition to however much file space the app server takes - so it won't run on a system with less than 2 GB of available RAM (1.25GB of that is for the speed testing itself).

 

I'm pretty excited about this project. It seems like there's nothing else out there like it. The closest thing was an old text-only app.

 

Attached are screen shots of my two UNRAID systems listing the controllers & the drives attached to those controllers.

main.png

backup.png

diskmodels.sh

 

@jbartlett

 

What tool did you use to product this output?

 

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.