DiskSpeed, hdd/ssd benchmarking (unRAID 6+), version 2.10.7


Recommended Posts

8 hours ago, johnnie.black said:

Very cool, if you need help testing I have have some disks with known slow sectors.

That would be awesome! If you wouldn't mind, please execute this command against such a drive from a command line. It'll read the entire drive, 1 GB at a time, logging how long it took to read that GB to a file. Replace the two "sdx" with the Drive ID and email it to [email protected] along with the make & model.

 

dd if=/dev/sdx of=/dev/null bs=1GB skip=0 iflag=direct status=progress conv=noerror 2> /mnt/user/appdata/sdx.scan

Link to comment
2 hours ago, jonathanm said:

That sounds like a recipe to enable parity based file corruption detection. Instead of just telling you that parity is wrong at a certain address, you could take that address and generate a list of potentially affected files, one on each drive, if a file exists at that address. Then before you correct parity, you could verify each file against backup or external checksum to determine whether the the data drive should be updated instead. With dual parity, maybe you could correct corruption automatically?

 

It's been too long since I worked through the scenarios with dual parity and data correction, but I have it in the back of my mind that one of the stumbling blocks was determining exactly which files corresponded to a specific address.

Interesting! The filefrag command reports on the information (documenting here so I don't have to rediscover it again in another week like I had to do today) but only works against a direct mounted drive and not against a combined directory such as in /mnt/user.

root@NASBackup:/mnt/disk1/Movies/xXx# filefrag -e -b512 xXx.m4v
Filesystem type is: 58465342
File size of xXx.m4v is 6129442384 (11971568 blocks of 512 bytes)
 ext:     logical_offset:        physical_offset: length:   expected: flags:
   0:        0..  262015: 15548729960..15548991975: 262016:
   1:   262016.. 1048447: 15539024720..15539811151: 786432: 15548991976:
   2:  1048448.. 4194175: 15530636232..15533781959: 3145728: 15539811152:
   3:  4194176..11971567: 15444185408..15451962799: 7777392: 15533781960: last,eof

The issue I'm trying to work through now is being able to tell what physical drive is mounted where on the file system. We may know that sdb is mounted at /mnt/disk1 but every unix command I've been able to find so far shows that "/dev/md1" is mounted. I haven't found a way yet to trace back to find the physical drive sdb from md1 or take sdb and see how it's linked to the virtualfs md1. The purpose of this is to be agnostic in case the Docker is ran on a non-unraid OS.

Link to comment
1 hour ago, jbartlett said:

The issue I'm trying to work through now is being able to tell what physical drive is mounted where on the file system. We may know that sdb is mounted at /mnt/disk1 but every unix command I've been able to find so far shows that "/dev/md1" is mounted. I haven't found a way yet to trace back to find the physical drive sdb from md1 or take sdb and see how it's linked to the virtualfs md1. The purpose of this is to be agnostic in case the Docker is ran on a non-unraid OS.

Looks like the file /proc/mdstat gives me the linking information I need.

diskNumber.1=1
diskName.1=md1
diskSize.1=9766436812
diskState.1=7
diskId.1=WDC_WD100EFAX-68LHPN0_JEGTUUVM
rdevNumber.1=1
rdevStatus.1=DISK_OK
rdevName.1=sdg
rdevOffset.1=64
rdevSize.1=9766436812
rdevId.1=WDC_WD100EFAX-68LHPN0_JEGTUUVM
rdevNumErrors.1=0
rdevLastIO.1=0
rdevSpinupGroup.1=0

I can loop over that to map out the md mounts.

Link to comment

I am sorry if this has been covered in the topic already, I did search.

 

Found this great app today but am having a few problems with it on an LSI 9300-8i. It seems to benchmark the 2x SSDs (cache and unassigned for plex) that are connected directly to my motherboard but the 8x drives connected to my HBA (LSI 9300-8i) are not scanned. It just seems to hang here. 30 minutes and counting. 

 

image.thumb.png.ebeee5c52d6844c6b58e0b8fc4542ccb.png

 

Incidentally, the controller and drives are accurately represented at the beginning before starting the benchmark. Maybe I'm doing something wrong?

Link to comment
16 hours ago, SpuddyUK said:

Found this great app today but am having a few problems with it on an LSI 9300-8i. It seems to benchmark the 2x SSDs (cache and unassigned for plex) that are connected directly to my motherboard but the 8x drives connected to my HBA (LSI 9300-8i) are not scanned. It just seems to hang here. 30 minutes and counting.

 

Incidentally, the controller and drives are accurately represented at the beginning before starting the benchmark. Maybe I'm doing something wrong?

See the period at the end of the line "Click on a drive label to hide or show it." ? There's a hidden link on that period that when clicked on will make visible the hidden iframes that are doing all the actual work. You'll be able to see if there's an error or not or what's happening.

 

It sounds like the drives are properly being detected but there's an error happening when it's trying to do read tests against the drives on that controller.

 

Copy-n-paste any error message. You do not need to copy-n-paste the Java stack trace.

Link to comment

Thanks a lot for this awesome tool! It looks like I finally figured out the issues with my server that bugged me pretty much since I’m running it. Not yet sure why the results from one PCIe x8 slot are so much worse than in the other x8 one, but still, at least it finally works in the latter.

 

However, even after "fixing" it this way, I’m still worried about my parity drive (Seagate, I’ve lost so many of them...). For some reason, it is noticeably slower at first, with the rest of the graph looking kinda okay (not exactly linear, but linear-ish). While I can reproduce that the first test is always around 25MBs slower than the second, the rest of the "valleys" and "bumps" fluctuates with each test, that’s why I personally focused on the start of the test, although my other drives look better overall. What do you think, should I be worried? 

 

(Oh, by the way, I tested each 5% with this test, I guess the curve necessarily won’t look as linear that way?)

 

Bildschirmfoto 2020-05-08 um 11.38.56.png

Edited by aureus
Link to comment
On 2/4/2019 at 7:02 AM, jbartlett said:

Good idea! I'll add it to my To Do list.

I was looking for a way to export, so I can use the data gained and the positions of the drives in my array to try and debug issues.

It doesnt look like anything like that has been implemented yet?  


I'm running the latest, and I'll probably just old school copy and paste and screen shot, but it would be nice to have an export function!

Link to comment
2 hours ago, alexdodd said:

I was looking for a way to export, so I can use the data gained and the positions of the drives in my array to try and debug issues.

It doesnt look like anything like that has been implemented yet?  


I'm running the latest, and I'll probably just old school copy and paste and screen shot, but it would be nice to have an export function!

Nope, not yet.

 

I'm working on a surface read speed heatmap right now, then plan on working on a drive utilization map which can help pinpoint what files are located on bad sectors.

 

image.png.1e0520a03fb923391284bc0dad77577f.png

 

image.png.0d7a1e881b8c9ef7ed1f4423ab886e7e.png

 

image.thumb.png.1d6a15678508c89463e7cfc3ad58326c.png

Link to comment

The heatmap looks pretty incredible i'll grant you that.  I have a few wobbly seagate SAS drives, which I will be moving out of service, but would probably be sufficient to test on if you want some feedback.  Albeit i'm waiting on some new breakout cables first.  Had quite the ride moving hardware the last week....

Link to comment
  • 4 weeks later...
On 6/1/2020 at 12:11 PM, Alexstrasza said:

For some reason drive images don't seem to be downloading for me, even though I've checked the database and my drives are present.  If I add a custom image, and then click "reset image", they show up then.

Drive images are downloaded when you scan the hardware. Do you see the following message: "There was an error fetching the initial drive images"

Link to comment
19 hours ago, jbartlett said:

Drive images are downloaded when you scan the hardware. Do you see the following message: "There was an error fetching the initial drive images"

No, I don't get any errors at all, yet the drive images don't appear unless I perform that "reset image" shuffle.

Scanning Hardware
19:31:49 Spinning up hard drives
19:31:49 Scanning system storage
19:31:50 Scanning USB Bus
19:31:56 Scanning hard drives
19:31:57 Scanning storage controllers
19:31:57 Scanning USB hubs & devices
19:31:57 Scanning motherboard information
19:31:57 Fetching known drive vendors from the HDDB
19:31:58 Found controller SAS2008 PCI-Express Fusion-MPT SAS-2 [Falcon]
19:31:58 Found drive Seagate ST10000DM0004 Rev: DN01 Serial: / (sdb)
19:31:58 Found drive Western Digital WD80EZAZ Rev: 83.H0A83 Serial: / (sdc)
19:31:58 Found drive Western Digital WD60EFRX Rev: 82.00A82 Serial: / (sdd)
19:31:58 Found drive Western Digital WD80EMAZ Rev: 81.00A81 Serial: / (sde)
19:31:58 Found drive Western Digital WD80EMAZ Rev: 81.00A81 Serial: / (sdf)
19:31:58 Found controller P1 NVMe PCIe SSD
19:31:58 Found drive CT1000P1SSD8 Rev: P3CR013 Serial: / (nvme0n1)
19:31:58 Found controller Matisse USB 3.0 Host Controller
19:31:58 Found drive SanDisk' Cruzer Fit Rev: 0 Serial: / (sda)
19:31:58 Found controller P1 NVMe PCIe SSD
19:31:58 Found drive CT1000P1SSD8 Rev: P3CR013 Serial: / (nvme1n1)
19:31:58 Fetching drive images
19:31:58 Fetching Drive Platter Information
19:31:59 Checking Hard Drive Database for drives
19:31:59 Configuration saved
Link to comment

Thank-you firstly for this great tool.

 

I have successfully used it, and have attached the screenshot.

 

FST4P.png

 

Is there a website that these submitted benchmarks are visible? I would like to see if the graphing for my 2TB WD Red are the norm.

 

2TB WD Red WD20EFAX (Disk 1)

FSTzR.png

 

2TB WD Red WD20EFAX (Disk 2)

FST7Z.png

Ever since reading this article https://arstechnica.com/gadgets/2020/05/western-digital-gets-sued-for-sneaking-smr-disks-into-its-nas-channel/

I became more interested in my 'great purchase' of WD Red EFAX and began to wonder about their long term performance and reliability for NAS.

 

My starting array was the 4TB as parity1 for the 2 x 2TB.  When I ran out of room (ty Sonarr), I added a 8TB Ironwolf as parity2 while I moved the 4TB into the array after the parity1 had mirrored to parity2. That is why you will see Parity 2 and no parity1.

The 4TB has not shown the same tracking as the 2TB, but its peak/low speeds were worse.

 

4TB WD Red WD40EFAX

FST86.png

 

8TB Seagate Ironwolf ST8000VN004

FSTcS.png

 

 

Now the good news is I took the advice in the article, contacted WD support, and I have an RMA now and they will swap the drives to the previous model that does not use SMR.

 

It will be interesting to see the benchmark changes with the incoming new drives.

PS:

 

Your introduction post shows a 10TB WD RED WD100EFAX (Disk 1)

FSTv2.pngFSTvH.png

Which shows none of the performance issues I saw on mine.

 

So I went searching for more information and found this tidbit 

Quote

They defiantly are SMR drives. It’s pretty clear when you look at platter count. Keep in mind that SMR isnt necasarly a bad thing. It all depends on use case. Anyways I ended up going all WD100EFAX drives which are still PMR and very happy I did. 

 

https://community.synology.com/enu/forum/1/post/127228

 

Edited by Dauser
Established reason for 10TB possible variance
Link to comment
On 6/3/2020 at 11:34 AM, Alexstrasza said:

No, I don't get any errors at all, yet the drive images don't appear unless I perform that "reset image" shuffle.

I'll add some debugging to the scanning process that'll hopefully help pinpoint your issue.

 

11 hours ago, Dauser said:

Is there a website that these submitted benchmarks are visible? I would like to see if the graphing for my 2TB WD Red are the norm.

You'll find it eventually at http://strangejourney.net/hddb.cfm?View=models&Vendor=Western Digital&Model=WD20EFAX but I haven't added graphs for the current version of the benchmark. The previous version averaged out all scans and it was misleading because scans with bad drives & drives not able to go full speed due to the interface was throwing things off. I'll be adding a graph that shows scans for each individual drive uploaded vs an average.

  • Thanks 1
Link to comment

Hi @jbartlett this is a really cool tool, thank you.

 

The tests are finding a few problems. Note that Disk 4 is on the onboard controller and the others are on a SAS HBA card.

  1. First, the app reports a bandwidth cap error on all my SATA drives.
  2. Second, I'm getting frequent Speed Gap notices when I run the all-disk benchmark.
  3. Third, the graph appears to be pretty shaky/wobbly for several of my disks.

I attached a shot of the graph. Any suggestions for what is causing this behavior?

 

 

 

Screen Shot 2020-06-05 at 8.42.51 PM.png

Screen Shot 2020-06-05 at 8.50.31 PM.png

Edited by scud133b
added images
Link to comment
23 hours ago, scud133b said:

First, the app reports a bandwidth cap error on all my SATA drives

It's not an error, it means that the read speed over a span was basically the same which hints that the drive can output data faster than the controller can ingest it.

 

On 6/5/2020 at 6:43 PM, scud133b said:

Second, I'm getting frequent Speed Gap notices when I run the all-disk benchmark.

When the app analyzes the dd progress log, it saves the min & max read speeds. A "SpeedGap" means there is a larger than expected gap (or span) between these two numbers. This typically is indicative that the drives were accessed while the scan was running. Since this app is inside a Docker container and only can run while the array is up and available, this check is to mitigate such occurrences. It can also be representative of a drive that is failing if it keeps having to retry reading an area while increasing the SpeedGap range after each attempt.

On 6/5/2020 at 6:43 PM, scud133b said:

Third, the graph appears to be pretty shaky/wobbly for several of my disks.

Try swapping which drives are on which controller. This may help determine if it is tied to the controller or not.

  • Thanks 1
Link to comment
On 6/6/2020 at 9:04 PM, jbartlett said:

When the app analyzes the dd progress log, it saves the min & max read speeds. A "SpeedGap" means there is a larger than expected gap (or span) between these two numbers. This typically is indicative that the drives were accessed while the scan was running. Since this app is inside a Docker container and only can run while the array is up and available, this check is to mitigate such occurrences. It can also be representative of a drive that is failing if it keeps having to retry reading an area while increasing the SpeedGap range after each attempt.

 

Try swapping which drives are on which controller. This may help determine if it is tied to the controller or not.

So I gathered that the drives might be accessed during the test so I actually shut down all VMs and all other docker containers (so this was the only thing running) and that produced the test results you saw in my post.

 

I haven't had a chance to try swapping disks between the onboard controller and the PCIe controller but I can do that next if you think it might illuminate something.

 

Thanks

Edited by scud133b
Link to comment
23 hours ago, scud133b said:

I haven't had a chance to try swapping disks between the onboard controller and the PCIe controller but I can do that next if you think it might illuminate something.

It's worth a try only if you care enough to see if it's the controller or not. The "dd" utility is doing all the reporting, my app just takes what it spits out. The graph is indicative of the real performance you're getting but it's not likely something you'd actually ever notice.

Link to comment
On 6/14/2020 at 12:19 PM, jms2321 said:

I just installed this plugin, but its not displaying any images for the identified drives, despite images being in your database.  Is there any way to either add them directly to the plugin or have them refresh and display?

This is the 2nd report I've gotten of this. I'll look into it further. I moved the whole back-end system off my private server and onto "the cloud" a couple months ago and there might be something with that.

 

Can you check the following share: "\\tower\appdata\DiskSpeed\Instances\local" and email the file DriveImageHTTPResult.json to [email protected] - this will help me see if there's a transport issues.

 

In the meantime, you can visit http://strangejourney.net/hddb.cfm?View=images to look for brand images. I haven't updated this in about a year so if your drive isn't found, you can typically check the site you ordered it from to get an image to use. Clicking any drive image will download it to your PC. In the DiskSpeed app, click on the drive you want to update the image for and then click the "Edit" button under it. Click the "Upload Image" button. Click on the upload area or drag the downloaded file to it. On the edit screen, set the viewable range or click the expand button to use all and click "Save"

Link to comment
3 hours ago, jms2321 said:

I have emailed the json file as asked.

Thank you. For some reason, you are getting back basically an empty packet with no data. I'll need to add some debugging on my end that'll log the requests coming in and the responses. I'll reply to your email when it's done though I don't have an ETA for when it'll be ready as I'm getting married on the 27th at my home and we're doing basically everything 'cause of COVID-19. :)

Link to comment

I do appreciate your hard work and time on this minor issue.  I look forward to trying your fix.  Could one source of the issue be the NVMe drive that I have installed?  I know that it was picking up everything before I added the NVMe drive.  So I wonder if that might be the source of the issue?

Link to comment
1 hour ago, jms2321 said:

I do appreciate your hard work and time on this minor issue.  I look forward to trying your fix.  Could one source of the issue be the NVMe drive that I have installed?  I know that it was picking up everything before I added the NVMe drive.  So I wonder if that might be the source of the issue?

I don't suspect it is. The only information being passed is a list of vendors & model numbers.

 

ETA: This got me thinking - what is the manufacture & model number of the drive displayed in the app?

Edited by jbartlett
Link to comment
  • jbartlett changed the title to DiskSpeed, hdd/ssd benchmarking (unRAID 6+), version 2.10.7

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.