Turbo Read


Recommended Posts

We already have turbo write, so how about investigating turbo read?

 

@limetechThis is just a rough sketch of what would be required, I realize the amount of work needed to implement this, but I think it would be beneficial to at least roll the idea around, maybe throw together a proof of concept md driver to see if it's even feasible.

 

What I am envisioning is a read request being split into two threads, the physical disk thread and the parity reconstructed thread, one requesting the first sector of a file, the other requesting the second, then as each thread returns with valid data, the next queued sector is requested. Theoretically you could almost double throughput, depending on bottlenecks.

Link to comment

The entire linux i/o path is already highly threaded with multiple queues.  Are you seeing a read bottleneck?  "turbo" write results in faster write-throughput (vs unRaid "normal" read/modify/write) because it eliminates extra latency due to mechanical hard drive spinning, but at the cost of requiring all drives to be spun up and active in each write operation.

Link to comment
30 minutes ago, limetech said:

Are you seeing a read bottleneck?

Only that imposed by all reads being serviced by a single device. What I'm asking is if you can read from all the drives at once, roughly half from the single device, the other half from the rest of the drives parity reconstruction algorithm of that same data.

Link to comment

Ok I think I understand what your idea is:  Suppose we are reading a large file, where the transfer is broken up into a set of "segments", each of some size (presumably a large size, like 1MB or more).  We can number the segments like this:

 

0, 1, 2, 3, ..

 

Your idea is to read say the even numbered segments from the source drive, and the odd numbered segments from reconstruct-read of all the other drives.  The key being that a read of an even segment is happening in parallel with reconstruct-reads of an odd segment.

 

Correct?  If so, I don't think this will help much because the killer in disk I/O is rotational latency.  If the segment size was "just right" you might see a speed up, but because HDD's are highly zoned (variable track size), I think it would be almost impossible to pick the right size, and it would have to be variable as the transfer progressed.  Interesting idea though...

Link to comment
13 minutes ago, limetech said:

The key being that a read of an even segment is happening in parallel with reconstruct-reads of an odd segment.

Yep. Exactly what I envisioned, except maybe more optimized to allow the faster leg of the parallel process to do more of the reading. In the extreme of an old slow ailing drive, one set of reads would be much faster than the other set, so the faster process would get more segments.

 

25 minutes ago, limetech said:

I don't think this will help much because the killer in disk I/O is rotational latency.

So unless the segment size was large enough to keep both processes reading data more time than was wasted skipping to the next segment, no gains. :$

 

25 minutes ago, limetech said:

Interesting idea though...

Oh well, for a few minutes I thought I might have had an unraid game changer idea. :) 

 

I know that when a drive is severely ill, sometimes reading the data is MUCH faster when you remove the drive and read from the parity emulated volume.

Link to comment

Well it's a clever idea for sure.  Not sure it's a game changer... Another application for this would be small random read case:  Have some requests satisfied by directly reading the drive, others by reconstruct-read of the other drives.  This is similar to read speed-up with RAID-1 configs.  However, the reconstruct-read requests would be considerably slower on average vs. reading directly from the target drive, assuming all about the same speed devices.  This is because in recon-read case we must read from all the drives before we can reconstruct.  This means each of those operations "seeks" will be much longer.  This is because seek is composed of two parts: move the r/w head over the proper track, then wait for the sector to rotate underneath the r/w head.  In the "average seek" case, we wait 1/2 disk revolution.  In recon-read case we wait almost 1 full revolution time.  For a disk spinning at say 5400 rpm this is 11 ms vs. 5.5 ms. which would be pretty significant I think.

Link to comment
4 minutes ago, limetech said:

Have some requests satisfied by directly reading the drive, others by reconstruct-read of the other drives.

So if the direct read is fully saturating a specific disk, and a request comes in for another file on the same drive, it could be satisfied by reconstruct. But, what happens when a read for another disk interrupts the reconstruct? I guess both transfers suffer badly.

 

So, my "turbo read" mode would kneecap simultaneous multi disk reads, but augment simultaneous file access from a single disk. I guess the overall possible benefit is probably too small to warrant much work. :( You would probably have to implement a cumbersome logic tree to figure out in what circumstances to try to boost a read without killing other I/O at the same time.

 

Maybe when we are using SSD media and seek times are no longer the bottleneck? :D

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.