Turbo Write


jonp

Recommended Posts

Hi Guys,

 

What's the status of this? I have had a look @ the rc and can't see any options for this anywhere. I could be blind. Did it get implemented? I was assuming yes because it's still in the v6.0 roadmap.

 

I currently have a 3 disk array in my backup server without a cache drive and wondered what the impact of this feature would be?

 

Daniel

Link to comment
  • Replies 60
  • Created
  • Last Reply

Top Posters In This Topic

I feel like this is prime for a quick plug-in until LT embeds it. I just don't know if anyone will want to put in the work knowing it will be superceded. Then again, maybe their work could be used by LT to make it happen faster in 6.1 (6.0 is officially in code freeze afterall).

 

As to your question, the feature should help you a fair bit in all likelihood.  You know you can toggle it on manually right?

/root/mdcmd set md_write_method 1

/root/mdcmd set md_write_method 0

Link to comment

what the impact of this feature would be?

 

Accelerated writes to the protected raid array depending on where you are writing.

 

Populating a single array disk from a source could reach as high as 90MB/s over the network for some operations.

Once you start writing to the other disks simultaneously this benefit starts to slow down.

Link to comment

How is this achieved (technically)?

 

"the ability for smaller arrays to write data faster without caching."

 

The hdd's write cache? OS's file cache? The cache drive?

Why does it need all drives to be spun up? On write operation it's just parity + data drive that's being wrtten on, right?

Link to comment

Normally unRAID does

read of parity,

read of data,

XOR new data,

write data,

write parity (order may not be correct, but that's the concept).

 

This equates to 4 IO operations 2 on each drive, potentially requiring 2 rotations for each drive.

 

With Turbo Write,

as the sector is being written, (1 write),

all of the data drives are read (as many reads as data drives)

XOR is calculated

the parity drive is written with the new data

 

This takes advantage of drive caches and sequential operations.

 

Why would this equate to being faster?

Taking advantage of internal drive caches, buffer cache, elevator sorting algorithms.

Drives that are being updated do not necessarily have to have those sectors read into the buffer cache first before being written.

Idle drives that are being read could achieve up to 225MB/s (6tb 7200 RPM) on a read 'in parallel'.

Writes to the data drive can be completed at near that speed.

Writes to the parity can be updated at near that speed.

 

For large writes, turbo write compensates greatly.

Downside being, if there are many parallel reads to the other data drives from other operations, It slows down the write operations once again.

 

Turbo-write is good for loading large bulk data to a single drive.

Link to comment

@WeeboTech: ohhh...

 

So rather than get the data from the target disk, it calculates the data from all the other drives, similar to the 'phantom drive' (that exists during a disk failure); thus achieving paralel read-write?

 

Cmiiw, say an array with 5 disks: Parity, d1, d2, d3 and d4.

We want to write data to disk1. Normally you read parity and data on d1 and just xor the two, letting all other data drive asleep.

Turbo write gets the data of d1 from P, d2, d3, d4, thus allowing d1 to be continuously written. That's why it needs all data drive to be spun up.

Link to comment

Plus you don't have to flip the heads from read-to-write and back constantly... that takes a full rotation (or more) on some drives.  Even on other drives, it takes a finite amount of time that ends up costing you a rotation under conventional unRAID writing.

Link to comment

BubbaQ's one liner explanation is spot on.

 

As far as turbo write it would be more like

disk2 disk3 disk4 data for disk 1  disk1  parity

read  read  read  xor              write  write

 

The data for disk1 would already be in memory, so it does not have to be read.

Read all other spinning drives xor them with data for disk 1, write to disk1 and parity.

 

Since there is only 1 operation per disk per rotation, you can take advantage of the sequential speed of the drive and any cache on the drive.

 

A penalty occurs if you have other operations going on, let's say for disk3..

If you have a massive read or write going on there, it slows the writes.

Link to comment

BubbaQ's one liner explanation is spot on.

 

As far as turbo write it would be more like

disk2 disk3 disk4 data for disk 1  disk1  parity

read  read  read  xor              write  write

 

The data for disk1 would already be in memory, so it does not have to be read.

Read all other spinning drives xor them with data for disk 1, write to disk1 and parity.

 

Since there is only 1 operation per disk per rotation, you can take advantage of the sequential speed of the drive and any cache on the drive.

 

It's a bit more complex than my 1-liner w/r/t timing.... the Parity write can't be done until all the data discs are read, so 1 write will likely take 2 rotations minimum, but the pattern is not 2n is is n+1.. so 200 writes will take 201 rotations.

Link to comment

This feature is being moved to a later release.

 

Hopefully not a too distant one.    For what it's worth, not sure if you've followed the entire thread, but I think the on/off/auto option that WeeboTech suggested is by far the best way to implement this ==>  where On means TurboWrite is on (and will spin up all disks to do any writes);  Off simply means it's off, so writes are done in the usual way; and Auto means use Turbo Write if all disks are spinning; but otherwise do writes normally (i.e. don't spin up extra disks to do the write).    I like the Auto mode best -- but there are cases where the Off/On choices are better (as WeeboTech pointed out in our discussion).

 

 

Link to comment

This feature is being moved to a later release.

 

Hopefully not a too distant one.    For what it's worth, not sure if you've followed the entire thread, but I think the on/off/auto option that WeeboTech suggested is by far the best way to implement this ==>  where On means TurboWrite is on (and will spin up all disks to do any writes);  Off simply means it's off, so writes are done in the usual way; and Auto means use Turbo Write if all disks are spinning; but otherwise do writes normally (i.e. don't spin up extra disks to do the write).    I like the Auto mode best -- but there are cases where the Off/On choices are better (as WeeboTech pointed out in our discussion).

 

Auto is nice, but then you have to do something to spin up the drives on purpose and keep them spinning.

A manual button, /proc interface or timed method would be a nice easy stop gap as it would allow people to turn it on when they are doing large loads to the system and turn it off on purpose.

 

Also an option to do this in the mover might be attractive.

At that point all disks would be spinning thus a good time to capture smart reports of all active drives. (but that's deviating from the original purpose).

 

As long as I don't loose manual control to the interface and I can turn off an auto mode, I'm happy.

I'm not sure you would want an auto mode to engage if you have a very wide array or activity on the other disks.

Link to comment

... Auto is nice, but then you have to do something to spin up the drives on purpose and keep them spinning.

 

Actually you don't have to do anything to "keep them spinning."    If a drive (or more) had spun down, it would simply use the traditional write method.    What I like about this mode is it's virtually no-impact ... writes work as normal, but if all drives happen to be spinning it uses Turbo Write.    So if you're getting ready to do a lot of writes you can simply click "Spin Up" and they'll be faster;  otherwise they just work as normal.

 

You did bring up some good points in our discussion that still make a dedicated On/Off setting a good idea.  In particular, if you have some specific array disks that are very active and you don't want them "distracted" for turbo-write reads, you'd want the feature turned Off;  or if you did want turbo write to work anyway, but didn't want to have to spin down all drives to revert to normal writes you'd also need the "Off" feature.    So I think a 3-choice option works the best ... On/Off for those who want positive control of when it's used;  Auto for those who'd simply like it to work that way if all disks are already spinning, but don't want extra disk spinups otherwise.

 

Link to comment

FWIW, being able to turn off turbo write manually is important to me.

 

At specific times of the day I do allot of updates.  Constantly.  I want the small array spinning all day.

 

At other times of the day I may update 1 disk at a lower volume on a regular basis, but I do not need the speed nor want all the other disks spinning all night long.

 

I'm sure the answer would be 'use a cache disk'  but I only have 4 drives in a server. I don't need, nor can fit, a cache disk.

manual turbo write compensates nicely.

 

Having turbo write in auto mode all the time will keep all disks spinning if I continue to update the one busy disk.

Without being able to turn off turbo write by some schedule, every write I do to any of the disks, keeps all the other disks spinning.

 

The 3-choice option would work well.

Link to comment

This feature is being moved to a later release.

 

Hopefully not a too distant one.    For what it's worth, not sure if you've followed the entire thread, but I think the on/off/auto option that WeeboTech suggested is by far the best way to implement this ==>  where On means TurboWrite is on (and will spin up all disks to do any writes);  Off simply means it's off, so writes are done in the usual way; and Auto means use Turbo Write if all disks are spinning; but otherwise do writes normally (i.e. don't spin up extra disks to do the write).    I like the Auto mode best -- but there are cases where the Off/On choices are better (as WeeboTech pointed out in our discussion).

 

Firstly, disappointed that this feature got dropped from v6.0 and without notification.

 

All good discussion. For me I know what times of the day I would like Turbo Write to enabled. So for me, a schedule feature would be good. I know you "could" enable auto and just write a script to spin all drives up external to this feature at time(s) you want thus technically enabling Turbo at those times => BUT a small scheduling feature would be cool!

 

P.S. I keep feeing more advanced scheduling has applications in lots of other current and future functions e.g backup so if it was written in a way that you could leverage the code in other features it wouldn't be wasted dev effort.

Link to comment

Firstly, disappointed that this feature got dropped from v6.0 and without notification.

 

Well, the notification was in my reply to this thread, but I guess you meant earlier notification?  This was one of those features that we were thinking about implementing before the last beta, but just didn't have time.

 

P.S. I keep feeing more advanced scheduling has applications in lots of other current and future functions e.g backup so if it was written in a way that you could leverage the code in other features it wouldn't be wasted dev effort.

 

Scheduling is something we too see as a feature that can be greatly expanded upon, adding lots of other functionality (including TRIM) to this.  All in due time...

Link to comment
  • 5 months later...

How to keep turbo writes always on?

 

I thought that it would be enough to add the command to my go file, but t doesn’t work, do I need to insert some kind of delay?

 

This is the go file:

 

 #!/bin/bash
# Start the Management Utility
/usr/local/sbin/emhttp &
mdcmd set md_write_method 1
beep -f 700 ; beep -f 500 ; beep -f 700 ; beep -f 500

 

also tried:

 

/root/mdcmd set md_write_method 1

 

and

 

/usr/local/sbin/mdcmd set md_write_method 1

Link to comment

You need to be sure the array is up.

So yes,  a delay is appropriate.

 

I use the following snippet in my go script.

I included my readahead snippet as well, feel free to use or delete.

 

declare -a CHAR=('+' 'x');

let i=0 notices=60
DEV=/dev/md1
while [[ ${notices} -gt 0 && ! -b ${DEV} ]]
do    printf "Waiting $notices seconds for ${DEV}. Press ANY key to continue: [${CHAR[${i}]}]: "
      read -n1 -t1 && break
      echo -e "\r\c"
      (( notices-=1 ))
      [[ $(( i+=1 )) -ge ${#CHAR[@]} ]] && let i=0;
done
[ ${notices} -ne 60 ] && echo

let i=0 notices=60
DIR=/mnt/disk1
while [[ ${notices} -gt 0 && ! -d "${DIR}" ]]
do    printf "Waiting $notices seconds for ${DIR}. Press ANY key to continue: [${CHAR[${i}]}]: "
      read -n1 -t1 && break
      echo -e "\r\c"
      (( notices-=1 ))
      [[ $(( i+=1 )) -ge ${#CHAR[@]} ]] && let i=0;
done
[ ${notices} -ne 60 ] && echo

shopt -s extglob
READAHEAD=1024
for disk in /dev/md+([[:digit:]]) ; do blockdev --setra ${READAHEAD} ${disk}; done
for disk in /dev/sd+([[:digit:]]) ; do blockdev --setra ${READAHEAD} ${disk}; done

 

FWIW, I do not enable turbo write all the time. I do it on a schedule from the /etc/cron.d directory

 

 

In my go script I rsync a file from /boot/local/etc/cron.d/md_write_method to /etc/cron.d/md_write_method

(actually I rsync a whole tree of things, but this is what you need for this application).

 

root@unRAID:/boot/bin# cat /etc/cron.d/md_write_method 
30 08 * * * [ -e /proc/mdcmd ] && echo 'set md_write_method 1' >> /proc/mdcmd
30 23 * * * [ -e /proc/mdcmd ] && echo 'set md_write_method 0' >> /proc/mdcmd
#
# * * * * * <command to be executed>
# | | | | |
# | | | | |
# | | | | +---- Day of the Week   (range: 1-7, 1 standing for Monday)
# | | | +------ Month of the Year (range: 1-12)
# | | +-------- Day of the Month  (range: 1-31)
# | +---------- Hour              (range: 0-23)
# +------------ Minute            (range: 0-59)

Link to comment

You need to be sure the array is up.

So yes,  a delay is appropriate.

 

I use the following snippet in my go script.

I included my readahead snippet as well, feel free to use or delete.

 

That was it, many thanks!

 

My main server is always on and because I don’t usually do large writes and to keep disks sleeping I use normal write mode, but most of my servers are always off and I only turn them on to archive data, usually once a week.

 

Also, I thought that turbo write was only beneficial for servers with few drives but even on my biggest servers, which have 15 disks, the most I feel comfortable with single parity, I can practically double my write speed and can almost max out my gigabit connection on large transfers.

 

Link to comment

I thought that turbo write was only beneficial for servers with few drives but even on my biggest servers, which have 15 disks, the most I feel comfortable with single parity, I can practically double my write speed and can almost max out my gigabit connection on large transfers.

 

This is really great news. It was my theory that with larger arrays, there was a diminishing return.

I think for the backup application you've proven that turbo write is an effective feature.

 

I know in 'some' of my usage cases, while simultaneously writing and accessing other drives in read mode there was a negative effect.

However in the backup scenario where there are single massive writes, turbo write shines!

 

That's awesome!

Link to comment

This is really great news. It was my theory that with larger arrays, there was a diminishing return.

I think for the backup application you've proven that turbo write is an effective feature.

 

I believe now that turbo write will work well on bigger servers as long as there are no significant bottlenecks, since all disk have to be read simultaneously.

 

I think I’m benefiting from my quasi obsession with good parity check speeds, as no bottlenecks there should translate to good performance with turbo write.

 

Link to comment

To confirm my suspicions that turbo write is beneficial for large arrays as long as the server has no controller bottlenecks I did some quick tests with 18 drives including parity, the maximum I had available for testing.

 

Copy1 was done using my fastest controllers, parity check with this config is ~170MB/s, drives are connect as follows: 16 on 2 x Dell H310 + 2 onboard.

 

Copy2 was done after replacing one of the H310s with a SASLP, this limits parity checks to ~80MB/s.

 

For both tests I copied the same data to disk2, always connected to a Dell H310, copy1 is limited by gigabit Ethernet, copy2 starts at gigabit speed while Unraid is caching but then drops to ~75MB/s, my conclusion is that since turbo write requires all other disk to be read simultaneously there’s a slowdown caused by the limited bandwidth of the SASLP.

 

For some usage scenarios turbo write can greatly improve performance even for large servers, in this case I think max write speed will be close to the lowest of these three speeds:

 

- Server bandwidth when reading all disks, should be similar to parity check/sync speed

- Network speed

- Write speed of parity or disk involved in the write operation

copy1.png.8872c78b1a5ab23ac55807264113de07.png

copy2.png.ce06112e9e821b0fd0501a2fb65fdf8c.png

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.