Disk Buffering

September 19, 200916 yr

In unMenu there is a user script to set read-ahead buffers for all array data drives (only) to 2048 bytes (and other sizes). But what about cache and parity?

Can you apply the blocksize parameter similiarly to sd devices as you would with md devices? The cache drive has a filesystem, so I'd assume you would handle it at the filesystem level, such as /dev/sdX1, does that work the same?

Parity has no filesystem so can blocksize be used on a /dev/sdX?

Any thoughts or ideas on this?

--Bill

Quote

September 19, 200916 yr

Yes, you can set the read-ahead buffers on other drives, such as cache.

Quote

September 20, 200916 yr

Author

Yes, you can set the read-ahead buffers on other drives, such as cache.

Good, thanks. Does changing the buffering to the block device rather than the filesystem partition make any significant differences to what or how it is buffered?

The unMenu user script to set 2048 just does a recursive loop through created /dev/md* systems. I was thinking of doing something similar to the created /dev/sdX devices instead (skipping sdX1 fs references). Would that accomplish the same thing performance-wise?

--Bill

Quote

September 20, 200916 yr

Yes, you can set the read-ahead buffers on other drives, such as cache.

Good, thanks. Does changing the buffering to the block device rather than the filesystem partition make any significant differences to what or how it is buffered?

The unMenu user script to set 2048 just does a recursive loop through created /dev/md* systems. I was thinking of doing something similar to the created /dev/sdX devices instead (skipping sdX1 fs references). Would that accomplish the same thing performance-wise?

--Bill

It is something to experiment with... At the time I tried it (and created those unMENU scripts) , it had little effect on what I was doing... but then, perhaps the /dev/md buffering was the bottleneck at the time.

I've often wondered if setting the disk read-ahead of the /dev/sdX and /dev/hdX devices to a much larger value (size of a full cylinder on the disk) might help with the speed of writing to the array. We know the disk has to spin around anyway to write a sector, but the next sector to be read might already be in memory more often. so, instead of setting the read-ahead to 2048, you can try 256000 or bigger.

See here: http://markmail.org/message/kjm6mw7xh2ttugrg

Lots of matches with others with similar questions searching google with "md device read ahead"

If you set read-ahead bigger, random I/O performance on small files might suffer, but sequential access of large files might improve (as will parity checks)

Let us know what you find.

Joe L.

Quote

September 20, 200916 yr

I was under the impression that on the parity disk, the reads were always unbuffered. I never saw parit-check speed improvement on i/o to parity disk with large buffer settings.

Quote

September 21, 200916 yr

Author

I was under the impression that on the parity disk, the reads were always unbuffered. I never saw parit-check speed improvement on i/o to parity disk with large buffer settings.

Possibly because it's write-bound? I'm not sure, but it seems like it would be faster to buffered read from x drives in the array than to unbuffered write to one drive (parity). Though there is write buffering internally on most newer drives.

The sd* and md* devices have a different major node number. That would seem to imply that they are seen as independent devices with independent RA buffering. The Blockdev program seems to bear that out as you can set different buffer sizes on /dev/md1 and /dev/sdg (in my case) for disk 1. But only the size set for md devices is going to matter to unRaid because that's the device it speaks to them through. The Cache drive is referred by sd* only. Blockdev also refers to the native device (sdX) and the filesystem on that device (sdX1) as capable of having separate settings (get/setra and get/setfra), but my testing shows that both devices change when either mode is used.

If the parity disk is not write-bound, it would seem that increasing the RA buffering significantly on the data drives (md devices) might help.

One of the messages in the link that Joe included said that most modern drives read at least a track at a time into their local buffer and large linux level read buffers probably wouldn't help much because of that, and because the only way to read in a larger buffer's worth is by repeated read requests from the drive, which kind of defeats the purpose.

I just tested my fastest drive (Seagate 1.5T 7200 RPM) at several different RA sizes using dd if=/dev/sdX of=/dev/null count=8192000 on an idle system:

RA=256 32.4356 seconds, 129 MB/s

RA=2048 32.5393 seconds, 129 MB/s

RA=4096 33.2323 seconds, 126 MB/s

RA=256000 45.8386 seconds, 91.5 MB/s

Here I performed the same tests but with the md1 device using dd if=/dev/mdX of=/dev/null count=8192000 on an idle system:

RA=256 48-82 seconds, 51-86 MB/s

RA=2048 32-33 seconds, 125-127 MB/s

RA=4096 32.5-33.5 seconds, 127-129 MB/s

RA-10240 34 seconds, 120 MB/s

RA=256000 45 seconds, 91.5 MB/s

In multiple tests there were very minor differences in each group, but the relationship between sizes in that group remained the same. So there's a big difference whether you're testing with the raw device, or the md filesystem device that unRaid uses.

At least in these tests, it appears that the default 256 byte system buffering is optimal for the raw device, but there's significantly improved performance at 2048 or 4096 buffer sizes on the mdX filesystems. I'd guess these relationships would be different on different motherboards and controllers and may change in other ways depending on load conditions.

Maybe this whole system buffering issue is moot beyond this point until improved disk I/O in newer kernels (alleged to be happening), and newer drives with faster (than UDMA 6 133MB/s) transfer modes is all that's going to really help?

--Bill

Quote

September 21, 200916 yr

Your tests seem to have results similar to those I found back in this thread: http://lime-technology.com/forum/index.php?topic=965.0

Since that time, lime-tech has increased the default read-ahead on the /dev/mdX devices to 1024 in the last few releases.

Joe L.

Quote

September 21, 200916 yr

I'm not sure, but it seems like it would be faster to buffered read from x drives in the array than to unbuffered write to one drive (parity).

Yes, it would but there is a data integrity issue with a multi-threaded system writing to a sector after you got it in a buffered read.... it depends at what level you are at.

Though there is write buffering internally on most newer drives.

Integrity again -- you can do a write through that buffer with the proper command which many high RELIABILITY systems do (i.e. sacrificing speed for reliability/integrity)

it would seem that increasing the RA buffering significantly on the data drives (md devices) might help.

Yes it will generally. My post was restricted to the parity drive however.

Quote

September 21, 200916 yr

Author

Your tests seem to have results similar to those I found back in this thread: http://lime-technology.com/forum/index.php?topic=965.0

Since that time, lime-tech has increased the default read-ahead on the /dev/mdX devices to 1024 in the last few releases.

Joe L.

Interesting read, thanks for the pointer.

Is there no way to 'insert' some scripting so the blockdev -setra's can be executed each time the array is brought up? Just having them in the go script and only being effective after a reboot seems rather limiting.

--Bill

Quote

September 21, 200916 yr

Your tests seem to have results similar to those I found back in this thread: http://lime-technology.com/forum/index.php?topic=965.0

Since that time, lime-tech has increased the default read-ahead on the /dev/mdX devices to 1024 in the last few releases.

Joe L.

Interesting read, thanks for the pointer.

Is there no way to 'insert' some scripting so the blockdev -setra's can be executed each time the array is brought up? Just having them in the go script and only being effective after a reboot seems rather limiting.

--Bill

Not at this time, but at some point in the future...

The desire/need to insert "scripting" at various steps of the array management has been discussed at some length in the wiki:

http://lime-technology.com/wiki/index.php/Third_Party_Boot_Flash_Plugin_Architecture

Early this year, Tom @ Lime-tech started this thread in response. In version 5 of unRAID, we will begin to have the ability to have some event based processing. See this thread: http://lime-technology.com/forum/index.php?topic=3461.0

We have no idea how far version 5 has progressed, if at all. I think the initial thought was to develop it in parallel with Version 4.5.x

Joe L.

Quote

Disk Buffering

Featured Replies

Archived

Account

Navigation

Search

Configure browser push notifications

Chrome (Android)

Chrome (Desktop)

Safari (iOS 16.4+)

Safari (macOS)

Edge (Android)

Edge (Desktop)

Firefox (Android)

Firefox (Desktop)