eweitzman

Members
  • Posts

    82
  • Joined

  • Last visited

Everything posted by eweitzman

  1. bubbaQ, Browsing through the driver code, I see it can hold on to ~1200 "stripes" of data. This term must be a legacy from the days when this code was really a RAID driver, right? Anyways, if the driver is aware of 1200 simultaneous IO requests, perhaps some of them can be grouped, reordered and processed so that a large series of data reads on adjacent tracks is done in parallel with a similar series of parity drive reads. That is, if the requested stripes have some sort of addressing that can be mapped to drive geometry, there is the possibility of disk-sequential, deterministic reading instead of "non-determinative optimistic read-ahead." After these complete -- with minimal seeks and no wasted rotations -- then parity will be computed, and then both drives would be written to in the same order as the reads. ------ Can anyone recommend a pithy summary/overview of linux driver architecture and programming? I'll look at this reordering and batching idea in more detail once I understand the overall picture better. - Eric
  2. I see. I've been looking at this as if the unraid code was higher up, ie, not a driver, and had knowledge of what needed to be written beyond a single block or atomic disk operation. If each call to the driver by the OS has no knowledge of previous and forthcoming calls (that is, data to read/write) then it would be very gnarly to have the driver to coordinate with other invocations of it. From what I've gleaned since last night, there are three main parts to unRAID: md driver - unRAID-modified kernel disk driver shfs - shared file system (user shares?) built on FUSE emhttp - management utility Any others? - Eric
  3. First, an introduction. I've just started using unRAID Plus (not Pro) and I like it a lot. It has the right trade-offs for me, and is replacing a slow, dedicated RAID NAS box and some unprotected drives in a PC. I'm a developer. I worked on various unixes (and other OSes) from the mid 80s to mid 90s. Getting ps -aux, ls -lRt, top, and even vi back into L2 has been a trip. (vi twice so for an emacs guy.) I dug up a 1989, spiral bound O'Reilly Nutshell book on BSD 4.3 that I bought back in the day. I'm not a kernel programmer or driver programmer or hardware guy, so the following thoughts may be naive. Clue me in if you can. I've read that unRAID has to do two disk reads and two disk writes for each data chunk when writing a file. See http://lime-technology.com/forum/index.php?topic=4390.60 for Tom's description. That description, and Joe L. and bubbaQ's posts, make it sound like these operations must be done sequentially, with seeks between each op, waiting for the start sector to spin back to the heads, and so on. All this waiting seems unnecessary to me, except for files that only occupy part of a track, and you'll never get high throughput with them anyways because of all the directory activity. With large files, the parts don't necessarily have to be read, written or processed sequentially along the length of the file. Let me illustrate. Multithreaded code could issue 20 synchronous reads, each from a separate thread, at different locations in a file. The drive will optimize how it reads and retrieves that data. When the IO requests complete in somewhat random order and the threads unblock, each can work with it's chunk of data to compute or update parity. After this, the write commands can be issued in any order and again, the drive will reorder the commands to write with the fewest seeks and least rotational latency. It would seem to me that allowing out-of-order processing in the code coupled with smart IO request reordering in the drive firmware can keep the heads moving smoothly through a file for relatively long periods. Of course, there will be some limits imposed by memory limits and interleaved block operations in the code, but if 20 tracks can be processed at a time this way, with only one or two seeks instead of 20, it's a big win. I'm sure this has been investigated, or that there are underlying reasons due to the architecture of unRAID that makes this unfeasible. Anybody know the reasons? Also, I'm very interested in reading an overview of unRAID's architecture: custom daemons, drivers, executables, and so on. Any pointers would be appreciated. Thanks, - Eric