Trying to run unRAID on regular hard drive


Recommended Posts

josetann,

This is great work indeed!  So this setup means you also have swap already and that changes (such as new installations) would survive a reboot, right?  If so, this would probably be the easiest way to get any desired software installed and runnable by an end user....that is, it would be much easier to get VMware running this way than what I'm doing trying to turn it into a package.

 

Yes it has swap (I have 4GB for the heck of it), and any changes survive a reboot.  It's better for someone like me, who wants to run multiple things on a linux server, and doesn't plan on upgrading every time a new release comes out.  Currently if you want to upgrade (I see beta6 is already out, so the download link I gave for beta5 won't work anymore), you have to download the zip file, unzip it, run cpio on the bzroot, then compare it to the old bzroot contents, and copy over anything that's changed.  I cheated for the initial install, I just had it copy over anything that wasn't already there.  Can't cheat for the upgrade, so I need to provide a way to compare two unRAID distros and copy over anything that changed.

 

A couple of questions:

  • Does the installation hard drive count against the unRAID hard drive limit?  I'm assuming not because it's not in the array
  • (A more general question to everybody)  Can the device address for a hard drive change as you add new drives?  For example, say I have multiple SATA drives and unRAID is currently installed on a drive that comes up as /dev/sda.  Is there any chance that, that same physical card drive's /dev/sda address could change after I add more hard drives (say, with an additional SATA controller added)? Let me know if this question is not clear.

 

I would assume it wouldn't count against the hard drive limit.  Ask WeeboTech, I can't afford that many drives right now (at least, not that many drives that are worth using).

 

As long as you keep adding controllers, the drive assignments shouldn't change.  If you start disabling some or removing add-in cards, then I guess it could.  Anything built-in on the motherboard should be recognized in the same order, regardless of what other add-in cards you add.  If you disable one though, it could bump up one of the others (so whatever was sde might now be sdd).  On every motherboard I saw any add-in cards were initialized after the onboard stuff, so adding/removing add-in cards shouldn't ever change the assignments of drives that are plugged directly into the motherboard.

Link to comment
  • Replies 65
  • Created
  • Last Reply

Top Posters In This Topic

For example, say I have multiple SATA drives and unRAID is currently installed on a drive that comes up as /dev/sda.  Is there any chance that, that same physical card drive's /dev/sda address could change after I add more hard drives (say, with an additional SATA controller added)? Let me know if this question is not clear.

 

After some thought I remembered it actually depends on the boot order of the controller's bios.

I.E. Where it lies in the PCI configuration and the BIOS.

Chances are additional cards would come AFTER the motherboard ports unless you alter BIOS order.

(some allow this).

 

Most of the times I've seen these issues when dealing with mixed systems of IDE/SATA/SCSI.

 

So the possibility exists.

In fact it occured with my itx unraid system after adding the promise controller.

The internal hard drive WAS sda, then became sdf

 

 

Link to comment

For example, say I have multiple SATA drives and unRAID is currently installed on a drive that comes up as /dev/sda.  Is there any chance that, that same physical card drive's /dev/sda address could change after I add more hard drives (say, with an additional SATA controller added)? Let me know if this question is not clear.

 

After some thought I remembered it actually depends on the boot order of the controller's bios.

I.E. Where it lies in the PCI configuration and the BIOS.

Chances are additional cards would come AFTER the motherboard ports unless you alter BIOS order.

(some allow this).

 

Most of the times I've seen these issues when dealing with mixed systems of IDE/SATA/SCSI.

 

So the possibility exists.

In fact it occured with my itx unraid system after adding the promise controller.

The internal hard drive WAS sda, then became sdf

 

 

 

Weird, my Abit AB9 Pro doesn't do this, nor have the other two or three systems I had an add-in card installed (an old Promise Ultra66).  Guess it's another one of those quirks, it "usually" works a certain way, but not always.

Link to comment

My MSI itx motherboard does this.

My Supermicros with the embedded SCSI controllers do this.

 

It's bios initialization order.

 

It really tee'd me off with the ITX machine as I wanted to have a way of booting unraid, then booting the dev system.

after I installed drives the internal boot drive shifted making slackware fail to boot (grumble).

 

I gave up and went with a vmware dev environment.

 

Trying to rebuild a bzroot to test out CPU Freq now.

 

 

Link to comment
(A more general question to everybody)  Can the device address for a hard drive change as you add new drives?  For example, say I have multiple SATA drives and unRAID is currently installed on a drive that comes up as /dev/sda.  Is there any chance that, that same physical card drive's /dev/sda address could change after I add more hard drives (say, with an additional SATA controller added)?

 

I can definitely say that drive/device symbols do change.  I have seen quite a few syslogs now, and have seen the order change fairly often.  For instance, on some boards with a JMB chipset, I have seen the JMB ports assigned before the standard onboard ports.  So a user might fill up the regular ones first, then use a JMB port and see them all slide up.  Also, PCI slots may not be identified in the expected way.  An added Promise card may be processed before other long installed cards.  And another common issue, the ports are not necessarily identified in serial threads, their threads may be running simultaneously.  In particular, the USB drive thread will assign your unRAID flash drive when it feels like, irregardless of where in the drive setup it is.  On my machine, it seems to be about 50/50 as to whether sde will be assigned to my flash or to a data drive.  Sometimes the USB routine grabs sde first, other times the SATA drive grabs it, and the loser will get sdf.

 

In addition, with each kernel change, the drive analysis may change and different chipsets be identified in a different order.  With one or two of the unRAID upgrades (v4.1?), some of the users had significant changes, significant enough that several drives were 'Missing' on the Web page, and had to be re-assigned.

 

Link to comment

Thanks for the info, guys, re: the drive assignments.  I was thinking that I had seen some shifting on my previous setup, but I haven't looked closely at my current one.  However, as long as there's not the need to uncompress/recompress for every little thing, then shiftin assingments hopefully won't be too much of an issue.  I guess I could always use a PATA drive as my boot drive and have everything else SATA.

Link to comment

Yeah, I know I should figure this out on my own, but this has been bugging me all day.  First of all, thanks for a great guide, josetann.  I got my new hardware up and running last night, so I tried to give this a go.  Everything went well except that my new kernel won't boot.  The first time, I hadn't changed the ATA and SATA support to be compiled into the kernel.  I thought that I had read later that leaving them as modules would still work, but I might have been dreaming.  Anyway, I went back and just changed those from M to *, but it still wouldn't boot.  I think lilo finds the image, I get the initial loading line followed by a successful BIOS check line, but then it hangs.

 

I'm sure I probably just need to go deeper into the make menuconfig, but I just wanted to throw it out there in case anyone has hit this before.

Link to comment

Another thing, some of those driver selections can be selected for even more stuff.  If you're just changing a handful of options from M to *, then you're not getting them all.  Hit enter on them and either nothing will happen, or you'll be given a whole new list of stuff (which you must go through and change M to *).

 

The note about not having to change from Modules to built-in was about the filesystem stuff.  Everything that's required is already compiled directly in.  You just need to get the ATA/SATA drivers compiled directly into the kernel.

Link to comment

Another thing, some of those driver selections can be selected for even more stuff.  If you're just changing a handful of options from M to *, then you're not getting them all.  Hit enter on them and either nothing will happen, or you'll be given a whole new list of stuff (which you must go through and change M to *).

Yeah, I was being lazy and thought the highest level would create the changes I needed.  Actually, I was hoping that the submenus already had reasonable defaults.  I'll go in there and change those, as well.

 

 

The note about not having to change from Modules to built-in was about the filesystem stuff.  Everything that's required is already compiled directly in.  You just need to get the ATA/SATA drivers compiled directly into the kernel.

 

Makes sense...I'm glad to know I didn't imagine it.

Link to comment

Yeah, I was being lazy and thought the highest level would create the changes I needed.  Actually, I was hoping that the submenus already had reasonable defaults.  I'll go in there and change those, as well.

 

No need.  Re-read the how-to (not because you didn't read it the first time, rather I just updated it).  Download the linux-2.6.24.4.tar.gz to your /usr/src directory.  Run tar xzf on it.  It should copy over all the needed files including the .config file I used.  Now everything should work (of course you will need to compile the kernel and modules, don't forget to make modules_install like I have multiple times).

Link to comment

So I went back and made all the M to * changes, and it would still hang after the "BIOS check successful" (or whatever) line.  I then copied over a .config file from a working 2.6.24.4 setup (made from a general upgrade to slackware-current).  I did a "kompare" of the two config files and after pouring through hundreds of lines, I finally got the clue as to my bonehead mistake.  One of the lines in the working config contained something about "AGP"...that was an "ah ha" moment.

 

I was building on a system with X installed, and my lilo config was using VGA mode 773.  I changed VGA back to "normal" and rebooted.  Turns out, the system wasn't hanging, it was booting BUT I JUST COULDN'T SEE IT!  (It also didn't help that on my new hardware, my HDD LED wires were connected backwards, so I could never tell if the HDD was spinning).  I guess I'm just livin' and learnin'!

Link to comment

Ok, I have SMP working on a virtual machine, I was able to start and stop it repeatedly and no crashes.  However I haven't tested this on a real machine, nor with any usage other than starting and stopping the array (so bad things may happen if you do this and actually use the array).

 

In unraid.c, simply change this line:

 

#define UNRAID_PARANOIA 1

 

to this:

 

#define UNRAID_PARANOIA 0

 

If anyone wants to try this, I'd recommend trying it in VMWare first, or on a spare machine.  If enough people report success, then maybe you can do this on a production server, IF you're feeling lucky.  Well, are you feeling lucky?

Link to comment

I'm a lil leary about that change.

There's a reason an assert_spin_lock is there. It's to catch a bug or to force a failure on a condition.

Could be because the code itself is not thread/SMP safe.

 

So although it succeeds.... on heavy load there may be an issue with data and stripe buffers.

 

What is probably needed is a wait condition on the &conf->device_lock

 

 

Link to comment

I'm leary too, that's why I'm not doing it on my main server yet.  Just above it has this:

 

#if UNRAID_PARANOIA && defined(CONFIG_SMP)

 

So I figure maybe when Tom wrote it, he figured that SMP may give some problems, or may not.  If you were paranoid, set the paranoia bit (it's set by default, so now we know that Tom is indeed paranoid).  Maybe we shouldn't be so paranoid?

 

Actually, it's this way in the raid5.c driver.  I searched but couldn't find any good documentation in it.  Apparently the CHECK_DEVLOCK(); on line 278 is what's giving us fits.  Either we need to redefine it (it's currently set to "define CHECK_DEVLOCK() assert_spin_locked(&conf->device_lock)" if running an smp kernel and you have paranoia set), or look into what's happening around line 278.  Or just stop being so paranoid and see what happens.  I am curious as to what the worst that could happen is.

Link to comment

Just some more observations.  So far it appears to have no adverse affect.  Tested it extensively in VMware, I was satisfied enough that I tried it out on my main server.  Don't have enough time to fully test it out (gotta get some sleep sometime) but I'll try that tomorrow.

 

When I tested under VMware, I timed how long it took to transfer a file over.  I used the same file, and had a parity drive selected (so writing would consume a decent amount of cpu).  It was definitely faster with smp enabled.  I tried a variety of things to stress it out (and hopefully get a crash if it was going to do so), including deleting a file (which would need to recalculate parity), start a new transfer at the same time, then stopping the array (previously I'd get a crash whenever stopping the array).  Worked fine.

 

Now, partly because I'm still a bit chicken, and partly because this is what I wanted the final result on my machine to be, I've somewhat restricted everything on the server to use CPU0 (without some work I can't bind absolutely everything to a certain cpu, but this should be good enough).  I can force certain programs to use a different cpu, so I set vmware to use CPU1.  I figure that even if there is a bug in the unraid.c code, it shouldn't be triggered if it doesn't ever switch cpus.  Only time will tell for sure.  If anyone's wanting to do the same thing as me (and you shouldn't, let me be the guinea pig for a while), here's what I did:

 

Move the existing kernel and System.map, I used /boot/vmlinuz-2.6.24.4-unRAIDnosmp and /boot/System.map-2.6.24.4-unRAIDnosmp.  Edit your /usr/src/linux/drivers/md/unraid.c file, you only have to go down about a page, change "#define UNRAID_PARANOIA 1" to "#define UNRAID_PARANOIA 0".  Now to to /usr/src/linux, make menuconfig, enable smp support.  Then make bzImage, move that to /boot/vmlinuz-2.6.24.4-unRAID, move System.map to /boot/System.map-2.6.24.4-unRAID, make modules, then make modules_install.

 

Ok, now edit /etc/lilo.conf.  Change your old kernel to point to the -nosmp one.  Make a new entry just above it, just copy everything from the old.  Remember to use the new kernel's name though.  At the bottom of the new entry add this: append = "isolcpus=1".  This will force the kernel to NOT use CPU1 (which would actually be the second cpu, numbering starts at 0) unless you tell it to (there are exceptions, but they shouldn't affect us).  This is good enough for a dual-core processor, I do not know what you'd need to enter for more cores than that.  Your entry should look something like this (note that your root = is probably different from mine):

 

image = /boot/vmlinuz-2.6.24.4-unRAID

  root = /dev/hda3

  label = unRAID

  append = "isolcpus=1"

  read-only

 

When you're done editing, run "lilo -v" without quotes.  You'll probably get an error related to a device mapper not compiled into the kernel or some-such, that's fine.

 

Now for VMware, open up the .vmx file for your guest OS.  At the bottom put this:

 

processor0.use = "FALSE"

processor1.use = "TRUE"

 

That tells it to not use CPU0, and to use CPU1.  You now have a server where the unRAID OS runs on CPU0, and your VMware guest OS runs on CPU1.

 

Feel free to try it out for yourself, but please don't do it on a server with important data, not until someone who knows what they're doing can comment on this.  I'm willing to take the risk, but if you do too, don't blame me (or anyone else other than yourself) if it crashes, catches on fire, dials your cellphone in the middle of the day eating up your minutes, etc.

 

Link to comment

I suppose if you use the isolcpus then you'll be somewhat safe as the kernel will not be using the other cpu.

I would suggest trying to use the old #define and the isolcpus together to see what happens.

That would give me a warmer fuzzier feeling  ;D

 

I forgot about those vmware parms, good find.

 

I'm still using vmware server 1, so I'm going to have to try and get that one running.

 

I'm going to browser through the raid5 and unraid code to see if something grabs be.

 

My feeling is that if the #define is enabled in raid6 without tripping a bug, then there's something amiss in unraid and there's a potential for a buffer issue somewhere.

 

There's a reason for spin locks and ignoring it could be trouble waiting to visit.

 

Still very cool you've gotten this far.

 

Just got a bunch of my removables and new 120MM fans, so the build of my larger environment is making progress.

Going to do some benchmarking and testing with NFS today. /me crosses fingers.

 

Link to comment

There's very little (read: none that I could find) documentation on just what the paranoia bit is supposed to be for.  Since it says up above it's for debugging, one would assume it's not that necessary.  Then again it's enabled by default.  Who knows?

 

I "think" that leaving the define as is, and restricting the cpu to one core, would still cause a crash.  I haven't tested it yet though, if I get bored I might.  I'm just happy it's working the way I want on my system, but I would like to see just what's going on and make sure it's safe for others to run too.

 

BTW I'm using VMware server 1 as well.  I couldn't get RAW disks to work in 2, plus with the expiration date since it's a beta...I just want to get this working and then forget about it.

Link to comment

While I most likely will be going back to my old setup (everything on CPU0 unless otherwise specified), I've been running the past 24 hours with both cores wide-open (i.e. linux can use both cores however it wants, and my VMware install of Vista could use both cores too).  I've had no ill effects yet.  Unfortunately even with both cores available TVersity just isn't able to handle some of my files, most notably mpeg-2 video with ac3 audio.  Both cores would go to 100%, but it couldn't keep up (my HR20 would only play a few seconds before exiting, Xbox 360 kept buffering every few seconds).

 

I've been trying out fuppes, and while it's pretty difficult to get going it does work.  I had to give up on the vob files for my 360 though, it'd transcode fine but the 360 wouldn't play it (though transcoding manually worked fine, go figure).  The PS3 will play these files manually, so I'm just going to stick with that for now.  Fuppes could probably be installed on unRAID out of the box (vs on a hard drive like I'm doing), but it'd take quite a bit to get it working right (requires quite a few files for transcoding).

Link to comment

Both cores would go to 100%

 

How did you manage to determine that in Linux?

 

 

You can use top.  Just press "1" and it'll switch between showing the average of all the cpus/cores, or each individual core.  Of course since I was maxxing out the cpu while using Vista running under VMware, I just looked at task manager.

Link to comment

You shouldn't be maxing out both cores with windows.. I.E. Unless you did not install the vmware tools.

 

I just realized my slackware vm was running at 100% then I installed the vmware tools and it went back to idle condition.

 

 

I didn't mean the server was just sitting there doing nothing with both cores maxxed out.  They were only maxxed out when I ran TVersity, trying to stream a vob to my HR20 or my Xbox 360 (since neither could play it natively, it had to transcode).

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.