UnRAID on VMWare ESXi with Raw Device Mapping


Recommended Posts

Yes, you can pass thru the controller of the motherboard - but there are some caveats on this and other devices. It appears you pass thru the buss that the PCI device is on (or other device). So in my example, my AOC-SATA8-MV8 is on the PCI bus. If I put in another PCI card, then that cannot be assigned to another guest other than the one sharing the SATA card. Similarly, the onboard graphics of the mobo appears to be on the PCI bus too so I loose my graphics. Not that you need it with ESXi.

 

Just a note related to the above:

 

I just read that:

 

The VT-d specification states that all conventional PCI devices behind a PCIe-to-PCI bridge have to be assigned to the same domain.

PCIe devices do not have this restriction.

 

(taken from http://wiki.xensource.com/xenwiki/VTdHowTo )

Link to comment
  • Replies 461
  • Created
  • Last Reply

Top Posters In This Topic

Top Posters In This Topic

Posted Images

The problem with my sharing the LSI card is because of a bug in the LSI driver according to Tom:

 

Hi David,

Yeah this is a bug in the LSI driver, and is typical of proprietary drivers (they don’t implement all of the required command set).  There is a workaround I can do in the unRaid driver which is on my todo list.

Cheers,

Tom

Link to comment

 

I got ESXi 4.1 installed on CF card with IDE adapter (needed to add customized drivers to support  network - Realtek 8111C on GIGABYTE GA-MA780G-UD3H motherboard) - and it runs great.

 

However run into problem with unRAID VM which configured with physical RDMs and LSI SAS controller - emhttp see drives and allow to correctly assign them in Devices page, it restarts md-mod driver with correct parameters (major/minor device numbers) but then  md-mod driver can not import drives - scsi inquiry doesn't return correct model and s/n. hdparm also doesn't work.

 

i looked at driver code (md.c) but can't figure out what exactly goes wrong there -  small test program which uses almost the same ioctl calls with scsi query works fine - serial and model numbers are returned correctly. Almost - because it uses user space ioctl instead of kernel ioctl_by_bdev executed by driver.

 

Oct 21 23:56:58 Tower emhttp: unRAID System Management Utility version 4.5.6

Oct 21 23:56:58 Tower emhttp: Copyright © 2005-2010, Lime Technology, LLC

Oct 21 23:56:59 Tower emhttp: Basic key detected, GUID:

Oct 21 23:56:59 Tower emhttp: shcmd (1): udevadm settle

Oct 21 23:56:59 Tower rc.local_startup[1517]: Initiating Local Custom Startup.

Oct 21 23:56:59 Tower emhttp: Device inventory:

Oct 21 23:56:59 Tower emhttp: pci-0000:03:00.0-sas-phy0:1-0x5000c29db5f14ed2:0-lun0 host1 (sda) ST31000528AS_6VP06JTM

Oct 21 23:56:59 Tower emhttp: pci-0000:03:00.0-sas-phy1:1-0x5000c29a767a1bee:1-lun0 host1 (sdb) ST31000528AS_6VP06L45

Oct 21 23:56:59 Tower emhttp: pci-0000:03:00.0-sas-phy2:1-0x5000c29f90899daf:2-lun0 host1 (sdc) ST31000528AS_6VP05VMM

Oct 21 23:56:59 Tower emhttp: shcmd (2): modprobe -rw md-mod 2>&1 | logger

Oct 21 23:56:59 Tower emhttp: shcmd (3): modprobe md-mod super=/boot/config/super.dat slots=8,16,8,32,8,0 2>&1 | logger

Oct 21 23:56:59 Tower kernel: xor: automatically using best checksumming function: pIII_sse

Oct 21 23:56:59 Tower kernel: pIII_sse : 2358.000 MB/sec

Oct 21 23:56:59 Tower kernel: xor: using function: pIII_sse (2358.000 MB/sec)

Oct 21 23:56:59 Tower ifplugd(eth0)[1452]: Link beat detected.

Oct 21 23:56:59 Tower kernel: md: unRAID driver 0.95.4 installed

Oct 21 23:56:59 Tower emhttp: Spinning up all drives...

Oct 21 23:56:59 Tower kernel: md0: import: scsi_inquiry (std inquiry) error: -14

Oct 21 23:56:59 Tower kernel: md0: import: scsi_inquiry (vpd: unit ser no) error: -14

Oct 21 23:56:59 Tower kernel: md: import disk0: [8,16] (sdb) offset: 63 size: 976762552

Oct 21 23:56:59 Tower kernel: md: disk0 wrong

...

 

I figured out why I was getting errors - 14 and managed to get unraid to work. It may even help in other cases when devices web page (emhttp) show correct serial/model number but md-mod driver refuse to import disks - can't get model/serial number.

 

In short (in my configuration under ESXi) that's how linux sg drivers handle SG_IO SCSI inquiry calls expecting user space buffer but refusing to take kernel space buffer to write info to. My workaround is to try SG_IO first, and on EFAULT (-14) error try legacy (depreciated but still supported) ioctl call which correctly handles kernel buffer. I created and successfully tested patch for md.c - let me know if someone else wants it to try.  Btw I compiled it against 2.6.35.8 kernel but it can be applied to any kernel with almost no modifications. 

 

Now I am looking to create custom vmware optimized unraid configuration (with paravirtualized scsi/net and balloon driver) and have it running 24x7 in VM.

 

Nov 13 22:29:24 Tower kernel: md: unRAID driver 0.95.4 installed

Nov 13 22:29:24 Tower kernel: program modprobe is using a deprecated SCSI ioctl, please convert it to SG_IO

Nov 13 22:29:24 Tower kernel: program modprobe is using a deprecated SCSI ioctl, please convert it to SG_IO

Nov 13 22:29:24 Tower kernel: md: import disk0: [8,16] (sdb) model: 'ST31000528AS    ' serial: '6VP06L45' offset: 63 size: 976762552

Nov 13 22:29:24 Tower kernel: program modprobe is using a deprecated SCSI ioctl, please convert it to SG_IO

Nov 13 22:29:24 Tower kernel: program modprobe is using a deprecated SCSI ioctl, please convert it to SG_IO

Nov 13 22:29:24 Tower kernel: md: import disk1: [8,32] (sdc) model: 'ST31000528AS    ' serial: '6VP05VMM' offset: 63 size: 976761496

Nov 13 22:29:24 Tower kernel: program modprobe is using a deprecated SCSI ioctl, please convert it to SG_IO

Nov 13 22:29:24 Tower kernel: program modprobe is using a deprecated SCSI ioctl, please convert it to SG_IO

Nov 13 22:29:24 Tower kernel: md: import disk2: [8,0] (sda) model: 'ST31000528AS    ' serial: '6VP06JTM' offset: 63 size: 976762552

 

 

 

 

Link to comment

There is a workaround for spindown/spinup issue that one may experience for SATA drives presented to unRAID VM using Physical RDM with LSI logic scsi or paravirtualized scsi. The issue reported in syslog (shown below) seems as result that linux/vmware scsi layer doesn't accept ATA spinup/down commands. But it does accept SCSI unit stop/start commands instead  - which can be done using sg_start from sg3_utils (http://sg.danny.cz/sg/sg3_utils.html)

 

For instance: sg_start --stop /dev/sda  will stop and sg_start --start /dev/sda will start the drive.

 

I created patch for md.c to integrate into md-mod driver - works good so far, and no more messages like this:

 

Oct 21 23:56:59 Tower kernel: md: disk2: ATA_OP_SETIDLE1 ioctl error: -22

 

Still to figure out temperature issue with smartctl

Link to comment

I hit another wall with following through with my ESXi plans... the uncertainty of 3TB drives.  I don't want to overhaul the server now only to find out 3 months from now that I need to overhaul again for 3TB drive support.  Does anyone have any input regarding this?

Link to comment

I hit another wall with following through with my ESXi plans... the uncertainty of 3TB drives.  I don't want to overhaul the server now only to find out 3 months from now that I need to overhaul again for 3TB drive support.  Does anyone have any input regarding this?

 

There are SO many things that need to change to get 3TB drive support.

 

I would NOT build a server/computer today and expect it to work with 3TB drives.  The SATA controller, BIOS (EFI), etc need to change to support 3TB drives and hardware is not quite caught up yet.  I would not look to build a server that supports 3TB for another year yet (at least).  Build the server with all 2TB drives and go from there.

Link to comment
Oct 21 22:38:32 Tower emhttp: disk_temperature: ioctl (smart_enable): Invalid argument

 

Above issue with temperature is much harder to resolve given lack of source code for emhttp. Apparently it fails for the similar reason hdparm fails  - both use queries to kernel (ioctl calls) intended for ATA devices which do not work under vmware when using RDM disks with scsi drivers.

 

Those calls needs to be made for SCSI devices - btw thats exactly how smartctl operates and that's why it return temperature correctly. So while changes are relatively simple the best scenario is if limetech fix it in emhttp code.

 

Link to comment

Oct 21 22:38:32 Tower emhttp: disk_temperature: ioctl (smart_enable): Invalid argument

 

Above issue with temperature is much harder to resolve given lack of source code for emhttp. Apparently it fails for the similar reason hdparm fails  - both use queries to kernel (ioctl calls) intended for ATA devices which do not work under vmware when using RDM disks with scsi drivers.

 

Those calls needs to be made for SCSI devices - btw thats exactly how smartctl operates and that's why it return temperature correctly. So while changes are relatively simple the best scenario is if limetech fix it in emhttp code.

 

Please correct me if I am wrong, from what I have read from your posts you have gotten everything to work with your changes except spin down and temperature?

 

So in the Main page you can see the model and serial number of the disks on the LSI controller rather than a dash? Thus, if you switch the disks around on the ports of the LSI controller, you could put them back to together again in UnRaid without problems?

 

I'm really looking forward to being able to reduce three 24/hr machines down to just one, given the hardware/power consumption overkill going on.

 

Eagerly following this thread.

Link to comment

The SATA controller, BIOS (EFI), etc need to change to support 3TB drives and hardware is not quite caught up yet.

 

This website at Western Digital says that 3TB drives are supported by Linux although there is a footnote.

 

http://www.wdc.com/en/solutions/Greaterthan22.asp

 

Edit:

WD is bundling WD Caviar® Green™ 2.5 TB and 3 TB drives with an ACHI-compliant HBA that, once installed, allows the operating system to use a known driver to correctly support large capacity drives.

 

So it seems you need a compatible HBA...

Link to comment

Oct 21 22:38:32 Tower emhttp: disk_temperature: ioctl (smart_enable): Invalid argument

 

Above issue with temperature is much harder to resolve given lack of source code for emhttp. Apparently it fails for the similar reason hdparm fails  - both use queries to kernel (ioctl calls) intended for ATA devices which do not work under vmware when using RDM disks with scsi drivers.

 

Those calls needs to be made for SCSI devices - btw thats exactly how smartctl operates and that's why it return temperature correctly. So while changes are relatively simple the best scenario is if limetech fix it in emhttp code.

 

Please correct me if I am wrong, from what I have read from your posts you have gotten everything to work with your changes except spin down and temperature?

 

So in the Main page you can see the model and serial number of the disks on the LSI controller rather than a dash? Thus, if you switch the disks around on the ports of the LSI controller, you could put them back to together again in UnRaid without problems?

 

I'm really looking forward to being able to reduce three 24/hr machines down to just one, given the hardware/power consumption overkill going on.

 

Eagerly following this thread.

 

Spindown/spinup got fixed as well. But temperature reading issue is not as described above.

 

Here is my completion/to-do list:

-fix model/serial number scsi query issue - completed (unraid driver patch)

-fix spinup/down issue in - completed (unraid driver patch)

-create/compile VMware optimized kernel - work in progress

-compile/install vmware tools - todo

-fix temperature issue - todo (hardest piece)

 

Btw I sent unraid patches to limetech with hope to get it eventually incorporated into mainstream version but have not get any response so far. 

 

My current VMware optimized kernel options are the following:

-device drivers -> scsi support

-> Scsi low level drivers ->

-> VmWare PVSCSI (*)

- device driver -> network device support

-> VMware VMXNET3 (M)

- device driver -> Misc devices

-> VMware Balloon driver (M)

 

In general PVSCSI (paravirtualized scsi) is recommended for i/o intensive workloads, vmxnet3 - paravirtualized network driver and vmware balloon driver speak for itself. There are some other tuning parameters available, good article on linux guests under vmware  - http://events.linuxfoundation.org/slides/2010/linuxcon2010_raja.pdf

 

 

Link to comment
So in the Main page you can see the model and serial number of the disks on the LSI controller rather than a dash? Thus, if you switch the disks around on the ports of the LSI controller, you could put them back to together again in UnRaid without problems?

 

I don't have real LSI controller (or any separate from motherboard controller) to switch disks around and I do not use VMdirectpath i/o or pci pass-through to present controller directly to unraid VM.

 

What I do have is number of SATA drives (attached to sata controller built in motherboard) that presented to guest unRAID VM using physical RDM feature of Vmware. I can configure VM with LSI logic scsi controller (virtual) or pvscsi (virtual) to present those disks.

 

Link to comment

 

Spindown/spinup got fixed as well. But temperature reading issue is not as described above.

 

Here is my completion/to-do list:

-fix model/serial number scsi query issue - completed (unraid driver patch)

-fix spinup/down issue in - completed (unraid driver patch)

-create/compile VMware optimized kernel - work in progress

-compile/install vmware tools - todo

-fix temperature issue - todo (hardest piece)

 

So your patches will help with any of the controller cards listed below which uses the mptsas driver? If so that is great news.  :)

 

I have a Dell SAS 5/e non raid (LSI 1068) which also uses the mptsas driver.

 

In addition Dell Sas 5i, 5iR, SAS 6i and 6iR

LSI SAS1064 and SAS1064E, SAS1068 and SAS1068E, SAS1078.

VMWare SAS controller

Fujitsu and Intel both make cards based on LSI1068 as well.

 

 

Link to comment

Spindown/spinup got fixed as well. But temperature reading issue is not as described above.

 

Here is my completion/to-do list:

-fix model/serial number scsi query issue - completed (unraid driver patch)

-fix spinup/down issue in - completed (unraid driver patch)

-create/compile VMware optimized kernel - work in progress

-compile/install vmware tools - todo

-fix temperature issue - todo (hardest piece)

 

Btw I sent unraid patches to limetech with hope to get it eventually incorporated into mainstream version but have not get any response so far. 

 

My current VMware optimized kernel options are the following:

-device drivers -> scsi support

-> Scsi low level drivers ->

-> VmWare PVSCSI (*)

- device driver -> network device support

-> VMware VMXNET3 (M)

- device driver -> Misc devices

-> VMware Balloon driver (M)

 

In general PVSCSI (paravirtualized scsi) is recommended for i/o intensive workloads, vmxnet3 - paravirtualized network driver and vmware balloon driver speak for itself. There are some other tuning parameters available, good article on linux guests under vmware  - http://events.linuxfoundation.org/slides/2010/linuxcon2010_raja.pdf

 

 

 

So basically, with the exception of Temp's, I could run my LSI card passed my unraid VM, and it'd not only get the correct names on the main status page (like normal on a regular box), AND have spindown/spinup working?  Temp is low on my list of requirements, but spinup/spindown + correct naming are high.  Here's hoping you'll have either some good, detailed instructions on how to set it up working (with your md mods as well), or a installable setup that will work out of the box :D.

 

Link to comment

Essentially the problem is with scsi support with unRAID. The patches I made do enable it in unraid driver (at least enough for vmware - I don't have hardware to test scsi on physical server). Management piece of unraid (emhttp) and users customization (such uumenu) are still not scsi aware which can be lived with - for emhttp so far I only discovered that temperature readings not working, uumenu scripts where it queries info from drives can patched to make them scsi aware.

 

To simplify things I created iso file where my unraid VM boots from. Essentially iso consists of bzimage and bzroot files which boot into ram disks same way stock unRAID boots from USB drive. USB drive is still is needed for unRAID configuration, license, customization and so on.

 

I will upload .iso and usage instructions later today/tomorrow on internet so people can play with this (need to find place for ~100mb first).

 

 

Link to comment

Actually it even less - ~50MB, now available at http://www.mediafire.com/?4ypkpvf6dff8x

 

Included readme file has description on how to use. If something not clear - let me know.

 

Would be interesting to hear comments..

Yes! I copied the bzroot and bzimage off your iso and on to my flash drive for my test machine. It booted fine and recognized the drive that was on my LSI1068E chipset card as a new hard drive and proceeds with a rebuild of that drive.

 

On the main page everything shows except the temperature. I'll check on spin down/up when it finished it's rebuild.

 

Attaching a syslog for those interested.

syslog.zip

Link to comment

It doesn't seem the ISO includes the source of the patches you made. The /usr/src/linux/drives/md files seem identical to stock unraid 4.5.6. Did I overlook them somewhere?

 

Can you please provide them? This way those of us interested in them can patch them into the 5.0 beta 2 on our test box(es), and you can be compliant with the GPL.

 

Thanks!

Link to comment

It doesn't seem the ISO includes the source of the patches you made. The /usr/src/linux/drives/md files seem identical to stock unraid 4.5.6. Did I overlook them somewhere?

 

Can you please provide them? This way those of us interested in them can patch them into the 5.0 beta 2 on our test box(es), and you can be compliant with the GPL.

 

Thanks!

 

right, it does missing since I did apply patch to linux source kernel tree after copying unraid drivers over. I will include it into next rev of iso where I hope to get vmware tools integrated. But for now it's uploaded to the same directory.

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.