13:00:51 <baoli> #startmeeting PCI Passthrough
13:00:52 <openstack> Meeting started Tue Apr 15 13:00:51 2014 UTC and is due to finish in 60 minutes.  The chair is baoli. Information about MeetBot at http://wiki.debian.org/MeetBot.
13:00:53 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote.
13:00:56 <openstack> The meeting name has been set to 'pci_passthrough'
13:01:00 <baoli> Hi there
13:01:04 <heyongli> hi
13:01:12 <beagles> hi
13:01:51 <baoli> Irenab is not going to be here today
13:02:20 <baoli> Let's wait to see if John and Rkukura are going to join
13:02:35 <Guest84743> (nick has gone crazy, Guest84743 is beagles)
13:02:53 <rkukura> I’m lurking, but am not caught up on current sr-iov goings on
13:04:48 <baoli> I think that we should get started.
13:05:08 * russellb lurking
13:05:28 <baoli> https://review.openstack.org/#/c/86606/
13:07:16 <baoli> For networking, my simplistic view of how tenant is going to use it: 1) want a sr-iov port (with macvtap or not), 2) the port is connected to a particular network.
13:08:46 <heyongli> baoli, agree, also picture this way
13:09:20 <baoli> So basically a compute host needs to provide the number of sr-iov ports per network it's connected with
13:10:51 <heyongli> don't get you point, but this is sound good,
13:11:58 <baoli> heyongli, I'm talking about the information required from a compute host for sr-iov networking
13:12:44 <heyongli> got, does this bring something new?
13:13:01 <Guest84743> baoli, to clarify, when you write "(macvtap or not)" above, do you mean the tenant specifies the type of connection (direct/macvtap) or is this implicit in the sr-iov/network spec?
13:13:58 <baoli> guest84743, two variants with sr-iov ports
13:14:03 <Guest84743> right okay
13:14:18 <Guest84743> so the have a degree of control over how it is connected?
13:14:26 <baoli> heyongli, not really. We have gone over it several times.
13:14:37 <heyongli> i suppose the port type could be specified by user or not, both should be a valid use case
13:15:10 <baoli> Guest84743, yea, with macvtap, less performance, but with live migration support
13:15:16 * Guest84743 nods
13:16:18 <baoli> heyongli, on top of the basic use cases, you brought it up that a tenant may want to use a sr-iov port that resides on a particular vendor's card.
13:17:12 <baoli> Would everyone agree that the cloud should provide this support to the tenant?
13:17:18 <Guest84743> has there ever been a discussion about using classifiers instead of direct "specification" that would mean something like 'passthru-to-net-foo-direct' instead of lower level details, or is this type of generalization/indirection seen as being part of a flavor mechanism
13:17:44 <russellb> this feels like flavor driven stuff to me
13:17:46 <heyongli> guest84743, i agree
13:17:52 <russellb> Guest84743: /nick beagles :-)
13:18:05 <baoli> beagles, yes, we talked about that.
13:18:10 <sadasu> Guest84743: good point
13:18:14 <Guest84743> russellb, wouldn't let me change.. or kept changing it back
13:18:34 <russellb> Guest84743: may need to identify with nickserv
13:18:37 <rkukura> I’m no expert on this area, but do we need to think about how the tenant ensures the VM has the right driver for whatever device they are asking for?
13:18:37 <Guest84743> yeah, I've read the backlog of emails but in the end I couldn't get a strong handle on the general consensus
13:18:51 <heyongli> i also think the vendor,  vnic type should be hide in flaovr, but user also should can choose it if user does want it
13:19:35 <sadasu> rkukura: yes...that is very critical for the feature to work
13:19:38 <b3nt_pin> heh, b3nt_pin will have to do for now.. won't let me (b3nt_pin is beagles)
13:20:04 <baoli> rkukura, do you mean the kernel device driver?
13:20:10 <rkukura> baoli: yes
13:20:16 <rkukura> in the VM
13:20:23 <b3nt_pin> rkukura, that wouldn't suck :)
13:20:30 <sadasu> when the tenant picks the vendor and product ids, this indirectly takes care of the tenant driver criteria
13:20:33 <russellb> you can so something for that between flavor metadata and image metadata
13:20:38 <russellb> capabilities / requirements matching
13:20:55 <russellb> if you want
13:21:42 <b3nt_pin> russellb, is there a good example of this in use that you would recommend?
13:21:43 <rkukura> If the tenant needs to ensure the VM has the correct driver, I don’t think we should be obfuscating how the tenant specifies/sees the vNIC type too much
13:22:08 <russellb> rkukura: that's a fair point
13:22:56 <sadasu> rkukura: the same driver usually supports both vnic_types generally
13:24:05 <heyongli> the drive should directly depend on the device it use, not the nic type it use,  i think
13:24:28 <sadasu> driver decides which host the VM can run on and the vnic_type decides the mode in which the sr-iov port can be used
13:25:03 <rkukura> Again, I’m far from an expert on this area, but does the VM need a driver specific to the SR-IOV card’s vendor/type?
13:25:20 <sadasu> vnic_type is not useful in the placement of a VM on a host but the driver info is
13:25:26 <sadasu> rkukura: correct
13:26:37 <baoli> rkukura, would an image be built with all the device drivers the cloud would support?
13:26:39 <rkukura> So vnic_type is more of a mode of operation than a device type?
13:27:03 <sadasu> rkukura: correct ...again :-)
13:27:20 <rkukura> baoli: I guess if the image is supplied by the cloud provider, that would be reasonable.
13:27:23 <sadasu> and only neutron cares about the vnic_type...nova can be blind to it
13:27:59 <sadasu> we just want to include it in a single API, so the user can launch a VM with all these specification via one command
13:28:46 <heyongli> to get a device driver properly setup, we may setup the meta data after get the actually device, but i'm not sure at that point, is there a way to do so,  for a tenant image.
13:29:10 <sadasu> rkukura: by image do you mean kernel driver for the sr-iov ports?
13:29:11 <b3nt_pin> sadasu, rkukura: to clarify, are we saying that a VM needs a device specific driver? If yes, does it still need that if the vnic type is macvtap (doesn't sound right)?
13:29:32 <heyongli> and this is common issue for all pci passthrough not only for sriov
13:29:38 <rkukura> b3nt_pin: I don’t know the answers
13:30:09 <b3nt_pin> rkukura, :) okay... I'll poke around and see if I can find out what the deal with libvirt is at least.
13:30:12 <sadasu> b3nt_pin: host needs a device specific driver
13:30:31 <b3nt_pin> sadasu, host, but not VM, right?
13:30:45 <rkukura> sadasu: Yes, I meant the image that is booted, with its kernel, drivers, …
13:30:52 <sadasu> b3nt_pin: correct
13:31:03 <b3nt_pin> sadasu, aahhh okay... that makes sense :)
13:32:13 <b3nt_pin> sadasu, I'm curious how VFs are presented to a VM but I suppose that is a virtualization detail.
13:32:18 <sadasu> rkukura: atleast with the pci passthru devices that I am exposed to so far...the driver resides in the kernel of the host OS and not the guest OS
13:32:34 <sadasu> not sure if any devices out there break that model
13:33:17 <b3nt_pin> sadasu, I'm happy to "hear" that as it is consistent with my admittedly "academic" view (awaiting hardware to get hands-on)
13:33:29 <sadasu> but I am guessing there will dfntlt be a host OS level driver...there might be an additional guest os lever driver
13:33:32 <sadasu> level*
13:33:33 <heyongli> sadasu, any device? don't thinks so, some driver must in the guest os also,for some device
13:33:50 <b3nt_pin> heyongli, yeah.. that would make sense especially for not networking SR-IOV
13:34:02 <sadasu> heyongli: yes...mentioned that..
13:34:11 * b3nt_pin wonders if anybody has a crypto board
13:34:31 <sadasu> the point I am trying to make is that ..there is a host kernel dependency also...
13:34:34 <heyongli> b3nt_pin, i had one
13:34:59 <heyongli> sadasu, seems a deploy problem
13:35:08 <russellb> i think even if we only pulled of sr-iov NICs for Juno, that'd be a huge step forward :)
13:35:09 <b3nt_pin> sadasu, yeah.. so this has to be presented as part of the info, basically. Whether configured or discovered.
13:35:10 <sadasu> if the dependency is only on the VM image, then I think it is pretty simple because the cloud provider can provide the correct image ...end of story
13:35:18 <b3nt_pin> russellb, +100
13:35:26 <baoli> I think that we are talking about a different issue: how to select a host that meets the requirement that comes from a tenant built image.
13:35:27 <russellb> "only"
13:35:57 * b3nt_pin didn't meant to "dilute"
13:36:10 <russellb> nah i said that
13:36:40 <rkukura> My concern was purely with the tenant VM having the correct driver for the virtual device that shows up in the VM, not with the hosts’ drivers.
13:37:45 <heyongli> rkukura, the metadata service might be a answer for this, fix me
13:37:51 <sadasu> rkukura: in that case, the tenant has to supply nova with the correct image id, correct?
13:38:08 <baoli> Say the tenant built image supports mlnx driver only.
13:38:34 <heyongli> baoli, then we need vendor information here, in the flavor
13:39:52 <baoli> How to extract the requirement from the image?
13:39:58 <heyongli> sadasu, for any image, provide some information to VM is a possible solution
13:40:11 <sadasu> baoli: thought we covered that...
13:40:38 <heyongli> baoli, this is another story, i try to make image can append pci device information specs or flavor something
13:40:51 <sadasu> heyongli: yes, that where the vendor_id, product_id come into play
13:41:16 <baoli> sadasu, can you enlighten me again?
13:41:27 <sadasu> baoli: :-)
13:41:54 <sadasu> u are the expert..
13:42:11 <sadasu> ok...so I think this is where we are:
13:42:14 <b3nt_pin> can we not simply assign that info to the image via metadata?
13:42:39 <sadasu> PCI device needs      1. driver in host OS 2. driver in guest OS 3. Both
13:43:00 <heyongli> b3nt_pin, we must had facility to extrat that from image.
13:43:04 <sadasu> veryone on same page so far?
13:44:14 <heyongli> sure, host driver is deploy scope i think,, guest image is a problem, we might should split that to another topic , it's common for all pci device. fix me
13:44:23 <sadasu> specifying vendor_id, product_id will help nova place VM of type 1 on correct host
13:45:16 <baoli> b3nt_pin: do we have an API to associate the meta-data with the image, which then can be used for host scheduling?
13:45:26 <sadasu> based on driver inside the VM image, once again giving vendor_id, product_id to nova should let us place it on the Host having the correct HW
13:45:37 <heyongli> sadasu, i don't think vm need this in a type one request,  where am i wrong?
13:45:38 <sadasu> case 3: is same as above
13:46:06 <b3nt_pin> baoli, http://docs.openstack.org/grizzly/openstack-compute/admin/content/image-metadata.html
13:46:30 <baoli> b3nt_pin, thanks
13:46:48 <sadasu> for case 1: we will use vendor_id product_id just to determine if pci passthrro devices exist
13:47:25 <sadasu> b3nt_pin: thanks will take a look
13:47:31 <heyongli> sadasu, no , whitelist should  does this, in a deploy stage.
13:47:57 <sadasu> agreed...whitelist contains the same info..
13:48:29 <sadasu> it would deftly help to associate meta data with the VM instead of expecting the tenant to assign the correct image to the VM
13:48:29 <heyongli> any thing related to host, should be a deploy problem, at least deploy involved in
13:49:03 <sadasu> yes...even at deploy you are providing vendo_id, product_id
13:51:21 <heyongli> sadasu, sure, it help, current image meta data could not address vm's driver problem if we don't check that in PCI pass-through scope
13:52:09 <b3nt_pin> I'm kind of struggling with the "extract from vm" idea... what do we mean by that?
13:53:07 <heyongli> b3nt_pin,where is the extract form vm idea? i'm lost here
13:53:30 * b3nt_pin goes back ...
13:53:44 <b3nt_pin> sorry.. heyongli I misinterpreted your previous statement
13:53:54 <b3nt_pin> you said, "extract from image" not from VM
13:53:58 <b3nt_pin> my bad
13:54:14 <sadasu> I am a little worried if we are making the configuration very difficult
13:54:52 <b3nt_pin> can you elaborate at which points you feel might be increasing the difficulty?
13:55:00 <heyongli> sadasu, not really difficult, which point difficult do you mean?
13:55:17 * b3nt_pin is telepathically linked to heyongli apparently
13:55:19 <sadasu> whitelist, image meta-data, sr-iov port modes, picking the correct ml2 driver on the neutron side :-)
13:55:46 <sadasu> I get what is going on..I am looking at it from a tenant perspective
13:56:00 * b3nt_pin nods...
13:56:20 <b3nt_pin> are all of these essential for basic SR-IOV usage, or only required for finer and finer control?
13:56:20 <heyongli> all of this could be done automatically, except the image meta-data, that's might be confuse now, but not very bad
13:56:34 <heyongli> i thinks it's finer control
13:57:00 <b3nt_pin> yeah
13:57:26 <sadasu> b3nt_pin: hard to say what is basic SR-IOV usage anymore :-)..every device/vendor is doing things slightly differently
13:57:31 <b3nt_pin> lol
13:57:35 <heyongli> cause OS admin could say, we had all these device, please pre install driver for your image... this is a simple solution here
13:59:04 <baoli> heyongli, thinking about the same.
14:00:16 <baoli> ok, we have to wrap it up today
14:00:18 <heyongli> if we provide image constrain, it's also a good feature,
14:00:23 <sadasu> ok...so my action item is to understand the image metadata feature better to see if that better suits the VM image driver dependency issue
14:00:30 <baoli> thanks everyone
14:00:34 <b3nt_pin> cheers all!
14:00:37 <baoli> #endmeeting