13:00:04 <baoli> #startmeeting PCI passthrough
13:00:05 <openstack> Meeting started Thu Jan  9 13:00:04 2014 UTC and is due to finish in 60 minutes.  The chair is baoli. Information about MeetBot at http://wiki.debian.org/MeetBot.
13:00:06 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote.
13:00:08 <openstack> The meeting name has been set to 'pci_passthrough'
13:00:23 <baoli> Hi everyone
13:00:27 <irenab> hi
13:00:33 <johnthetubaguy1> hi
13:00:41 <heyongli> hello
13:00:55 <baoli> John, can you lead the discussion today?
13:01:06 <johnthetubaguy> if you want
13:01:17 <johnthetubaguy> I would love to talk about this:
13:01:27 <johnthetubaguy> https://wiki.openstack.org/wiki/Meetings/Passthrough#The_user_view_of_requesting_things
13:01:40 <johnthetubaguy> what would the nova CLI calls look like
13:01:56 <johnthetubaguy> when requesting gpu passthrough first
13:03:00 <johnthetubaguy> anyone want to suggest a proposal?
13:03:08 <irenab> john, I guess it requires flavor creation with extra_spec for GPU device and then regular 'nova boot'
13:03:26 <johnthetubaguy> +1 on that I think
13:03:43 <heyongli> current is: nova flavor-key  m1.large set  "pci_passthrough:alias"="a1:2"
13:03:57 <heyongli> nova boot --image new1 --key_name test --flavor m1.large 123
13:03:58 <johnthetubaguy> right
13:04:20 <johnthetubaguy> I will add that in the wiki
13:04:21 <irenab> heyongli: and its already supported, right?
13:04:31 <heyongli> yes
13:05:18 <johnthetubaguy> so there is a limitation there
13:05:25 <johnthetubaguy> you only get one PCI passthrough device
13:05:39 <johnthetubaguy> do we care about that for GPU etc, I think the answer is yet
13:05:42 <johnthetubaguy> I mean yes
13:06:12 <irenab> john: you can request number of devices
13:06:33 <johnthetubaguy> irenab: how?
13:06:41 <johnthetubaguy> oh, I see
13:06:45 <irenab> a1:2 => 2 devices
13:06:53 <heyongli> a1:2, a2:3
13:06:54 <johnthetubaguy> but there are all from the same alias
13:06:56 <heyongli> is also ok
13:07:04 <johnthetubaguy> ah, so thats a better example
13:07:20 <johnthetubaguy> we support this today then: a1:2, a2:3
13:07:22 <heyongli> and alias support a mixer spec: two 2 type of device
13:07:25 <irenab> I think you can add another alias too
13:08:07 <heyongli> you can defien alias:
13:08:17 <irenab> my feeling is GPU case is quite sovled and we just need to keep it working when adding netowrking case, agree?
13:08:22 <heyongli> a1={type1}
13:08:30 <heyongli> then write same: a1={type2}
13:08:54 <heyongli> then you requied a1:2 ,means both type1,and typ2 is ok
13:09:00 <johnthetubaguy> heyongli: what are the CLI commands for that, I am a bit confused
13:09:20 <heyongli> in nova configuration now.
13:09:25 <johnthetubaguy> I added the GPU case here:
13:09:25 <johnthetubaguy> https://wiki.openstack.org/wiki/Meetings/Passthrough#The_user_view_of_requesting_things
13:09:29 <heyongli> no API yet
13:09:32 <johnthetubaguy> I agree we just need to keep it working
13:09:52 <johnthetubaguy> heyongli: I am talking about the flavor, and what we already have today
13:10:15 <sadasu> heyongli: cli is confusing ... "x:y"="a:b" would be interpreted as x = a and y = b which is not the case in your CLI
13:11:02 <heyongli> sadasu: where you see x:y = a:b?
13:11:10 <johnthetubaguy> hang on, hang on
13:11:13 <johnthetubaguy> is this valid today
13:11:14 <johnthetubaguy> nova flavor-key  m1.large set  "pci_passthrough:alias"=" large_GPU:1,small_GPU:1"
13:11:14 <sadasu> "pci_passthrough:alias"="a1:2"
13:11:37 <baoli> pci_alias='{"name":"Cisco.VIC","vendor_id":"1137","product_id":"0071"}'
13:11:44 <baoli> this is how it's defined today
13:11:49 <heyongli> sadasu: this is another problem, john: right ,it works today
13:11:56 <johnthetubaguy> I don't mind about the alias
13:12:08 <johnthetubaguy> I am trying to ask how the flavor extra specs work today
13:12:11 <johnthetubaguy> is this valid?
13:12:12 <johnthetubaguy> nova flavor-key  m1.large set  "pci_passthrough:alias"=" large_GPU:1,small_GPU:1"
13:12:28 <heyongli> johnthetubaguy, it works
13:12:32 <johnthetubaguy> cool
13:12:47 <johnthetubaguy> so, we have the user request for vGPU
13:12:48 <sadasu> I am sure it works...you can make it work...I am just suggesting that it is not very self explanatory
13:12:48 <johnthetubaguy> now...
13:12:56 <johnthetubaguy> SRIOV
13:13:30 <johnthetubaguy> sadasu: it could be better, it could be worse, but I vote we try not to worry about that right now
13:13:49 <sadasu> ok...get it...lets move to SRIOV
13:13:58 <heyongli> sadasu: the pci_passthrough:alias, should be this sytle because the scheduler history reason.
13:14:36 <johnthetubaguy> so first, nova boot direct
13:14:38 <irenab> SRIOV + neutron, ok?
13:14:46 <johnthetubaguy> yep
13:15:31 <irenab> john: suggestion is to add attributes to --nic
13:15:45 <johnthetubaguy> yep, can we give an example
13:15:54 <johnthetubaguy> I am trying to type one up and not likeing any of them
13:15:57 <baoli> Can we go over the jan 8th agenda I posted yesterday?
13:16:14 <baoli> It contains all the details we have been workign so far
13:16:39 <johnthetubaguy> OK, we can reference it for sure, I just would love to aggree this user end bit first
13:16:46 <irenab> baoli: lest go to the last use case
13:17:11 <irenab> john: do you agree with nova boot format?
13:17:29 <johnthetubaguy> I can't easily see an example in that text
13:17:36 <johnthetubaguy> oh wait
13:17:37 <johnthetubaguy> sorry
13:17:39 <baoli> nova boot --flavor m1.large --image <image_id>                     --nic net-id=<net-id>,vnic-type=macvtap,pci-group=<group-name> <vm-name>
13:17:39 <johnthetubaguy> I am blind
13:17:46 <johnthetubaguy> macvtap?
13:17:57 <johnthetubaguy> vs direct vs vnic
13:17:59 <irenab> or direct or virtio
13:18:02 <johnthetubaguy> why do we need that?
13:18:14 <johnthetubaguy> I mean, why do we have three here?
13:18:32 <baoli> For SRIOV, there is both macvtap and direct
13:18:54 <johnthetubaguy> OK, is that not implied by the device type and vif driver config?
13:19:02 <baoli> With macvtap, it's still pci passthrough but a host macvtap device is involved
13:19:41 <baoli> Well, the device type and vif driver can support both at the same time on the same device
13:20:06 <johnthetubaguy> hmm, OK
13:20:14 <johnthetubaguy> macvtap doesn't look like passthrough
13:20:23 <johnthetubaguy> it looks like an alternative type of vnic
13:20:31 <baoli> John, it's one form of PCI passthrough
13:20:36 <irenab> john, the idea is to work with neutron ML2 plugin that will enable different typpes of vnics
13:20:37 <baoli> I mean one type
13:21:16 <johnthetubaguy> OK...
13:21:26 <johnthetubaguy> does the PCI device get attached to the VM?
13:21:39 <baoli> John, yes.
13:21:46 <johnthetubaguy> hmm, OK
13:21:47 <irenab> john: both macvtap and direct are network interfaces on PCI device
13:21:58 <johnthetubaguy> OK
13:22:11 <johnthetubaguy> seems like we need that then
13:22:15 <irenab> direct required vendor driver in the VM and macvtap doesn't
13:22:36 <baoli> irenab, I think it's the opposite
13:22:50 <baoli> Irenab, sorry,. you are right
13:22:52 <johnthetubaguy> so, as a user, I don't want to type all this stuff in, let me suggest something...
13:23:10 <johnthetubaguy> the user wants a nic-flavor right?
13:23:27 <johnthetubaguy> defaults to whatever makes sense in your cloud setup
13:23:33 <baoli> John, we have a special case in which th euser doesn't need to type it
13:23:43 <irenab> john: exactly
13:23:44 <johnthetubaguy> but if there are options, the user picks "slow" or "fast" or something like that
13:24:04 <johnthetubaguy> so I would expect to see...
13:24:49 <johnthetubaguy> nova boot --flavor m1.large --image <image_id>
13:24:49 <johnthetubaguy> --nic net-id=<net-id>,vnic-flavor=<slow | fast | foobar> <vm-name>
13:25:16 <johnthetubaguy> vnic-type is probably better than flavor I guess
13:25:33 <baoli> John, we don't want to add QoS to this yet, which is a separate effort
13:25:47 <johnthetubaguy> nova boot --flavor m1.large --image <image_id>
13:25:47 <johnthetubaguy> --nic net-id=<net-id>,vnic-type=<slow | fast | foobar> <vm-name>
13:26:00 <baoli> But I guess that you can do that
13:26:02 <johnthetubaguy> this isn't QoS...
13:26:09 <johnthetubaguy> slow = virtual
13:26:14 <johnthetubaguy> fast = PCI passthrough
13:26:25 <heyongli> this mean vnic-type contain vnic-type=macvtap in it?
13:26:29 <irenab> john: agree on this
13:27:04 <johnthetubaguy> heyongli, the concept represented by vnic-type would include such settings, yes
13:27:16 <johnthetubaguy> so do we all agree on this:
13:27:17 <johnthetubaguy> nova boot --flavor m1.large --image <image_id>
13:27:17 <johnthetubaguy> --nic net-id=<net-id>,vnic-type=<slow | fast | foobar> <vm-name>
13:27:34 <heyongli> i'm ok with it.
13:27:44 <irenab> john: missing here the 'pointer' to the pool of  PCI devices
13:28:03 <baoli> John, how do you define vnic-type?
13:28:16 <johnthetubaguy> well, thats the question
13:28:24 <johnthetubaguy> vnic-type is the user concept
13:28:32 <johnthetubaguy> we need to map that to concrete settings
13:28:49 <johnthetubaguy> but before we get there, are we OK with the theory of that user facing command?
13:29:00 <baoli> our original idea was to define a type of vnic that a user would attach its VM to
13:29:29 <johnthetubaguy> right, thats what I am suggesting here I think...
13:30:09 <baoli> Can we classify the VNICs to have types of virtio, pci-passthorugh without macvtap, pci-passthourgh with macvta[
13:30:15 <baoli> sorry, macvtap
13:30:33 <johnthetubaguy> the user doesn't care about all that, thats an admin thing, I think
13:30:50 <johnthetubaguy> the user cares about the offerings, not the implementation
13:31:03 <johnthetubaguy> at least, thats our general assumption in the current APIs
13:31:20 <irenab> john: I guess user will be charged differently depending what vnic he has, so probably he should be aware
13:31:45 <irenab> but logically it should have names meaningful to the user and not technical
13:31:55 <johnthetubaguy> exactly
13:32:05 <baoli> #agreed
13:32:09 <johnthetubaguy> logical names, the users case which one, but they care about the logical name
13:32:34 <johnthetubaguy> cool… I don't really care what that is, but this works for now I think...
13:32:44 <johnthetubaguy> boot --flavor m1.large --image <image_id> --nic net-id=<net-id>,nic-type=<slow | fast | foobar> <vm-name>
13:33:04 <johnthetubaguy> I removed the "v" bit, it seems out of place, but we can have that argument laters
13:33:22 <irenab> still missing here binding to the PCI devices that allowed for this nic
13:33:27 <baoli> #agreed
13:33:55 <johnthetubaguy> irenab: yes, lets do that in a second
13:33:59 <johnthetubaguy> now one more question...
13:34:08 <johnthetubaguy> if we have the above, I think we also need this...
13:34:26 <johnthetubaguy> nova boot --flavor m1.large --image <image_id> --nic port-id=<port-id>
13:34:36 <johnthetubaguy> i.e. all the port settings come from neutron
13:34:42 <johnthetubaguy> which means...
13:34:45 <baoli> John, yes.
13:34:48 <johnthetubaguy> we probably need this
13:34:55 <johnthetubaguy> quantum port-create --fixed-ip subnet_id=<subnet-id>,ip_address=192.168.57.101 <net-id> --nic-type=<slow | fast | foobar>
13:34:57 <baoli> we had it described in our doc
13:35:07 <irenab> john: yes, it will be added with the same nic-type attribute
13:35:14 <johnthetubaguy> cool, appolgies for repeating the obvious
13:35:20 <johnthetubaguy> just want to get agreement
13:35:33 <irenab> agree
13:35:40 <heyongli> agree
13:35:41 <johnthetubaguy> cool, so does this look correct:
13:35:41 <johnthetubaguy> https://wiki.openstack.org/wiki/Meetings/Passthrough#The_user_view_of_requesting_things
13:36:00 <irenab> john: its neutron :)
13:36:22 <johnthetubaguy> agree, just want to make sure
13:36:40 <johnthetubaguy> I have little knowlege of neutron these days, but that seems to make sense
13:36:45 <johnthetubaguy> cool
13:37:01 <baoli> overall, it looks good
13:37:07 <johnthetubaguy> so how do we get a mapping from nic-type to macvtap and pci devices
13:37:17 <johnthetubaguy> I vote macvtap goes into the alias
13:37:21 <johnthetubaguy> is that crazy?
13:37:25 <heyongli> +1
13:37:49 <irenab> john: do you suggest it to work with flavor?
13:38:23 <johnthetubaguy> irenab: not really, at least I don't think so
13:38:34 <johnthetubaguy> irenab: sounds like info that the VIF driver needs
13:38:39 <irenab> so what do you mean by goes into alias?
13:38:52 <johnthetubaguy> a good question...
13:39:22 <heyongli> alias map the nic-type to contain information needed.
13:39:41 <heyongli> this means the vnic type is one kind of alias
13:39:57 <johnthetubaguy> pci_alias='{"name":"Cisco.VIC","vendor_id":"1137","product_id":"0071", "nic-type":"fast", "attach-type":"macvtap"}'
13:40:08 <heyongli> +1
13:40:22 <johnthetubaguy> with nic-type and attach-type as optional, its not that sexy, but could work I think...
13:40:23 <irenab> not good...
13:40:31 <baoli> no good
13:40:40 <irenab> this definition makes it static
13:40:52 <johnthetubaguy> pci_alias_2='{"name":"Cisco.VIC.Fast","vendor_id":"1137","product_id":"0071", "nic-type":"faster", "attach-type":"direct"}'
13:40:57 <heyongli> what do you mean static?
13:41:06 <johnthetubaguy> user chooses "fast" or "faster"
13:41:36 <irenab> it comes to my previous question regarding pool of available PCI devies for the vnic
13:41:37 <johnthetubaguy> does that work?
13:41:48 <irenab> seems you define all of this as part of the alias
13:41:57 <johnthetubaguy> at the moment, yes
13:42:08 <baoli> John, if we have done the user's point of view, can we go over the the original post?
13:42:41 <johnthetubaguy> baoli: sure, that probably makes sense
13:43:00 <baoli> Thanks, john
13:43:32 <irenab> If we want a VM be connected to 3 networks: one via SRIOV direct, one with SRIOV macvtap and one with virtio, how it will be done?
13:44:20 <johnthetubaguy> this is now
13:44:27 <johnthetubaguy> one sec...
13:44:28 <baoli> Thanks Irenab to bring that up
13:45:09 <johnthetubaguy> nova boot --flavor m1.large --image <image_id> --nic net-id=<net-id>,nic-type=fast --nic net-id=<net-id>,nic-type=faster  <vm-name>
13:45:15 <johnthetubaguy> •	pci_alias='{"name":"Cisco.VIC", devices:[{"vendor_id":"1137","product_id":"0071", address:"*"}],"nic-type":"fast", "attach-type":"macvtap"}'
13:45:16 <johnthetubaguy> •	pci_alias_2='{"name":"Cisco.VIC.Fast",devices:[{"vendor_id":"1137","product_id":"0071", address:"*"}],"nic-type":"faster", "attach-type":"direct"}'
13:45:37 <heyongli> john, agree
13:45:55 <johnthetubaguy> hang on, we missed regular...
13:46:28 <johnthetubaguy> nova boot --flavor m1.large --image <image_id> —nic net-id=<net-id-1> —nic net-id=<net-id-2>,nic-type=fast --nic net-id=<net-id-3>,nic-type=faster  <vm-name>
13:47:11 <irenab> john: we need the devices that provide network connectivity, how it is going to happen?
13:47:33 <johnthetubaguy> right, we haven't covered how we implement it
13:47:38 <johnthetubaguy> just how the user requests it
13:48:14 <johnthetubaguy> irenab: is that OK for the user request?
13:48:17 <heyongli> this will works smooth for pci,  per my opinion,  and the connectivity is also can be a spec of alias
13:48:32 <irenab> john: if youer has both cisco and mellanox nics, it will have to define cisco_fast and mellanox_fast ...
13:48:42 <johnthetubaguy> irenab: correct
13:48:46 <johnthetubaguy> unless
13:48:54 <johnthetubaguy> you want them to share...
13:48:55 <baoli> John, when you say user, you mean the final user or someone providing the service
13:49:23 <johnthetubaguy> well we are only really doing the end user at the moment
13:49:37 <johnthetubaguy> be we should do the PCI alias stuff for the deployer
13:49:49 <johnthetubaguy> hmm...
13:50:53 <johnthetubaguy> pci_alias_2='{"name":"Fast",devices:[{"vendor_id":"1137","product_id":"0071", address:"*","attach-type":"direct"}, {"vendor_id":"123","product_id":"0081", address:"*","attach-type":"macvtap"}],"nic-type":"faster", }'
13:50:58 <johnthetubaguy> does that work better?
13:51:09 <irenab> I am not the fan of this alias defintions, but waiting to see how we resolve the network connectivity to agree
13:51:18 <heyongli> john: just define the pci_alias_2 2 time is works now
13:51:27 <johnthetubaguy> irenab: what do you mean: "resolve network connectivity"?
13:51:40 <baoli> John, we are trying to avoid alias on the controller node
13:51:49 <baoli> first of all
13:51:55 <sadasu> john: the cisco vic and the mellanox vic could be connected to diff networks
13:52:10 <sadasu> so they cannot be part of the same alias
13:52:21 <johnthetubaguy> right, thats OK still though, I think
13:52:28 <baoli> I don't think anyone using pci cares about vendor id, whatsoever
13:52:45 <sadasu> then both of them would seem equivalent at the time of nova boot
13:52:46 <johnthetubaguy> baoli: that is a deployer option
13:52:47 <irenab> how do we make VM  to land on the node with PCI devices connecting to the correct provider-network and correct  PCI device be allocated
13:53:12 <johnthetubaguy> irenab: OK, thats the scheduling issue then?
13:53:51 <irenab> john: agree, but I think that input comes from the nova boot command
13:54:18 <irenab> either in flavor or --nic
13:54:25 <johnthetubaguy> right, so let me ramble on about how I see this working… I know its not ideal
13:54:39 <johnthetubaguy> so, user makes request
13:54:54 <johnthetubaguy> nova looks for required pci devices
13:55:10 <johnthetubaguy> flavor extra specs or network_info might have them
13:55:15 <irenab> john: missing scheduler
13:55:18 <johnthetubaguy> we get a list of required alias
13:55:55 <johnthetubaguy> the scheduler can look at the required alias
13:56:09 <johnthetubaguy> and filters out hosts that can't meet the requests for all the requested devices
13:56:25 <johnthetubaguy> i.e. thats some scheduler filter kicking in
13:56:33 <johnthetubaguy> when the requests gets to the compute node
13:56:41 <johnthetubaguy> talks to resource manager to claim the resource as normal
13:57:19 <johnthetubaguy> compute node sends updates the the scheudler on what devices are available (we can sort out format later)
13:57:29 <johnthetubaguy> when VM is setup...
13:57:32 <johnthetubaguy> create domain
13:57:41 <johnthetubaguy> add any requested devices in the flavor extra specs
13:57:44 <johnthetubaguy> when plugging vifs
13:57:54 <johnthetubaguy> vif driver gets extra PCI device info
13:58:27 <johnthetubaguy> it also gets some lib that points back to nova driver specific ways of plugging PCI devices, and does what it wants to do
13:58:29 <johnthetubaguy> maybe
13:58:40 <johnthetubaguy> anyways, that was my general thinking
13:59:13 <heyongli> john: currently pci works in this way, almost
13:59:25 <irenab> john: I have to admit I do not see how networking part is resolved ...
13:59:37 <baoli> Hi John, I think that's how it works today. But we need to resolve network connectivity issue as Irenab has pointed out. We need a PCI device that connects to a physical net
13:59:45 <johnthetubaguy> well VIF driver gets its config from neutron in regular way
14:00:08 <johnthetubaguy> combined with PCI alias info from nova
14:00:13 <johnthetubaguy> it should be able to do what it needs to do
14:00:19 <johnthetubaguy> at least thats my suggestion
14:00:39 <heyongli> john: yep, this is also can reslove connectivity problem
14:01:00 <johnthetubaguy> anyways
14:01:05 <johnthetubaguy> its probably the nova meeting on here now
14:01:16 <baoli> Time is up. Do you guys want to end the meeting soon?
14:01:20 <irenab> john: PCI alias does not put any info regarding connectivity according to waht you defined previously
14:01:36 <heyongli> irenab: we can extend it
14:01:41 <johnthetubaguy> irenab: agreed, that has to come from neutron, in my model
14:01:42 <shanewang> yes, nova meeting
14:01:46 <johnthetubaguy> hyongli: its the nova meeting
14:01:48 <irenab> it can be several mellanox NICs but only one conecting to the physicla network
14:01:54 <baoli> #endmeeting