13:00:04 #startmeeting PCI passthrough 13:00:05 Meeting started Thu Jan 9 13:00:04 2014 UTC and is due to finish in 60 minutes. The chair is baoli. Information about MeetBot at http://wiki.debian.org/MeetBot. 13:00:06 Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 13:00:08 The meeting name has been set to 'pci_passthrough' 13:00:23 Hi everyone 13:00:27 hi 13:00:33 hi 13:00:41 hello 13:00:55 John, can you lead the discussion today? 13:01:06 if you want 13:01:17 I would love to talk about this: 13:01:27 https://wiki.openstack.org/wiki/Meetings/Passthrough#The_user_view_of_requesting_things 13:01:40 what would the nova CLI calls look like 13:01:56 when requesting gpu passthrough first 13:03:00 anyone want to suggest a proposal? 13:03:08 john, I guess it requires flavor creation with extra_spec for GPU device and then regular 'nova boot' 13:03:26 +1 on that I think 13:03:43 current is: nova flavor-key m1.large set "pci_passthrough:alias"="a1:2" 13:03:57 nova boot --image new1 --key_name test --flavor m1.large 123 13:03:58 right 13:04:20 I will add that in the wiki 13:04:21 heyongli: and its already supported, right? 13:04:31 yes 13:05:18 so there is a limitation there 13:05:25 you only get one PCI passthrough device 13:05:39 do we care about that for GPU etc, I think the answer is yet 13:05:42 I mean yes 13:06:12 john: you can request number of devices 13:06:33 irenab: how? 13:06:41 oh, I see 13:06:45 a1:2 => 2 devices 13:06:53 a1:2, a2:3 13:06:54 but there are all from the same alias 13:06:56 is also ok 13:07:04 ah, so thats a better example 13:07:20 we support this today then: a1:2, a2:3 13:07:22 and alias support a mixer spec: two 2 type of device 13:07:25 I think you can add another alias too 13:08:07 you can defien alias: 13:08:17 my feeling is GPU case is quite sovled and we just need to keep it working when adding netowrking case, agree? 13:08:22 a1={type1} 13:08:30 then write same: a1={type2} 13:08:54 then you requied a1:2 ,means both type1,and typ2 is ok 13:09:00 heyongli: what are the CLI commands for that, I am a bit confused 13:09:20 in nova configuration now. 13:09:25 I added the GPU case here: 13:09:25 https://wiki.openstack.org/wiki/Meetings/Passthrough#The_user_view_of_requesting_things 13:09:29 no API yet 13:09:32 I agree we just need to keep it working 13:09:52 heyongli: I am talking about the flavor, and what we already have today 13:10:15 heyongli: cli is confusing ... "x:y"="a:b" would be interpreted as x = a and y = b which is not the case in your CLI 13:11:02 sadasu: where you see x:y = a:b? 13:11:10 hang on, hang on 13:11:13 is this valid today 13:11:14 nova flavor-key m1.large set "pci_passthrough:alias"=" large_GPU:1,small_GPU:1" 13:11:14 "pci_passthrough:alias"="a1:2" 13:11:37 pci_alias='{"name":"Cisco.VIC","vendor_id":"1137","product_id":"0071"}' 13:11:44 this is how it's defined today 13:11:49 sadasu: this is another problem, john: right ,it works today 13:11:56 I don't mind about the alias 13:12:08 I am trying to ask how the flavor extra specs work today 13:12:11 is this valid? 13:12:12 nova flavor-key m1.large set "pci_passthrough:alias"=" large_GPU:1,small_GPU:1" 13:12:28 johnthetubaguy, it works 13:12:32 cool 13:12:47 so, we have the user request for vGPU 13:12:48 I am sure it works...you can make it work...I am just suggesting that it is not very self explanatory 13:12:48 now... 13:12:56 SRIOV 13:13:30 sadasu: it could be better, it could be worse, but I vote we try not to worry about that right now 13:13:49 ok...get it...lets move to SRIOV 13:13:58 sadasu: the pci_passthrough:alias, should be this sytle because the scheduler history reason. 13:14:36 so first, nova boot direct 13:14:38 SRIOV + neutron, ok? 13:14:46 yep 13:15:31 john: suggestion is to add attributes to --nic 13:15:45 yep, can we give an example 13:15:54 I am trying to type one up and not likeing any of them 13:15:57 Can we go over the jan 8th agenda I posted yesterday? 13:16:14 It contains all the details we have been workign so far 13:16:39 OK, we can reference it for sure, I just would love to aggree this user end bit first 13:16:46 baoli: lest go to the last use case 13:17:11 john: do you agree with nova boot format? 13:17:29 I can't easily see an example in that text 13:17:36 oh wait 13:17:37 sorry 13:17:39 nova boot --flavor m1.large --image --nic net-id=,vnic-type=macvtap,pci-group= 13:17:39 I am blind 13:17:46 macvtap? 13:17:57 vs direct vs vnic 13:17:59 or direct or virtio 13:18:02 why do we need that? 13:18:14 I mean, why do we have three here? 13:18:32 For SRIOV, there is both macvtap and direct 13:18:54 OK, is that not implied by the device type and vif driver config? 13:19:02 With macvtap, it's still pci passthrough but a host macvtap device is involved 13:19:41 Well, the device type and vif driver can support both at the same time on the same device 13:20:06 hmm, OK 13:20:14 macvtap doesn't look like passthrough 13:20:23 it looks like an alternative type of vnic 13:20:31 John, it's one form of PCI passthrough 13:20:36 john, the idea is to work with neutron ML2 plugin that will enable different typpes of vnics 13:20:37 I mean one type 13:21:16 OK... 13:21:26 does the PCI device get attached to the VM? 13:21:39 John, yes. 13:21:46 hmm, OK 13:21:47 john: both macvtap and direct are network interfaces on PCI device 13:21:58 OK 13:22:11 seems like we need that then 13:22:15 direct required vendor driver in the VM and macvtap doesn't 13:22:36 irenab, I think it's the opposite 13:22:50 Irenab, sorry,. you are right 13:22:52 so, as a user, I don't want to type all this stuff in, let me suggest something... 13:23:10 the user wants a nic-flavor right? 13:23:27 defaults to whatever makes sense in your cloud setup 13:23:33 John, we have a special case in which th euser doesn't need to type it 13:23:43 john: exactly 13:23:44 but if there are options, the user picks "slow" or "fast" or something like that 13:24:04 so I would expect to see... 13:24:49 nova boot --flavor m1.large --image 13:24:49 --nic net-id=,vnic-flavor= 13:25:16 vnic-type is probably better than flavor I guess 13:25:33 John, we don't want to add QoS to this yet, which is a separate effort 13:25:47 nova boot --flavor m1.large --image 13:25:47 --nic net-id=,vnic-type= 13:26:00 But I guess that you can do that 13:26:02 this isn't QoS... 13:26:09 slow = virtual 13:26:14 fast = PCI passthrough 13:26:25 this mean vnic-type contain vnic-type=macvtap in it? 13:26:29 john: agree on this 13:27:04 heyongli, the concept represented by vnic-type would include such settings, yes 13:27:16 so do we all agree on this: 13:27:17 nova boot --flavor m1.large --image 13:27:17 --nic net-id=,vnic-type= 13:27:34 i'm ok with it. 13:27:44 john: missing here the 'pointer' to the pool of PCI devices 13:28:03 John, how do you define vnic-type? 13:28:16 well, thats the question 13:28:24 vnic-type is the user concept 13:28:32 we need to map that to concrete settings 13:28:49 but before we get there, are we OK with the theory of that user facing command? 13:29:00 our original idea was to define a type of vnic that a user would attach its VM to 13:29:29 right, thats what I am suggesting here I think... 13:30:09 Can we classify the VNICs to have types of virtio, pci-passthorugh without macvtap, pci-passthourgh with macvta[ 13:30:15 sorry, macvtap 13:30:33 the user doesn't care about all that, thats an admin thing, I think 13:30:50 the user cares about the offerings, not the implementation 13:31:03 at least, thats our general assumption in the current APIs 13:31:20 john: I guess user will be charged differently depending what vnic he has, so probably he should be aware 13:31:45 but logically it should have names meaningful to the user and not technical 13:31:55 exactly 13:32:05 #agreed 13:32:09 logical names, the users case which one, but they care about the logical name 13:32:34 cool… I don't really care what that is, but this works for now I think... 13:32:44 boot --flavor m1.large --image --nic net-id=,nic-type= 13:33:04 I removed the "v" bit, it seems out of place, but we can have that argument laters 13:33:22 still missing here binding to the PCI devices that allowed for this nic 13:33:27 #agreed 13:33:55 irenab: yes, lets do that in a second 13:33:59 now one more question... 13:34:08 if we have the above, I think we also need this... 13:34:26 nova boot --flavor m1.large --image --nic port-id= 13:34:36 i.e. all the port settings come from neutron 13:34:42 which means... 13:34:45 John, yes. 13:34:48 we probably need this 13:34:55 quantum port-create --fixed-ip subnet_id=,ip_address=192.168.57.101 --nic-type= 13:34:57 we had it described in our doc 13:35:07 john: yes, it will be added with the same nic-type attribute 13:35:14 cool, appolgies for repeating the obvious 13:35:20 just want to get agreement 13:35:33 agree 13:35:40 agree 13:35:41 cool, so does this look correct: 13:35:41 https://wiki.openstack.org/wiki/Meetings/Passthrough#The_user_view_of_requesting_things 13:36:00 john: its neutron :) 13:36:22 agree, just want to make sure 13:36:40 I have little knowlege of neutron these days, but that seems to make sense 13:36:45 cool 13:37:01 overall, it looks good 13:37:07 so how do we get a mapping from nic-type to macvtap and pci devices 13:37:17 I vote macvtap goes into the alias 13:37:21 is that crazy? 13:37:25 +1 13:37:49 john: do you suggest it to work with flavor? 13:38:23 irenab: not really, at least I don't think so 13:38:34 irenab: sounds like info that the VIF driver needs 13:38:39 so what do you mean by goes into alias? 13:38:52 a good question... 13:39:22 alias map the nic-type to contain information needed. 13:39:41 this means the vnic type is one kind of alias 13:39:57 pci_alias='{"name":"Cisco.VIC","vendor_id":"1137","product_id":"0071", "nic-type":"fast", "attach-type":"macvtap"}' 13:40:08 +1 13:40:22 with nic-type and attach-type as optional, its not that sexy, but could work I think... 13:40:23 not good... 13:40:31 no good 13:40:40 this definition makes it static 13:40:52 pci_alias_2='{"name":"Cisco.VIC.Fast","vendor_id":"1137","product_id":"0071", "nic-type":"faster", "attach-type":"direct"}' 13:40:57 what do you mean static? 13:41:06 user chooses "fast" or "faster" 13:41:36 it comes to my previous question regarding pool of available PCI devies for the vnic 13:41:37 does that work? 13:41:48 seems you define all of this as part of the alias 13:41:57 at the moment, yes 13:42:08 John, if we have done the user's point of view, can we go over the the original post? 13:42:41 baoli: sure, that probably makes sense 13:43:00 Thanks, john 13:43:32 If we want a VM be connected to 3 networks: one via SRIOV direct, one with SRIOV macvtap and one with virtio, how it will be done? 13:44:20 this is now 13:44:27 one sec... 13:44:28 Thanks Irenab to bring that up 13:45:09 nova boot --flavor m1.large --image --nic net-id=,nic-type=fast --nic net-id=,nic-type=faster 13:45:15 • pci_alias='{"name":"Cisco.VIC", devices:[{"vendor_id":"1137","product_id":"0071", address:"*"}],"nic-type":"fast", "attach-type":"macvtap"}' 13:45:16 • pci_alias_2='{"name":"Cisco.VIC.Fast",devices:[{"vendor_id":"1137","product_id":"0071", address:"*"}],"nic-type":"faster", "attach-type":"direct"}' 13:45:37 john, agree 13:45:55 hang on, we missed regular... 13:46:28 nova boot --flavor m1.large --image —nic net-id= —nic net-id=,nic-type=fast --nic net-id=,nic-type=faster   13:47:11 john: we need the devices that provide network connectivity, how it is going to happen? 13:47:33 right, we haven't covered how we implement it 13:47:38 just how the user requests it 13:48:14 irenab: is that OK for the user request? 13:48:17 this will works smooth for pci, per my opinion, and the connectivity is also can be a spec of alias 13:48:32 john: if youer has both cisco and mellanox nics, it will have to define cisco_fast and mellanox_fast ... 13:48:42 irenab: correct 13:48:46 unless 13:48:54 you want them to share... 13:48:55 John, when you say user, you mean the final user or someone providing the service 13:49:23 well we are only really doing the end user at the moment 13:49:37 be we should do the PCI alias stuff for the deployer 13:49:49 hmm... 13:50:53 pci_alias_2='{"name":"Fast",devices:[{"vendor_id":"1137","product_id":"0071", address:"*","attach-type":"direct"}, {"vendor_id":"123","product_id":"0081", address:"*","attach-type":"macvtap"}],"nic-type":"faster", }' 13:50:58 does that work better? 13:51:09 I am not the fan of this alias defintions, but waiting to see how we resolve the network connectivity to agree 13:51:18 john: just define the pci_alias_2 2 time is works now 13:51:27 irenab: what do you mean: "resolve network connectivity"? 13:51:40 John, we are trying to avoid alias on the controller node 13:51:49 first of all 13:51:55 john: the cisco vic and the mellanox vic could be connected to diff networks 13:52:10 so they cannot be part of the same alias 13:52:21 right, thats OK still though, I think 13:52:28 I don't think anyone using pci cares about vendor id, whatsoever 13:52:45 then both of them would seem equivalent at the time of nova boot 13:52:46 baoli: that is a deployer option 13:52:47 how do we make VM to land on the node with PCI devices connecting to the correct provider-network and correct PCI device be allocated 13:53:12 irenab: OK, thats the scheduling issue then? 13:53:51 john: agree, but I think that input comes from the nova boot command 13:54:18 either in flavor or --nic 13:54:25 right, so let me ramble on about how I see this working… I know its not ideal 13:54:39 so, user makes request 13:54:54 nova looks for required pci devices 13:55:10 flavor extra specs or network_info might have them 13:55:15 john: missing scheduler 13:55:18 we get a list of required alias 13:55:55 the scheduler can look at the required alias 13:56:09 and filters out hosts that can't meet the requests for all the requested devices 13:56:25 i.e. thats some scheduler filter kicking in 13:56:33 when the requests gets to the compute node 13:56:41 talks to resource manager to claim the resource as normal 13:57:19 compute node sends updates the the scheudler on what devices are available (we can sort out format later) 13:57:29 when VM is setup... 13:57:32 create domain 13:57:41 add any requested devices in the flavor extra specs 13:57:44 when plugging vifs 13:57:54 vif driver gets extra PCI device info 13:58:27 it also gets some lib that points back to nova driver specific ways of plugging PCI devices, and does what it wants to do 13:58:29 maybe 13:58:40 anyways, that was my general thinking 13:59:13 john: currently pci works in this way, almost 13:59:25 john: I have to admit I do not see how networking part is resolved ... 13:59:37 Hi John, I think that's how it works today. But we need to resolve network connectivity issue as Irenab has pointed out. We need a PCI device that connects to a physical net 13:59:45 well VIF driver gets its config from neutron in regular way 14:00:08 combined with PCI alias info from nova 14:00:13 it should be able to do what it needs to do 14:00:19 at least thats my suggestion 14:00:39 john: yep, this is also can reslove connectivity problem 14:01:00 anyways 14:01:05 its probably the nova meeting on here now 14:01:16 Time is up. Do you guys want to end the meeting soon? 14:01:20 john: PCI alias does not put any info regarding connectivity according to waht you defined previously 14:01:36 irenab: we can extend it 14:01:41 irenab: agreed, that has to come from neutron, in my model 14:01:42 yes, nova meeting 14:01:46 hyongli: its the nova meeting 14:01:48 it can be several mellanox NICs but only one conecting to the physicla network 14:01:54 #endmeeting