13:04:42 <baoli> #startmeeting PCI passthrough
13:04:43 <openstack> Meeting started Wed Jan  8 13:04:42 2014 UTC and is due to finish in 60 minutes.  The chair is baoli. Information about MeetBot at http://wiki.debian.org/MeetBot.
13:04:44 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote.
13:04:46 <openstack> The meeting name has been set to 'pci_passthrough'
13:05:01 <johnthetubaguy> hi, do we have a rough agenda for today?
13:05:13 <baoli> Hi John
13:05:27 <baoli> I posted this on the wiki yesterday:https://wiki.openstack.org/wiki/Meetings/Passthrough
13:06:11 <irenab__> baoli: how do you suggest to proceed?
13:06:18 <baoli> Yesterday, we were discussing predefined PCI groups. Yongli doesn't seem to like the idea
13:06:37 <baoli> Let's continue from what we have left yesterday
13:06:41 <johnthetubaguy> baoli: got it, thanks
13:07:27 <ttx> baoli: will the meeting change back to Tuesdays once agreement is reached ? I haven't updated the meeting calendar given it's very temporary...
13:07:46 <baoli> ttx, yes.
13:07:58 <ttx> ok, let's skip the calendar update then :)
13:08:03 <baoli> we will be doing daily meetings for this week.
13:08:19 <baoli> except for Friday/Saturday
13:08:33 <johnthetubaguy> baoli: I can't promise to make all those, but lets see how it goes
13:09:04 <johnthetubaguy> Have a agreed the list of use cases we want to support yet?
13:09:29 <johnthetubaguy> like a short term list (for Icehouse) and longer term aims too?
13:09:58 <baoli> john, we didn't go through those cases yet. But stuck on the first part
13:10:17 <johnthetubaguy> I thought use cases would be the first part, which bit are we stuck on?
13:10:18 <baoli> But I guess that we should go through them first?
13:10:34 <irenab__> baoli: there is a list of use cases you put on wiki
13:11:03 <baoli> Shall we start from use cases today, then?
13:11:03 <irenab__> I just miss there one more case for  mixed VIFs, both sriov and vnic for same VM
13:11:19 <baoli> #topic use cases
13:11:22 <heyongli> i think john mean the use case in the blueprint of nova.
13:11:33 <baoli> #irenab, yes, I should put that in
13:11:53 <johnthetubaguy> heyongli: I think we probably want both, but lets start with this wiki first
13:12:02 <heyongli> sure
13:12:29 <baoli> #topic SRIOV-based cloud
13:12:37 <baoli> Any thoughts on this?
13:12:51 <johnthetubaguy> Can we start with GPU passthrough?
13:13:00 <johnthetubaguy> just to keep things simple
13:13:04 <baoli> Ok
13:13:27 <johnthetubaguy> how do we want that to look?
13:13:39 <johnthetubaguy> nova boot —flavor bigGPU
13:14:04 <johnthetubaguy> nova boot —flavor smallGPU_4GBRAM_2_vCPUs
13:14:06 <baoli> John, our discussion so far is based on PCI groups
13:14:31 <heyongli> groups is almost identical  to pci-flavor
13:14:39 <johnthetubaguy> I think there will be more agreement if we work from what the user wants, then look how to deliver that
13:15:01 <johnthetubaguy> i.e. agree the problem we are solving, then look how to implement
13:15:08 <johnthetubaguy> then apply that to networking
13:15:26 <irenab__> johnthetubaguy: I think we mostly talked on PCI for networking, and this is quite different from GPU case
13:15:31 <baoli> #agreed
13:16:02 <johnthetubaguy> I agree its different, but we need the object model to work for both right?
13:16:12 <baoli> When we have been working on this for a while, and certainly we would think about it from user's point of view
13:16:34 <baoli> Also taking into account what existing API we have in nova/neutron
13:16:59 <irenab__> johnthetubaguy: not sure it will e the same from request point of view
13:17:26 <irenab__> I have strong objection to elaborate in flavor request for SRIOV NICs
13:17:46 <baoli> John, in any case, a PCI group/pci flavor can be used in the nova server flavor
13:17:49 <irenab__> which is fine for Device passthrough case
13:17:58 <johnthetubaguy> yes, and I think I agree, but I would just like to see both SRIOV and GPU side by side
13:18:30 <johnthetubaguy> if we agree how to setup GPU, for example, it should be very similar for SRIOV, agreed the user bit is probably different
13:18:42 <irenab__> I think GPU case should be mostly as today, with extra_spec for PCI device
13:19:09 <irenab__> the proposal is to change the terminology from pci_alias to pci_group
13:20:02 <heyongli> irena: no , i think the alias is diffrent , we can drop but not  the same thing with group
13:20:06 <johnthetubaguy> OK, can we recap what we have in the code today, if only for my benefit?
13:20:22 <johnthetubaguy> then agree how GPU looks in the new(er) world?
13:20:52 <baoli> Yongli, can you go ahead to describe that for John?
13:20:55 <heyongli> we have now is : alias: define how you chose the device
13:21:11 <heyongli> server flavor:  use alias request your device
13:21:37 <heyongli> white list:  select device from a host, pick which can be assign to VMs
13:22:00 <baoli> Just want to add that the extra_specs/whitelist is based on PCI device's vender_id and product_id
13:22:22 <johnthetubaguy> and how do the while list and alias relate again?
13:22:41 <baoli> by vendor_id and product_id
13:22:55 <heyongli> alias is chose device from the available pool
13:23:10 <johnthetubaguy> how does the device id come into things? only via whitelist?
13:23:13 <heyongli> whitelist chose device from all on a specific host
13:24:13 <johnthetubaguy> Ok, so if I want a flavor that says pick either GPUv3 and GPUv4, can I do that?
13:24:36 <heyongli> alias support this
13:24:57 <heyongli> define a alias,  say the GPUv3 or GPUv4
13:25:05 <johnthetubaguy> OK, so alias is a list of possible vendor_ids and product_ids?
13:25:12 <heyongli> yeah
13:25:16 <johnthetubaguy> does it include device ids?
13:25:29 <heyongli> what id do you mean?
13:25:43 <johnthetubaguy> the PCI device id, where does that come into the model?
13:26:11 <heyongli> no alias not include the device id ( the DB main key)
13:26:50 <johnthetubaguy> so where does the device id come from? it gets seletected our of the whitelist on the device when attaching it to the VM?
13:26:51 <baoli> John, by id, do you mean PCI slot?
13:26:59 <johnthetubaguy> possibly
13:27:21 <heyongli> that infomation store in the pci device model
13:27:47 <johnthetubaguy> I think I mean address, sorry
13:28:03 <heyongli> alias should not include the address
13:28:05 <baoli> domain:bus:slot:func
13:28:16 <johnthetubaguy> right, thats the thing
13:28:26 <heyongli> whitelist doe not also, but i added alread in current  patch i released
13:28:28 <johnthetubaguy> is that in the whitelist?
13:28:40 <johnthetubaguy> ah, OK
13:28:57 <heyongli> and support the * and [1-5]
13:29:01 <johnthetubaguy> so the big step between GPU and SRIOV is groups different addresses?
13:29:09 <johnthetubaguy> grouping^
13:29:33 <baoli> yes, they belong to different groups
13:29:52 <johnthetubaguy> so that should go in the alias now? for SRIOV?
13:30:05 <heyongli> alias don't need the address
13:30:16 <irenab__> Do we talk about SRIOV for networking or general?
13:30:18 <heyongli> add the group to alias  is sufficent
13:30:37 <johnthetubaguy> OK, so we are adding an extra thing called group?
13:30:43 <heyongli> this is also in my patches released
13:30:46 <johnthetubaguy> that deals with grouping addresses?
13:30:46 <heyongli> yeah
13:30:59 <johnthetubaguy> why is this not just part of alias, that is just a grouping right?
13:31:12 <heyongli> yeah, just in group
13:31:16 <heyongli> alias is global
13:31:23 <johnthetubaguy> (sorry lots of dumb questions, but I just don't think I get where you are coming from now)
13:31:30 <johnthetubaguy> so group is going to be local to each server?
13:31:39 <heyongli> should not say, i want the devicd had bdf is a:b:c, this is meanless
13:31:59 <baoli> PCI group is global
13:32:02 <heyongli> kind of local , like pci vendor
13:32:20 <heyongli> if we keep alias as it, this is local
13:32:29 <heyongli> if we kill alias, this is going to global
13:32:35 <baoli> Yongli, alias is defined on the controller node
13:32:41 <heyongli> yeah
13:32:48 <johnthetubaguy> hmm, but its a local thing that gets referenced in a global concept (flavor)
13:32:55 <johnthetubaguy> I think this is where it gets very confusing
13:33:11 <heyongli> kind of  confusing, might have a better solution
13:33:34 <johnthetubaguy> So, from my outsider view, this seems:
13:33:46 <heyongli> but , group is very like the vender id
13:34:15 <johnthetubaguy> a) roughly complete, but (b) a bit confusing ( c) re-inventing groupings we already have in other bits of nova
13:34:17 <heyongli> we can say vendor id is global, cause it's alloced by pci world
13:34:40 <johnthetubaguy> I think we can agree on this though...
13:35:08 <johnthetubaguy> PCI device has: vendor_id, product_id, address
13:35:27 <johnthetubaguy> and we want to group them
13:35:44 <baoli> well, vendor_id is a hardware specific thing
13:35:50 <johnthetubaguy> types of GPU (don't care about address), types of VIF (do care about specific groups of addresses)
13:36:25 <johnthetubaguy> by default we should not expose any of these devices, unless we configure nova to allow such a device on a particular host to be exposed
13:36:26 <heyongli> VIF should not care the address, i think, they just need partion by address, am right ?
13:36:47 <johnthetubaguy> well, they a grouped by an address range right?
13:36:59 <heyongli> yeah, think
13:37:14 <irenab__> john: it maybe  PF
13:37:24 <irenab__> parent of all Virtual Functions
13:37:44 <johnthetubaguy> ah, OK, so we have virtual functions from a specific address too?
13:37:55 <johnthetubaguy> or is function just part of the address?
13:38:18 <baoli> Join, in SRIOV, we have PF and VF
13:38:40 <baoli> PF: physical function, VF: vritual function, The function is part of the address
13:38:56 <johnthetubaguy> thats cool, just checking we are still grouping by address
13:38:58 <irenab__> Virtual Function is a PCI device of SRIOV NIC that has Parent Function representing the SRIOV NIC itself
13:39:21 <johnthetubaguy> thats all cool, just trying to work out what we are grouping
13:39:54 <baoli> A PCI group is a collection of PCI devices that share the same functions or belong to the same subsystem in a cloud.
13:40:27 <irenab__> actually what we need for basic networking case is grouping by network connectivity
13:40:57 <baoli> Irenab, that's what I mean by subsystem
13:41:02 <johnthetubaguy> OK, so we need someway to link the address to the neutron network-uuid?
13:41:42 <baoli> in the case of SRIOV, new --nic options will achieve that
13:42:15 <irenab__> john: yes, but wee need to make sure that VM is scheduled on appropriate Host
13:42:38 <johnthetubaguy> well, I am not sure it always can, the user doesn't know which host that request will land on right? it just hints to some mappings
13:42:51 <heyongli> +1
13:42:54 <johnthetubaguy> anyways, I think we are moving foward here
13:43:24 <baoli> pci group is a logical abstraction
13:43:53 <irenab__> john: its an idea. Based on the VM boot request it should be scheduled on the Host that is capable to provide SRIOV nics and connect to correct physical network
13:43:59 <baoli> it doesn't care where it lands, but as long as it's using a device in a particular pci group
13:44:25 <heyongli> agree
13:44:45 <johnthetubaguy> right, so what is the user requesting here
13:45:05 <johnthetubaguy> the neutron network, and the type of connection?
13:45:27 <johnthetubaguy> so passthrough, or virtual, and also which type of passthrough, 1Gb or 10Gb, etc?
13:45:29 <baoli> a  neutron network with a NIC that is in a particular PCI group
13:45:39 <irenab__> on wiki:     nova boot --flavor m1.large --image <image_id>                     --nic net-id=<net-id>,vnic-type=macvtap,pci-group=<group-name> <vm-name>
13:45:42 <johnthetubaguy> I am trying to ignore our terms here, and thing of the user
13:45:47 <johnthetubaguy> think^
13:46:02 <baoli> John, 1Gb or 10 GB is a qos thing
13:46:34 <baoli> It's not related to what we are discussing here. But conceptually, you can have a PCI group with 1GB nics
13:46:36 <johnthetubaguy> depends, it could be different cards right?
13:46:54 <irenab__> john: on --nic there is waht we think is needed
13:47:28 <heyongli> to deal with  1G, 10G thing, add the pci device_id to alias is a good solution
13:47:37 <heyongli> i think
13:48:05 <baoli> AGain, you can use PCI groups to group NICs that are on different kind of cards
13:48:16 <heyongli> also work
13:48:33 <johnthetubaguy> OK
13:48:47 <johnthetubaguy> I have written up what I think we said here:
13:48:48 <johnthetubaguy> https://wiki.openstack.org/wiki/Meetings/Passthrough#Definitions
13:48:59 <johnthetubaguy> Do we all agree with those statements?
13:49:40 <johnthetubaguy> sorry, I missed a bit, please refresh
13:49:48 <johnthetubaguy> extra bullet on SRIOV
13:50:23 <heyongli> i post my +1
13:50:29 <irenab__> john: I think the last SRIOV bullet is not accurate.
13:50:54 <johnthetubaguy> irenab__: yeah, I don't like it, what is a better statement?
13:51:15 <irenab__> Its not specific to neutron network , its specific to provider_network that many neutron networks can be defined for
13:51:39 <baoli> John, can we go throught the original post and see if they make sense?
13:51:43 <johnthetubaguy> OK, so it could be specific to a group of netron networks?
13:51:52 <irenab__> john: yes
13:52:06 <johnthetubaguy> irenab__: awesome, got you, thanks
13:53:00 <johnthetubaguy> irenab__: can you check my update please, is that better?
13:53:32 <johnthetubaguy> baoli: we can do that next, I just wanted to agree some basics of what we have, and what we need
13:54:04 <baoli> ok
13:54:26 <irenab__> john: its OK. Not sure what you mean by specific configuration
13:54:45 <johnthetubaguy> I was meaning neutron might specify settings like VLAN id
13:55:19 <irenab__> john: correct
13:55:30 <johnthetubaguy> cool, thanks, let me add an e.g.
13:55:55 <johnthetubaguy> so I guess in basic cases we pass the full nic though
13:56:08 <johnthetubaguy> and its straight to the provider network
13:56:11 <irenab__> each device can be configured differently, but the common part is that it has same network connectivity (to the same Fabric)
13:56:22 <johnthetubaguy> but if we have virtual devices, we can do some fun stuff
13:56:25 <johnthetubaguy> right
13:57:10 <irenab__> john: with full NIC passthrough , I think there is nothing for neutron to do
13:57:38 <johnthetubaguy> irenab__: yeah, it probably gives the guest IP addresses, and things, but yes, there is little connection info I guess
13:58:30 <irenab__> In full passthrough it can be only configured from inside the VM
13:58:39 <baoli> server flavor can still be used for generic PCI passthrought
13:58:50 <irenab__> at least for cases I need, we talk only of SRIOV VFs
13:58:51 <johnthetubaguy> I don't get why that is, neutron DHCP can still setup, if its given the mac address?
13:59:28 <irenab__> jogh: agree. I mean that you need VM to actually do something to get the config, like send DHCP request
13:59:30 <heyongli> they might mean pass through regular PCI
13:59:51 <johnthetubaguy> irenab__: ah, yep, sorry, thats true
14:00:09 <johnthetubaguy> cool, so I think we can agree the GPU passthrough case then...
14:00:30 <johnthetubaguy> user requests a flavor extra specs *imply* which possible PCI devices can be connected
14:00:33 <heyongli> still base on alias, right ?
14:00:51 <irenab__> john: would you be available tomorrow for this meeting to dig into SRIOV net details?
14:00:55 <johnthetubaguy> I am leaving that out for now.. we can add that later
14:01:06 <johnthetubaguy> what time is tomorrow?
14:01:15 <baoli> same time
14:01:23 <johnthetubaguy> 13.00 UTC?
14:01:27 <johnthetubaguy> that should be OK
14:01:31 <baoli> Yes
14:01:40 <irenab__> great. thanks
14:01:45 <baoli> Do we want to end this meeting now?
14:01:57 <johnthetubaguy> we might have to soon
14:02:02 <johnthetubaguy> I can do another 10 mins
14:02:14 <irenab__> I can too
14:02:20 <baoli> cool
14:02:23 <heyongli> fine
14:03:07 <irenab__> Do we want to start SRIOV NIC case?
14:03:32 <johnthetubaguy> well, just thinking about doing a statement like
14:03:33 <johnthetubaguy> user requests a flavor extra specs *imply* which possible PCI devices can be connected
14:03:42 <johnthetubaguy> as in thats the GPU case
14:03:54 <johnthetubaguy> what do we say for the SRIOV case?
14:04:13 <irenab__> I think flavor extra spec is not god solution for networking case
14:04:32 <irenab__> good^
14:04:37 <baoli> a VM needs NICs from one or more PCI groups
14:04:51 <johnthetubaguy> user requests neutron nics, on specific neutron networks, but connected in a specific way (i.e. high speed SRIOV vs virtual)
14:05:01 <johnthetubaguy> doe that make sense?
14:05:05 <johnthetubaguy> does^
14:05:15 <irenab__> and VM can be attached to different virtual networks
14:05:39 <irenab__> and iterface can be attached/detached later on
14:05:49 <irenab__> interface^
14:05:51 <baoli> I should say, A VM needs NICs on some networks from some PCI groups
14:05:55 <johnthetubaguy> some of the nics may be virtual, some may be passthrough, and some might be a different type of passthrough
14:06:03 <baoli> yes
14:06:08 <irenab__> john: correct
14:06:15 <johnthetubaguy> I am trying to exclude any of the admin terms in the user description
14:06:26 <johnthetubaguy> so we have a clear vision we can agree on, thats all
14:06:49 <baoli> #agreed
14:06:56 <irenab__> john: vision yes, implementation details - no
14:07:00 <johnthetubaguy> OK, I updated the wiki page
14:07:08 <johnthetubaguy> https://wiki.openstack.org/wiki/Meetings/Passthrough#The_user_view_of_requesting_things
14:07:14 <johnthetubaguy> do we agree on that?
14:07:47 <irenab__> john: yes
14:08:14 <johnthetubaguy> sorry to take up the whole meeting on this, but really happy to get a set of aims we all agree on now
14:08:14 <baoli> #agreed
14:08:19 <johnthetubaguy> sweet
14:08:30 <heyongli> +1
14:08:35 <johnthetubaguy> so I think the question now, is how do we get the admin to set this up and configure it
14:08:40 <johnthetubaguy> and what do we call everything
14:09:01 <irenab__> agree
14:09:06 <johnthetubaguy> that sounds like something for tomorrow, but maybe spend 5 mins discussing one point...
14:09:06 <baoli> #agreed
14:09:15 <irenab__> ok
14:09:20 <johnthetubaguy> at the summit we raised an issue with the current config
14:09:44 <irenab__> john: can you recap
14:10:10 <johnthetubaguy> basically we are trying to keep more of the config as API driven, to stop the need for reloading nova.conf, etc, and general ease of configuration
14:10:18 <johnthetubaguy> now clearly not everything should be an API
14:11:01 <johnthetubaguy> also, in other sessions, we have pushed back on ideas that introduce new groups that are already covered by existing generic groupings (i.e. use host aggregates, don't just add a new grouping)
14:11:06 <baoli> John, we have discussed configuration versus API for the past couple of meetings. Would you be able to look at the logs? I can send you the logs
14:11:31 <johnthetubaguy> yeah, if you can mail me the lots that would be awesome, or are then on the usual web address?
14:11:43 <johnthetubaguy> did we have nova-core review any outcomes of that yet?
14:11:47 <irenab__> john: we try to define auto-discovery of PCI device in order to minimize items needed for config
14:11:55 <johnthetubaguy> right that sounds good
14:12:11 <johnthetubaguy> I should read up on those lots
14:12:13 <johnthetubaguy> logs
14:12:16 <baoli> A couple of them are in the daily logs, but not in the meeting logs
14:12:29 <johnthetubaguy> ah...
14:12:49 <baoli> I need to find a way to link them back here. I'll try to do that
14:12:59 <irenab__> baoli: I think its better you send it, since there was meeting name change and one meeting without starting...
14:13:12 <johnthetubaguy> cool, we should probably end this meeting, then add those pointers to the wiki page?
14:13:18 <baoli> I'll send them again.
14:13:28 <baoli> Sure
14:13:32 <johnthetubaguy> cool, could we just add it to that meeting wiki page?
14:13:37 <johnthetubaguy> cool
14:13:49 <irenab__> thanks, I think the meeting was productive. see you tomorrow
14:13:55 <baoli> I'll do both. See you guys tomorrow
14:13:55 <heyongli> thanks,baoli
14:14:00 <johnthetubaguy> so to be upfront, I think we can do the whole grouping with host aggregates and an API to list all pci devices
14:14:09 <johnthetubaguy> but yep, lets chat tomorrow!
14:14:21 <baoli> #endmeeting