14:00:50 <apuimedo> #startmeeting kuryr
14:00:51 <openstack> Meeting started Mon Jan 16 14:00:50 2017 UTC and is due to finish in 60 minutes.  The chair is apuimedo. Information about MeetBot at http://wiki.debian.org/MeetBot.
14:00:52 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote.
14:00:54 <openstack> The meeting name has been set to 'kuryr'
14:01:10 <apuimedo> Hello and welcome everybody to another Kuryr weekly IRC meeting
14:01:21 <garyloug> o/
14:01:22 <ivc_> o/
14:01:27 <yedongcan> o/
14:01:28 <apuimedo> Today vikasc and irenab have excused themselves and won't be able to join
14:01:30 <mchiappero> o/
14:02:04 <apuimedo> #topic kuryr-lib
14:03:40 <alraddarla> o/
14:03:41 <apuimedo> There's basically no news on kuryr-lib :-)
14:03:44 <apuimedo> No news, good news
14:03:47 <apuimedo> as they say
14:03:54 <apuimedo> only this patch
14:03:58 <apuimedo> #link https://review.openstack.org/#/c/418792/
14:04:19 <apuimedo> which pisses me off a bit that came so late
14:04:45 <apuimedo> but... What can we do, we merge it now and we'll consider if we do a point release to get the requirement upped
14:05:04 <ltomasbo> o/
14:05:46 <apuimedo> Anybody's got anything on kuryr-lib?
14:06:42 <apuimedo> very well!
14:06:44 <apuimedo> Moving on
14:06:48 <apuimedo> #topic kuryr-libnetwokr
14:06:51 <apuimedo> darn!!!
14:06:54 <apuimedo> typo
14:06:56 <mchiappero> eheh
14:06:58 <apuimedo> #topic kuryr-libnetwork
14:07:15 <apuimedo> ltomasbo: can you update us on the race?
14:07:25 <ltomasbo> Hi
14:07:40 <ltomasbo> the problem is that, after calling to attach the (sub)port to the trunk
14:07:49 <ltomasbo> we (kuryr-libnetwork) calls update_port
14:07:59 <ltomasbo> and it seems attach_subport also calls update_port internally
14:08:05 <ltomasbo> therefore, there is a race there
14:08:12 <ltomasbo> and something me device_owner is not properly set
14:08:29 <ltomasbo> which makes kuryr-libnetwork not able to remove the port after removing the container
14:08:53 <ltomasbo> since there is a filter by deviceowner before removing the port
14:09:10 <apuimedo> #link https://review.openstack.org/#/c/419028/
14:09:41 <ltomasbo> I've been discussing with armax about the possibility of reverting the patch that set device_owner for trunk ports
14:10:04 <ltomasbo> another easy fix could ve to remove the device_owner filter we do on ipam_release before calling the remove_port
14:10:22 <apuimedo> #info there is a race in container-in-VM flow due to subport addition usage of device owner
14:11:06 <mchiappero> I'm sorry, but I'm not sure I've understood the root cause of the race
14:11:15 <apuimedo> ltomasbo: it is kind of a nice service we provide our users to mark which ports we manage for them automatically
14:11:26 <ltomasbo> apuimedo, yes, I agree
14:11:28 <apuimedo> so we should try to preserve it as much as possible
14:11:46 <apuimedo> mchiappero: basically we update the device:owner field to container:kuryr
14:11:50 <ltomasbo> the problem is that, the patch I proposed to revert is setting a different device_owner to the subports
14:11:57 <apuimedo> and the trunk_subport_add does the same behind the scene
14:12:07 <ltomasbo> therefore we call attach_subport and then update_port
14:12:18 <ltomasbo> but attach_subport internally calls to update_port too
14:12:24 <apuimedo> so IIRC, if the trunk_subport_add op finds it changed before it finishes, it goes boom. Is that right ltomasbo?
14:12:27 <ltomasbo> and both update_ports set a different device_owner
14:12:47 <apuimedo> or is it only that it breaks our removal?
14:13:12 <ltomasbo> we need to call first trunk_subport_add and then update_port
14:13:26 <ltomasbo> since trunk_subport_add will fail if device_id is already set in the port
14:13:55 <ltomasbo> butn trunk_subport_add also calls update_port to set the device_owner to trunk:subport
14:14:14 <ltomasbo> while we call update_port from kuryr-libnetwork to set device_owner to kuryr:container
14:14:15 <ivc_> ltomasbo wait a sec, if trunk_subport_add fails, does that mean that it does not support device_id? or is it a bug in neutron?
14:14:42 <ltomasbo> with the current implementation, there is no problem with device_id, as we call first trunk_subport_add
14:14:52 <ltomasbo> the problem is that, as device_owner is not set to kuryr:container
14:15:00 <ltomasbo> the port will not be deleted after the container is removed
14:15:30 <ltomasbo> and, the fact that turnk_subport_add fails if device_id is already set is not a bug
14:15:32 <ivc_> what is the device_owner then? if not 'kuryr:container'?
14:15:33 <ltomasbo> is the way it should work
14:15:55 <ltomasbo> if the port is already in used, it should not be made part of a trunk
14:16:01 <apuimedo> ivc_: no, it is trunk:subport
14:16:08 <ltomasbo> device_owner is set to trunk:subport
14:16:08 <apuimedo> sometimes, depending on the race
14:16:42 <ivc_> apuimedo thats what i see as the problem. neutron trunk code set it to 'trunk:subport' and we reset it to 'kuryr:container', right?
14:16:55 <ivc_> i think we are breaking some neutron contract here
14:16:56 <apuimedo> ivc_: I think so
14:17:03 <apuimedo> no, no contract
14:17:05 <ltomasbo> yes
14:17:17 <mchiappero> yes, no? :D
14:17:22 <apuimedo> they only set it to trunk:subport for no specific reason
14:17:25 <ltomasbo> they just tagged subports to kuryr:container for simplicity
14:17:33 <ltomasbo> there is no real need for that on neutron
14:17:36 <ivc_> neutron might expect it to be 'trunk:subport'
14:17:48 <ivc_> maybe it's not used now, but we should not rely on it
14:17:59 <apuimedo> ivc_: mchiappero: "Altough this is not currently required by any of the business logic, it is handy to have this set to help users quickly identify ports used in trunks"
14:18:11 <apuimedo> This is the justification for setting trunk:subport in the original patch
14:18:16 <apuimedo> nothing uses this fact
14:18:19 <ivc_> 'currently'
14:18:36 <apuimedo> ivc_: this 'currently' has not changed
14:18:42 <apuimedo> and there's no good reason it should
14:18:56 <ivc_> at some point neutron could add the business logic that would rely on it
14:18:57 <mchiappero> I agree with using kuryr as device owner, but still, I don't fully understand whether it's a timing issue or what
14:18:57 <ltomasbo> we can modify (I'm waiting for Armax answer) the way trunk_subport_owner set the device_owner
14:19:04 <apuimedo> ivc_: I agree with you on that
14:19:08 <apuimedo> I find it misguided
14:19:15 <ltomasbo> and make it possible to not set it to anything (for the kuryr case)
14:19:59 <apuimedo> ivc_: I honestly just don't see the point of setting it to trunk:subport
14:20:09 <ltomasbo> neither do I
14:20:14 <apuimedo> when it is something that can be checked in the API, that it belongs to a trukn
14:20:16 <apuimedo> *trunk
14:20:21 <ltomasbo> I think it is just to easily find the subports
14:20:24 <ivc_> apuimedo i understand that, but thats the current api
14:20:26 <apuimedo> but now the damage is done
14:20:29 <apuimedo> exactly
14:21:08 <ivc_> imo best course of action is to update neutron trunk api in a way that would allow us to legitimately set device_owner
14:21:16 <ltomasbo> yes, it seems reverting that could affect already set deployments
14:21:23 <ivc_> excluding the potential conflict between trunk code and kuryr
14:21:26 <apuimedo> it is quite annoying, but we'll probably have to consider whether we do not mark subports or use tags or something else
14:21:38 <apuimedo> ivc_: that was my first thought
14:21:46 <apuimedo> to extend the api of trunk_subport_add
14:21:51 <apuimedo> so that you can pass it a device owner
14:22:10 <apuimedo> (which by the way saves us one neutron call :P )
14:22:15 <ivc_> yup
14:22:29 <apuimedo> ltomasbo: did you propose that to armax?
14:22:30 <ltomasbo> and easy modifications could be to just change the config option, and allow not to set any device_owner
14:22:35 <apuimedo> Is there a mailing list thread for that?
14:22:38 <ltomasbo> without needed to modify the API
14:22:43 <ltomasbo> which is always trickier
14:23:02 <apuimedo> ivc_: the problem with extending the API is that we're probably too close to freeze or already frozen
14:23:03 <ltomasbo> it is being discussed on the revert patch:
14:23:09 <ltomasbo> https://review.openstack.org/#/c/419028
14:23:12 <apuimedo> ltomasbo: gotcha
14:23:49 <ivc_> -2 ...
14:23:55 <ltomasbo> :D
14:24:09 <ltomasbo> I know! But still discussing with armax
14:24:20 <apuimedo> ltomasbo got the hammer of justice!
14:24:28 <ltomasbo> and I agree reverting could not be done, but we are using that to discuss
14:24:37 <ltomasbo> and then (I suppose) another solution will be proposed
14:24:55 <ltomasbo> I see the other option, not setting TRUNK_SUBPORT_OWNER
14:25:02 <apuimedo> ltomasbo: from the commit message discussion I see an interesting option
14:25:16 <ltomasbo> and making the code not setting the device owner on trunk_add_subport
14:25:20 <apuimedo> we could probably argue for having the device:owner unchanged if it is not None
14:25:22 <ltomasbo> it will be just a couple of lines
14:25:25 <ivc_> so how tricky it would be to keep 'trunk:subport' owner? do we have some sort of workaround?
14:25:30 <apuimedo> and move the update before the trunk_port_add
14:25:37 <apuimedo> *trunk_subport_add
14:25:44 <mchiappero> sorry but I guess there is no guarantee on the serialization of the operations in neutron
14:25:53 <mchiappero> but shouldn't be that way for the same port?
14:26:09 <mchiappero> shouldn't actions to the same port be serialized?
14:26:20 <mchiappero> wouldn't this solve the issue?
14:26:30 <apuimedo> mchiappero: not sure
14:26:37 <ivc_> mchiappero afaik they are. as soon as you got confirmation for your request, it is commited
14:27:08 <ltomasbo> yes, but calls are async, so, they can be executed in different orders
14:27:09 <mchiappero> so in this case our update port could get confirmed first, right?
14:27:15 <apuimedo> ivc_: re keeping trunk:subport would break our contract of marking our resources
14:27:19 <ivc_> the problem is not the race as i understand it but just the conflict between kuryr's port update and trunk logic
14:27:32 <ltomasbo> yes and not
14:27:42 <apuimedo> ivc_: For me the race is just a symptom
14:27:57 <ltomasbo> if it ensure that our call is called after trunk_add_subport fully finished
14:28:01 <mchiappero> apuimedo: right
14:28:05 <ltomasbo> then from kuryr point of view, that will work
14:28:14 <ltomasbo> but I agree with apuimedo
14:28:17 <hongbin> o/
14:28:21 <ltomasbo> the problem is the use of device_owner
14:28:29 <apuimedo> exactly
14:28:33 <ivc_> yup
14:28:46 <mchiappero> I would expect neutron to perform some ordering or take some port specific lock
14:28:48 <apuimedo> I would really like to have the extra parameter in the trunk_subport_add
14:29:06 <ltomasbo> but that will not solve the problem
14:29:15 <ltomasbo> if we set device_owner to whatever we want
14:29:19 <ltomasbo> we can still do that
14:29:23 <ltomasbo> but the problem will be the same
14:29:36 <ltomasbo> not an unified view of what device_owner should be about
14:29:57 <apuimedo> ltomasbo: recognizing that it can be set to something else kind of forces Neutron to acknowledge that this field is informative for them and not for logic
14:30:06 <ivc_> ltomasbo, toni's point is if we get 'device_owner' as part of 'trunk_subport_add' api, it would make 'kuryr:container' device owner legit
14:30:18 <ltomasbo> got it
14:30:22 <apuimedo> ivc_: or a "use at your own risk"
14:30:34 <apuimedo> depends on the wording in the method doc
14:30:39 <apuimedo> that gets merged
14:31:09 <ltomasbo> of course, for kuryr deployment we can state that TRUNK_SUBPORT_OWNER should be set to kuryr:container
14:31:19 <mchiappero> I still don't fully understand: does setting the owner after a well finished and seccessful trunk_subport_add work?
14:31:21 <ltomasbo> and that will solve the problem, at the expense of flexibility
14:31:23 <apuimedo> ivc_: I didn't want to say it. But the right fix, things being what they are now, would be that there would be multiple owners
14:31:39 <apuimedo> but that is even more API breaking :/
14:31:52 <apuimedo> mchiappero: it does
14:31:59 <mchiappero> ok, so the problem is neutron
14:32:05 <apuimedo> but it makes us tread in unsafe waters
14:32:19 <mchiappero> that's something for them to fix
14:32:31 <apuimedo> that if neutron subport related code started relying on this (like for upgrades)
14:32:39 <apuimedo> it could render our subports useless
14:33:12 <apuimedo> the fix I'd like to avoid, but that work work right away
14:33:18 <ivc_> apuimedo i think multiple owners would only add confusion. the 'owner' should be unique, but the device_owner field should not be used for storing 'informative' date as it is the case with 'trunk:subport'
14:33:30 <ivc_> data*
14:33:32 <mchiappero> ivc_: agree
14:33:32 <apuimedo> is to use a new tag instead of the device:owner
14:34:21 <ivc_> imo kuryr is the real owner in this case
14:34:32 <apuimedo> ivc_: I agree with you, if Neutron wanted so much to have this informative to avoid checking in DB whether it was a subport, it could have added a field for that
14:34:39 <apuimedo> ivc_: no question about that
14:34:45 <apuimedo> but we need a pragmatic solution
14:35:06 <apuimedo> let's wait to hear what ltomasbo gets from armax in proposing a new parameter for the trunk operations
14:35:47 <ltomasbo> would you agree/like the other solution? just disabling trunk_add_subport to write on device:owner?
14:35:51 <ivc_> the problem is HCF
14:35:56 <apuimedo> #action ltomasbo to continue discussion with armax, proposing trunk_subport_add to receive optionally an API owner name
14:36:13 <apuimedo> ltomasbo: disabling it how?
14:36:16 <apuimedo> ivc_: HCF?
14:36:23 <ivc_> hard code freeze
14:36:41 <ltomasbo> as they just set whatever value is in TRUNK_SUBPORT_OWNER in the config.py file
14:36:45 <ltomasbo> just setting that to none
14:37:02 <ltomasbo> and if it is set to none, then just don't call update_port
14:37:15 <ltomasbo> and if it is set to wahtever it is, keep working as it is
14:37:22 <ltomasbo> so that it does not break current deployments
14:37:32 <apuimedo> ltomasbo: that would apply to all the ports, not just those used by kuryr
14:37:44 <ltomasbo> only to subports
14:38:00 <apuimedo> right, but a user may use subports for other purposes
14:38:20 <ivc_> apuimedo ltomasbo if we look at it from different perspective, do we need 'device_owner' for cleanup only?
14:38:38 <ltomasbo> we don't even really needed
14:38:45 <ltomasbo> is just a filter to speed up the search
14:38:55 <apuimedo> ivc_: we use it for cleanup. But its main purpose is to notify that it is automatically handled by kuryr to users
14:39:10 <apuimedo> it was sort of... Since it is already there, let's use it
14:39:47 <ltomasbo> to me, it makes sense, as it is kuryr service creating/managing them
14:39:49 <mchiappero> I used it a lot while working on ipvlan
14:39:50 <ivc_> as much as i dislike special-casing, maybe then we could have a special case for 'trunk:subport' that would fetch the ports for kuryr-managed nodes somehow
14:39:58 <mchiappero> i often had leftovers
14:40:10 <ltomasbo> while tagging that it is a subport, should not go on device_owner, but device_type or something like that if they need it
14:40:13 <ivc_> ofc until we get a proper api update on neutron side
14:41:36 <apuimedo> ivc_: let's wait a week to see what Neutron people say
14:41:48 <apuimedo> and then we can decide on contingency measures
14:41:50 <ivc_> sure
14:42:32 <apuimedo> anything else about kuryr-libnetwork?
14:43:34 <apuimedo> very well
14:43:35 <apuimedo> moving on
14:43:42 <apuimedo> #topic fuxi
14:43:50 <apuimedo> #chair hongbin
14:43:51 <openstack> Current chairs: apuimedo hongbin
14:43:53 <hongbin> hi
14:44:06 <hongbin> in last week, there are several proposed fixes
14:44:17 <hongbin> #link https://review.openstack.org/#/c/419767/
14:44:19 <apuimedo> hongbin: today I was asked about fuxi on magnum. Do we have some docs on that? Or it only targets bare metal?
14:44:39 <hongbin> apuimedo: i am happy to explore fuxi on magnum
14:44:56 <hongbin> apuimedo: it is definitely one of the target
14:45:03 <hongbin> apuimedo: there are several things that needs to be done
14:45:13 <hongbin> apuimedo: 1. containerized fuxi
14:45:19 <hongbin> apuimedo: 2. trust support
14:45:34 <hongbin> apuimedo: then, we are ready to propose it to magnum
14:45:57 <apuimedo> cool
14:46:04 <apuimedo> sorry for the interruption
14:46:09 <hongbin> apuimedo: np
14:46:09 <apuimedo> :-)
14:46:24 <hongbin> yes, to continue,
14:46:36 <hongbin> i was trying to move fuxi to py35
14:46:43 <hongbin> #link https://review.openstack.org/#/c/419683/
14:47:03 <hongbin> the last one, i have a pov for making multi-tenancy support
14:47:12 <hongbin> #link https://review.openstack.org/#/c/420386/
14:47:25 <hongbin> all of those are under review, feedback is appreciate
14:47:31 <hongbin> apuimedo: that is all from my side
14:47:51 <apuimedo> hongbin: I suppose swift's failure with py3 is reported as a bug, right?
14:47:55 <apuimedo> thanks hongbin
14:48:02 <apuimedo> #topic kuryr-kubernetes
14:48:09 <hongbin> apuimedo: yes, we worked around swift
14:49:04 <apuimedo> #info ivc_ is a new core for Kuryr-kubernetes! Congratulations!
14:49:19 <ivc_> thanks! :)
14:49:20 <apuimedo> ivc_: and now that you are congratulated, pls review https://review.openstack.org/#/c/419933/ for merge :P
14:50:00 <apuimedo> #info vikasc reported that he is finishing addressing ltomasbo and ivc_'s comments to https://review.openstack.org/#/c/410578/
14:50:29 <apuimedo> hongbin: we could appreciate help into drafting a plan to integrate kuryr-kubernetes with Magnum once that patch is merged
14:50:48 <apuimedo> a TODO list or something like that
14:50:57 <hongbin> apuimedo: i can try
14:51:04 <apuimedo> s/could/would/
14:51:05 <ltomasbo> nice, I'm trying to follow the instruction to set it up, but so far no luck with it
14:51:05 <apuimedo> :-)
14:51:07 <apuimedo> thanks hongbin
14:51:19 <apuimedo> ltomasbo: you mean vikasc's patch?
14:51:23 <ltomasbo> yep
14:51:31 <ltomasbo> using the devstack templates
14:51:33 <apuimedo> ltomasbo: ping him in the morning then :P
14:51:40 <ltomasbo> I'll do
14:51:43 <apuimedo> cool
14:51:51 <apuimedo> ivc_: any news on the services front?
14:52:07 <ivc_> nope
14:52:31 <apuimedo> very well
14:52:36 <ltomasbo> servivcs == lbaas/octavia?
14:52:43 <ltomasbo> s/servivcs/services
14:52:44 <apuimedo> ltomasbo: neutron-lbaasv2
14:52:55 <apuimedo> we should add octavia after that
14:53:05 <ltomasbo> is there a patch on that already? I would like to take a look
14:53:06 <ivc_> ltomasbo i think toni is referring to the split of https://review.openstack.org/#/c/376045/ :)
14:53:10 <apuimedo> it should be a matter of changing the driver
14:53:28 <ltomasbo> great!
14:53:30 <apuimedo> ltomasbo: there is a patch, the one ivc_ links to. However, it needs a bit of splitting and UT
14:53:54 <ltomasbo> should that include the floating ip support too (in a follow up patch)?
14:54:12 <apuimedo> let me check
14:54:27 <ltomasbo> (maybe it already does...)
14:54:32 <ivc_> ltomasbo it does not
14:54:33 <ltomasbo> just asking...
14:54:39 <ltomasbo> ok
14:55:01 <apuimedo> I was checking if you could define externalIP for pod
14:55:04 <apuimedo> apparently you can't
14:55:12 <apuimedo> so yeah, it should be a follow-up patch
14:55:53 <apuimedo> anybody's got anything about kuryr-kubernetes?
14:56:20 <ivc_> apuimedo ltomasbo thats a rather trivial change technically, but i'm not yet certain if floating ip is the right fit for external IP
14:57:11 <apuimedo> ivc_: why?
14:57:38 <ltomasbo> I think it is something you can add to the VIP in lbaas, so that the loadbalancer can get reached from outside
14:58:09 <ivc_> ltomasbo apuimedo because there's also 'loadbalancer' type service
14:58:58 <ivc_> my understanding was that external ip (https://kubernetes.io/docs/user-guide/services/#external-ips) from k8s point of view is a an IP configured on the node's interface
14:59:04 <sigmavirus> apuimedo: y'all wrapping up?
14:59:11 <apuimedo> sigmavirus: we are
14:59:15 <apuimedo> sorry about that
14:59:21 <sigmavirus> No problem :)
14:59:27 <apuimedo> let's move to the channel
14:59:32 <apuimedo> thank you all for joining
14:59:36 <apuimedo> #endmeeting