22:03:13 <armax> #startmeeting neutron_drivers
22:03:13 <openstack> Meeting started Thu Sep 22 22:03:13 2016 UTC and is due to finish in 60 minutes.  The chair is armax. Information about MeetBot at http://wiki.debian.org/MeetBot.
22:03:14 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote.
22:03:16 <openstack> The meeting name has been set to 'neutron_drivers'
22:03:24 <armax> worse comes to worse we can make this meeting super brief
22:03:25 <HenryG> o/
22:03:31 <armax> I wanted to discuss the RC2 backlog
22:03:36 <amotoki> hi
22:03:38 <armax> it looks like we’re pretty close to squashing it
22:03:42 <armax> amotoki: hi!
22:03:58 <armax> you guys ready?
22:04:03 <armax> #link https://launchpad.net/neutron/+milestone/newton-rc2
22:04:07 <HenryG> YES
22:04:10 <armax> YES?
22:04:11 <armax> good
22:04:19 <armax> HenryG: are you sure?
22:04:24 <HenryG> NO
22:04:32 <kevinbenton> HI
22:04:40 <armax> NO SHOUTING PLEASE
22:04:47 <ihrachys> I don't see Inessa's port calculation fix in. was it deferred to O?
22:05:00 <armax> ihrachys: you’re jumping the gun
22:05:04 <armax> ihrachys: go back to your place
22:05:17 <armax> ihrachys: I ask politely
22:05:20 <armax> mind you
22:05:32 <ihrachys> I see!
22:05:35 <armax> you see?
22:05:36 <armax> good
22:05:47 <armax> just making sure we all know who rules here
22:05:55 <armax> and that’s be me
22:06:10 <armax> ok
22:06:14 <armax> jokes aside
22:06:33 <armax> let’s dive in
22:06:44 <armax> the provisional deadline for RC2 is Sep 29
22:06:58 <armax> so ihrachys and I will aim to have everything in tip top shape by the 27
22:07:05 <armax> so 27, we’ll cut RC2
22:07:17 <armax> that gives us 48 hours for the clusterfudge of the last minute
22:07:28 <armax> fair?
22:07:38 <armax> ihrachys: what do you reckon?
22:07:50 <armax> since you’re the release manager, I am asking you to stand up now
22:08:04 <njohnston> sounds fair to me
22:08:08 <ihrachys> + to what the gentleman said
22:08:15 <armax> ihrachys: thank you sir
22:08:18 <armax> so
22:08:33 <armax> besides, I am on PTO from Sep 28 :)
22:08:49 <ihrachys> so you don't show up on Mon?
22:08:55 <armax> 28 is Wed
22:09:16 <ihrachys> pff, oh well, sorry
22:09:17 <armax> ok, now with that in mind
22:09:25 <armax> ihrachys: it’s ok we know it’s late on your neck of the woods
22:09:41 <armax> let’s go from the lowest priority to the highest
22:10:03 <armax> bug 1537091
22:10:05 <openstack> bug 1537091 in neutron "Prevent the attachment of a subnet to a router" [Wishlist,In progress] https://launchpad.net/bugs/1537091 - Assigned to Mathieu Rohon (mathieu-rohon)
22:10:12 <armax> it’s a low-hanging-fruit ish
22:10:34 <armax> let’s vote for to agree if we want to work/review to have it in
22:10:52 <armax> I think the patch churned enough times and it’s ready-ish
22:10:56 <armax> so I am +1
22:11:24 <armax> +1 to work to allow it in RC2
22:11:43 <armax> if we review it extensively and we don’t like it so be it
22:11:45 <kevinbenton> +1
22:11:48 <armax> ok
22:11:50 <amotoki> +1 from me too. I just commented on variable naming in the review, but it looks good in general.
22:11:51 <armax> HenryG: ?
22:11:57 <armax> ihrachys: ?
22:12:10 <ihrachys> I haven't reviewed the patch; I am also not sure why it's high priority on bgpvpn side (it seems like a lack of validation which is not critical in my worldview). but I don't mind either.
22:12:18 <armax> ok
22:12:25 <armax> let’s keep it targeted
22:12:29 <armax> next
22:12:30 <armax> bug 1625981
22:12:31 <openstack> bug 1625981 in neutron "update response of ML2 doesn't include bumped revision number" [Medium,In progress] https://launchpad.net/bugs/1625981 - Assigned to Kevin Benton (kevinbenton)
22:12:40 <armax> I think kevinbenton wants this
22:12:54 <armax> and assuming he doesn’t screw up again, I am ok with helping him
22:13:07 <kevinbenton> +1. it makes the dhcp agent unable to distinguish between some stale port updates
22:13:09 <kevinbenton> :)
22:13:11 * armax loves kevinbenton
22:13:28 <armax> I reviewed the patches already
22:13:31 <armax> they ready-ish
22:13:50 <armax> HenryG, amotoki, ihrachys ?
22:13:51 <ihrachys> +1 on that one, I will review tomorrow.
22:13:56 <armax> cool
22:14:06 <amotoki> +1 from me. i am reviewing it.
22:14:09 <armax> sweet
22:14:11 <armax> next one
22:14:13 <armax> bug 1623953
22:14:14 <openstack> bug 1623953 in neutron "Updating firewall rule that is associated with a policy causes KeyError" [Medium,In progress] https://launchpad.net/bugs/1623953 - Assigned to Sridar Kandaswamy (skandasw)
22:14:20 <armax> njohnston: that’s in your realm
22:14:37 <njohnston> Yes, it's an edge case that SridarK found in testing
22:14:40 <armax> njohnston: I am happy to support you
22:15:00 <armax> amotoki, kevinbenton, HenryG, ihrachys ?
22:15:14 <kevinbenton> this is in fwaas
22:15:15 <kevinbenton> +1
22:15:17 <armax> yes
22:15:18 <njohnston> the fix is simple, and SridarK already has a patch up that includes a test
22:15:28 <armax> njohnston: ok, make sure the fix lands in master
22:15:29 <ihrachys> +1, though who cares about fwaasv2 :P
22:15:32 <armax> and we’ll take care of the backport
22:15:46 <njohnston> I swear I'll read the doc this time! :-)
22:16:00 <armax> njohnston: just press the button
22:16:01 <armax> :)
22:16:01 <amotoki> I am not sure the impact. njohnston any idea on priority?
22:16:12 <armax> this early in the realease it’s probably the only time when it works
22:16:30 <armax> amotoki: looks like an edge case, but worth nailing down nonetheless
22:16:44 <njohnston> amotoki: SridarK didn't mention how major the edge case was that he found, so I have only his rating of 'medium' to go on
22:16:44 <armax> besides the fwaas gate is bleeding fast
22:16:55 <amotoki> +1 to try to have this in.
22:16:56 <njohnston> armax: bleeding?
22:16:57 <armax> because they don’t test anything
22:17:00 <armax> cough cough
22:17:26 <armax> njohnston: I mean that the fwaas gate is not as cumbersome and job intensive as the neutron one
22:17:43 <njohnston> armax: Ah, I see. :)
22:17:44 <armax> so approving/merging code is usually quick
22:17:53 <armax> besides not being in the integrated gate
22:18:02 <armax> patches are tested in isolation
22:18:14 <armax> and thus free of gate resets induced by other projects
22:18:14 <HenryG> Is the fwaas functional job running yet? Do we know if the models and migrations are in sync?
22:18:15 <armax> anyhow
22:18:19 <armax> HenryG: not yet
22:18:24 <armax> I don’t think
22:18:49 <njohnston> HenryG: Not yet https://review.openstack.org/#/c/359320/
22:18:51 <armax> but that’s a good point, if someone were to look into that and found niggley bits
22:19:17 <armax> I would not mind to nail issues down
22:19:29 <armax> HenryG: didn’t you reconcile the models a while back?
22:19:35 <armax> I don’t recall migrations go in recently
22:19:36 <HenryG> I tried
22:19:40 <armax> but failed?
22:19:49 <HenryG> I don't remember
22:19:50 <njohnston> I have run the model migration test manually a few times in the past week, it worked for me
22:20:01 <HenryG> cool
22:20:02 <armax> HenryG: oh boy
22:20:14 * njohnston will put "it worked for me" on his tombstone
22:20:17 <armax> HenryG: it must have been a failure then
22:20:27 <armax> HenryG: those are the events that the human minds tend to discard
22:20:30 <armax> anyhoo
22:20:33 <armax> let’s move on
22:20:33 <HenryG> I recall I got them working but the job was not up
22:20:40 <armax> bug 1623708
22:20:42 <openstack> bug 1623708 in neutron "OVS trunk management does not tolerate agent failures" [Medium,In progress] https://launchpad.net/bugs/1623708 - Assigned to Armando Migliaccio (armando-migliaccio)
22:20:47 <armax> this one has two changes
22:20:52 <armax> one from me and one from rossella_
22:20:58 <armax> rossella_’s is nearly ready
22:21:18 <armax> mine I just posted it, I need to clean it up a little but I will be done in the next hour or so
22:21:33 <armax> then I need kuba and some other OVS guru to review it
22:21:42 <armax> obviously I’d love it in
22:21:55 <armax> amotoki, ihrachys, kevinbenton, HenryG?
22:22:03 <ihrachys> definitely + on that one
22:22:07 <HenryG> no objection from me
22:22:09 <kevinbenton> +1
22:22:10 <amotoki> no objection
22:22:14 <armax> cool
22:22:22 <armax> I would also like this one
22:22:23 <armax> https://review.openstack.org/#/c/374388/
22:22:30 <armax> but I have to talk some sense into kevinbenton first
22:22:52 <armax> if that doesn’t make it though it’s not the end of the world
22:23:12 <armax> I’ll blame kevinbenton in case people want this
22:23:24 <armax> so,
22:23:40 <armax> next two issues get a little thorny
22:23:48 <armax> bug 1619253
22:23:49 <openstack> bug 1619253 in neutron "Subnet update bumps revision_number for network but does not notify about the change on RPC wire" [Medium,In progress] https://launchpad.net/bugs/1619253 - Assigned to Darren Shaw (dronshaw)
22:23:57 <armax> this has a proposed fix but broken
22:24:00 <armax> I pinged the author
22:24:03 <armax> no response so far
22:24:08 <armax> kevinbenton: seems important
22:24:16 <kevinbenton> no, this can be deferred
22:24:16 <armax> kevinbenton: do you want to spread your magic dust?
22:24:19 <ihrachys> why is it rc2?
22:24:23 <armax> ok
22:24:37 <ihrachys> + to defer, not critical and potentially scary
22:24:42 <kevinbenton> worth a back-port when it does get fixed (assuming not to complex)
22:24:42 <armax> so we want to defer it a priori?
22:24:46 <armax> ok
22:24:52 <armax> amotoki, HenryG you concur?
22:25:19 <HenryG> I don't understand the implications well enough
22:25:22 <armax> ok
22:25:27 <armax> that’s enough for me to take it out of RC2
22:25:29 <amotoki> I haven't understood the whole picture
22:25:31 <kevinbenton> it will only affect things in the future if we depend on using revision numbers to detect out of sync things
22:25:36 <armax> your comments say it all
22:25:38 <armax> done
22:25:47 <armax> ok
22:25:49 <armax> next one
22:25:50 <armax> bug 1622616
22:25:51 <openstack> bug 1622616 in neutron "delete_subnet update_port appears racey with ipam" [High,In progress] https://launchpad.net/bugs/1622616
22:25:56 <armax> this one is well known I guess
22:26:04 <armax> we had a couple of stop gaps in place
22:26:16 <armax> it looks like the gate is back to behaving better
22:26:29 <armax> but we’re still prone to potential issues, afaik
22:26:29 <kevinbenton> i think this can be deferred at this point
22:26:39 <HenryG> Root cause not yet pin-pointed?
22:26:43 <armax> I did put a patch up
22:26:50 <kevinbenton> root cause is concurrent port updates
22:26:53 <armax> kevinbenton and I can try and whip into shape
22:26:54 <ihrachys> HenryG: I think we understand the cause
22:26:56 <kevinbenton> to the same port
22:26:58 <armax> if it looks good we could have it in
22:27:01 <armax> if not so be it
22:27:10 <armax> I am talking about this one
22:27:10 <armax> https://review.openstack.org/#/c/373536/
22:27:19 <ihrachys> armax: post in the bug comments?
22:27:24 <armax> ihrachys: I will
22:27:26 <kevinbenton> which is pretty rare. we just had a bug with DHCP agent racing to update its port with the port update in delete_subnet
22:27:47 <kevinbenton> that patch has a -2 on it
22:27:53 <armax> kevinbenton: duh!
22:27:58 <armax> :)
22:28:07 <armax> can’t merge like that, now can it?
22:28:27 <armax> you and I need to whip it into shape remember
22:28:28 <armax> ?
22:28:29 <ihrachys> also seems like catch is misplaced in that WIP?
22:28:57 <ihrachys> it's in delete_subnet() while should go into update_port()?
22:28:58 <armax> ihrachys: yeah, I literally didn’t give it more than 5 minutes worth of thought but I swear I want to get back to it
22:29:08 <ihrachys> ok :)
22:29:17 <armax> I’ll work on it
22:29:27 <armax> and we can make the judgement call later
22:29:32 <armax> if we defer, not the end of the world
22:29:40 <armax> ok?
22:29:43 <ihrachys> it would be nice to have it in if it's ready. I prefer we keep it in list.
22:29:47 <kevinbenton> rally subnet tests are passing now even with high load
22:30:00 <armax> cool
22:30:01 <armax> ok
22:30:02 <kevinbenton> which is what uncovered this quite a bit before
22:30:05 <armax> we squashed this list
22:30:09 <armax> let’s move on to another list
22:32:23 <armax> #link
22:32:24 <armax> https://bugs.launchpad.net/neutron/+bugs?field.searchtext=&orderby=-importance&field.status%3Alist=NEW&field.status%3Alist=CONFIRMED&field.status%3Alist=TRIAGED&field.status%3Alist=INPROGRESS&field.status%3Alist=FIXCOMMITTED&field.status%3Alist=INCOMPLETE_WITH_RESPONSE&field.status%3Alist=INCOMPLETE_WITHOUT_RESPONSE&assignee_option=any&field.assignee=&field.bug_reporter=&field.bug_commenter=&field.subscriber=&field.structural_subscriber=&fiel
22:32:24 <armax> ld.milestone%3Alist=79508&field.tag=newton-rc-potential+&field.tags_combinator=ANY&field.has_cve.used=&field.omit_dupes.used=&field.omit_dupes=on&field.affects_me.used=&field.has_patch.used=&field.has_branches.used=&field.has_branches=on&field.has_no_branches.used=&field.has_no_branches=on&field.has_blueprints.used=&field.has_blueprints=on&field.has_no_blueprints.used=&field.has_no_blueprints=on&search=Search
22:32:29 <armax> don’t get scared
22:32:31 <armax> it’s just 3 bugs
22:32:36 <armax> bug 1622002
22:32:38 <openstack> bug 1622002 in neutron "dhcp_release6 can be called when it is not present" [High,In progress] https://launchpad.net/bugs/1622002 - Assigned to Brian Haley (brian-haley)
22:32:47 <armax> this seems thorny
22:33:13 <armax> I don’t think it’s doable
22:33:15 <armax> in time
22:33:18 <kevinbenton> +1 for RC2
22:33:46 <armax> what other things?
22:33:46 <ihrachys> armax: define 'it'
22:33:56 <armax> ihrachys: it’s doable to have it in RC@
22:33:58 <armax> RC2
22:34:10 <ihrachys> armax: depends on what we WANT to see in RC2 for that.
22:34:22 <armax> btw we need to nail the grenade patch if it didn’t already
22:34:30 <armax> ihrachys: what’s your suggestion?
22:34:44 <ihrachys> I may argue it's mostly fine as-is right now
22:35:05 <ihrachys> it fails with a traceback, yes, that's expected when you have your setup broken.
22:35:31 <armax> right, on that basis and assuming we were particular careful in release note this
22:35:37 <armax> I think we should be ok if this doesn’t land
22:35:45 <ihrachys> maybe graceful catch of the error is ok, but definitely I discourage going the path of api response
22:36:12 <amotoki> agree with ihar
22:36:17 <armax> ihrachys: agreed, it’s overkill for this type of scenario
22:36:26 <armax> the user should install the damn right tool and move on
22:36:40 <armax> if he/she wants IPv6
22:36:51 <armax> I mean dhcp v6 stateful
22:37:01 <armax> ok, so let’s keep it targeted for O-1?
22:37:11 <armax> i.e. out of RC2?
22:37:19 <ihrachys> yeah, we can backport refining of error handling later
22:37:22 <armax> ok
22:37:32 <armax> next one is bug 1625221
22:37:34 <openstack> bug 1625221 in neutron "Fullstack looses test workers if eventlet's Timeout is raised" [High,In progress] https://launchpad.net/bugs/1625221 - Assigned to Ihar Hrachyshka (ihar-hrachyshka)
22:37:34 <amotoki> +1
22:37:34 <kevinbenton> are we sure this doesn't break?
22:37:50 <armax> kevinbenton: you mean ^?
22:37:56 <armax> the bug I just referenced?
22:38:21 <kevinbenton> no, the dhcp release uncaught exception
22:38:51 <armax> break how?
22:39:38 <kevinbenton> this is going to prevent normal reload of allocations for a network
22:40:00 <kevinbenton> if '_release_unused_leases' leaks an exception
22:40:10 <armax> you mean the patch proposed or the existing code?
22:40:20 <kevinbenton> the existing code
22:40:43 <armax> ok, let’s take this offline and see if there are loose ends
22:40:48 <kevinbenton> ok
22:40:50 <armax> but let’s agree not to do anymore
22:40:52 <armax> than that
22:40:59 <armax> ok?
22:41:23 <kevinbenton> yeah, we need to contain the failure for RC2 IMO
22:41:27 <kevinbenton> logs can be ugly
22:41:30 <armax> ok moving on
22:41:31 <kevinbenton> but it shouldn't interfere
22:41:33 <armax> to bug 1625221
22:41:34 <openstack> bug 1625221 in neutron "Fullstack looses test workers if eventlet's Timeout is raised" [High,In progress] https://launchpad.net/bugs/1625221 - Assigned to Ihar Hrachyshka (ihar-hrachyshka)
22:41:42 <armax> ihrachys: what you feel?
22:41:52 <armax> this needs more time?
22:42:00 <ihrachys> that one... fullstack is non-voting, also it does not break gate in any way, so...
22:42:11 <ihrachys> it just makes some tests not executed if another failed already
22:42:14 <armax> ok
22:42:19 <ihrachys> I don't think it's rc2 critical
22:42:23 <ihrachys> but it's nice to have
22:42:27 <ihrachys> the fix is invasive though
22:42:33 <ihrachys> so maybe better to wait
22:42:47 <HenryG> it feels like inception
22:42:48 <armax> ok, let’s keep it on the backburner for now
22:43:08 <armax> last one of the O-1+rc-potential pile
22:43:10 <armax> is bug 1611991
22:43:12 <openstack> bug 1611991 in neutron "[ovs firewall] Port masking adds wrong masks in several cases." [High,In progress] https://launchpad.net/bugs/1611991 - Assigned to Inessa Vasilevskaya (ivasilevskaya)
22:43:27 <armax> we have this affecting mitaka as well as newton
22:43:42 <armax> amuller initially suggested to defer
22:43:54 <armax> but I don’t know if sending a defer signal is the right thing to do
22:44:10 <armax> we want people to gain confidence and adopt the driver
22:44:18 <armax> especially when it’s the only driver that works with trunks
22:44:35 <armax> at the same time I wouldn’t want the existing functionality to regress
22:44:39 <armax> so I am a bit on the fence here
22:44:41 <kevinbenton> this should be RC2
22:44:49 <kevinbenton> the existing functionality is busted :)
22:44:54 <armax> only in some cases
22:44:55 <kevinbenton> not much to regress
22:44:56 <armax> not all cases
22:45:03 <armax> depends where the wind blows
22:45:13 <kevinbenton> we have tests for the stuff though
22:45:19 <kevinbenton> we need to merge this!
22:45:22 <kevinbenton> !!!!~!~~!~!
22:45:23 <openstack> kevinbenton: Error: "!!!~!~~!~!" is not a valid command.
22:45:23 <armax> if you want it, commit to it!
22:45:49 <armax> what do other people think/
22:45:50 <armax> ?
22:46:03 <ihrachys> what happened with the idea of reusing Jakub's algorithm to validate the better one?
22:46:16 <armax> no-one is acting on it
22:46:22 <kevinbenton> i can act on it!
22:46:26 <armax> kevinbenton: oh boy
22:46:29 <kevinbenton> armax did just tell me to commit to it
22:46:32 <armax> I smell disaster
22:46:33 <kevinbenton> git commit
22:46:36 <armax> ok
22:46:55 <ihrachys> kevinbenton: do you need a nudge? I give you a nudge!
22:47:05 <armax> kevinbenton: are you gonna commit with diversity in mind?
22:47:19 <armax> ok, then I hear consensus of having this RC2?
22:47:26 <amotoki> rc2 potential looks better.
22:47:27 <ihrachys> + for rc2
22:47:33 <armax> ok
22:47:38 <armax> I’ll add this in a bit
22:47:47 <armax> let’s move on to the next and final list
22:48:02 <armax> and that’s the bugs that are marked potential
22:48:10 <armax> but not vetted yet and hence have no milestone
22:48:21 <armax> bug 1624079
22:48:22 <openstack> bug 1624079 in neutron "KeyError on "subnet_dhcp_ip = subnet_to_interface_ip[subnet.id]"" [High,Confirmed] https://launchpad.net/bugs/1624079
22:48:42 <armax> this might be another of kevinbenton screw ups
22:48:52 <armax> but it needs triaging
22:48:59 <armax> anyone keen on it?
22:49:06 <armax> kevinbenton cough kevinbenton cough?
22:49:33 <kevinbenton> yeah, i can look at the cause of this
22:49:38 <kevinbenton> looks like nobody is quite sure yet
22:49:51 <armax> ok
22:49:55 <armax> let’s reassess offline
22:50:17 <armax> but eyes on it would be good
22:50:28 <armax> so if people could help look into it
22:50:31 <armax> that’d be good
22:50:33 <armax> next one
22:50:35 <armax> bug 1625305
22:50:36 <openstack> bug 1625305 in neutron "neutron-openvswitch-agent is crashing due to KeyError in _restore_local_vlan_map()" [High,New] https://launchpad.net/bugs/1625305
22:50:43 <armax> this is somewhat troubling
22:50:48 <armax> but we don’t have enough to go by
22:51:06 <armax> it does sound scare
22:51:09 <armax> *scary
22:51:13 <ihrachys> I have deja-vu
22:51:24 <ihrachys> we had something similar back in mitaka rc* times :)
22:51:32 <armax> right
22:51:41 <armax> but it doesn’t seem the fix worked for the guy
22:52:48 <kevinbenton> i'm still collecting info
22:52:51 <armax> ok
22:52:57 <kevinbenton> to get a keyerror it seems the ports have to have different net uuids
22:53:14 <armax> I don’t understand enough of this code to judge
22:53:57 <armax> anyone has an opinion?
22:54:10 <ihrachys> no, I would need to read the code; I will tomorrow.
22:54:13 <armax> ok
22:54:14 <armax> thanks
22:54:22 <armax> last one of this pile
22:54:25 <armax> bug 1625305
22:54:26 <openstack> bug 1625305 in neutron "neutron-openvswitch-agent is crashing due to KeyError in _restore_local_vlan_map()" [High,New] https://launchpad.net/bugs/1625305
22:54:33 <armax> oops
22:54:38 <armax> bug 1626010
22:54:39 <openstack> bug 1626010 in neutron "Connectivity problem on trunk parent with MAC reuse and openvswitch firewall driver" [High,New] https://launchpad.net/bugs/1626010 - Assigned to Jakub Libosvar (libosvar)
22:54:47 <armax> this probably is going to stay untargeted for now
22:55:37 <armax> because we need to undestand a bit more how ovs-fw can handle the same mac on different networks
22:56:08 <armax> dougwig: ping
22:56:41 <armax> ok, I have nothing else
22:56:50 <armax> for now
22:57:28 <armax> we have another couple of days to squash these
22:58:02 <armax> thanks everyone for watching stable backports and current fixes
22:58:14 <armax> any last minute comment?
22:58:18 <armax> if not
22:58:54 <armax> looks like not
22:59:00 <armax> #endmeeting