16:00:33 <ihrachys> #startmeeting neutron_ci
16:00:34 <openstack> Meeting started Tue Oct  3 16:00:33 2017 UTC and is due to finish in 60 minutes.  The chair is ihrachys. Information about MeetBot at http://wiki.debian.org/MeetBot.
16:00:35 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote.
16:00:37 <openstack> The meeting name has been set to 'neutron_ci'
16:01:39 <mlavalle> o/
16:01:44 <ihrachys> before we start, I'd like to mention that I was not very attentive to upstream fallout lately so I may miss crucial things. if so, speak up.
16:01:52 <ihrachys> #topic Action items from prev week
16:02:20 <ihrachys> we had two items for the same
16:02:28 <ihrachys> "ihrachys to report bug for iptables apply failure" and "jlibosva to triage iptables apply failure in linuxbridge scenarios job"
16:02:39 <ihrachys> I am afraid I haven't done the job, but let me check
16:02:49 <jlibosva> I haven't found time to look at it
16:02:59 <ihrachys> oh I actually did, wow https://bugs.launchpad.net/neutron/+bug/1719711
16:03:01 <openstack> Launchpad bug 1719711 in neutron "iptables failed to apply when binding a port with AGENT.debug_iptables_rules enabled" [High,Confirmed]
16:03:17 <ihrachys> my memory bank is not long enough it seems.
16:03:45 <ihrachys> jlibosva, will you? or we should find someone else?
16:04:26 <jlibosva> I'd love look at it but I was busy with some other things lately ..
16:04:42 <jlibosva> I was also two days off last week so that's my excuse :)
16:04:53 <haleyb> ihrachys: i can look, just didn't have time this past week, but since i have another iptables issue on my plate i will have that part of my brain swapped-in
16:04:58 <ihrachys> we are not to blame here :)
16:05:13 <ihrachys> jlibosva, sounds fair if we pass the cake to haleyb ?
16:05:19 <jlibosva> sure
16:05:23 <jlibosva> thank haleyb :)
16:05:25 <jlibosva> s
16:05:35 <ihrachys> ok, assigned to haleyb
16:05:38 <ihrachys> haleyb++
16:05:53 <ihrachys> #topic Grafana
16:06:05 <ihrachys> grafana is dead: http://grafana.openstack.org/dashboard/db/neutron-failure-rate
16:06:10 <ihrachys> no data points
16:06:15 <ihrachys> probably a fallout of zuulv3 switch
16:06:30 <ihrachys> haleyb, I remember you asked about it. was there any progress after that to get it back?
16:06:38 <ihrachys> any patches to chew?
16:07:17 <haleyb> ihrachys: no, that was near end of day here, but i could take a look.  this zuulv3 change was not as clean as i expected
16:07:36 <ihrachys> haleyb, I suspect it's because job names changed
16:07:44 <ihrachys> maybe our board was not updated with new
16:07:54 <haleyb> ihrachys: yes, there's a lot of legacy-* now
16:08:20 <ihrachys> yeah, no 'legacy' matches in grafana/neutron.yaml
16:08:30 <ihrachys> #action haleyb to update grafana board with new job names
16:08:48 <ihrachys> I hope this part of the repo is still fresh and we don't need to learn more new ways
16:09:00 <haleyb> who is going to update all the jobs... :(
16:09:18 <ihrachys> they can live as legacy for a while
16:09:37 <ihrachys> the main problem is that we now are on the hook to migrate them if we need improvements/new jobs...
16:09:47 <ihrachys> a lot of patches were caught in flight
16:10:05 <ihrachys> mlavalle, are you aware of anyone working on migration to new job format?
16:10:28 <mlavalle> no
16:10:36 <boden> I might wait to see what happens with zuul v3 before doing too much work.. based on the latest ML thread there are a lot of problems and folks are starting to talk about a revert if we can’t get the gates healthy with v3
16:10:58 <mlavalle> I pinged back the person who asked about it on Friday
16:11:06 <mlavalle> but didn't get back to me
16:11:06 <ihrachys> boden, wow that's harsh
16:11:43 <boden> http://lists.openstack.org/pipermail/openstack-dev/2017-October/123022.html
16:11:52 <boden> well it has been pretty disruptive
16:12:09 <boden> neutron-lib gate is on the floor and there are also issues with neutron gate
16:13:22 <ihrachys> yeah, I see patches falling with POST_FAILUREs
16:13:29 <ihrachys> I thought we were past that?
16:13:34 <boden> there are other problems
16:13:37 <ihrachys> apparently more bits of the puzzle were deployed
16:14:02 <ihrachys> boden, do we have a list of grievances on our side?
16:14:16 <boden> I’ve been adding them here https://etherpad.openstack.org/p/zuulv3-migration-faq
16:14:44 <boden> so far for neutron I only noticed the legacy releasenotes job busted… but its busted across the board best I can tell
16:14:50 <boden> neutron-lbi is diff story
16:15:27 <armax> do we have a list of neutron related jobs that are known to be unstable/broken?
16:16:16 <boden> the only list I have is that faq… but its hard to tell right now b/c there are random POST_FAILURES that are not related to gate “job logic” best I can tell
16:16:42 <armax> shall we attempt to focus on one pipeline at the time? there might be common problems and once we identified those it’s easier to do a sweep across the board?
16:17:01 <ihrachys> yeah, that would be nice to have something neutron specific. I think we could do a quick triage of failures in our gates based on late patches and have a list that we would then run against what's in faq, and if smth is not there, escalate it to infra
16:17:01 <armax> perhaps the neutron-lib pipeline is easier to bring back to sanity?
16:17:20 <boden> armax: TBH I think it has more problems than neutron
16:17:21 <armax> then go to neutron and the other networking-* projects?
16:17:27 <armax> boden: even better :)
16:17:36 <boden> I’d say get neutron working 1st
16:17:46 <ihrachys> ok
16:17:47 <boden> I know for sure the legacy releasenotes is busted
16:17:49 <ihrachys> let's start the pad
16:17:57 <ihrachys> #link https://etherpad.openstack.org/p/neutron-zuulv3-grievances Etherpad for zuulv3 grievances
16:18:20 <boden> why not just add to https://etherpad.openstack.org/p/zuulv3-migration-faq so everyone knows the issues
16:18:26 <boden> other people might have similar problems
16:18:35 <boden> other people = other projects
16:18:47 <ihrachys> I think it makes sense for ourselves to understand what's the fallout and then cross-match with what they alredy track
16:18:55 <boden> cool
16:18:57 <mlavalle> yeah
16:19:03 <ihrachys> I don't intend to have it forever, just to classify and pass over
16:19:43 <ihrachys> we have neutron, -lib, and client + stable branches to classify
16:20:08 <armax> what about we have a liasion on each of these areas, tasked to report a status of the gate by EOB?
16:20:12 <boden> FYI: I did land this patch to try and fix a lib gate job: https://review.openstack.org/#/c/508945/
16:20:16 <ihrachys> we could split those right now and work the next 24h on getting the full picture?
16:20:23 <armax> then perhaps we can have a ad hoc sync-up tomorrow to see where we are?
16:20:26 <ihrachys> armax, yeah +
16:20:41 <armax> I can take neutron-lib
16:20:54 <ihrachys> I take stable
16:21:07 <ihrachys> all of them
16:21:12 <armax> OK
16:21:30 <mlavalle> I'll take Neutron
16:21:31 <ihrachys> who's on neutron?
16:21:34 <ihrachys> great
16:21:37 <armax> who wants to take the neutron- and networking- ones?
16:21:46 <ihrachys> team team team team. if you can't work as a team...
16:22:49 <ihrachys> mlavalle, I put your name in the pad
16:22:54 <boden> I dont mind to help, problem is I’m nearly gate illiterate
16:22:55 <mlavalle> ++
16:22:56 <ihrachys> more volunteers for the rest?
16:23:31 <haleyb> i can look at client
16:23:33 <armax> I’ll have a look at the periodic runs I have time
16:23:41 <armax> haleyb: take it off of me then
16:23:43 <ihrachys> haleyb, check the list of not assigned in the pad
16:23:59 <armax> I added myself but happy to hand it over :)
16:24:12 * haleyb takes a step back :)
16:24:34 <armax> chicken
16:24:38 <armax> :)
16:24:42 <ihrachys> ok. if nothing else, I am not too nervous about networking- / neutron- / periodic at this point
16:24:53 <ihrachys> it's on subteams (except periodic that doesn't block)
16:25:03 <armax> so for now we don’t worry about the non-voting jobs, right?
16:25:06 <ihrachys> thanks for everyone who is not a chicken
16:25:08 <ihrachys> :)
16:25:19 <ihrachys> armax, yeah, goal is unblock gate
16:25:24 * mlavalle is a chicken but is trying to hide it
16:25:29 <ihrachys> we will review the gate in a week for others
16:25:52 <armax> I think let’s try to figure out the grafana black hole asap
16:25:57 <armax> we can’t fix what we can’t see
16:25:58 <haleyb> we're more like squirrels who all got run over by the zuulv3 bus, and it's backing-up now
16:26:25 <haleyb> armax: i was going to look at grafana, most likely just needs update to job names
16:26:43 <armax> haleyb: true, but we’ll have to move away from the legacy- prefix sooner or later
16:26:59 <ihrachys> that's after we are back on our legs
16:27:00 * jlibosva is also chicken
16:27:02 <armax> unless we want to create a legacy dashboard
16:27:20 <armax> either way
16:27:32 <haleyb> armax: i hope we don't have to do that
16:27:41 <ihrachys> I don't think infra in a position to force the switch through further
16:27:47 <armax> great, I get a status.json: Proxy Error when looking at http://zuulv3.openstack.org/
16:28:09 <armax> ihrachys: agreed, but I loathe to see ‘legacy’ everywhere :)
16:28:12 <ihrachys> haleyb, I don't think that's realistic hope
16:28:35 <haleyb> ihrachys: you mean not having a legacy dash?
16:28:40 <armax> I hope the translation is pretty straightforward
16:28:43 <armax> yeah
16:28:43 <ihrachys> haleyb, not having to do it
16:28:54 <ihrachys> armax, it's switch to ansible man
16:29:07 <ihrachys> well, we could call a bash script from there I guess?
16:29:22 <armax> ihrachys: rock. on.
16:29:27 <haleyb> i will see what other dashboard changes landed recently i guess
16:29:41 <ihrachys> would probably make sense to move definitions as scripts into neutron tree; then work on ansible switch as needed.
16:30:10 <ihrachys> so excited of all the productive work we are about to do
16:30:15 <ihrachys> meh
16:31:04 <ihrachys> I don't think we have anything to discuss for grafana or gate without data, so let's move on
16:31:06 <ihrachys> #topic Bugs
16:31:16 <ihrachys> https://bugs.launchpad.net/neutron/+bugs?field.tag=gate-failure
16:31:44 <ihrachys> I don't see anything new in the list except the iptables apply issue that haleyb will look at
16:31:55 <ihrachys> so we can focus on gate for the most part.
16:32:04 <mlavalle> great
16:32:33 <ihrachys> #topic Fullstack
16:32:56 <ihrachys> I am not sure there is a lot of reason to discuss fullstack or scenarios at the point where we are right now. thoughts?
16:33:09 <mlavalle> agree
16:33:27 <ihrachys> ok
16:33:32 <armax> it’s like discussing that the house is dirty when the roof is on fire
16:33:39 <ihrachys> #topic Open discussion
16:33:54 <ihrachys> anything critical that is more important than putting the fire off to discuss?
16:34:07 <armax> back to the zuulv3 topic any idea what to do with node_failure errors?
16:34:20 <armax> it seems like infra is fighting stability issues of their own?
16:34:45 <boden> armax; yes, that’s what I said earlier… the infra isn’t even stable enough to test the gate jobs now
16:35:11 <ihrachys> armax, maybe ask them about whether they have it fixed, or when it's going to be fixed
16:35:15 <mlavalle> so maybe talking to them, to get a sense as to where they stand
16:35:34 <ihrachys> maybe they will tell us straight they revert and we don't need to do the work in the first place :)
16:35:36 <armax> OK, I am going to learn a bit about this today so that I can ask intelligent questions
16:36:02 <mlavalle> ihrachys: that's a nice dream
16:36:21 <ihrachys> mlavalle, not really, only means you will go through another round of pain in the future
16:36:32 <mlavalle> that's true
16:36:36 <ihrachys> ok, thanks everyone, let's follow up with infra, and classify. that will be something already.
16:36:39 <ihrachys> keep up
16:36:43 <ihrachys> #endmeeting