22:01:31 <jeblair> #startmeeting zuul
22:01:32 <openstack> Meeting started Mon Oct 23 22:01:31 2017 UTC and is due to finish in 60 minutes.  The chair is jeblair. Information about MeetBot at http://wiki.debian.org/MeetBot.
22:01:33 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote.
22:01:36 <openstack> The meeting name has been set to 'zuul'
22:02:03 <jeblair> there is no agenda for this meeting -- so let's start by asking if there's anything folks want to talk about :)
22:03:08 <jeblair> i'd like to give folks a quick heads up on some branch-related issues i've started digging into
22:03:15 <fungi> point of order: zuul v3 is awesome
22:03:26 <pabelanger> yes it is!
22:03:39 <Shrews> i have a topic
22:04:03 <jeblair> Shrews: what's that?
22:04:08 <jeblair> and maybe we should give the issues etherpad a quick once-over?
22:04:18 <Shrews> i've been working on migrating the nodepool jobs (https://review.openstack.org/#/q/topic:nodepool-migration+status:open). all of those (except the WIP) could use reviews
22:04:30 <jeblair> Shrews: very meta!
22:04:37 <Shrews> so i need some nodepool core's to get active there
22:05:15 <jeblair> okay, if you have other topics, let me know and we'll wing it
22:05:22 <jeblair> #topic zuulv3 issues
22:05:30 <Shrews> jeblair: it's all been very confuzzling trying to get everything in all the projects in the right order, so i sympathize with our users
22:06:09 <jeblair> Shrews: good! we're supposed to do that :)
22:06:51 <jeblair> so first thing, i guess: is that we're still in firefighting mode
22:07:00 <jeblair> #link https://etherpad.openstack.org/p/zuulv3-issues
22:07:10 <jeblair> that etherpad is active
22:07:19 <jeblair> if you have time to jump on issues there, please do so
22:08:08 <jeblair> i think most of the issues under zuul are being worked in some form or other
22:08:33 <jeblair> anyone have any questions about those?
22:08:46 <jeblair> fungi: i left a comment on your sqlreporter patch
22:08:58 <fungi> thanks!
22:09:19 <fungi> it may be the comment to which i already responded
22:09:23 <jeblair> fungi: briefly: i think we can merge that as a quick fix, but there's a slightly larger patch that we should write soon -- before we have tristanC's dashboard
22:09:30 <dmsimard> \o sorry I'm late
22:10:07 <fungi> jeblair: yep, i'm good either way and happy to work on the more correct solution, just looking for pointers from someone with a deeper knowledge of the orm bits
22:10:08 <jeblair> fungi: ah yeah, so i think we'll want a migration to allow nulls
22:10:55 <jeblair> basically, if we do it now, all we need to do is change the column; if we do it later, we'll need to 'update buildset set change=null where change=0' or something.  not too big of a deal.
22:11:42 <fungi> i also added a related issue to the pad
22:12:00 <dmsimard> I guess we will need to unfreeze the v2 files to let https://review.openstack.org/#/c/507180/ merge
22:12:02 <fungi> may be helpful if one reporter breaking doesn't cause zuul to skip other reporters
22:12:08 <jeblair> fungi: and yeah, i don't think there's any work in progress to make the reporters more idempotent.
22:13:03 <jeblair> i think the only trick there is what to do if gerrit fails to merge the change.  if we make them idempotent, does that impact any of the subsequent reporters?
22:13:33 <jeblair> (currently, if gerrit fails to merge, no other reporters run)
22:13:52 <jeblair> my inclination would be to make them idempotent.  we can't do a two-phase commit across them, so no use pretending we do.  :)
22:14:03 * clarkb wanders by late
22:14:06 <fungi> ahh, yeah, seems like reporters which also merge changes are a slightly different class than those which just provide data
22:14:42 <jeblair> fungi: is the failing proposal a zuul issue or job issue?
22:15:05 <fungi> i put it under zuul since the patch will be to zuul's codebase
22:15:22 <fungi> oh, wait, failing proposals
22:15:31 <jeblair> http://logs.openstack.org/periodic/git.openstack.org/openstack/requirements/stable/newton/propose-updates/a515bba/job-output.txt.gz
22:15:33 <jeblair> that link ^
22:15:34 <fungi> we've switched subtopics, sorry ;)
22:15:52 <fungi> i saw that on the pad and am trying to regain my former context there
22:16:47 <clarkb> thats a job issue using ZUUL_REFNAME right?
22:16:54 <clarkb> (which is not something we'll continue to provide in v3)
22:17:16 <fungi> oh, right, this issue got somehow split from the refname details
22:17:30 <jeblair> clarkb: i think zuul_refname is sometimes provided by the legacy filter
22:18:03 <clarkb> iirc the comment in the filter says it is intentionally ommitted but unsure if 100% of the time
22:18:33 <jeblair> clarkb: that's zuul_ref.  refname should be there for non-change items.
22:18:39 <clarkb> oh right
22:19:02 <clarkb> this is the thing where we had to set refname explicitly on old v2 periodic jobs
22:19:03 <fungi> so... for periodic pipeline jobs i don't think v2 provided one at all
22:19:15 <clarkb> fungi: correct, it was hardcoded on the job before iirc since we had a job per branch
22:19:23 <fungi> and, yeah, i think this has since been solved?
22:19:35 <clarkb> but I want to say that was lost in the conversion process. And ya I want to say the job was updated ?
22:19:51 <clarkb> to just use what the checkout is rather than try and checkout in the job body itself?
22:20:05 <jeblair> "propose-updates" should, by virtue of not having the word legacy in it, not be using ZUUL_ vars
22:20:22 <jeblair> so hopefully things are as clarkb says
22:20:29 <clarkb> jeblair: ya its in the script it runs so the jjb bits didn't but then the script in jenkins/scripts did/does
22:20:33 <pabelanger> yah, are they using parent: base?
22:20:35 * clarkb looks to see if that was cleaned
22:20:54 <fungi> right, the issue was within the scripts it runs, which to retain backward-compat in case of a rollback (we hadn't decided no rollback yet at that point) it was retained in the script
22:21:38 <clarkb> # Zuul v3 adds refs/heads, remove that to get the branch
22:21:40 <clarkb> then
22:21:46 <clarkb> ZUUL_REFNAME=${ZUUL_REFNAME#refs/heads/}
22:22:42 <clarkb> ya I think this was addressed based on the git log
22:22:53 <fungi> the job is getting past that point now, so i think it can be updated to whatever the new failure reason is or moved to the fixed list and a new issue added for the new failure mode
22:22:57 <fungi> #link http://logs.openstack.org/periodic/git.openstack.org/openstack/requirements/stable/newton/propose-updates/35f37e8/job-output.txt.gz
22:23:01 <fungi> that's from today's run
22:23:02 <jeblair> oh. it's an old job masquerading as a new one.
22:23:04 <clarkb> fungi: yup I agree
22:23:06 <pabelanger> ya, propose-update jobs parents to base, we should likey change that to legacy-base if still using zuul-cloner
22:23:24 <jeblair> if it's not using legacy-base, how's it getting ZUUL_REFNAME?
22:23:34 <clarkb> jeblair: it is populatign it itself
22:24:02 <jeblair> from?
22:24:10 <fungi> also, and perhaps related (or perhaps not?) this looks like the job to generate constraints updates, but we're not supposed to run that for any branch besides master because we freeze constraints on stable branches
22:24:28 <jeblair> fungi: i think AJaeger had a patch to fix that today
22:24:31 <clarkb> jeblair: not sure yet  but where I am reading it it is passed as a cli arg to the script
22:25:01 <clarkb> playbooks/proposal/propose-updates.yaml:      command: "{{ ansible_user_dir }}/scripts/propose_update.sh {{ update_target }} {{zuul.branch}}"
22:25:09 <fungi> #link http://git.openstack.org/cgit/openstack-infra/project-config/tree/jenkins/scripts/propose_update.sh#n91
22:25:13 <fungi> passes it in there
22:25:23 <jeblair> ah, ok.
22:25:30 <jeblair> so yeah, that should be pretty v3-safe.
22:25:44 <pabelanger> cool
22:26:00 <jeblair> and now that we've decided to stop maintaining backwards compat, we can remove the ZUUL_REFNAME references from the script
22:26:18 <pabelanger> my main concern is when we move forward with https://review.openstack.org/513506/ if that jobs was still using zuul-cloner too
22:26:26 <pabelanger> but, looks to be okay also
22:26:48 <fungi> jeblair: agreed
22:26:50 <jeblair> pabelanger: we should probably only merge that change after we remove zuul-cloner from images
22:27:06 <clarkb> fungi: and ya current issue looks like update-constraints has moved on from newton and grown some new flags that we should just avoid using against stable branches
22:27:28 <jeblair> pabelanger: otherwise, i think it will have the opposite effect (zuul-cloner will work "better" because it doesn't "need required-projects" any more) :)
22:28:09 <pabelanger> sure, I'm trying to avoid having jobs move to parent: base while using zuul-cloner
22:28:39 <jeblair> pabelanger: i understand.  that change won't stop that until we remove zuul-cloner from images.
22:28:45 <pabelanger> when do you think we'd remove zuul-cloner
22:28:59 <pabelanger> (from images)
22:29:07 <jeblair> pabelanger: as soon as someone writes that change, i would think.  now that we've decided not to roll back.
22:29:12 <dmsimard> The shim?
22:29:16 <pabelanger> jeblair: ack
22:29:29 <clarkb> dmsimard: the actual installation that the shim then goes in over the top of
22:29:36 <dmsimard> Ah, okay.
22:30:03 <jeblair> so if you're a legacy job, you still get the shim.  if you're not, there is no zuul-cloner.
22:30:09 <jeblair> only zuul.
22:30:21 <dmsimard> :)
22:30:21 <pabelanger> ++
22:30:39 <fungi> ;)
22:30:45 <jeblair> okay, anything else from the etherpad jump out at anyone?
22:31:23 <jeblair> #topic nodepool jobs
22:31:24 <clarkb> do we still need to restart executors for latest ssl fix?
22:31:30 <jeblair> #undo
22:31:30 <openstack> Removing item from minutes: #topic nodepool jobs
22:31:31 <dmsimard> On my end, mostly the ara wsgi thing. I think that would easily allow us to have 1) ara reports all the time 2) at a much lower cost
22:31:46 <fungi> clarkb: yes, i can probably knock out some new executor restarts this evening if that has merged now
22:31:49 <jeblair> clarkb: i think we do, yes.
22:31:57 <clarkb> ok let me know if I can help with those
22:32:12 <fungi> you can most certainly help. i'm merely volunteering ;)
22:32:22 <fungi> we can split them up or whatever
22:32:26 <clarkb> I'e ended up down the bwrap rabbit hole today due to that too fwiw :P
22:32:28 <jeblair> dmsimard: do we need any new ara releases for that?
22:32:54 <dmsimard> jeblair: nope. Just a wsgi script which, pending a new release, is in-tree in puppet-openstackci
22:32:55 <fungi> i'm still helping hammer on the release jobs (latest fun is tag-releases can't auth to lp when it wants to add bug comments)
22:33:37 <dmsimard> The wsgi script will be bundled in ara on the next release so we can just stop carrying it.
22:34:04 <dmsimard> Oh, on that note, Ansible 2.4.1 should be out this Wednesday.
22:34:19 <dmsimard> Along with the new release of ara to support 2.4.1 properly.
22:34:37 <jeblair> dmsimard: cool.  i'm somewhat inclined to defer that until after we manage to release openstack.  we have delayed the release, and it may be best to avoid risking further disruption for something not immediately on fire right now.
22:34:46 <jeblair> dmsimard: hopefully that's not long though.  :)
22:34:59 <dmsimard> Sure
22:35:28 <dmsimard> reviews in the meantime are appreciated, even if we don't deploy it yet
22:35:32 <jeblair> (i totally want ara back though, i have missed it.  :)
22:35:40 <clarkb> ++ re getting release working first
22:36:41 <jeblair> #topic nodepool jobs
22:37:03 <jeblair> #link please review https://review.openstack.org/#/q/topic:nodepool-migration+status:open
22:37:20 <jeblair> Shrews: what issues did you run into when doing that?
22:38:21 <Shrews> jeblair: first and foremost was understanding the PTI policy and the reasoning behind it. but i understand that now
22:39:21 <jeblair> cool.  i think there are maybe some grey areas there still... i was thinking of bringing it up at tomorrow's infra meeting.
22:39:46 <Shrews> jeblair: the other was understanding the job variants and disabling a template job for certain branches.
22:40:14 <Shrews> jeblair: yeah, i've seen others have the same confusion in #-infra
22:40:58 <jeblair> yeah, i think that's squarely in the grey area.  basically: where should a project disable or alter a PTI job.
22:43:01 <Shrews> thanks to pabelanger, all changes to project-config to remove the problematic templates have merged
22:43:10 <Shrews> https://review.openstack.org/512637 implements the py27 jobs directly
22:43:33 <Shrews> https://review.openstack.org/513766 is for the feature/zuulv3 branch (and py35 jobs)
22:44:22 <Shrews> we should get those merged ASAP to have py27 and py35 jobs running
22:44:50 <jeblair> we should probably dust off the zuul-nodepool integration job when we have a few mins too.
22:45:13 <Shrews> jeblair: yeah. i want to move to the non-legacy devstack job too. but first things first
22:45:16 <clarkb> are no jobs running now because we are in an in between state?
22:45:29 <jeblair> Shrews: ++
22:45:30 <Shrews> clarkb: dsvm jobs are, but yeah
22:45:48 * clarkb makes note to review those after the meeting
22:45:58 <Shrews> clarkb: many thanks!
22:46:43 <jeblair> #topic stable branch issues
22:47:16 <jeblair> in tracking down some of the issues on the etherpad, i realized there are some issues with zuul's configuration system and stable branches
22:47:33 <jeblair> at the end of last week, i wasn't even in a position to articulate what the issues were
22:48:00 <jeblair> but i took a moleskine with me on a hike this weekend to try to work through it
22:48:37 * dmsimard googles moleskine
22:49:08 <jeblair> and i think i've got a handle on them.  i think there are 5 related problems, and they all pretty much need to be solved simultaneously.
22:49:13 <fungi> dead-tree notebook
22:49:39 <jeblair> i think i also just about have solutions to them as well
22:49:45 <Shrews> dmsimard: jeblair is old school. you should ask to see his camera sometime  :)
22:49:53 <jeblair> or at least, a first pass at a solution
22:50:20 <dmsimard> Shrews: it's alright, no problem with that. English is not my first language so sometimes there's one of those new words... :)
22:50:30 * Shrews hopes jeblair got some good photos on the hike
22:50:54 <jeblair> my hope is to finish working through this, and then describe the problems along with some proposed solutions
22:51:43 <dmsimard> You're keeping us in suspense
22:51:45 <jeblair> i've dug into this now, because i think as soon as people really try to use some of the branch stuff in earnest, we're going to hit problems that don't currently have good solutions.
22:52:06 <dmsimard> Telling us there's a problem, that you have a solution but no details :(
22:52:23 <jeblair> i don't have a solution.  i think i'm close to having a solution.
22:52:38 <jeblair> i could describe everything now, but i feel like i'd be wasting people's time.
22:53:06 <dmsimard> Ok, so, anything to look out for ?
22:53:29 <jeblair> basically, i think i should at least be able to describe the problems fully before i waste anyone else's time.
22:54:17 <jeblair> but i wanted to mention it so that folks know there's some undesirable behavior around stable branches
22:54:31 <jeblair> and if you see any issues related to that, send them to me
22:54:43 <dmsimard> Ack.
22:54:47 <Shrews> i was not aware of issues in that area, but will keep an eye out now. thx for the info
22:55:00 <jeblair> i'm hoping i'll be able to write up my findings by tomorrow.
22:55:22 <jeblair> in like, legible electronic form.  not what's in the moleskine.
22:55:47 <fungi> the margin of your moleskine is too narrow to contain the solution?
22:55:48 <pabelanger> would be interesting to see them however :)
22:55:53 <clarkb> you could send copies via snail mail
22:56:45 <jeblair> addressed to clark boylan, c/o convention center, sydney, nsw, australia
22:57:01 <clarkb> better send it air mail
22:57:11 <jeblair> #topic even more open discussion
22:57:20 <jeblair> anything else?
22:57:44 <clarkb> I'm gonna pop out for a bit after meeting then will be back to review nodepool job changes and help with executor restarts
22:57:45 <pabelanger> was going to ask about powering down zuulv2 servers, but maybe that is for tomorrow infra meeting
22:58:08 <jeblair> pabelanger: yeah, let's check in then
22:58:08 <fungi> i missed an opportunity to exercise my latin: hanc marginis exiguitas non caperet
22:58:23 <pabelanger> ++
22:58:29 <dmsimard> oh, open floor
22:59:02 <dmsimard> pabelanger and I this morning hosted an informal Q&A/ask me anything on TripleO and Zuul v3 with about two dozen developers involved in TripleO and it's CI
22:59:25 <jeblair> dmsimard: nice!  you seem to have survived?
22:59:37 <dmsimard> happy to report that we got a good turn out and I think we managed to fend out some worries/frustrations with v3 and told them it was awesome
22:59:56 <dmsimard> it was recorded (red hat bluejeans) and shared with the folks that couldn't attend
23:00:00 <clarkb> would it be helpful to properly bubble that feedback upstream?
23:00:27 <jeblair> yeah, even if it's stuff we know, would still be good to know what the initial roadblocks are for folks.
23:00:27 <dmsimard> there's an unorganized etherpad https://etherpad.openstack.org/p/migrating-tripleo-zuulv3
23:00:32 <clarkb> (storyboard stories or whatever)
23:00:50 <dmsimard> It was more about education than missing features or blockers, most of them had not even been interested in v3 at all
23:01:19 <fungi> which makes sense if they were also not especially interested in v2
23:01:26 <jeblair> general interest in zuulv3 is a recent phenomenon.  :)
23:01:29 <dmsimard> The topic of artifacts did come up and I very briefly discussed that with jeblair
23:02:07 <dmsimard> I think the moment we were able to convince them that v3 is able to make their lives better and easier, you could see the sparkles in their eyes
23:02:15 <fungi> "artifacts" as in being able to pass build artifacts from one job to another?
23:02:20 <dmsimard> fungi: yes
23:02:24 <pabelanger> Yah, could be turned into a FAQ for sure
23:02:25 <fungi> cool
23:02:28 <jeblair> dmsimard: nice! thanks.  i'll give the etherpad a once over.
23:02:33 <jeblair> i think we're a bit over time...
23:02:36 <jeblair> so let's
23:02:38 <jeblair> #endmeeting