22:03:11 <jeblair> #startmeeting zuul
22:03:12 <openstack> Meeting started Mon Aug 14 22:03:11 2017 UTC and is due to finish in 60 minutes.  The chair is jeblair. Information about MeetBot at http://wiki.debian.org/MeetBot.
22:03:13 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote.
22:03:15 <openstack> The meeting name has been set to 'zuul'
22:03:27 <jeblair> okay, i had hoped more people would be here :(
22:03:32 <fungi> i am
22:03:36 <jeblair> #link agenda https://wiki.openstack.org/wiki/Meetings/Zuul
22:03:37 <fungi> but yeah
22:03:42 <jeblair> you will note that i have cleared the agenda
22:03:45 <jeblair> it has two topics
22:03:52 <jeblair> #topic  What needs to happen before PTG
22:04:05 <jeblair> i am not certain that we will be ready for the ptg cutover
22:04:13 <jeblair> that doesn't mean we won't be
22:04:21 <jeblair> but i would not place any bets on it at this point
22:04:28 <jeblair> we have 20 work days until then
22:04:36 <jeblair> many of us, myself included, have fewer than that
22:04:51 <jeblair> if our pace were like it was 2 weeks ago: no problem, we're fine.
22:04:58 <jeblair> if it's like last week: we're in trouble
22:05:19 <leifmadsen> summer ftl :)
22:05:47 <jeblair> anyway, the point is not to speculate or try to try to convince ourselves one way or the other, but to try to get on the same page about what needs to be done
22:06:02 <jeblair> and make sure that folks who are able to help know what to do to try to maximize our chances
22:06:18 <jeblair> to that end, here's an etherpad: https://etherpad.openstack.org/p/zuulv3-pre-ptg
22:06:30 * mordred waves
22:06:45 <jeblair> that's just some things i brainstormed quickly
22:07:17 <jeblair> if the stuff in the first section doesn't happen (some sooner than others), then i'm very likely to advocate we call it off.
22:07:32 <jeblair> the second section is what i think we could accomplish if progress is really good
22:07:53 <Shrews> what is "startup time"?
22:07:56 <jeblair> the third section is not going to happen before the ptg, i just couldn't bear to title it that way.
22:08:04 <leifmadsen> re: improved docs -- is that docs for migration, or just overall "how to use zuul" docs?
22:08:04 <SpamapS> o/
22:08:08 <SpamapS> sorry I got distracted coding
22:08:14 <mordred> Shrews: testing zuul startup time with the entire set of migrated job config for openstack
22:08:21 <pabelanger> so, I have a good handle on what we need to do for tarball/publish jobs, devstack-gate is an unknown to me right now. Are we just going to copypaste bash scripts into ansible playbooks like we did with tox job?
22:08:36 <fungi> leifmadsen: migration docs are already in the absolutely section
22:08:39 <jeblair> Shrews: it was one of the risks mordred noted in the announcement.  zuul has to read 2k repos in order to startup.  we have no idea how long that will take.
22:08:47 <fungi> leifmadsen: so i think improved general docs for zuul
22:09:01 <mordred> pabelanger: d-g is in some ways easier since it's already in a repo that can be included in job changes
22:09:14 <leifmadsen> fungi: ok cool, I'm (kinda) working on that as a "I have no fucking clue what I'm doing here!" person approaching zuul with github integration
22:09:39 <jeblair> fungi, leifmadsen: correct.  i think our docs are mostly *accurate* for what openstack users will need, but they could still be improved from a narrative viewpoint.
22:09:55 <jeblair> leifmadsen: and that pespective in particular is great and exactly what we need :)
22:09:58 <leifmadsen> they are very hard to approach for someone landing on zuul and trying to spin something up
22:09:58 <mordred> pabelanger: so I think the idea is to add d-g to v3 then iterate on the ansiblification effort that was already underway and get it to a point where it's decomposed enough that a playbook to drive it works
22:10:04 <fungi> leifmadsen: the migration docs are likely to continue to be a priority after the cut-over as well as we discover things we didn't think to add, but this should at least be the minimal set of stuff we know will come up
22:10:19 <leifmadsen> fungi: that's fine, I have no intention or interest in those :) (re: migration docs)
22:10:51 <SpamapS> you know...
22:10:51 <leifmadsen> I did notice a blog post from tristanC on RDO blog site recently (Aug 1) that might help bootstrap me a bit further as well
22:10:56 <SpamapS> I know we'd like to ansiblify d-g
22:10:58 <fungi> oh, also the migration docs are likely to be of interest to operators of other zuul v2 deployments, i expect
22:11:00 <SpamapS> but is it necessary for the cutover?
22:11:45 <mordred> pabelanger: which means we should be able to start with stupidly simple v3 jobs in the shade repo (shell: run_devstack_gate.sh) and iterate until they're better - but verifying that the changes don't break anything
22:11:53 <jeblair> SpamapS: i think it's necessary for 2 reasons: 1) it will be more difficult to do it after the cutover and 2) i think we need to actually be able to show people what the system is actually for before we cutover.
22:12:05 <pabelanger> mordred: right, ansibilfy effort isn't as much a technical issue, but we do spend a lot of time discussion the design. So if we have another etherpad session like we did for tox and tarball jobs, I think that will allow us to iterate at a faster rate.
22:12:16 <mordred> pabelanger: ++
22:12:19 <fungi> so d-g as our "advanced usage" example
22:12:25 <jeblair> SpamapS: otherwise, we're going to end up with zuulv3 being a *worse* zuul than v2 for most of what openstack uses it for, and our cries of "but no! we imagine it will be great!" will fall on deaf ears
22:12:53 <leifmadsen> so I gotta bail for dinner, but if anyone has interest in doing a bit of training on my side to speed up my knowledge transfer, I'm happy to take notes and write up some documentation, submit patches, etc
22:12:54 <SpamapS> jeblair: I like some of v3's features that have little to do with Ansible.. so I'm not sure (2) speaks to me that loudly. But (1) is huge.
22:12:56 <mordred> pabelanger: I agree. although I do think that d-g is going to be 'easier' since it's way more about doing remote things and less about the harder bits of secrets integration and whatnot
22:12:59 <fungi> also, d-g provides us with a complex use case to help us find the rough edgest (by getting cut on them)
22:13:05 <jeblair> leifmadsen: cool, i'll ping you tomorrow
22:13:15 <leifmadsen> I'm pretty close I think to having zuul up and running (I have the first half done), but I don't have the nodepool and job run stuff done
22:13:24 <mordred> leifmadsen: sweet!
22:13:42 <leifmadsen> jeblair: cool, I feel like a bluejeans, showing where I'm at, what I know, and then someone helping to take me to the end would be incredibly useful
22:13:51 <leifmadsen> it would get me to writing docs faster, and more accurately
22:14:01 <jeblair> leifmadsen: can do
22:14:03 <mordred> pabelanger: which is all just me being optimistic- I think we should totally do another session like we did for tarballs/publish
22:14:05 <SpamapS> So would it make sense to stop painting the dark corners of the code base now and see if we can split up the d-g work even more?
22:14:11 <leifmadsen> rock on, bailing now, but ping me tomorrow or Wednesday
22:14:18 * SpamapS may be a week or two behind and that may already be happening ;)
22:14:40 <pabelanger> for d-g, I would people interested maybe get into a sprint channel so we can dedicate some serious time to this? Because, it is hard just waiting for people to review ansible code.
22:14:44 <jeblair> SpamapS: yeah i think so.  i think at the end of last week, i put out the last code related fire that i think was holding us back
22:14:50 <pabelanger> suggest people*
22:14:51 <mordred> ++
22:15:10 <jeblair> so unless another pops up, i will start focusing on these items
22:15:21 <jeblair> maybe let's go through these items together real quick?
22:15:26 <mordred> cool
22:15:35 <jeblair> #topic tarball/publish jobs
22:15:41 <jeblair> i think this can be quick
22:15:50 <jeblair> mordred, pabelanger, and i made an etherpad for this last week
22:15:59 <jeblair> it took us a couple hours
22:16:11 <jeblair> but we ended up with a discrete todo list (it's sorta in the middle of the pad, sorry)
22:16:16 <jeblair> i think that was really helpful
22:16:44 <jeblair> it's now missing a step
22:17:01 <jeblair> as folks would like to see some more verification around secrets before we fully execute it
22:17:42 <jeblair> completion of that may now be sitting behind writing some more unit tests and/or creating test jobs with dummy keys
22:18:43 <jeblair> but aside from that -- does the list look okay?  do we need to reassign any tasks there?  will folks be able to work on that in the next day or so?
22:18:48 <SpamapS> With the short horizon, I like the idea of just etherpadding the tasks and updating people via IRC when one changes things.
22:19:05 <jeblair> #link tarball/publish jobs etherpad https://etherpad.openstack.org/p/mVSVwG4xos
22:19:15 <mordred> jeblair: yah - list looks good to me and with v3 up and going again I'm good to crank on the next bits
22:19:31 <jeblair> okay, next:
22:19:34 <jeblair> #topic devstack jobs
22:19:50 <jeblair> i like the suggestion from earlier that we etherpad this like the other jobs
22:19:55 <mordred> ++
22:20:07 <pabelanger> yes, please
22:20:09 <jeblair> we have a lot of stuff done previously, but also, that was a while ago and things may have changed
22:20:10 <clarkb> one thing to keep in mind with ansibilification of d-g is that final RCs for openstack happen next week. So we'll need to be careful merging changes to d-g (as many projects rely on it for their testing)
22:20:12 <jeblair> so when should we do this?
22:20:49 <jeblair> clarkb: good point.  we should be able to merge additive changes easily
22:21:05 <pabelanger> any time works for me
22:21:06 <clarkb> yup and the vast majority of the changes should also be self testing giving a good degree of confidence in them
22:21:15 <mordred> yup
22:21:21 <fungi> yeah, i think it ought not be too hard to make nondisruptive changes to d-g while working on this
22:21:30 <jeblair> i guess another thing to consider is that we're also probably close enough that we could consider d-g slushy and start building new stuff in parallel.
22:21:55 <jeblair> let me ask a different question first: who's interested in working on this?
22:21:57 <jeblair> o/
22:21:59 <fungi> i'd be cool declaring d-g mostly-frozen similar to what we're starting to do with project-config
22:22:01 <mordred> o/
22:22:07 <pabelanger> o/
22:22:15 <clarkb> I'm happy to continue reviewing the changes, though currently I've been quite distracted by the release :/
22:22:28 <fungi> i'm interested in it but disappearing for the next week-ish
22:22:29 <jeblair> clarkb: that's useful too
22:22:51 <jeblair> fungi: that's good to know :)
22:22:55 <fungi> probably reviewing is something i can squeeze in at odd hours over bad connectivity for the coming week
22:23:26 <fungi> blame eclipses
22:23:33 <SpamapS> o/ for working on it... but more to o/ to working on whatever needs doing
22:23:39 <mordred> jeblair: we should also perhaps define how 'done' we want to get - I imagine we're not going to be *fully* done, and auto-converting all of the various d-g cargocults will be tricky - but probably we want to get further than a shell script that just runs safe-devstack-gate.sh yeah?
22:23:40 <jeblair> pabelanger, mordred: would you like to finish the tarball/publish work before moving to d-g? or start our etherpad session for that sooner?
22:24:00 <pabelanger> jeblair: etherpad sooner is better for me
22:24:05 <jeblair> mordred: yeah.  maybe that's an outcome of the etherpad session (hopefully we'll have a better idea of structure then)
22:24:07 <pabelanger> so I can start thinking about it
22:24:08 <mordred> jeblair: ++
22:25:33 <jeblair> mordred, pabelanger: would you like to etherpad devstack jobs tomorrow?  morningish or afternoonish?
22:25:49 <fungi> so the general idea is that hooks are left as raw shell and we'll continue to source settings from the environment and continue using the same name for a gate wrap script as an entrypoint?
22:25:58 <fungi> just trying to understand the overall scope
22:26:08 <pabelanger> jeblair: tomorrow works for me, morningsh should be fine too
22:26:11 <jeblair> fungi: let's get back to you on that :)
22:26:14 <fungi> k
22:26:16 <fungi> ;)
22:26:34 <fungi> so not yet hashed out, something else which will come out of the etherpad brainstorming
22:27:43 <fungi> though _if_ we stick to those basics, i expect we can provide a fairly smooth incremental migration for people without having to do it all for them up front
22:28:03 <jeblair> mordred: does 1600 utc work?
22:28:26 <mordred> yup. I'd love it if that works for SpamapS - I think having his brain in the ehterpadding would be helpful
22:28:52 <jeblair> SpamapS, clarkb: ^?
22:29:08 <clarkb> I'll be around
22:29:14 <fungi> i'll be around some later tomorrow (maybe 21:00 utc or so) and will gladly review the etherpad you put together
22:29:16 <SpamapS> yes that works
22:29:37 <jeblair> #agreed etherpad brainstorming for devstack-gate jobs tuesday 15 aug 1600 utc
22:29:50 <jeblair> #topic migration script
22:29:50 <SpamapS> and my brain should be fully booted back into zuulOS by then
22:30:03 <jeblair> ++
22:30:09 <jeblair> mordred has taken a stab at this
22:30:19 <jeblair> i haven't looked at it yet
22:30:21 <mordred> oh - don't
22:30:25 <mordred> it's very incomplete
22:30:39 <jeblair> oh, actually, let me say something first
22:30:42 <SpamapS> Rust would be a bad choice for this, yes?
22:30:45 <SpamapS> ;-)
22:30:47 <mordred> :)
22:30:58 <jeblair> i think the two previous topics -- the publishing and devstack jobs are things we need to get done first
22:31:16 <jeblair> because creating jobs is what is driving understanding of pain points and surfacing bugs
22:31:29 <jeblair> all of the stuff i've been sweeping up for the past few weeks directly came out of writing jobs
22:31:43 <SpamapS> +1
22:31:45 <mordred> ++
22:31:53 <jeblair> so we need to have that work done early so we have time to discover those things and fix them
22:32:06 <jeblair> similarly, we need to have it done early so that the migration script has something to output :)
22:32:08 <SpamapS> We don't want to be tempted to paper over those blemishes with the automated script.
22:32:30 <fungi> hear hear
22:32:33 <SpamapS> That's just duplicating v2.5's ansiblizer
22:32:38 <jeblair> ya
22:32:40 <mordred> yup
22:33:03 <SpamapS> That said, this should have a solid foundation, so I think having one person get it to a poorly-not-really-working state now isn't a bad idea.
22:33:05 <clarkb> ya my suggestion was to avoid that for the "core" jobs
22:33:20 <mordred> https://review.openstack.org/#/c/491804 is the one thing in the in-work migration script that's worth mentioning
22:33:22 <clarkb> do it for everything else and point people at the "core" jobs as the example to then go and migrate towards
22:33:23 <pabelanger> mordred: once run, we are doing a git commit and starting iterating on the output?
22:33:39 <jeblair> so this is proably something to start poking at, and as SpamapS said, start building a foundation , but we should plan on having the job work done before we really focus on this
22:33:46 <mordred> jeblair: agreee
22:34:19 <jeblair> (unlike almost anything else on the list, if we are still tweaking this the friday before the cutover, i will not be terrified.  like, not thrilled, but not terrified :)
22:34:31 <pabelanger> are we still planning on running against -infra jobs first?
22:34:51 <mordred> clarkb: I think "core" here might be worth expanding - there's a set of stuff where auto-translation is both easy and desirable and a big win (tox jobs are a good example) and others where auto-translation will be next to impossible where we need to do something more like what you're saying
22:35:14 <clarkb> mordred: ya I tried to give examples of hte nasty jobs in my email to the lis tas well
22:35:18 <mordred> yah
22:35:37 <jeblair> pabelanger: i think mordred was looking at having jobs run continuously so we can keep an eye on the output (once it basically works)
22:35:55 <mordred> sorry I haven't responded to that yet - there are a few things we can do in some circumstances I want to highlight - but I broadly think we're on the same page
22:36:14 <mordred> yah - https://review.openstack.org/#/c/491937/1 is a followup to the above which runs the script and saves the output so we can look at it
22:36:48 <mordred> or - it will be, once the "run the migrate script" job works and I don't have it looking for incorrect paths :)
22:37:39 <jeblair> #agreed work on foundation now, but significant work will depend on completion of basic and devstack jobs
22:37:45 <mordred> but I'll start poking folks once that's ready to be looked at for real
22:37:49 <mordred> ++
22:37:58 <jeblair> #topic startup time
22:38:09 <jeblair> sorry this was enigmatic, but i think it got explained earlier
22:38:21 <jeblair> we have no idea how long it will take to start zuul the first time we do it at openstack scale
22:38:39 <jeblair> we can actually preview this by adding all the the repos now, even though they don't have zuul.yaml files
22:38:50 <jeblair> that should at least get us order of magnitude info
22:39:03 * SpamapS expects at least one wormhole to form once zuulv3 is fully formed
22:39:34 <jeblair> but in order to get a realistic estimate, we need some more mergers
22:39:58 <jeblair> given at least a few, we can probably extrapolate somewhat
22:40:05 <fungi> this was hinted at in the discussion a few weeks back about testing that zuul doesn't fall over loading configuration at "openstack scale" i guess
22:40:12 <jeblair> ya
22:40:25 <fungi> cool, just making sure i'm connecting the dots
22:40:29 <clarkb> you probably want to include the biggest repos in your subset just so that they skew you conservatively rathre than being missed and causing surprises later
22:40:34 <clarkb> nova neutron openstack-manuals etc
22:40:39 <SpamapS> are you more concerned about all the git operations, or python/gear/zk lag?
22:41:20 <jeblair> SpamapS: git ops, gear data serialization, and config parsing
22:41:24 <SpamapS> I guess it has to clone "everything" yeah?
22:41:33 <jeblair> we can test the first with repos without zuul.yaml files
22:41:56 <jeblair> we won't know the rest until we have the final config
22:41:58 <jeblair> oh, quick aside:
22:42:15 <fungi> oh, oof, yeah i hadn't considered the start-up time for cloning all those repos
22:43:15 <jeblair> another reason we need a good corpus of base jobs is that zuulv3's config routines are *much* less efficient thatn zuulv2.  that is mitigated by the fact that there should be *much less* configuration than zuulv2 (eg one py35 job instead of 1600).  that's only true if we are able to factor out our most common basic jobs
22:43:49 <mordred> yah. luckily the most common jobs are also the most consistent and similar jobs
22:43:53 <jeblair> ++
22:43:57 <mordred> so there's big wins on each of those
22:44:21 <jeblair> anyway, the thing we can easily do any time between now and ptg is to spin up some more mergers and/or executors, and add all the repos and time startup
22:44:46 <SpamapS> That seems like a pretty straight forward and useful task to get done now.
22:45:29 <jeblair> i was thinking that since we have lost some cloud capacity (hopefully we'll be getting some of it and more back soon), we might even spin down some zuulv2 launchers so it doesn't greatly impact our control plane size
22:45:33 <SpamapS> I know we're all good at managing our own schedules, but did we want to maintain sort of a master authoritative list somewhere like https://etherpad.openstack.org/p/zuulv3-pre-ptg ?
22:46:53 <jeblair> SpamapS: i am happy to increase communication around things like that for the next few weeks
22:47:37 <mordred> ++
22:47:42 <jeblair> i'll volunteer to drive startup time testing.  i'll coordinate with infra-root folks about servers and report back when i have data.
22:47:59 <jeblair> #action jeblair drive startup time testing
22:48:06 <fungi> thanks
22:48:08 <mordred> \o/
22:48:11 <jeblair> #topic migration docs
22:48:24 <jeblair> oh lemme link
22:48:43 <jeblair> #link migration guide https://docs.openstack.org/infra/manual/zuulv3.html
22:48:46 <jeblair> that's a stub
22:48:51 <SpamapS> I think storyboard is too heavy and will bog us down for such a short horizon.. however, I worry we'll get off track easily if we don't have a place where we can all look back to make sure we've remembered to do all the things we said we'd do. I also worry that if we don't have that place, that place will be "Jim's head"
22:49:11 <fungi> saw you just updated the additional migration docs change an hour or so ago but haven't had time to look at it yet
22:49:11 <jeblair> #link additional content https://review.openstack.org/489691
22:49:24 <jeblair> SpamapS: yes, my head is a bad place for things to be put
22:49:49 <jeblair> there's more to add to that (that change adds todos as well as content)
22:50:13 <jeblair> and once we manage to get all the things we think openstack folks need to know about written down there...
22:50:29 <jeblair> ... i fully expect us to then rapidly add all the things they *actually* need to know about once they start asking
22:50:46 <jeblair> so we should consider it a rapidly evolving document for the next 2 months at least
22:50:52 <fungi> yeah, that will tend to get built up as an faq
22:51:14 <jeblair> anyway, my plan is to try to add a little bit here and there over the remaining time
22:51:57 <jeblair> if we can get a good chunk of info in there sooner, that would be great -- we could send out another announcement and folks can start looking at it and thinking about it
22:52:14 <jeblair> but i think the only critical thing is that we have it ready by the cutover
22:53:18 <jeblair> anyway, please feel free to add to it -- even todo comments for big topics we've missed; i'll be happy to expand them
22:53:19 <fungi> and be ready to rapidly add whatever comes up
22:53:37 <jeblair> also, i've been trying to be a bit less formal than the reference documentation
22:54:01 <jeblair> it could, honestly, probably be even less formal than i'm being, but that's a really hard thing for me to dial back when writing docs :)
22:54:25 <fungi> i enjoyed the voice used in the stub at least, and skimming the pending addition it looks to be consistent, so cool by me
22:54:51 <jeblair> (i think it's more effective the more it directly relates and addresses the openstack dev community)
22:55:07 <fungi> people are going to be landing there when they're a little irritated, so a less dry and formal document will hopefully lighten their mood if even slightly
22:55:17 <jeblair> fungi: good point
22:55:30 <jeblair> we should add soothing pictures
22:55:34 <jeblair> #topic open discussion
22:55:40 <clarkb> and ocean wave sound track
22:55:46 <fungi> mordred is not allowed to choose the cat pictures
22:55:59 <jeblair> anything else folks want to discuss?
22:56:06 <mordred> fungi: not all cats can be un-moist
22:57:39 <fungi> i have a misbehaving kitten who is being trained with the assistance of a spray bottle of water, and so is often rather damp
22:58:17 <jeblair> thanks everyone for attending and pitching in on this.  i'm glad we have a plan (and in some cases, a plan for a plan).  i think we have a fighting chance of getting this thing out the door.  :)
22:58:39 <fungi> thanks jeblair!
22:59:02 <jeblair> #endmeeting