22:03:11 #startmeeting zuul 22:03:12 Meeting started Mon Aug 14 22:03:11 2017 UTC and is due to finish in 60 minutes. The chair is jeblair. Information about MeetBot at http://wiki.debian.org/MeetBot. 22:03:13 Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 22:03:15 The meeting name has been set to 'zuul' 22:03:27 okay, i had hoped more people would be here :( 22:03:32 i am 22:03:36 #link agenda https://wiki.openstack.org/wiki/Meetings/Zuul 22:03:37 but yeah 22:03:42 you will note that i have cleared the agenda 22:03:45 it has two topics 22:03:52 #topic What needs to happen before PTG 22:04:05 i am not certain that we will be ready for the ptg cutover 22:04:13 that doesn't mean we won't be 22:04:21 but i would not place any bets on it at this point 22:04:28 we have 20 work days until then 22:04:36 many of us, myself included, have fewer than that 22:04:51 if our pace were like it was 2 weeks ago: no problem, we're fine. 22:04:58 if it's like last week: we're in trouble 22:05:19 summer ftl :) 22:05:47 anyway, the point is not to speculate or try to try to convince ourselves one way or the other, but to try to get on the same page about what needs to be done 22:06:02 and make sure that folks who are able to help know what to do to try to maximize our chances 22:06:18 to that end, here's an etherpad: https://etherpad.openstack.org/p/zuulv3-pre-ptg 22:06:30 * mordred waves 22:06:45 that's just some things i brainstormed quickly 22:07:17 if the stuff in the first section doesn't happen (some sooner than others), then i'm very likely to advocate we call it off. 22:07:32 the second section is what i think we could accomplish if progress is really good 22:07:53 what is "startup time"? 22:07:56 the third section is not going to happen before the ptg, i just couldn't bear to title it that way. 22:08:04 re: improved docs -- is that docs for migration, or just overall "how to use zuul" docs? 22:08:04 o/ 22:08:08 sorry I got distracted coding 22:08:14 Shrews: testing zuul startup time with the entire set of migrated job config for openstack 22:08:21 so, I have a good handle on what we need to do for tarball/publish jobs, devstack-gate is an unknown to me right now. Are we just going to copypaste bash scripts into ansible playbooks like we did with tox job? 22:08:36 leifmadsen: migration docs are already in the absolutely section 22:08:39 Shrews: it was one of the risks mordred noted in the announcement. zuul has to read 2k repos in order to startup. we have no idea how long that will take. 22:08:47 leifmadsen: so i think improved general docs for zuul 22:09:01 pabelanger: d-g is in some ways easier since it's already in a repo that can be included in job changes 22:09:14 fungi: ok cool, I'm (kinda) working on that as a "I have no fucking clue what I'm doing here!" person approaching zuul with github integration 22:09:39 fungi, leifmadsen: correct. i think our docs are mostly *accurate* for what openstack users will need, but they could still be improved from a narrative viewpoint. 22:09:55 leifmadsen: and that pespective in particular is great and exactly what we need :) 22:09:58 they are very hard to approach for someone landing on zuul and trying to spin something up 22:09:58 pabelanger: so I think the idea is to add d-g to v3 then iterate on the ansiblification effort that was already underway and get it to a point where it's decomposed enough that a playbook to drive it works 22:10:04 leifmadsen: the migration docs are likely to continue to be a priority after the cut-over as well as we discover things we didn't think to add, but this should at least be the minimal set of stuff we know will come up 22:10:19 fungi: that's fine, I have no intention or interest in those :) (re: migration docs) 22:10:51 you know... 22:10:51 I did notice a blog post from tristanC on RDO blog site recently (Aug 1) that might help bootstrap me a bit further as well 22:10:56 I know we'd like to ansiblify d-g 22:10:58 oh, also the migration docs are likely to be of interest to operators of other zuul v2 deployments, i expect 22:11:00 but is it necessary for the cutover? 22:11:45 pabelanger: which means we should be able to start with stupidly simple v3 jobs in the shade repo (shell: run_devstack_gate.sh) and iterate until they're better - but verifying that the changes don't break anything 22:11:53 SpamapS: i think it's necessary for 2 reasons: 1) it will be more difficult to do it after the cutover and 2) i think we need to actually be able to show people what the system is actually for before we cutover. 22:12:05 mordred: right, ansibilfy effort isn't as much a technical issue, but we do spend a lot of time discussion the design. So if we have another etherpad session like we did for tox and tarball jobs, I think that will allow us to iterate at a faster rate. 22:12:16 pabelanger: ++ 22:12:19 so d-g as our "advanced usage" example 22:12:25 SpamapS: otherwise, we're going to end up with zuulv3 being a *worse* zuul than v2 for most of what openstack uses it for, and our cries of "but no! we imagine it will be great!" will fall on deaf ears 22:12:53 so I gotta bail for dinner, but if anyone has interest in doing a bit of training on my side to speed up my knowledge transfer, I'm happy to take notes and write up some documentation, submit patches, etc 22:12:54 jeblair: I like some of v3's features that have little to do with Ansible.. so I'm not sure (2) speaks to me that loudly. But (1) is huge. 22:12:56 pabelanger: I agree. although I do think that d-g is going to be 'easier' since it's way more about doing remote things and less about the harder bits of secrets integration and whatnot 22:12:59 also, d-g provides us with a complex use case to help us find the rough edgest (by getting cut on them) 22:13:05 leifmadsen: cool, i'll ping you tomorrow 22:13:15 I'm pretty close I think to having zuul up and running (I have the first half done), but I don't have the nodepool and job run stuff done 22:13:24 leifmadsen: sweet! 22:13:42 jeblair: cool, I feel like a bluejeans, showing where I'm at, what I know, and then someone helping to take me to the end would be incredibly useful 22:13:51 it would get me to writing docs faster, and more accurately 22:14:01 leifmadsen: can do 22:14:03 pabelanger: which is all just me being optimistic- I think we should totally do another session like we did for tarballs/publish 22:14:05 So would it make sense to stop painting the dark corners of the code base now and see if we can split up the d-g work even more? 22:14:11 rock on, bailing now, but ping me tomorrow or Wednesday 22:14:18 * SpamapS may be a week or two behind and that may already be happening ;) 22:14:40 for d-g, I would people interested maybe get into a sprint channel so we can dedicate some serious time to this? Because, it is hard just waiting for people to review ansible code. 22:14:44 SpamapS: yeah i think so. i think at the end of last week, i put out the last code related fire that i think was holding us back 22:14:50 suggest people* 22:14:51 ++ 22:15:10 so unless another pops up, i will start focusing on these items 22:15:21 maybe let's go through these items together real quick? 22:15:26 cool 22:15:35 #topic tarball/publish jobs 22:15:41 i think this can be quick 22:15:50 mordred, pabelanger, and i made an etherpad for this last week 22:15:59 it took us a couple hours 22:16:11 but we ended up with a discrete todo list (it's sorta in the middle of the pad, sorry) 22:16:16 i think that was really helpful 22:16:44 it's now missing a step 22:17:01 as folks would like to see some more verification around secrets before we fully execute it 22:17:42 completion of that may now be sitting behind writing some more unit tests and/or creating test jobs with dummy keys 22:18:43 but aside from that -- does the list look okay? do we need to reassign any tasks there? will folks be able to work on that in the next day or so? 22:18:48 With the short horizon, I like the idea of just etherpadding the tasks and updating people via IRC when one changes things. 22:19:05 #link tarball/publish jobs etherpad https://etherpad.openstack.org/p/mVSVwG4xos 22:19:15 jeblair: yah - list looks good to me and with v3 up and going again I'm good to crank on the next bits 22:19:31 okay, next: 22:19:34 #topic devstack jobs 22:19:50 i like the suggestion from earlier that we etherpad this like the other jobs 22:19:55 ++ 22:20:07 yes, please 22:20:09 we have a lot of stuff done previously, but also, that was a while ago and things may have changed 22:20:10 one thing to keep in mind with ansibilification of d-g is that final RCs for openstack happen next week. So we'll need to be careful merging changes to d-g (as many projects rely on it for their testing) 22:20:12 so when should we do this? 22:20:49 clarkb: good point. we should be able to merge additive changes easily 22:21:05 any time works for me 22:21:06 yup and the vast majority of the changes should also be self testing giving a good degree of confidence in them 22:21:15 yup 22:21:21 yeah, i think it ought not be too hard to make nondisruptive changes to d-g while working on this 22:21:30 i guess another thing to consider is that we're also probably close enough that we could consider d-g slushy and start building new stuff in parallel. 22:21:55 let me ask a different question first: who's interested in working on this? 22:21:57 o/ 22:21:59 i'd be cool declaring d-g mostly-frozen similar to what we're starting to do with project-config 22:22:01 o/ 22:22:07 o/ 22:22:15 I'm happy to continue reviewing the changes, though currently I've been quite distracted by the release :/ 22:22:28 i'm interested in it but disappearing for the next week-ish 22:22:29 clarkb: that's useful too 22:22:51 fungi: that's good to know :) 22:22:55 probably reviewing is something i can squeeze in at odd hours over bad connectivity for the coming week 22:23:26 blame eclipses 22:23:33 o/ for working on it... but more to o/ to working on whatever needs doing 22:23:39 jeblair: we should also perhaps define how 'done' we want to get - I imagine we're not going to be *fully* done, and auto-converting all of the various d-g cargocults will be tricky - but probably we want to get further than a shell script that just runs safe-devstack-gate.sh yeah? 22:23:40 pabelanger, mordred: would you like to finish the tarball/publish work before moving to d-g? or start our etherpad session for that sooner? 22:24:00 jeblair: etherpad sooner is better for me 22:24:05 mordred: yeah. maybe that's an outcome of the etherpad session (hopefully we'll have a better idea of structure then) 22:24:07 so I can start thinking about it 22:24:08 jeblair: ++ 22:25:33 mordred, pabelanger: would you like to etherpad devstack jobs tomorrow? morningish or afternoonish? 22:25:49 so the general idea is that hooks are left as raw shell and we'll continue to source settings from the environment and continue using the same name for a gate wrap script as an entrypoint? 22:25:58 just trying to understand the overall scope 22:26:08 jeblair: tomorrow works for me, morningsh should be fine too 22:26:11 fungi: let's get back to you on that :) 22:26:14 k 22:26:16 ;) 22:26:34 so not yet hashed out, something else which will come out of the etherpad brainstorming 22:27:43 though _if_ we stick to those basics, i expect we can provide a fairly smooth incremental migration for people without having to do it all for them up front 22:28:03 mordred: does 1600 utc work? 22:28:26 yup. I'd love it if that works for SpamapS - I think having his brain in the ehterpadding would be helpful 22:28:52 SpamapS, clarkb: ^? 22:29:08 I'll be around 22:29:14 i'll be around some later tomorrow (maybe 21:00 utc or so) and will gladly review the etherpad you put together 22:29:16 yes that works 22:29:37 #agreed etherpad brainstorming for devstack-gate jobs tuesday 15 aug 1600 utc 22:29:50 #topic migration script 22:29:50 and my brain should be fully booted back into zuulOS by then 22:30:03 ++ 22:30:09 mordred has taken a stab at this 22:30:19 i haven't looked at it yet 22:30:21 oh - don't 22:30:25 it's very incomplete 22:30:39 oh, actually, let me say something first 22:30:42 Rust would be a bad choice for this, yes? 22:30:45 ;-) 22:30:47 :) 22:30:58 i think the two previous topics -- the publishing and devstack jobs are things we need to get done first 22:31:16 because creating jobs is what is driving understanding of pain points and surfacing bugs 22:31:29 all of the stuff i've been sweeping up for the past few weeks directly came out of writing jobs 22:31:43 +1 22:31:45 ++ 22:31:53 so we need to have that work done early so we have time to discover those things and fix them 22:32:06 similarly, we need to have it done early so that the migration script has something to output :) 22:32:08 We don't want to be tempted to paper over those blemishes with the automated script. 22:32:30 hear hear 22:32:33 That's just duplicating v2.5's ansiblizer 22:32:38 ya 22:32:40 yup 22:33:03 That said, this should have a solid foundation, so I think having one person get it to a poorly-not-really-working state now isn't a bad idea. 22:33:05 ya my suggestion was to avoid that for the "core" jobs 22:33:20 https://review.openstack.org/#/c/491804 is the one thing in the in-work migration script that's worth mentioning 22:33:22 do it for everything else and point people at the "core" jobs as the example to then go and migrate towards 22:33:23 mordred: once run, we are doing a git commit and starting iterating on the output? 22:33:39 so this is proably something to start poking at, and as SpamapS said, start building a foundation , but we should plan on having the job work done before we really focus on this 22:33:46 jeblair: agreee 22:34:19 (unlike almost anything else on the list, if we are still tweaking this the friday before the cutover, i will not be terrified. like, not thrilled, but not terrified :) 22:34:31 are we still planning on running against -infra jobs first? 22:34:51 clarkb: I think "core" here might be worth expanding - there's a set of stuff where auto-translation is both easy and desirable and a big win (tox jobs are a good example) and others where auto-translation will be next to impossible where we need to do something more like what you're saying 22:35:14 mordred: ya I tried to give examples of hte nasty jobs in my email to the lis tas well 22:35:18 yah 22:35:37 pabelanger: i think mordred was looking at having jobs run continuously so we can keep an eye on the output (once it basically works) 22:35:55 sorry I haven't responded to that yet - there are a few things we can do in some circumstances I want to highlight - but I broadly think we're on the same page 22:36:14 yah - https://review.openstack.org/#/c/491937/1 is a followup to the above which runs the script and saves the output so we can look at it 22:36:48 or - it will be, once the "run the migrate script" job works and I don't have it looking for incorrect paths :) 22:37:39 #agreed work on foundation now, but significant work will depend on completion of basic and devstack jobs 22:37:45 but I'll start poking folks once that's ready to be looked at for real 22:37:49 ++ 22:37:58 #topic startup time 22:38:09 sorry this was enigmatic, but i think it got explained earlier 22:38:21 we have no idea how long it will take to start zuul the first time we do it at openstack scale 22:38:39 we can actually preview this by adding all the the repos now, even though they don't have zuul.yaml files 22:38:50 that should at least get us order of magnitude info 22:39:03 * SpamapS expects at least one wormhole to form once zuulv3 is fully formed 22:39:34 but in order to get a realistic estimate, we need some more mergers 22:39:58 given at least a few, we can probably extrapolate somewhat 22:40:05 this was hinted at in the discussion a few weeks back about testing that zuul doesn't fall over loading configuration at "openstack scale" i guess 22:40:12 ya 22:40:25 cool, just making sure i'm connecting the dots 22:40:29 you probably want to include the biggest repos in your subset just so that they skew you conservatively rathre than being missed and causing surprises later 22:40:34 nova neutron openstack-manuals etc 22:40:39 are you more concerned about all the git operations, or python/gear/zk lag? 22:41:20 SpamapS: git ops, gear data serialization, and config parsing 22:41:24 I guess it has to clone "everything" yeah? 22:41:33 we can test the first with repos without zuul.yaml files 22:41:56 we won't know the rest until we have the final config 22:41:58 oh, quick aside: 22:42:15 oh, oof, yeah i hadn't considered the start-up time for cloning all those repos 22:43:15 another reason we need a good corpus of base jobs is that zuulv3's config routines are *much* less efficient thatn zuulv2. that is mitigated by the fact that there should be *much less* configuration than zuulv2 (eg one py35 job instead of 1600). that's only true if we are able to factor out our most common basic jobs 22:43:49 yah. luckily the most common jobs are also the most consistent and similar jobs 22:43:53 ++ 22:43:57 so there's big wins on each of those 22:44:21 anyway, the thing we can easily do any time between now and ptg is to spin up some more mergers and/or executors, and add all the repos and time startup 22:44:46 That seems like a pretty straight forward and useful task to get done now. 22:45:29 i was thinking that since we have lost some cloud capacity (hopefully we'll be getting some of it and more back soon), we might even spin down some zuulv2 launchers so it doesn't greatly impact our control plane size 22:45:33 I know we're all good at managing our own schedules, but did we want to maintain sort of a master authoritative list somewhere like https://etherpad.openstack.org/p/zuulv3-pre-ptg ? 22:46:53 SpamapS: i am happy to increase communication around things like that for the next few weeks 22:47:37 ++ 22:47:42 i'll volunteer to drive startup time testing. i'll coordinate with infra-root folks about servers and report back when i have data. 22:47:59 #action jeblair drive startup time testing 22:48:06 thanks 22:48:08 \o/ 22:48:11 #topic migration docs 22:48:24 oh lemme link 22:48:43 #link migration guide https://docs.openstack.org/infra/manual/zuulv3.html 22:48:46 that's a stub 22:48:51 I think storyboard is too heavy and will bog us down for such a short horizon.. however, I worry we'll get off track easily if we don't have a place where we can all look back to make sure we've remembered to do all the things we said we'd do. I also worry that if we don't have that place, that place will be "Jim's head" 22:49:11 saw you just updated the additional migration docs change an hour or so ago but haven't had time to look at it yet 22:49:11 #link additional content https://review.openstack.org/489691 22:49:24 SpamapS: yes, my head is a bad place for things to be put 22:49:49 there's more to add to that (that change adds todos as well as content) 22:50:13 and once we manage to get all the things we think openstack folks need to know about written down there... 22:50:29 ... i fully expect us to then rapidly add all the things they *actually* need to know about once they start asking 22:50:46 so we should consider it a rapidly evolving document for the next 2 months at least 22:50:52 yeah, that will tend to get built up as an faq 22:51:14 anyway, my plan is to try to add a little bit here and there over the remaining time 22:51:57 if we can get a good chunk of info in there sooner, that would be great -- we could send out another announcement and folks can start looking at it and thinking about it 22:52:14 but i think the only critical thing is that we have it ready by the cutover 22:53:18 anyway, please feel free to add to it -- even todo comments for big topics we've missed; i'll be happy to expand them 22:53:19 and be ready to rapidly add whatever comes up 22:53:37 also, i've been trying to be a bit less formal than the reference documentation 22:54:01 it could, honestly, probably be even less formal than i'm being, but that's a really hard thing for me to dial back when writing docs :) 22:54:25 i enjoyed the voice used in the stub at least, and skimming the pending addition it looks to be consistent, so cool by me 22:54:51 (i think it's more effective the more it directly relates and addresses the openstack dev community) 22:55:07 people are going to be landing there when they're a little irritated, so a less dry and formal document will hopefully lighten their mood if even slightly 22:55:17 fungi: good point 22:55:30 we should add soothing pictures 22:55:34 #topic open discussion 22:55:40 and ocean wave sound track 22:55:46 mordred is not allowed to choose the cat pictures 22:55:59 anything else folks want to discuss? 22:56:06 fungi: not all cats can be un-moist 22:57:39 i have a misbehaving kitten who is being trained with the assistance of a spray bottle of water, and so is often rather damp 22:58:17 thanks everyone for attending and pitching in on this. i'm glad we have a plan (and in some cases, a plan for a plan). i think we have a fighting chance of getting this thing out the door. :) 22:58:39 thanks jeblair! 22:59:02 #endmeeting