22:02:45 #startmeeting zuul 22:02:46 Meeting started Mon Aug 21 22:02:45 2017 UTC and is due to finish in 60 minutes. The chair is jeblair. Information about MeetBot at http://wiki.debian.org/MeetBot. 22:02:47 Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 22:02:49 The meeting name has been set to 'zuul' 22:02:51 #link agenda https://wiki.openstack.org/wiki/Meetings/Zuul 22:03:00 #link previous meeting http://eavesdrop.openstack.org/meetings/zuul/2017/zuul.2017-08-14-22.03.html 22:03:26 the agenda remains: "What needs to happen before PTG" (as i expect it to until the ptg) 22:03:43 #link pre-ptg etherpad https://etherpad.openstack.org/p/zuulv3-pre-ptg 22:04:02 so let's go through those which we identified last time 22:04:23 actually, i'm going to do easy things first 22:04:33 #topic startup time 22:04:52 we ran this test, at least as best we can until we have more jobs defined 22:05:11 we got a baseline for how long it will take zuul to checkout all branches of all projects 22:05:30 if we size zuulv3 at least as large as zuulv2, things look good. it should be able to start in less than 2.5 minutes 22:05:53 switching to phone, family dragging me to icecream 22:06:02 that's excellent time for an initial start - especially considering it doesn't need to initial start much 22:06:08 i think that's quite manageable for a system that large which is, after all, not actually intended to restart very often 22:06:10 mordred: ya that :) 22:06:36 (ftr: that's ~1600 repos, with 8 mergers and 8 executors) 22:06:49 is a progress indicator emitted during that? 22:06:52 and that was with pre primed repos right? 22:07:09 (that may be worthy of deployment documentation if expected to be used) 22:07:19 Shrews: no 22:07:32 Not bad at all. 22:07:42 clarkb: yes -- that's a good point. if we want to minimize this for our very first restart, we should prime the git repos on our new hosts 22:07:53 There are logs in the file... 22:08:09 jeblair: was 2.5 minutes with or without pre-primed git repos? 22:08:19 clarkb: i don't think i'd mention this in general documentation though. most other zuul v3 instances will be able to grow to this size 22:08:24 mordred: with pre-primed 22:08:27 ok. cool 22:08:33 primed in this case means they were already cloned onto the host 22:09:22 and yah, I don't think it's super useful for normal docs - our migration is a weird special case of starting a new v3 at massive size :) 22:09:25 clarkb: by which i mean, it's really only the case where a zuul v3 instance springs from nowhere with 1600 repos that it's worth considering 22:09:49 i'm comfortable scratching this from the pending list; any other concerns? 22:10:02 the results have exceeded my expectations 22:10:08 That's like replacing an existing Zuul site with a new host and new file systems. 22:10:18 So, like a major failure. 22:10:40 yah. and in that case, pre-cloning the repos isn't likely to buy you much in terms of downtime response 22:10:54 jlk: yeah, and if you're a large site, you'd have to lose all your (ideally many) executors+mergers to achieve this level of failure 22:10:58 in fact, it's likely to make it slower, since you'll have to clone the repos which means scripting that real quick 22:11:08 mordred: indeed 22:11:24 So a very unlikely scenario. 22:11:43 #topic fix for "A worker was found in a dead state" bug 22:11:53 we didn't actually talk about this last time 22:12:18 but i put it on the list because it was killing a significant portion of our ansible-playbook runs 22:12:34 good news: we tracked it down to a python segfault which has been fixed in current versions of python 22:12:54 mordred created a ppa with the backported bugfix (which we're running now i believe?) 22:13:07 and SpamapS started the ubuntu SRU process for it 22:13:32 https://launchpad.net/~openstack-ci-core/+archive/ubuntu/python-bpo-27945-backport <- ppa exists with python3.5 package built 22:13:42 so the package there is good to go 22:13:53 mordred: ah, do we still need to add that to our puppet? 22:14:23 I did 22:14:26 no, we have done that already 22:14:28 cool 22:14:30 Neat. 22:14:31 and sorry I missed the start AGAIN... weird day. :-P 22:14:36 https://review.openstack.org/495399 was merged 22:14:43 SpamapS: sundial messed up? 22:14:45 jlk, SpamapS: y'all likely want to add that :) 22:14:46 exactly 22:15:29 SpamapS: let us know what happens with the sru process, please :) 22:15:33 I would expect that upload to be released into xenial-proposed within a week. 22:15:53 and once it's there, we should report back that the binaries work, and it will spend another few days in proposed before they release it in updates. 22:15:59 I suggest subscribing to the bug. 22:16:27 Okay 22:16:32 SpamapS: have the bug link handy? 22:16:49 oh i do 22:16:54 https://bugs.launchpad.net/ubuntu/+source/python3.6/+bug/1711724 22:16:55 Launchpad bug 1711724 in python3.5 (Ubuntu Xenial) "Segfaults with dict" [High,In progress] - Assigned to Clint Byrum (clint-fewbar) 22:17:06 #link https://bugs.launchpad.net/ubuntu/+source/python3.6/+bug/1711724 22:17:16 SpamapS: thanks! 22:17:31 #topic tarball/publish jobs 22:17:46 #link https://etherpad.openstack.org/p/mVSVwG4xos 22:18:02 pabelanger and mordred have been working on this 22:18:19 pabelanger has some next patches up - I've got some locally I wrote on the plane that follow up to his patches that I'll get pushed up after the meeting 22:18:43 #link https://review.openstack.org/494672 22:18:45 yes, python-branch-tarball should be ready fore testing 22:18:50 also added upload-twine role 22:19:03 #link https://review.openstack.org/495972 22:19:08 i think those are those two patches ^ 22:19:17 yes 22:19:36 https://review.openstack.org/495973/ uses the new twine role 22:19:43 pabelanger: would you mind keeping the etherpad updated and adding links to those patches in there? 22:19:52 i just added those, i mean in the future 22:19:53 jeblair: sure 22:20:06 Also ping me if you need more work. I have an empty plate. 22:20:34 jeblair: pong (sorry got sidetracked) 22:20:51 release-openstack-tarball still needs to be pushed up, but holding off until we get branch-tarball working 22:20:52 * dmsimard reads backlog 22:21:31 pabelanger: you mean release-openstack-python? 22:21:42 Ah, yes 22:22:32 actually, I'll push up that review shortly 22:22:44 pabelanger: thank you. that will let us work on things in parallel 22:22:53 agree 22:23:06 pabelanger, mordred: do you think we can wrap up these jobs in the next 2 days or so? 22:23:23 yes, I hope we can finish them for tomorrow 22:23:56 jeblair: yes. I agree with pabelanger on tomorrow 22:24:01 okay, thanks! 22:24:03 they're very close 22:24:12 if we land 494672 today, that would be helpful too 22:25:40 please continue to ping me as soon as any patches directly related to these efforts are ready for review 22:25:53 shall we move on to devstack now? 22:26:05 yes 22:26:15 #topic devstack jobs 22:26:18 #link https://etherpad.openstack.org/p/AIFz4wRKQm 22:26:28 there's the brainstorming etherpad for this one 22:26:50 jlk: i suspect that there may be more opportunity for you to jump in here, as compared to the tarball jobs 22:26:58 Okay 22:27:46 jeblair: I did some (mostly useless) noodling in an off moment over the weekend - I do not think we'll wind up being able to use anything I poked at over the weekend directly... 22:28:10 once mordred and pabelanger finish up with the tarball jobs, i expect their focus will shift to this 22:28:11 yes 22:28:24 jeblair: but one thing that jumped out that we're missing from the list is a role or something to get from our new repo structure to somethign devstack can consume 22:28:26 and clarkb has volunteered to review some of this as well :) 22:28:39 I'll read up this evening / tomorrow. 22:28:44 because the PROJECTS list and repos on disk is drastically different 22:28:58 I think mordred added devstack-gate to zuulv3 last week? 22:29:06 yes. d-g is in v3 currently 22:29:40 great 22:30:19 mordred: good point. i added an item to the etherpad todo list 22:30:24 jeblair: my hunch was that it might be good for you to at least eyeball that and ponder it 22:31:00 mordred: i expect 'required-projects' to be a replacement for the $PROJECTS variable 22:31:06 I do too 22:31:10 +1 22:31:15 clarkb's work to reduce the use of that to minimum will be helpful here 22:31:55 so yeah, i'll plan on thinking about that next 22:32:22 one question on d-g, for legacy hooks, that is basically just going to be a shell task right? 22:32:26 in the mean time, i started on the localrc ansible module we discussed. that's a nice out-of-the way thing i could start while other folks were finishing the tarball jobs 22:32:33 jeblair: ++ 22:32:51 the guts of that are done, i just need to wrap it up in module boilerplate. should have that up tomorrow. 22:33:25 pabelanger: the etherpad says: make "legacy" playbooks that run hook scripts with vars collected by part one 22:33:32 yah 22:33:34 (part one is "process env vars") 22:33:53 and in non-legacy jobs people can just use pre-tasks as needed 22:34:09 pabelanger: so yeah, i think so -- a playbook with a shell task with the current hook content 22:34:21 automatically generated by the migration script 22:34:29 this: https://review.openstack.org/#/c/495930/ does not work - but is the first attempt at dealing with part one "process env vars" btw 22:34:49 it does not work and probalby should be deleted - but that was the inspiration 22:34:49 Right, so I guess we'll have to do some magic for bash variables like WORKSPACE too 22:35:19 well - the idea in my brainhole is that we make a thing that produces all of the legacy envvars that things normally run with 22:35:31 okay, cool 22:35:37 then when we take a hook script, we run it in a script that first sources those vars, then runs the hook script 22:35:51 perfect 22:36:03 we'll want legacy vars for a bunch of types of jobs, so that's probable a role to generate the file, then source when appropriate? 22:36:19 yah 22:37:15 A role that wraps a script, script passed as role var at role call? 22:37:29 not that I am looking for an answer, but have we given any thought on how long we'd support a legacy hook for? 22:37:38 jlk: yah- something like that 22:38:22 pabelanger: preferrably not a super long time :) 22:38:36 pabelanger: i think once we get the automagic conversion done, we (openstack) set a release+1 goal for projects to migrate their own jobs (and help projects as we're able) 22:38:47 jeblair: ++ 22:38:49 okay, cool 22:38:50 Less time than keystone v2... 22:39:14 anything else devstack related? 22:39:19 also - fwiw, MOST of the hook scripts are one line scripts 22:39:24 that just call a script in the repo 22:39:31 which then calls our d-g script 22:39:34 mordred: tripleo-ci is the usecase I am thinking off 22:39:35 so we also have to keep that around 22:39:37 of* 22:39:43 Just a hook, script, and a jump? 22:39:49 the ones that are more than one line are mostly 2 lines, with the first line being a CD 22:39:52 cd 22:39:54 gah 22:40:07 CD \ 22:40:17 jeblair: nothing more here 22:40:35 #topic migration script 22:40:44 so thankfully there's not a ton of actual complex logic in the job definitions themselves - and we can probably take any of the very few actual special cases and nudge them to fit into the pattern of the otherhook scripts 22:41:04 this is basically pending completion of the publish and devstack jobs 22:41:23 mostly because we need to know what to migrate too 22:41:48 i'm hoping we can get those jobs wrapped up very soon and have at least a couple of weeks to work on the migration script and deal with the output 22:41:56 ++ 22:42:11 ++ 22:42:39 luckily the migration script is a thign that just reads local yaml files - so it's actually easy to iterate on locally 22:42:56 it may hurt one's brain - but other than that it's just normal hacking 22:43:39 #topic migration docs 22:43:52 Shrews has started to pick up work on this 22:43:59 YAML2YAML The YAMLing 22:44:31 updating the infra-manual migration page based on his experiences trying to use zuulv3 with almost no relevant documentation :) 22:44:31 https://review.openstack.org/495971 22:44:44 Yay 22:45:23 Shrews: should i continue to flesh out some of the todo items i left in there? or did you want to take some of them over? 22:45:47 that link is just for a 0-day, "omg, how do i do something" guide 22:46:39 jeblair: i'm afraid of assuming where you were heading with some of those todos 22:46:39 (eg: one of the important things i think we need to communicate is how the variant binding works. there's some tribal knowledge about metajobs and skip-if that we need to tell people how to do with variants) 22:46:45 * jlk has to go afk. Will review logs. 22:47:20 Shrews: okay, i'll continue to poke at them as able, and maybe try to trick you into doing some if i can articulate it adequately :) 22:47:24 jeblair: does what i put up cover your "actually work" todos? 22:47:40 or were you planning to go deeper? 22:48:18 There should definitely be a "zuulv3 for the jjb programmer" 22:48:34 i guess it doesn't really cover inheritence or roles very well 22:50:39 has anybody looked at project-template from python-jobs? 22:51:12 possible that is in mordred migration script, but I haven't looked 22:53:01 Shrews: i'm not sure; there's some good stuff in there, but it almost looks like it's more aimed at the user who hasn't used zuulv2 in openstack. i think we'll be able to use a lot of that in the project drivers guide of the infra-manual to replace the current jjb/zuulv2 stuff that's there. 22:54:47 i'll leave some suggestions as to how we might change things for the migration audience 22:57:09 anything else? 22:58:24 i forgot to start the meeting with our countown clock, so i'll end it with: we have 15 workdays until the scheduled ptg cutover 22:58:27 thanks everyone! 22:58:30 #endmeetig 22:58:31 #endmeeting