19:03:04 #startmeeting infra 19:03:05 Meeting started Tue Nov 1 19:03:04 2016 UTC and is due to finish in 60 minutes. The chair is fungi. Information about MeetBot at http://wiki.debian.org/MeetBot. 19:03:06 Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 19:03:07 Morning 19:03:09 The meeting name has been set to 'infra' 19:03:13 #link https://wiki.openstack.org/wiki/Meetings/InfraTeamMeeting#Agenda_for_next_meeting 19:03:18 #topic Announcements 19:03:19 o/ 19:03:20 o/ 19:03:25 #info Many members of the Infra team met in person last week at the Ocata Design Summit in Barcelona; a summary will be provided to openstack-infra@lists.openstack.org later this week. 19:03:29 #link https://wiki.openstack.org/wiki/Design_Summit/Ocata/Etherpads#Infrastructure 19:03:35 #action fungi send summit session summary to infra ml 19:03:41 as always, feel free to hit me up with announcements you want included in future meetings 19:03:51 #topic Actions from last meeting 19:03:56 #link http://eavesdrop.openstack.org/meetings/infra/2016/infra.2016-10-18-19.02.html 19:04:06 pleia2 add skeleton infra session etherpads linked from the ocata design summit etherpads wiki page 19:04:07 that's done (see above link in this week's announcements) 19:04:14 thanks for taking care of that! 19:04:26 ianw work on deploying a pholio server 19:04:28 #link https://review.openstack.org/389511 19:04:32 that looks pretty close 19:04:40 yep, imagine that will be up this week 19:04:56 * fungi rubs hands together evilly 19:04:59 ...excellent... 19:05:16 #topic Specs approval: PROPOSED "Neutral governance website" (ttx) 19:05:21 #link https://review.openstack.org/382447 "Neutral governance website" spec 19:05:29 #info The "Neutral governance website" spec is open for Infra Council vote until 19:00 UTC Thursday, November 3. 19:05:35 Mostly a rundown of all the steps I need to go through 19:05:50 to rotate current / to /tc with minimal disruption 19:06:01 it looked complete enough to me 19:06:32 anybody need to raise any quick questions about this? 19:06:34 let me know if you have questions 19:07:13 it's likely common sense for those who have been following te governance changes with the uc, but the commit message and spec description spell it out pretty explicitly 19:07:20 I remember skimming this and it seems to be straightforward 19:07:32 yeah, nothing fancy really 19:07:43 thanks ttx! if nobody objects, i'll be approving it in ~48 hours 19:07:50 cool, thanks! 19:08:01 #topic Priority Efforts 19:08:04 nothing is called out on the agenda 19:08:06 though rcarrillocruz has proposed to mark infra-cloud as implemented 19:08:10 #link https://review.openstack.org/391443 "Mark infracloud spec as complete" 19:08:15 aside from the technicalities of the index change, anyone disagree that it's implemented enough to be listed as such? 19:08:20 we seemed to have some consensus on this in the friday afternoon unconference anyway 19:08:32 +1, we are in run the cloud mode now, but its there 19:08:57 ++ 19:09:53 okay, cool. once rcarrillocruz corrects the patch (or when i get around to it if he doesn't have time), i'll approve it 19:10:35 congrats crinkle, rcarrillocruz, yolanda and everyone who worked so hard on making this work! 19:10:52 #topic Cached image reduction changes (ianw) 19:10:56 o/ 19:10:58 you've got a pretty lengthy summary in the agenda, but care to restate it for posterity of the meeting logs? 19:10:59 (sorry I'm late) 19:11:30 this was proposed by xen people for devstack, to allow the image list to be reduced in devstacks ./tools/image_list.sh 19:12:02 thus there is a project config change to only get those images for testing that we use in the gate 19:12:30 i think this dovetails into the arbitrary object mirroring work we've discussed recently 19:13:06 I would agree 19:13:12 ya I think we probably want a minimal image list cached for devstack (cirros or its possible future replacement) and then put everything else on arbitrary file caching setup 19:13:12 so i've proposed that we stop devstack downloading images on the fly, and put in a way to block that, and proposed enabling it 19:13:20 it would be nice to figure out which of these we actually use in jobs, and then which of them we use often enough to benefit from pre-caching on job nodes vs local mirrors in each provider 19:13:58 but i agree, not caching images we don't use at all in our jobs is a great place to start 19:13:59 hmm, so that would conflict with the "don't download testing images on the fly" approach, if some are to be grabbed from mirrors 19:14:02 I'm fine with the current change set in general. The initial changes contained duplicated hardcoded list, which I strongly objected to. 19:14:23 ianw: they may be grabbed via /afs though 19:14:32 ianw: in which case its not a download but a filepath that we set in our jobs 19:14:52 looking at the paste, codesearch showed all images as beeing used - with exception of cirros-0.3.0-x86_64-disk.vhd.tgz. Not sure where this comes from 19:14:59 job nodes referencing via /afs paths misses out on a reusable cache though because the nodes aren't reused 19:15:09 clarkb: as much as i love afs, that's almost certain to be slower than wget https://mirror/ 19:15:09 fungi: thats true 19:15:34 ya forgot that we get the local caching from the apache servers which is handy 19:15:51 so, maybe that means we don't try trimming the image list for infra? 19:15:52 ianw: yeah, the goal i have is that we find some balance between images that are too large and/or infrequently used such that we can stop embedding them in our images 19:16:23 and then provide those over local afs-backed mirrors 19:16:27 ianw: so what about having the flag permit downloading from the mirror, but not the internet? 19:16:55 jeblair: yes, it could be some sort of allowed regex i guess 19:17:01 like a tri-state flag 19:17:05 true, false and maybe 19:17:18 i'm not sure it has a concept of downloading from a mirror at all, at the moment 19:17:26 it lists full image urls 19:17:56 yeah, and we definitely don't want to encode our mirrors in that list 19:18:01 nope. 19:18:24 it's probably fine to just leave the status quo of downloading all images to be safe then 19:18:28 maybe the flag could transform the url in that list to one on the mirror, then devstack would either download it from the mirror successfully, or fail it it was not mirrored. 19:18:49 otoh - I know that local mirrors of things are useful to developers in places that are not texas ... maybe adding the concept of an overridable mirror to devstack woudlnt' be _too_ terrible? 19:19:21 like "if mirror is defined, fetch this image from $MIRROR/$PATH ; else from $UPSTREAM_URL" - or something? 19:20:02 yes, i can propose that 19:20:09 then if someone wanted to set up a local apache to just host 5 images they use all the time, it's easy to do without hacking devstack a ton 19:20:11 \o/ 19:20:16 and the $PATH portion of the url would be the same as the on-disk cache path maybe? 19:20:21 maybe so, yeah 19:20:27 that way we don't been any fancy url mangling functions 19:20:29 that would actually make it a nice easy interface 19:20:50 you could even copy your local devstack cache to a webserver and serve it up without rearranging that way 19:21:18 fungi: even easier if your local devstack cache is in afs :) 19:21:29 heh 19:21:40 * jeblair high fives clarkb 19:21:41 alright, so i'd propose we do NOT do the project-config change to get reduced set of images -> https://review.openstack.org/377159 and focus on ability to get images from a mirror? 19:21:48 heh 19:22:19 ianw: sounds reasonable 19:22:47 i thought we still wanted to have some local cache? 19:23:15 just more options -- like "used so frequently it should be in the image" vs "used infrequently enough we want to download from mirror" 19:23:19 right, but that's easily combined 19:23:21 I think we should local cache the "base" small image 19:23:26 because 95% fo jobs use that one 19:23:33 and its relatively tiny 19:24:40 #agreed Any image size reduction solutions should take care to avoid making HTTP mirroring or larger or infrequently-used files impossible. 19:24:43 i like that, but if we feel that d/l it from the mirror would be sufficiently fast/reliable, i could get on board with attempting the no-image-cache idea... 19:24:45 ^ yeah? 19:24:52 (that was in response to clarkb) 19:25:01 oh, typo 19:25:03 #undo 19:25:04 Removing item from minutes: 19:25:13 #agreed Any image size reduction solutions should take care to avoid making HTTP mirroring of larger or infrequently-used files impossible. 19:25:20 ok ... yeah i have some more ideas to move forward with thanks 19:25:28 anyone _disagree_ with that statement? 19:25:55 not me :) 19:25:57 thanks for bringing this hairy implementaiton change to the meeting, ian! 19:26:41 ++ 19:26:48 #topic Force gate-{name}-pep8-{node} to build needed wheels (pabelanger) 19:26:53 #link https://review.openstack.org/391875 "Force gate-{name}-pep8-{node} to build needed wheels" 19:26:56 thanks for hacking on this! 19:27:00 o/ 19:27:01 i meant to do it months ago, and then as usual got distracted by other fires 19:27:03 it looks more or less like what we talked about on friday 19:27:17 So, this popped into my head again at summit and was able to get it working quickly with an experimental zuul job 19:27:33 as an alternative, maybe we want to push taht into the projects and they can do --no-use-wheels or whatever the pip flag is? 19:27:40 wanted to bring some eyes to the review and may discuss how we are forcing no wheels and the message about why we are doing this and date 19:28:10 clarkb: I think we could do that too 19:28:11 clarkb: yeah, that's definitely an option i considered, but it's a lot of changes and a very lengthy transition 19:28:28 in theory both could be done in parallel 19:28:34 fungi: the nice thing about it is the magicalness isn't hidden in the ci system. every local run of pep8 will be the same as in ci 19:28:36 yah - and would make responding to fungi's concern in that review harder in the future 19:28:38 which to me is very important 19:29:19 thing is - the use of our wheel mirrors is special to our ci system - so working around them seems fair to do in the ci system 19:29:40 mordred: sort of, aiui the intent is to make it so that somewhere we do installs using sdists 19:29:49 anybody remember off the top of your head where we similarly test that you haven't broken sdist'ing? are we doing that in tox.ini or the job definition/scripts? 19:29:50 which isn't a ci mirror things its a can we install our things without wheels 19:30:04 they're likely to break because they don't have bindep system dependencies for things they are currently getting as wheels, right? 19:30:05 clarkb: yah - the mechanics of that are only important for the ci system itself 19:30:09 fungi: I think that happened in the run-tox.sh script (or whatever the equivalent is for pep8) 19:30:15 ianw: yah 19:30:17 fungi, in jenkins/scripts somewhere - let me get dteails 19:30:23 mordred: thats not true, things break when sdist only 19:30:36 mordred: and that happens regardless of ci or not 19:30:44 fungi: in run-pep8.sh 19:30:52 ianw: yeah, there's a semi-frequent race with new dep releases that are sdist-only, where we won't have the wheels built for an hour or so after our pypi mirror updates 19:31:03 totes. I'm just saying that removing extra-index-url from the /etc/pip.conf that's on our build nodes is an action specific to our ci system 19:31:16 becacuse our ci system injects wheels into the environment that do not exist for normal users 19:31:30 mordred: ya we hit it extra hard because of that but I think its a general issue 19:31:31 so doing this actually makes things _more_ similar to how they run for the user on their local machine 19:32:15 this is true 19:32:25 I agree with that too 19:33:18 the reason i asked about our test that sdist works is that i think confirming these things in the same place would be a bit more consistent 19:33:36 fungi: ya I can get on baord with that 19:33:46 in which case we can move this awking to run-pep8.sh 19:34:02 wfm 19:34:09 sure, I'll make some changes 19:34:48 pabelanger: clarkb: "move" in this case would be after it baked in the job config for a bit (if ever) so that we can easily revert initially without having to rebuild images 19:35:09 fungi: that sounds like a good idea :) 19:35:19 agreed 19:35:48 What sort of timelines are people thinking about making the change? 30days out? 19:35:52 i worry that if the roll-out happens in a script embedded in our images, it makes us less able to respond to mistakes or premature change-over in a timely fashion 19:36:21 pabelanger: next 2 weeks? 19:36:21 maybe a good time to just move run-pep8.sh into jjb 19:36:30 ++ 19:36:38 jeblair: agreed. we've talked about how these don't need to be in separate scripts for the most part 19:37:10 AJaeger: wfm, nobody objected 19:37:18 pabelanger: 30 days seems fine, i'd also be okay with sooner 19:37:26 I can also start work on a ML post, get some eyes on it for spelling mistakes 19:38:18 this is something that shouldn't cause an issue for most projects, and if it does it's because they're already arguably broken. also if they're using bindep.txt then it's a quick patch to fix things for them, and if they're not then odds are our bindep fallback already has all the needed deps covered anyway 19:38:44 or there is some really subtle bug where wheels work and sdists don't 19:38:52 right 19:38:53 but cases of that seem far less common 19:39:04 yeah, sounds like a real win for keeping it real with bindep.txt 19:39:08 *cough* pandas *cough* 19:39:15 fungi: :) 19:39:23 though they fixed that in 1.19.1 after much 'splainin 19:41:05 #agreed Move forward filtering pip.conf to remove wheel mirrors for pep8 jobs in two weeks; optionally move run-pep8.sh into the calling JJB builder macro. 19:41:12 ^ any disagreements on that? 19:41:58 none from me 19:42:02 thanks for the help 19:42:07 nope sounds good 19:42:11 pabelanger: presumably just an e-mail announcing this to the openstack-dev ml now and a reminder before we merge teh change will be sufficient? 19:42:12 ++ 19:42:24 fungi: ++ 19:42:32 pabelanger: you want to do the announcing too? 19:42:45 fungi: sure, I'll get something into etherpad first 19:42:54 i wonder if we could tell people how to test with a debootstrap or something 19:43:12 #action pabelanger announce upcoming wheel-less pep8 job transition to openstack-dev ml 19:43:47 ianw: yeah... i was pondering that as well. testing in a chroot is certainly a fairly clean way to go about it but instructions could get lengthy 19:44:43 alternatively we could try to figure out how to pre-test it (add that experimental job to a bunch of projects and make dummy changes in them?) 19:45:17 with the expectation that any pre-testing we do is sure to be incomplete 19:45:29 this is one place zuul v3 would make things so much simpler 19:45:52 projects who are worried about it could just propose a change running that modified job and see what happens 19:46:05 yup 19:46:25 you can also just run it locally using the infra images... 19:46:31 * clarkb has done a ton of this with the xenial stuff 19:46:54 this is true, especially if we have a good walkthrough of using our dib elements 19:47:07 the build image script should just work currently 19:47:08 which is mostly just running that script in project-config 19:47:12 but I can double check that 19:47:59 more like should we be recommending not pre-caching all the repos and stuff (does the script do that automatically)? and at least pointers for how to use the resulting image (loopback mount is likely fine in this particular case?) 19:48:32 the script should make you a 1:1 to what nodepool uses but we could modify it to be more minimal by default 19:48:40 with a toggle to add in the other elements 19:48:51 i don't think you can really stop the precaching 19:49:17 ianw: you can if you remove that element from the build, and since zuul-cloner knows how to work without a cache it should mostly just work 19:49:22 and result in much smaller images 19:49:35 we could stand to refactor our elements to avoid pre-caching git repos, distro packages, devstack files, et cetera 19:49:54 fungi: I think we already have it split out to handle that 19:49:54 which we've probably already at least mostly done 19:49:56 ya 19:50:14 clarkb: yeah, but openstack-repos gets dragged in 19:50:17 clarkb: not yet - see https://review.openstack.org/322487 19:50:39 right you can turn it all on or all off is what I mean to say 19:50:39 by cache-devstack, puppet, etc 19:50:40 I would appreciate some review of that one ^ - I wasn't sure whether that's beneficial or not 19:50:43 and thats already supported 19:50:49 you just remove the elements that do the caching 19:50:54 also we seem to be veering straight toward our last topic for the day... 19:50:56 #topic Open discussion 19:51:27 does anyone know of work in progress to run a second openstack bot since we hit our channel limit? 19:51:39 jeblair: yeah - let me find a change... 19:52:06 gerritbot presumably? 19:52:18 jeblair: https://review.openstack.org/355588 - but this needs more work and looks abandoned ;( 19:52:23 or have we hit the limit with meetbot now as well? 19:52:41 I assume gerritbot - 355588 is for gerritbot 19:52:42 related to caching things, what do people think about not caching the debian package repos until they no longer happen to be complete forks of all our repos 19:52:49 huh, seems like we should hit the limit for both at the same time 19:53:25 bkero: are you still working on that ^? 19:53:30 fewer teams moved to channel logging. gerrit event updates have been more pupular for longer 19:53:34 popular too 19:53:38 jeblair: not every channel that uses gerritbot uses meetbot as well. But yes, meetbot will be next ;) 19:54:31 yep, i haven't counted but wouldn't be surprised to hit it soon if we haven't already 19:55:15 clarkb: there is a heck of a lot of them ... 19:55:36 clarkb: is there a plan for them not to be forks? 19:55:57 jeblair: sort of, in debian session last week the ubuntu folk were pretty adamant that their system of not making it a complete fork was great 19:56:04 fungi, jeblair 110 currently setup for meetbot if I counted correctly 19:56:05 and they basiaclly said we should do that or something similar 19:56:12 need zigo to come around to it 19:56:18 oh, before next week i'll have the ocata cycle artifact signing key generated and ready for people to confirm/sign 19:56:20 #action fungi generate and sign ocata cycle signing key 19:56:37 meant to do it on friday but my brain was already turning to mush 19:56:48 mine is still mush 19:56:53 clarkb: I still think ubuntu carries tarballs in tree 19:56:54 clarkb: yeah, not caching the deb repos looks fine to me. 19:57:06 pabelanger: ya but they are point int time with a few of their things 19:57:13 pabelanger: its not the 1GB of nova history or whatever it is 19:57:23 right 19:57:25 which is what I think jamespage was trying to explain, you have much smaller repos 19:57:40 but, they do contain some forked code, just less of it 19:57:41 the other alternative is the overlay thing you talked about 19:57:48 clarkb: is that to reduce the disk pressure on nodepool-builder or the image upload times? or just on principle? 19:57:56 fungi: all of the above? :) 19:58:07 fungi: its mostly to keep our images as small as possible because upload time is crazy right now 19:58:18 and also build times are related to hygiene in the git repo cache 19:58:20 that debian etherpad is not listed at https://wiki.openstack.org/wiki/Design_Summit/Ocata/Etherpads ;( 19:58:32 oh! right, we're caching those on all our repos right now 19:58:36 er, on all our images 19:58:41 yes to brain still being mush 20:00:05 we're out of time--thanks everyone! 20:00:06 #endmeeting