19:03:04 <fungi> #startmeeting infra
19:03:05 <openstack> Meeting started Tue Nov  1 19:03:04 2016 UTC and is due to finish in 60 minutes.  The chair is fungi. Information about MeetBot at http://wiki.debian.org/MeetBot.
19:03:06 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote.
19:03:07 <jhesketh> Morning
19:03:09 <openstack> The meeting name has been set to 'infra'
19:03:13 <fungi> #link https://wiki.openstack.org/wiki/Meetings/InfraTeamMeeting#Agenda_for_next_meeting
19:03:18 <fungi> #topic Announcements
19:03:19 <bkero> o/
19:03:20 <crinkle> o/
19:03:25 <fungi> #info Many members of the Infra team met in person last week at the Ocata Design Summit in Barcelona; a summary will be provided to openstack-infra@lists.openstack.org later this week.
19:03:29 <fungi> #link https://wiki.openstack.org/wiki/Design_Summit/Ocata/Etherpads#Infrastructure
19:03:35 <fungi> #action fungi send summit session summary to infra ml
19:03:41 <fungi> as always, feel free to hit me up with announcements you want included in future meetings
19:03:51 <fungi> #topic Actions from last meeting
19:03:56 <fungi> #link http://eavesdrop.openstack.org/meetings/infra/2016/infra.2016-10-18-19.02.html
19:04:06 <fungi> pleia2 add skeleton infra session etherpads linked from the ocata design summit etherpads wiki page
19:04:07 <fungi> that's done (see above link in this week's announcements)
19:04:14 <fungi> thanks for taking care of that!
19:04:26 <fungi> ianw work on deploying a pholio server
19:04:28 <fungi> #link https://review.openstack.org/389511
19:04:32 <fungi> that looks pretty close
19:04:40 <ianw> yep, imagine that will be up this week
19:04:56 * fungi rubs hands together evilly
19:04:59 <fungi> ...excellent...
19:05:16 <fungi> #topic Specs approval: PROPOSED "Neutral governance website" (ttx)
19:05:21 <fungi> #link https://review.openstack.org/382447 "Neutral governance website" spec
19:05:29 <fungi> #info The "Neutral governance website" spec is open for Infra Council vote until 19:00 UTC Thursday, November 3.
19:05:35 <ttx> Mostly a rundown of all the steps I need to go through
19:05:50 <ttx> to rotate current / to /tc with minimal disruption
19:06:01 <fungi> it looked complete enough to me
19:06:32 <fungi> anybody need to raise any quick questions about this?
19:06:34 <ttx> let me know if you have questions
19:07:13 <fungi> it's likely common sense for those who have been following te governance changes with the uc, but the commit message and spec description spell it out pretty explicitly
19:07:20 <clarkb> I remember skimming this and it seems to be straightforward
19:07:32 <ttx> yeah, nothing fancy really
19:07:43 <fungi> thanks ttx! if nobody objects, i'll be approving it in ~48 hours
19:07:50 <ttx> cool, thanks!
19:08:01 <fungi> #topic Priority Efforts
19:08:04 <fungi> nothing is called out on the agenda
19:08:06 <fungi> though rcarrillocruz has proposed to mark infra-cloud as implemented
19:08:10 <fungi> #link https://review.openstack.org/391443 "Mark infracloud spec as complete"
19:08:15 <fungi> aside from the technicalities of the index change, anyone disagree that it's implemented enough to be listed as such?
19:08:20 <fungi> we seemed to have some consensus on this in the friday afternoon unconference anyway
19:08:32 <clarkb> +1, we are in run the cloud mode now, but its there
19:08:57 <pabelanger> ++
19:09:53 <fungi> okay, cool. once rcarrillocruz corrects the patch (or when i get around to it if he doesn't have time), i'll approve it
19:10:35 <fungi> congrats crinkle, rcarrillocruz, yolanda and everyone who worked so hard on making this work!
19:10:52 <fungi> #topic Cached image reduction changes (ianw)
19:10:56 <mordred> o/
19:10:58 <fungi> you've got a pretty lengthy summary in the agenda, but care to restate it for posterity of the meeting logs?
19:10:59 <mordred> (sorry I'm late)
19:11:30 <ianw> this was proposed by xen people for devstack, to allow the image list to be reduced in devstacks ./tools/image_list.sh
19:12:02 <ianw> thus there is a project config change to only get those images for testing that we use in the gate
19:12:30 <fungi> i think this dovetails into the arbitrary object mirroring work we've discussed recently
19:13:06 <pabelanger> I would agree
19:13:12 <clarkb> ya I think we probably want a minimal image list cached for devstack (cirros or its possible future replacement) and then put everything else on arbitrary file caching setup
19:13:12 <ianw> so i've proposed that we stop devstack downloading images on the fly, and put in a way to block that, and proposed enabling it
19:13:20 <fungi> it would be nice to figure out which of these we actually use in jobs, and then which of them we use often enough to benefit from pre-caching on job nodes vs local mirrors in each provider
19:13:58 <fungi> but i agree, not caching images we don't use at all in our jobs is a great place to start
19:13:59 <ianw> hmm, so that would conflict with the "don't download testing images on the fly" approach, if some are to be grabbed from mirrors
19:14:02 <AJaeger> I'm fine with the current change set in general. The initial changes contained duplicated hardcoded list, which I strongly objected to.
19:14:23 <clarkb> ianw: they may be grabbed via /afs though
19:14:32 <clarkb> ianw: in which case its not a download but a filepath that we set in our jobs
19:14:52 <AJaeger> looking at the paste, codesearch showed all images as beeing used - with exception of cirros-0.3.0-x86_64-disk.vhd.tgz. Not sure where this comes from
19:14:59 <fungi> job nodes referencing via /afs paths misses out on a reusable cache though because the nodes aren't reused
19:15:09 <jeblair> clarkb: as much as i love afs, that's almost certain to be slower than wget https://mirror/
19:15:09 <clarkb> fungi: thats true
19:15:34 <clarkb> ya forgot that we get the local caching from the apache servers which is handy
19:15:51 <ianw> so, maybe that means we don't try trimming the image list for infra?
19:15:52 <fungi> ianw: yeah, the goal i have is that we find some balance between images that are too large and/or infrequently used such that we can stop embedding them in our images
19:16:23 <fungi> and then provide those over local afs-backed mirrors
19:16:27 <jeblair> ianw: so what about having the flag permit downloading from the mirror, but not the internet?
19:16:55 <ianw> jeblair: yes, it could be some sort of allowed regex i guess
19:17:01 <mordred> like a tri-state flag
19:17:05 <mordred> true, false and maybe
19:17:18 <ianw> i'm not sure it has a concept of downloading from a mirror at all, at the moment
19:17:26 <ianw> it lists full image urls
19:17:56 <jeblair> yeah, and we definitely don't want to encode our mirrors in that list
19:18:01 <mordred> nope.
19:18:24 <ianw> it's probably fine to just leave the status quo of downloading all images to be safe then
19:18:28 <jeblair> maybe the flag could transform the url in that list to one on the mirror, then devstack would either download it from the mirror successfully, or fail it it was not mirrored.
19:18:49 <mordred> otoh - I know that local mirrors of things are useful to developers in places that are not texas ... maybe adding the concept of an overridable mirror to devstack woudlnt' be _too_ terrible?
19:19:21 <mordred> like "if mirror is defined, fetch this image from $MIRROR/$PATH ; else from $UPSTREAM_URL" - or something?
19:20:02 <ianw> yes, i can propose that
19:20:09 <mordred> then if someone wanted to set up a local apache to just host 5 images they use all the time, it's easy to do without hacking devstack a ton
19:20:11 <mordred> \o/
19:20:16 <fungi> and the $PATH portion of the url would be the same as the on-disk cache path maybe?
19:20:21 <mordred> maybe so, yeah
19:20:27 <fungi> that way we don't been any fancy url mangling functions
19:20:29 <mordred> that would actually make it a nice easy interface
19:20:50 <fungi> you could even copy your local devstack cache to a webserver and serve it up without rearranging that way
19:21:18 <jeblair> fungi: even easier if your local devstack cache is in afs :)
19:21:29 <fungi> heh
19:21:40 * jeblair high fives clarkb
19:21:41 <ianw> alright, so i'd propose we do NOT do the project-config change to get reduced set of images -> https://review.openstack.org/377159 and focus on ability to get images from a mirror?
19:21:48 <clarkb> heh
19:22:19 <fungi> ianw: sounds reasonable
19:22:47 <jeblair> i thought we still wanted to have some local cache?
19:23:15 <jeblair> just more options -- like "used so frequently it should be in the image" vs "used infrequently enough we want to download from mirror"
19:23:19 <fungi> right, but that's easily combined
19:23:21 <clarkb> I think we should local cache the "base" small image
19:23:26 <clarkb> because 95% fo jobs use that one
19:23:33 <clarkb> and its relatively tiny
19:24:40 <fungi> #agreed Any image size reduction solutions should take care to avoid making HTTP mirroring or larger or infrequently-used files impossible.
19:24:43 <jeblair> i like that, but if we feel that d/l it from the mirror would be sufficiently fast/reliable, i could get on board with attempting the no-image-cache idea...
19:24:45 <fungi> ^ yeah?
19:24:52 <jeblair> (that was in response to clarkb)
19:25:01 <fungi> oh, typo
19:25:03 <fungi> #undo
19:25:04 <openstack> Removing item from minutes: <ircmeeting.items.Agreed object at 0x7fecfadbf710>
19:25:13 <fungi> #agreed Any image size reduction solutions should take care to avoid making HTTP mirroring of larger or infrequently-used files impossible.
19:25:20 <ianw> ok ... yeah i have some more ideas to move forward with thanks
19:25:28 <fungi> anyone _disagree_ with that statement?
19:25:55 <ianw> not me :)
19:25:57 <fungi> thanks for bringing this hairy implementaiton change to the meeting, ian!
19:26:41 <jeblair> ++
19:26:48 <fungi> #topic Force gate-{name}-pep8-{node} to build needed wheels (pabelanger)
19:26:53 <fungi> #link https://review.openstack.org/391875 "Force gate-{name}-pep8-{node} to build needed wheels"
19:26:56 <fungi> thanks for hacking on this!
19:27:00 <pabelanger> o/
19:27:01 <fungi> i meant to do it months ago, and then as usual got distracted by other fires
19:27:03 <fungi> it looks more or less like what we talked about on friday
19:27:17 <pabelanger> So, this popped into my head again at summit and was able to get it working quickly with an experimental zuul job
19:27:33 <clarkb> as an alternative, maybe we want to push taht into the projects and they can do --no-use-wheels or whatever the pip flag is?
19:27:40 <pabelanger> wanted to bring some eyes to the review and may discuss how we are forcing no wheels and the message about why we are doing this and date
19:28:10 <pabelanger> clarkb: I think we could do that too
19:28:11 <fungi> clarkb: yeah, that's definitely an option i considered, but it's a lot of changes and a very lengthy transition
19:28:28 <fungi> in theory both could be done in parallel
19:28:34 <clarkb> fungi: the nice thing about it is the magicalness isn't hidden in the ci system. every local run of pep8 will be the same as in ci
19:28:36 <mordred> yah - and would make responding to fungi's concern in that review harder in the future
19:28:38 <clarkb> which to me is very important
19:29:19 <mordred> thing is - the use of our wheel mirrors is special to our ci system - so working around them seems fair to do in the ci system
19:29:40 <clarkb> mordred: sort of, aiui the intent is to make it so that somewhere we do installs using sdists
19:29:49 <fungi> anybody remember off the top of your head where we similarly test that you haven't broken sdist'ing? are we doing that in tox.ini or the job definition/scripts?
19:29:50 <clarkb> which isn't a ci mirror things its a can we install our things without wheels
19:30:04 <ianw> they're likely to break because they don't have bindep system dependencies for things they are currently getting as wheels, right?
19:30:05 <mordred> clarkb: yah - the mechanics of that are only important for the ci system itself
19:30:09 <clarkb> fungi: I think that happened in the run-tox.sh script (or whatever the equivalent is for pep8)
19:30:15 <mordred> ianw: yah
19:30:17 <AJaeger> fungi, in jenkins/scripts somewhere - let me get dteails
19:30:23 <clarkb> mordred: thats not true, things break when sdist only
19:30:36 <clarkb> mordred: and that happens regardless of ci or not
19:30:44 <AJaeger> fungi: in run-pep8.sh
19:30:52 <fungi> ianw: yeah, there's a semi-frequent race with new dep releases that are sdist-only, where we won't have the wheels built for an hour or so after our pypi mirror updates
19:31:03 <mordred> totes. I'm just saying that removing extra-index-url from the /etc/pip.conf that's on our build nodes is an action specific to our ci system
19:31:16 <mordred> becacuse our ci system injects wheels into the environment that do not exist for normal users
19:31:30 <clarkb> mordred: ya we hit it extra hard because of that but I think its a general issue
19:31:31 <mordred> so doing this actually makes things _more_ similar to how they run for the user on their local machine
19:32:15 <fungi> this is true
19:32:25 <pabelanger> I agree with that too
19:33:18 <fungi> the reason i asked about our test that sdist works is that i think confirming these things in the same place would be a bit more consistent
19:33:36 <clarkb> fungi: ya I can get on baord with that
19:33:46 <clarkb> in which case we can move this awking to run-pep8.sh
19:34:02 <mordred> wfm
19:34:09 <pabelanger> sure, I'll make some changes
19:34:48 <fungi> pabelanger: clarkb: "move" in this case would be after it baked in the job config for a bit (if ever) so that we can easily revert initially without having to rebuild images
19:35:09 <clarkb> fungi: that sounds like a good idea :)
19:35:19 <pabelanger> agreed
19:35:48 <pabelanger> What sort of timelines are people thinking about making the change? 30days out?
19:35:52 <fungi> i worry that if the roll-out happens in a script embedded in our images, it makes us less able to respond to mistakes or premature change-over in a timely fashion
19:36:21 <AJaeger> pabelanger: next 2 weeks?
19:36:21 <jeblair> maybe a good time to just move run-pep8.sh into jjb
19:36:30 <mordred> ++
19:36:38 <fungi> jeblair: agreed. we've talked about how these don't need to be in separate scripts for the most part
19:37:10 <pabelanger> AJaeger: wfm, nobody objected
19:37:18 <fungi> pabelanger: 30 days seems fine, i'd also be okay with sooner
19:37:26 <pabelanger> I can also start work on a ML post, get some eyes on it for spelling mistakes
19:38:18 <fungi> this is something that shouldn't cause an issue for most projects, and if it does it's because they're already arguably broken. also if they're using bindep.txt then it's a quick patch to fix things for them, and if they're not then odds are our bindep fallback already has all the needed deps covered anyway
19:38:44 <clarkb> or there is some really subtle bug where wheels work and sdists don't
19:38:52 <fungi> right
19:38:53 <clarkb> but cases of that seem far less common
19:39:04 <ianw> yeah, sounds like a real win for keeping it real with bindep.txt
19:39:08 <fungi> *cough* pandas *cough*
19:39:15 <clarkb> fungi: :)
19:39:23 <fungi> though they fixed that in 1.19.1 after much 'splainin
19:41:05 <fungi> #agreed Move forward filtering pip.conf to remove wheel mirrors for pep8 jobs in two weeks; optionally move run-pep8.sh into the calling JJB builder macro.
19:41:12 <fungi> ^ any disagreements on that?
19:41:58 <pabelanger> none from me
19:42:02 <pabelanger> thanks for the help
19:42:07 <clarkb> nope sounds good
19:42:11 <fungi> pabelanger: presumably just an e-mail announcing this to the openstack-dev ml now and a reminder before we merge teh change will be sufficient?
19:42:12 <mordred> ++
19:42:24 <pabelanger> fungi: ++
19:42:32 <fungi> pabelanger: you want to do the announcing too?
19:42:45 <pabelanger> fungi: sure, I'll get something into etherpad first
19:42:54 <ianw> i wonder if we could tell people how to test with a debootstrap or something
19:43:12 <fungi> #action pabelanger announce upcoming wheel-less pep8 job transition to openstack-dev ml
19:43:47 <fungi> ianw: yeah... i was pondering that as well. testing in a chroot is certainly a fairly clean way to go about it but instructions could get lengthy
19:44:43 <fungi> alternatively we could try to figure out how to pre-test it (add that experimental job to a bunch of projects and make dummy changes in them?)
19:45:17 <fungi> with the expectation that any pre-testing we do is sure to be incomplete
19:45:29 <fungi> this is one place zuul v3 would make things so much simpler
19:45:52 <fungi> projects who are worried about it could just propose a change running that modified job and see what happens
19:46:05 <mordred> yup
19:46:25 <clarkb> you can also just run it locally using the infra images...
19:46:31 * clarkb has done a ton of this with the xenial stuff
19:46:54 <fungi> this is true, especially if we have a good walkthrough of using our dib elements
19:47:07 <clarkb> the build image script should just work currently
19:47:08 <fungi> which is mostly just running that script in project-config
19:47:12 <clarkb> but I can double check that
19:47:59 <fungi> more like should we be recommending not pre-caching all the repos and stuff (does the script do that automatically)? and at least pointers for how to use the resulting image (loopback mount is likely fine in this particular case?)
19:48:32 <clarkb> the script should make you a 1:1 to what nodepool uses but we could modify it to be more minimal by default
19:48:40 <clarkb> with a toggle to add in the other elements
19:48:51 <ianw> i don't think you can really stop the precaching
19:49:17 <clarkb> ianw: you can if you remove that element from the build, and since zuul-cloner knows how to work without a cache it should mostly just work
19:49:22 <clarkb> and result in much smaller images
19:49:35 <fungi> we could stand to refactor our elements to avoid pre-caching git repos, distro packages, devstack files, et cetera
19:49:54 <clarkb> fungi: I think we already have it split out to handle that
19:49:54 <fungi> which we've probably already at least mostly done
19:49:56 <clarkb> ya
19:50:14 <ianw> clarkb: yeah, but openstack-repos gets dragged in
19:50:17 <AJaeger> clarkb: not yet - see https://review.openstack.org/322487
19:50:39 <clarkb> right you can turn it all on or all off is what I mean to say
19:50:39 <ianw> by cache-devstack, puppet, etc
19:50:40 <AJaeger> I would appreciate some review of that one ^ - I wasn't sure whether that's beneficial or not
19:50:43 <clarkb> and thats already supported
19:50:49 <clarkb> you just remove the elements that do the caching
19:50:54 <fungi> also we seem to be veering straight toward our last topic for the day...
19:50:56 <fungi> #topic Open discussion
19:51:27 <jeblair> does anyone know of work in progress to run a second openstack bot since we hit our channel limit?
19:51:39 <AJaeger> jeblair: yeah - let me find a change...
19:52:06 <fungi> gerritbot presumably?
19:52:18 <AJaeger> jeblair: https://review.openstack.org/355588 - but this needs more work and looks abandoned ;(
19:52:23 <fungi> or have we hit the limit with meetbot now as well?
19:52:41 <AJaeger> I assume gerritbot - 355588 is for gerritbot
19:52:42 <clarkb> related to caching things, what do people think about not caching the debian package repos until they no longer happen to be complete forks of all our repos
19:52:49 <jeblair> huh, seems like we should hit the limit for both at the same time
19:53:25 <jeblair> bkero: are you still working on that ^?
19:53:30 <fungi> fewer teams moved to channel logging. gerrit event updates have been more pupular for longer
19:53:34 <fungi> popular too
19:53:38 <AJaeger> jeblair: not every channel that uses gerritbot uses meetbot as well. But yes, meetbot will be next ;)
19:54:31 <fungi> yep, i haven't counted but wouldn't be surprised to hit it soon if we haven't already
19:55:15 <ianw> clarkb: there is a heck of a lot of them ...
19:55:36 <jeblair> clarkb: is there a plan for them not to be forks?
19:55:57 <clarkb> jeblair: sort of, in debian session last week the ubuntu folk were pretty adamant that their system of not making it a complete fork was great
19:56:04 <AJaeger> fungi, jeblair 110 currently setup for meetbot if I counted correctly
19:56:05 <clarkb> and they basiaclly said we should do that or something similar
19:56:12 <clarkb> need zigo to come around to it
19:56:18 <fungi> oh, before next week i'll have the ocata cycle artifact signing key generated and ready for people to confirm/sign
19:56:20 <fungi> #action fungi generate and sign ocata cycle signing key
19:56:37 <fungi> meant to do it on friday but my brain was already turning to mush
19:56:48 <clarkb> mine is still mush
19:56:53 <pabelanger> clarkb: I still think ubuntu carries tarballs in tree
19:56:54 <AJaeger> clarkb: yeah, not caching the deb repos looks fine to me.
19:57:06 <clarkb> pabelanger: ya but they are point int time with a few of their things
19:57:13 <clarkb> pabelanger: its not the 1GB of nova history or whatever it is
19:57:23 <pabelanger> right
19:57:25 <clarkb> which is what I think jamespage was trying to explain, you have much smaller repos
19:57:40 <pabelanger> but, they do contain some forked code, just less of it
19:57:41 <clarkb> the other alternative is the overlay thing you talked about
19:57:48 <fungi> clarkb: is that to reduce the disk pressure on nodepool-builder or the image upload times? or just on principle?
19:57:56 <clarkb> fungi: all of the above? :)
19:58:07 <clarkb> fungi: its mostly to keep our images as small as possible because upload time is crazy right now
19:58:18 <clarkb> and also build times are related to hygiene in the git repo cache
19:58:20 <AJaeger> that debian etherpad is not listed at https://wiki.openstack.org/wiki/Design_Summit/Ocata/Etherpads ;(
19:58:32 <fungi> oh! right, we're caching those on all our repos right now
19:58:36 <fungi> er, on all our images
19:58:41 <fungi> yes to brain still being mush
20:00:05 <fungi> we're out of time--thanks everyone!
20:00:06 <fungi> #endmeeting