#openstack-meeting log

19:05:21 <fungi> #startmeeting infra
19:05:22 <austin81> o/
19:05:23 <openstack> Meeting started Tue Jan  5 19:05:21 2016 UTC and is due to finish in 60 minutes.  The chair is fungi. Information about MeetBot at http://wiki.debian.org/MeetBot.
19:05:24 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote.
19:05:26 <openstack> The meeting name has been set to 'infra'
19:05:33 <fungi> #link https://wiki.openstack.org/wiki/Meetings/InfraTeamMeeting#Agenda_for_next_meeting
19:05:43 <fungi> #topic Announcements
19:06:10 <fungi> i didn't have any new announcements for this week, but will repeat any from the past couple meetings for the benefit of those who had a life ;)
19:06:48 <fungi> #info Tentative deadline for infra-cloud sprint registrations is Friday, January 29, 2016.
19:06:59 <fungi> #info HPE is catering lunches at the infra-cloud sprint; please update your registration with any dietary restrictions.
19:07:08 <fungi> #link http://lists.openstack.org/pipermail/openstack-infra/2015-December/003602.html
19:07:10 <nibalizer> thanks hpe
19:07:16 <fungi> #link https://wiki.openstack.org/wiki/Sprints/InfraMitakaSprint
19:07:33 <fungi> that was all we had under announcements looks like
19:07:34 <jeblair> i'm hungry already
19:07:39 <mordred> mmmm
19:07:42 <fungi> me^2
19:07:56 <fungi> #topic Actions from last meeting
19:08:01 <fungi> #link http://eavesdrop.openstack.org/meetings/infra/2015/infra.2015-12-29-19.02.html
19:08:17 <fungi> AJaeger propose a spec to consolidate translation setup
19:08:34 <fungi> that was based on outcome from the discussion in last week's meeting
19:08:41 <fungi> anybody happen to spot of that is up?
19:09:13 <fungi> #link https://review.openstack.org/262545
19:09:15 <fungi> i think that
19:09:17 <fungi> 's it
19:09:29 <anteaya> looks like it
19:09:34 <fungi> still wip, but i'll not readd the action
19:09:42 <fungi> it's at least in progress
19:09:50 <fungi> that's all we still had on action items
19:09:56 <fungi> #topic Specs approval
19:09:56 <AJaeger> yep, in progress - I need some input from all of you on this ;)
19:09:59 <cody-somerville> \o_
19:10:17 <fungi> repeating this one since i deferred it an extra week to give vacationers some opportunity to check it out
19:10:32 <fungi> #link https://review.openstack.org/246550
19:10:41 <fungi> PROPOSED: Consolidation of docs jobs
19:10:51 <fungi> #info Council voting is open on the "Consolidation of docs jobs" spec until 19:00 UTC on Thursday, January 7, 2016.
19:11:15 <fungi> AJaeger has been a spec-writing fiend lately
19:11:28 <fungi> anyway, that's the only spec on the agenda currently
19:11:38 <AJaeger> That's enough specs for this year, fungi ;)
19:11:41 <jeblair> oh hrm
19:11:42 <fungi> heh
19:12:01 <jeblair> it's worth discussing why we don't just run 'tox -e docs' now
19:12:27 <AJaeger> jeblair: I'm happy to have that discussion!
19:12:52 <fungi> cue pti discussion from 2015 tc meetings?
19:12:55 <jeblair> which is because in the past, people have put extra steps in the tox env which means the normal way of building docs with setup.py doesn't work
19:12:56 <jeblair> right
19:13:33 <jeblair> this previously came up at the tc level with a proposed change to the project testing interface
19:13:54 <jeblair> anyway -- i'm not advocating that we must keep things the way they are...
19:13:55 <AJaeger> jeblair: the challenge is if there's more than one root - releasenotes, api doc, and "normal" docs which gets also published in different places
19:14:13 <pabelanger> Ya, I was pretty surprised to find out tox -edocs wasn't run in the gate a few weeks ago.
19:14:31 <jeblair> however, i think it's worth noting that they are the way they are on purpose, and if we want to change them, we need to either decide that we don't care about the benefits we're getting now or find some other way to get them
19:14:48 <jeblair> and also, that if we change this, we need to change the PTI as well (simply an infra spec is not sufficient)
19:16:14 <clarkb> AJaeger: couldn't you have all o fthose be built by default?
19:16:31 <clarkb> via a common root
19:16:51 <AJaeger> clarkb: they might have different themes
19:16:55 <clarkb> ah
19:17:26 <krotscheck> o/
19:17:26 <AJaeger> clarkb: still, it's worth trying to do this.
19:18:25 <AJaeger> clarkb: I don't know yet how to do this properly and would evaluate that option, so please mention it in the spec.
19:18:46 <fungi> any particular reason why they should have different themes for docs, api reference and release notes?
19:19:07 <fungi> i mean, just because some of them do doesn't necessarily mean that was intended or even desired
19:19:16 <dhellmann> it is desired
19:19:22 <AJaeger> fungi: http://developer.openstack.org/api-guide/compute/ uses openstackdocstheme, the others use oslo.sphinx
19:19:41 <AJaeger> so, that was intentional by annegentle
19:19:46 <dhellmann> the release notes theme may change, too
19:20:56 <fungi> cool, just making sure that was complicating this for a good reason
19:21:41 <fungi> so i guess the other concern is that the problem description doesn't come out and describe a problem
19:22:09 <fungi> it notes that we run multiple brief jobs but doesn't say why that's bad
19:22:55 <fungi> sometimes the cure can be worse than the disease. need to be able to evaluate whether the overhead of multiple jobs for this is sufficiently undesirable to warrant the proposed solution
19:23:38 <AJaeger> fungi, agreed - as I said last time: Please evaluate whether the complications this will bring is worth the decrease in jobs
19:24:01 <fungi> it's counter-intuitive, but long-running jobs tie up a lot more resources. the waste from these is a relatively small % of quota being burned in node churn
19:24:40 <fungi> anyway, let's put comments in the review and move on with the meeting agenda
19:25:07 <fungi> #topic Priority Efforts: Store Build Logs in Swift
19:25:31 <fungi> jhesketh has recently put updates up for the remaining pieces it looks like
19:25:38 <fungi> #link https://review.openstack.org/#/q/status:open+topic:enable_swift
19:25:52 <jeblair> yep; apparently there's a zuul change needed too
19:25:56 <fungi> not sure if he's around and has anything else to add
19:26:09 <jeblair> so we'll need a restart somewhere in there
19:26:11 <fungi> two zuul changes from the look of things
19:26:38 <fungi> or at least a change in two parts
19:27:08 <fungi> also looks like it includes a scope expansion on the spec
19:27:26 <fungi> covering swift-based file tree metadata management
19:28:15 <fungi> i guess nothing else to cover here at the moment, other than we can hopefully knock this out soon with some reviewing and a zuul maintenance
19:28:30 <anteaya> thanks jhesketh
19:28:38 <fungi> but probably best not to schedule the maintenance until changes are close to approval
19:29:18 <fungi> i don't see anything else being called out on the priority efforts this week, so moving on to other meeting agenda topics
19:29:35 <fungi> #topic How to setup translations?
19:30:04 <clarkb> AJaeger: I think I saw that the claim is django must use django.pot?
19:30:07 <fungi> looks like this is the wip spec i mentioned earlier
19:30:13 <AJaeger> not sure whether we need to discuss this again after talking about it last week.
19:30:22 <AJaeger> please review the linked spec
19:30:25 <fungi> AJaeger: okay, and we covered the spec under action items
19:30:25 <AJaeger> clarkb: indeed
19:30:38 <fungi> but to reiterate
19:30:42 <fungi> #link https://review.openstack.org/262545
19:30:46 <fungi> thanks AJaeger!
19:30:59 <fungi> #topic StackViz integration? (timothyb89, austin81)
19:31:08 <fungi> #link https://etherpad.openstack.org/p/BKgWlKIjgQ
19:31:24 <fungi> what's all this then?
19:31:51 <timothyb89> right, so, we're looking for input on how best to get CI going for stackviz, and to get it running for gate jobs
19:32:12 <timothyb89> we've detailed our goals in the etherpad, along with some examples and possible solutions
19:32:33 <AJaeger> timothyb89: what's stackviz?
19:32:44 <fungi> this is a tool which analyzes metrics about the services which were running under devstack during a test job?
19:33:12 <jeblair> this looks like subunit2html?
19:33:12 <clarkb> are you able to use the npm that ships with distros?
19:33:13 <timothyb89> more or less, its a performance viz and debugging tool for devstack/tempest runs
19:33:30 <fungi> oh, it's analyzing tempest tests individually?
19:33:31 <clarkb> jeblair: it does a bit more, mapping resource use on the system against test runs over time
19:33:34 <timothyb89> clarkb: yes, it uses the same setup as openstack-health on the build-side, which is already working
19:33:40 <jeblair> ah i see the cpu/mem stuff now
19:34:24 <clarkb> timothyb89: so the npm runtime cost is because npm is slow not installing/building npm?
19:34:32 <fungi> is the test-specific part making use of the same subunit2sql database openstack-health uses?
19:34:34 <austin81> clarkb: Correct
19:34:51 <fungi> ahh, you just answered that before i asked it
19:34:57 <timothyb89> fungi: right now it pulls all data from an actual subunit file, no db connection is needed at all
19:35:41 <jeblair> aha, i think i'm caught up
19:36:03 <fungi> thanks
19:36:30 <austin81> We are leaning towards the tarball option for storing a pre-built site, then pulling it from a d-g cleanup script
19:36:30 <jeblair> so there's some heavy work (build stackviz) with 2 options for making sure it can happen before test runs, and then inside of test runs, just a little bit of post-processing then we can upload the statically generated results with the test results
19:36:46 <timothyb89> jeblair: yes, exactly that
19:37:16 <jeblair> the downside to building in nodepool images is that sometimes we go for a long time without being able to sucessfully build them
19:37:36 <fungi> yeah, need to be able to tolerate days, maybe even weeks, of staleness there
19:37:40 <jeblair> so it means that if you want to change stackviz itself, there may be periods (like a week or more) where what's on nodepool is stale
19:38:03 <fungi> also need to be able to tolerate inconsistency across providers/regions
19:38:20 <timothyb89> that isn't a huge problem for us, with just the 2 of us development isn't fast-paced at all
19:38:28 <fungi> we might successfully upload in ovh and two of three rackspace regions but have one that's exhibiting issues getting new images uploaded
19:38:44 <fungi> or something like that
19:38:51 <jeblair> if you're okay with that, i think building them into nodepool images is fine, and fairly simple.  but if stackviz is evolving enough that you wouldn't be able to tolerate that, then the tarball think is better
19:39:07 <mordred> ++
19:39:13 <jeblair> downside of tarball is there's extra network traffic and delay in each job
19:39:14 <fungi> so as long as we expect that it will sometimes lag and may lag by different amounts in different places, it's a possibility
19:39:19 <clarkb> how big is the output? I guess I can just look at the examples
19:39:22 <timothyb89> realistically we're looking at maybe 1-2 changes merged per week, and I don't see a pressing need to have changes applied immediately
19:39:45 <timothyb89> clarkb: ~500kb to 1.5MB if compressed, depending on log sizes
19:39:52 <fungi> if the tarball is tiny, then it's probably not any worse than a lot of other things devstack downloads (distro package lists, for ecample)
19:39:59 <jeblair> fungi: right
19:40:00 <clarkb> ok thats not bad (I always worry with log related things as they can explode)
19:40:22 <austin81> What sort of delay are we looking at with pulling form the tarball site?
19:40:25 <timothyb89> clarkb: actually, the actual tarball would be only ~100kb, that larger size is post-data processing
19:40:46 <jeblair> austin81: just whatever it takes to fetch that (possibly over an ocean) and install it
19:40:51 <clarkb> ya I am interested in the post data processing since that gets uploaded to the log server for each job right?
19:41:11 <fungi> austin81: no real delay. if you build in a post job for stackviz changes merged or something then basically just depends on how long the post pipeline backlog is on any given day. sometimes hours
19:41:27 <jeblair> (clarkb and i are talking about 2 different network transfers, i'm talking about downloading stackviz itself to the worker node, clarkb is talking about uploading from the worker to the log server)
19:41:45 <fungi> and i'm talking about uploading your tarball to the tarballs site
19:41:51 <fungi> so make that three transfers
19:42:20 <austin81> fungi: Ah okay I was more curious about pulling it onto the logs server. Not too concered about the upload speed
19:42:38 <austin81> Since, like timothyb89 said we are not dealing with very many patches
19:42:38 <clarkb> and its not a python project that could hvae a prebuilt sdist on our pypi mirrors?
19:42:41 <timothyb89> clarkb: the demo site has a 9.8MB footprint with data due to logs, but when trimmed it is < 1MB compressed
19:43:10 <clarkb> (I see pip on the etherpad so not sure how that works)
19:43:20 <fungi> yeah, if tarballs are an option, then packaging them in an sdist/wheel and mirroring them on our pypi mirrors could make sense
19:43:35 <fungi> just tag a release when you have new changes you want incorporated
19:43:41 <timothyb89> clarkb: the data processing is a small python module, we'll probably want some form of prepackaging for that
19:44:09 <clarkb> timothyb89: I am suggesting combine it all so that when you push a tag of stackviz you get a pip installable sdist that works
19:44:25 <clarkb> then that can be done whenever you are happy with it, devstack intsalls from mirror and runs it
19:44:26 <fungi> and includes the static web components prebuilt
19:44:31 <clarkb> ya
19:44:46 <timothyb89> clarkb: ah, right, that makes sense
19:45:12 <fungi> anyway, it sounds like we have some potential direction on next steps for this? want to move on to the other three agenda items in the next 15 minutes
19:45:30 <timothyb89> that's a lot of good advice, thanks all for input!
19:45:31 <austin81> fungi: Yes, thank you for the discussion
19:45:34 <fungi> thanks timothyb89/austin81!
19:45:51 <fungi> #topic Next steps for release automation work (dhellmann)
19:45:59 <fungi> #link http://specs.openstack.org/openstack-infra/infra-specs/specs/complete-reviewable-release-automation.html
19:46:18 <fungi> dhellmann: i suppose some of this is dependent on me working on the artifact signing pieces i have planned
19:46:26 <dhellmann> we have most of the scripts we need in the release-tools repo, so I'm looking for what our next steps are
19:46:33 <fungi> the key management bits in particular
19:46:40 <dhellmann> right now those scripts have 2 dependencies I don't know how we'll work with in CI: git keys and sending email
19:46:51 <mordred> does key management depend on zuul v3?
19:47:00 <fungi> not in the current design, no
19:47:02 <mordred> kk
19:47:04 <mordred> col
19:47:06 <fungi> just a trusted jenkins slave
19:47:06 <mordred> cool
19:47:16 <jeblair> what's a git key?
19:47:21 <mordred> jeblair: gpg key
19:47:23 <fungi> that spec is also published but not really underway yet
19:47:26 <jeblair> oh ok
19:47:28 <mordred> jeblair: to use to sign the tag
19:47:32 <dhellmann> sorry, gpg
19:47:39 <fungi> #link http://specs.openstack.org/openstack-infra/infra-specs/specs/artifact-signing.html
19:48:03 <jeblair> so that's probably just another secret on a trusted slave, yeah?
19:48:22 <fungi> i think we determined at the summit that we were just going to use the same openpgp key to make detached signatures of artifacts and to sign git tags
19:48:32 <dhellmann> yes, that's what I remember, too
19:48:37 <mordred> jeblair: yah. I think so. although I'm guessing we'd want to sign that key with infra-root and release-team keys at some point maybe?
19:48:38 <fungi> so, yes, stick in hiera, puppet onto worker
19:48:49 <jeblair> (which means the signing job can't run any in-repo code)
19:48:50 <fungi> mordred: yep, mentioned in the second spec i linked
19:48:53 <mordred> woot
19:48:59 <fungi> yeah
19:49:00 <mordred> its like you've thought through this already
19:49:26 <fungi> it's like we've written this stuff a while ago because it's sometimes easier to find free time to plan than to execute ;)
19:49:30 <jeblair> as for email -- i suppose we can have a slave (probably the same) send email; they all probably already run exim servers anyway...
19:49:43 <dhellmann> jeblair : the signing script is http://git.openstack.org/cgit/openstack-infra/release-tools/tree/release.sh but I don't know if that runs code you'd define as in-tree
19:50:08 <jeblair> where is the email being sent to?
19:50:15 <dhellmann> openstack-dev or openstack-announce
19:50:16 <fungi> dhellmann: i think we discussed a while back that it might need to move to project-config jenkins/scripts directory instead
19:50:16 <jeblair> just to lists.o.o?
19:50:26 <dhellmann> jeblair : yeah
19:50:32 <dhellmann> fungi : we did, that's true
19:50:41 <fungi> and yeah, exim is running on release.slave.o.o for example
19:50:50 <fungi> so having the worker send e-mail is pretty trivial
19:50:59 <jeblair> cool, then since we control that anyway, it seems like it shouldn't be much of a problem -- we don't really need a smarthost or anything since lists.o.o will effectively handle that for us.
19:51:14 <dhellmann> the scripts generate a fully formatted email with headers and all, so we just need a way to get that into the outgoing mail queue
19:51:23 <fungi> import smtp
19:51:28 <fungi> smtp.send(foo)
19:51:31 <fungi> if memory serves ;)
19:51:34 <jeblair> (and that means this will transition to zuulv3 easily where the slave will be short-running -- we'll just make sure lists.o.o can accept mail from our slaves)
19:52:09 <dhellmann> fungi : yeah, I could write a little thing; I wasn't sure if something like that already existed. I'm using msmtp on my local system
19:52:17 <jeblair> dhellmann, fungi: yeah, either smtp over port 25 or /usr/lib/sendmail should be fine.
19:52:19 <clarkb> but zuulv3 also supports long running slave too right? so we won't have to make that transition immediately
19:52:42 <clarkb> or as part of the switch (depending on what is easier)
19:52:47 <jeblair> clarkb: yeah
19:52:51 <dhellmann> what's the difference between long and short running and how does it apply here?
19:52:53 <fungi> yeah, preferably stuff it through /usr/lib/sendmail so we get queuing in the event of a network issue
19:53:06 <jeblair> fungi: smtp to localhost
19:53:22 <fungi> oh, right, that
19:53:35 <jeblair> fungi: so network issues highly unlikely
19:53:57 <fungi> dhellmann: the current proposal is a persistent manually maintained job worker rather than dynamic single-use nodepool-maintained workers
19:54:07 <jeblair> honestly, if there's no other reasion, i'd go with smtp because it's more flexible (can be used in other situations to talk to a remote smarthost with no local server)
19:54:09 <fungi> but in zuul v3 it'll just all be the latter
19:54:09 <dhellmann> fungi : ah, ok
19:54:36 <dhellmann> fungi : in that latter case, would there be a chance that the node would be killed before the queued email is actually sent?
19:54:51 <fungi> jeblair: yeah, the smtp module in stdlib pointed at the loopback makes the most sense, and exim is almost certainly already configured to dtrt there
19:55:13 * dhellmann makes a note to work on a script to push mail to smtp on localhost
19:55:21 <fungi> dhellmann: hrm, perhaps. we might have to make sure the job doesn't terminate until the message is delivered
19:55:29 <jeblair> ya
19:55:32 <mordred> dhellmann, jeblair: I think when we go to ephemeral nodes, we'll want to do smtp to remote smarthost
19:55:35 <fungi> worth pondering anyway
19:55:43 <dhellmann> mordred : yeah
19:56:09 <fungi> mordred: agreed, though then we need to do our own retrying
19:56:14 <jeblair> yeah
19:56:18 <mordred> indeed
19:56:29 <jeblair> well, if we write the script now to use smtp to localhost, then it's trivial to smtp to smarthost later
19:56:31 <fungi> we're about out of time, two topics remaining on the agenda
19:56:34 <mordred> jeblair: ++
19:56:38 <persia> Perhaps one of the small desktop-intended SMTP outbound servers would work for that, rather than all of exim.
19:56:43 <fungi> clarkb: pabelanger: can your topics wait for next week?
19:56:48 <mordred> persia: we've already got exim everywhere
19:56:52 <clarkb> fungi: maybe? its time bound to january 31
19:56:55 <mordred> persia: doing a different smtp server would be harder
19:56:58 <clarkb> and I will be afk ish the week prior to that
19:57:03 <fungi> clarkb: that's why i thought it might
19:57:13 <persia> mordred: Now, but for ephemeral?
19:57:18 <jeblair> (and if we're sure that we only deliver to lists.o.o, we can skip the step of setting up a smarthost and just 'smarthost' to lists.o.o :)
19:57:27 <pabelanger> fungi: Yup
19:57:29 <pabelanger> no rush
19:57:29 <clarkb> we can also just discuss it really quickly after the meeting
19:57:31 <mordred> persia: it's in our basic puppet manifests
19:57:32 <jeblair> persia: nothing is easier than exim
19:57:36 <mordred> jeblair: ++
19:58:30 <fungi> clarkb: sounds good
19:58:32 <fungi> pabelanger: thanks
19:58:52 <dhellmann> jeblair : should I just have it send directly to lists.o.o now, then?
19:59:36 <jeblair> dhellmann: maybe?
19:59:49 <fungi> dhellmann: that would probably be fine, though you may want to block and retry until it succeeds in case the network hates us for brief periods
19:59:52 <mordred> jeblair: won't he have to deal with retry logic if he does?
19:59:53 <mordred> yeah
20:00:01 <fungi> anyway, we're out of time
20:00:04 <dhellmann> k
20:00:05 <fungi> thanks everyone!
20:00:09 <fungi> #endmeeting