19:05:21 #startmeeting infra 19:05:22 o/ 19:05:23 Meeting started Tue Jan 5 19:05:21 2016 UTC and is due to finish in 60 minutes. The chair is fungi. Information about MeetBot at http://wiki.debian.org/MeetBot. 19:05:24 Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 19:05:26 The meeting name has been set to 'infra' 19:05:33 #link https://wiki.openstack.org/wiki/Meetings/InfraTeamMeeting#Agenda_for_next_meeting 19:05:43 #topic Announcements 19:06:10 i didn't have any new announcements for this week, but will repeat any from the past couple meetings for the benefit of those who had a life ;) 19:06:48 #info Tentative deadline for infra-cloud sprint registrations is Friday, January 29, 2016. 19:06:59 #info HPE is catering lunches at the infra-cloud sprint; please update your registration with any dietary restrictions. 19:07:08 #link http://lists.openstack.org/pipermail/openstack-infra/2015-December/003602.html 19:07:10 thanks hpe 19:07:16 #link https://wiki.openstack.org/wiki/Sprints/InfraMitakaSprint 19:07:33 that was all we had under announcements looks like 19:07:34 i'm hungry already 19:07:39 mmmm 19:07:42 me^2 19:07:56 #topic Actions from last meeting 19:08:01 #link http://eavesdrop.openstack.org/meetings/infra/2015/infra.2015-12-29-19.02.html 19:08:17 AJaeger propose a spec to consolidate translation setup 19:08:34 that was based on outcome from the discussion in last week's meeting 19:08:41 anybody happen to spot of that is up? 19:09:13 #link https://review.openstack.org/262545 19:09:15 i think that 19:09:17 's it 19:09:29 looks like it 19:09:34 still wip, but i'll not readd the action 19:09:42 it's at least in progress 19:09:50 that's all we still had on action items 19:09:56 #topic Specs approval 19:09:56 yep, in progress - I need some input from all of you on this ;) 19:09:59 \o_ 19:10:17 repeating this one since i deferred it an extra week to give vacationers some opportunity to check it out 19:10:32 #link https://review.openstack.org/246550 19:10:41 PROPOSED: Consolidation of docs jobs 19:10:51 #info Council voting is open on the "Consolidation of docs jobs" spec until 19:00 UTC on Thursday, January 7, 2016. 19:11:15 AJaeger has been a spec-writing fiend lately 19:11:28 anyway, that's the only spec on the agenda currently 19:11:38 That's enough specs for this year, fungi ;) 19:11:41 oh hrm 19:11:42 heh 19:12:01 it's worth discussing why we don't just run 'tox -e docs' now 19:12:27 jeblair: I'm happy to have that discussion! 19:12:52 cue pti discussion from 2015 tc meetings? 19:12:55 which is because in the past, people have put extra steps in the tox env which means the normal way of building docs with setup.py doesn't work 19:12:56 right 19:13:33 this previously came up at the tc level with a proposed change to the project testing interface 19:13:54 anyway -- i'm not advocating that we must keep things the way they are... 19:13:55 jeblair: the challenge is if there's more than one root - releasenotes, api doc, and "normal" docs which gets also published in different places 19:14:13 Ya, I was pretty surprised to find out tox -edocs wasn't run in the gate a few weeks ago. 19:14:31 however, i think it's worth noting that they are the way they are on purpose, and if we want to change them, we need to either decide that we don't care about the benefits we're getting now or find some other way to get them 19:14:48 and also, that if we change this, we need to change the PTI as well (simply an infra spec is not sufficient) 19:16:14 AJaeger: couldn't you have all o fthose be built by default? 19:16:31 via a common root 19:16:51 clarkb: they might have different themes 19:16:55 ah 19:17:26 o/ 19:17:26 clarkb: still, it's worth trying to do this. 19:18:25 clarkb: I don't know yet how to do this properly and would evaluate that option, so please mention it in the spec. 19:18:46 any particular reason why they should have different themes for docs, api reference and release notes? 19:19:07 i mean, just because some of them do doesn't necessarily mean that was intended or even desired 19:19:16 it is desired 19:19:22 fungi: http://developer.openstack.org/api-guide/compute/ uses openstackdocstheme, the others use oslo.sphinx 19:19:41 so, that was intentional by annegentle 19:19:46 the release notes theme may change, too 19:20:56 cool, just making sure that was complicating this for a good reason 19:21:41 so i guess the other concern is that the problem description doesn't come out and describe a problem 19:22:09 it notes that we run multiple brief jobs but doesn't say why that's bad 19:22:55 sometimes the cure can be worse than the disease. need to be able to evaluate whether the overhead of multiple jobs for this is sufficiently undesirable to warrant the proposed solution 19:23:38 fungi, agreed - as I said last time: Please evaluate whether the complications this will bring is worth the decrease in jobs 19:24:01 it's counter-intuitive, but long-running jobs tie up a lot more resources. the waste from these is a relatively small % of quota being burned in node churn 19:24:40 anyway, let's put comments in the review and move on with the meeting agenda 19:25:07 #topic Priority Efforts: Store Build Logs in Swift 19:25:31 jhesketh has recently put updates up for the remaining pieces it looks like 19:25:38 #link https://review.openstack.org/#/q/status:open+topic:enable_swift 19:25:52 yep; apparently there's a zuul change needed too 19:25:56 not sure if he's around and has anything else to add 19:26:09 so we'll need a restart somewhere in there 19:26:11 two zuul changes from the look of things 19:26:38 or at least a change in two parts 19:27:08 also looks like it includes a scope expansion on the spec 19:27:26 covering swift-based file tree metadata management 19:28:15 i guess nothing else to cover here at the moment, other than we can hopefully knock this out soon with some reviewing and a zuul maintenance 19:28:30 thanks jhesketh 19:28:38 but probably best not to schedule the maintenance until changes are close to approval 19:29:18 i don't see anything else being called out on the priority efforts this week, so moving on to other meeting agenda topics 19:29:35 #topic How to setup translations? 19:30:04 AJaeger: I think I saw that the claim is django must use django.pot? 19:30:07 looks like this is the wip spec i mentioned earlier 19:30:13 not sure whether we need to discuss this again after talking about it last week. 19:30:22 please review the linked spec 19:30:25 AJaeger: okay, and we covered the spec under action items 19:30:25 clarkb: indeed 19:30:38 but to reiterate 19:30:42 #link https://review.openstack.org/262545 19:30:46 thanks AJaeger! 19:30:59 #topic StackViz integration? (timothyb89, austin81) 19:31:08 #link https://etherpad.openstack.org/p/BKgWlKIjgQ 19:31:24 what's all this then? 19:31:51 right, so, we're looking for input on how best to get CI going for stackviz, and to get it running for gate jobs 19:32:12 we've detailed our goals in the etherpad, along with some examples and possible solutions 19:32:33 timothyb89: what's stackviz? 19:32:44 this is a tool which analyzes metrics about the services which were running under devstack during a test job? 19:33:12 this looks like subunit2html? 19:33:12 are you able to use the npm that ships with distros? 19:33:13 more or less, its a performance viz and debugging tool for devstack/tempest runs 19:33:30 oh, it's analyzing tempest tests individually? 19:33:31 jeblair: it does a bit more, mapping resource use on the system against test runs over time 19:33:34 clarkb: yes, it uses the same setup as openstack-health on the build-side, which is already working 19:33:40 ah i see the cpu/mem stuff now 19:34:24 timothyb89: so the npm runtime cost is because npm is slow not installing/building npm? 19:34:32 is the test-specific part making use of the same subunit2sql database openstack-health uses? 19:34:34 clarkb: Correct 19:34:51 ahh, you just answered that before i asked it 19:34:57 fungi: right now it pulls all data from an actual subunit file, no db connection is needed at all 19:35:41 aha, i think i'm caught up 19:36:03 thanks 19:36:30 We are leaning towards the tarball option for storing a pre-built site, then pulling it from a d-g cleanup script 19:36:30 so there's some heavy work (build stackviz) with 2 options for making sure it can happen before test runs, and then inside of test runs, just a little bit of post-processing then we can upload the statically generated results with the test results 19:36:46 jeblair: yes, exactly that 19:37:16 the downside to building in nodepool images is that sometimes we go for a long time without being able to sucessfully build them 19:37:36 yeah, need to be able to tolerate days, maybe even weeks, of staleness there 19:37:40 so it means that if you want to change stackviz itself, there may be periods (like a week or more) where what's on nodepool is stale 19:38:03 also need to be able to tolerate inconsistency across providers/regions 19:38:20 that isn't a huge problem for us, with just the 2 of us development isn't fast-paced at all 19:38:28 we might successfully upload in ovh and two of three rackspace regions but have one that's exhibiting issues getting new images uploaded 19:38:44 or something like that 19:38:51 if you're okay with that, i think building them into nodepool images is fine, and fairly simple. but if stackviz is evolving enough that you wouldn't be able to tolerate that, then the tarball think is better 19:39:07 ++ 19:39:13 downside of tarball is there's extra network traffic and delay in each job 19:39:14 so as long as we expect that it will sometimes lag and may lag by different amounts in different places, it's a possibility 19:39:19 how big is the output? I guess I can just look at the examples 19:39:22 realistically we're looking at maybe 1-2 changes merged per week, and I don't see a pressing need to have changes applied immediately 19:39:45 clarkb: ~500kb to 1.5MB if compressed, depending on log sizes 19:39:52 if the tarball is tiny, then it's probably not any worse than a lot of other things devstack downloads (distro package lists, for ecample) 19:39:59 fungi: right 19:40:00 ok thats not bad (I always worry with log related things as they can explode) 19:40:22 What sort of delay are we looking at with pulling form the tarball site? 19:40:25 clarkb: actually, the actual tarball would be only ~100kb, that larger size is post-data processing 19:40:46 austin81: just whatever it takes to fetch that (possibly over an ocean) and install it 19:40:51 ya I am interested in the post data processing since that gets uploaded to the log server for each job right? 19:41:11 austin81: no real delay. if you build in a post job for stackviz changes merged or something then basically just depends on how long the post pipeline backlog is on any given day. sometimes hours 19:41:27 (clarkb and i are talking about 2 different network transfers, i'm talking about downloading stackviz itself to the worker node, clarkb is talking about uploading from the worker to the log server) 19:41:45 and i'm talking about uploading your tarball to the tarballs site 19:41:51 so make that three transfers 19:42:20 fungi: Ah okay I was more curious about pulling it onto the logs server. Not too concered about the upload speed 19:42:38 Since, like timothyb89 said we are not dealing with very many patches 19:42:38 and its not a python project that could hvae a prebuilt sdist on our pypi mirrors? 19:42:41 clarkb: the demo site has a 9.8MB footprint with data due to logs, but when trimmed it is < 1MB compressed 19:43:10 (I see pip on the etherpad so not sure how that works) 19:43:20 yeah, if tarballs are an option, then packaging them in an sdist/wheel and mirroring them on our pypi mirrors could make sense 19:43:35 just tag a release when you have new changes you want incorporated 19:43:41 clarkb: the data processing is a small python module, we'll probably want some form of prepackaging for that 19:44:09 timothyb89: I am suggesting combine it all so that when you push a tag of stackviz you get a pip installable sdist that works 19:44:25 then that can be done whenever you are happy with it, devstack intsalls from mirror and runs it 19:44:26 and includes the static web components prebuilt 19:44:31 ya 19:44:46 clarkb: ah, right, that makes sense 19:45:12 anyway, it sounds like we have some potential direction on next steps for this? want to move on to the other three agenda items in the next 15 minutes 19:45:30 that's a lot of good advice, thanks all for input! 19:45:31 fungi: Yes, thank you for the discussion 19:45:34 thanks timothyb89/austin81! 19:45:51 #topic Next steps for release automation work (dhellmann) 19:45:59 #link http://specs.openstack.org/openstack-infra/infra-specs/specs/complete-reviewable-release-automation.html 19:46:18 dhellmann: i suppose some of this is dependent on me working on the artifact signing pieces i have planned 19:46:26 we have most of the scripts we need in the release-tools repo, so I'm looking for what our next steps are 19:46:33 the key management bits in particular 19:46:40 right now those scripts have 2 dependencies I don't know how we'll work with in CI: git keys and sending email 19:46:51 does key management depend on zuul v3? 19:47:00 not in the current design, no 19:47:02 kk 19:47:04 col 19:47:06 just a trusted jenkins slave 19:47:06 cool 19:47:16 what's a git key? 19:47:21 jeblair: gpg key 19:47:23 that spec is also published but not really underway yet 19:47:26 oh ok 19:47:28 jeblair: to use to sign the tag 19:47:32 sorry, gpg 19:47:39 #link http://specs.openstack.org/openstack-infra/infra-specs/specs/artifact-signing.html 19:48:03 so that's probably just another secret on a trusted slave, yeah? 19:48:22 i think we determined at the summit that we were just going to use the same openpgp key to make detached signatures of artifacts and to sign git tags 19:48:32 yes, that's what I remember, too 19:48:37 jeblair: yah. I think so. although I'm guessing we'd want to sign that key with infra-root and release-team keys at some point maybe? 19:48:38 so, yes, stick in hiera, puppet onto worker 19:48:49 (which means the signing job can't run any in-repo code) 19:48:50 mordred: yep, mentioned in the second spec i linked 19:48:53 woot 19:48:59 yeah 19:49:00 its like you've thought through this already 19:49:26 it's like we've written this stuff a while ago because it's sometimes easier to find free time to plan than to execute ;) 19:49:30 as for email -- i suppose we can have a slave (probably the same) send email; they all probably already run exim servers anyway... 19:49:43 jeblair : the signing script is http://git.openstack.org/cgit/openstack-infra/release-tools/tree/release.sh but I don't know if that runs code you'd define as in-tree 19:50:08 where is the email being sent to? 19:50:15 openstack-dev or openstack-announce 19:50:16 dhellmann: i think we discussed a while back that it might need to move to project-config jenkins/scripts directory instead 19:50:16 just to lists.o.o? 19:50:26 jeblair : yeah 19:50:32 fungi : we did, that's true 19:50:41 and yeah, exim is running on release.slave.o.o for example 19:50:50 so having the worker send e-mail is pretty trivial 19:50:59 cool, then since we control that anyway, it seems like it shouldn't be much of a problem -- we don't really need a smarthost or anything since lists.o.o will effectively handle that for us. 19:51:14 the scripts generate a fully formatted email with headers and all, so we just need a way to get that into the outgoing mail queue 19:51:23 import smtp 19:51:28 smtp.send(foo) 19:51:31 if memory serves ;) 19:51:34 (and that means this will transition to zuulv3 easily where the slave will be short-running -- we'll just make sure lists.o.o can accept mail from our slaves) 19:52:09 fungi : yeah, I could write a little thing; I wasn't sure if something like that already existed. I'm using msmtp on my local system 19:52:17 dhellmann, fungi: yeah, either smtp over port 25 or /usr/lib/sendmail should be fine. 19:52:19 but zuulv3 also supports long running slave too right? so we won't have to make that transition immediately 19:52:42 or as part of the switch (depending on what is easier) 19:52:47 clarkb: yeah 19:52:51 what's the difference between long and short running and how does it apply here? 19:52:53 yeah, preferably stuff it through /usr/lib/sendmail so we get queuing in the event of a network issue 19:53:06 fungi: smtp to localhost 19:53:22 oh, right, that 19:53:35 fungi: so network issues highly unlikely 19:53:57 dhellmann: the current proposal is a persistent manually maintained job worker rather than dynamic single-use nodepool-maintained workers 19:54:07 honestly, if there's no other reasion, i'd go with smtp because it's more flexible (can be used in other situations to talk to a remote smarthost with no local server) 19:54:09 but in zuul v3 it'll just all be the latter 19:54:09 fungi : ah, ok 19:54:36 fungi : in that latter case, would there be a chance that the node would be killed before the queued email is actually sent? 19:54:51 jeblair: yeah, the smtp module in stdlib pointed at the loopback makes the most sense, and exim is almost certainly already configured to dtrt there 19:55:13 * dhellmann makes a note to work on a script to push mail to smtp on localhost 19:55:21 dhellmann: hrm, perhaps. we might have to make sure the job doesn't terminate until the message is delivered 19:55:29 ya 19:55:32 dhellmann, jeblair: I think when we go to ephemeral nodes, we'll want to do smtp to remote smarthost 19:55:35 worth pondering anyway 19:55:43 mordred : yeah 19:56:09 mordred: agreed, though then we need to do our own retrying 19:56:14 yeah 19:56:18 indeed 19:56:29 well, if we write the script now to use smtp to localhost, then it's trivial to smtp to smarthost later 19:56:31 we're about out of time, two topics remaining on the agenda 19:56:34 jeblair: ++ 19:56:38 Perhaps one of the small desktop-intended SMTP outbound servers would work for that, rather than all of exim. 19:56:43 clarkb: pabelanger: can your topics wait for next week? 19:56:48 persia: we've already got exim everywhere 19:56:52 fungi: maybe? its time bound to january 31 19:56:55 persia: doing a different smtp server would be harder 19:56:58 and I will be afk ish the week prior to that 19:57:03 clarkb: that's why i thought it might 19:57:13 mordred: Now, but for ephemeral? 19:57:18 (and if we're sure that we only deliver to lists.o.o, we can skip the step of setting up a smarthost and just 'smarthost' to lists.o.o :) 19:57:27 fungi: Yup 19:57:29 no rush 19:57:29 we can also just discuss it really quickly after the meeting 19:57:31 persia: it's in our basic puppet manifests 19:57:32 persia: nothing is easier than exim 19:57:36 jeblair: ++ 19:58:30 clarkb: sounds good 19:58:32 pabelanger: thanks 19:58:52 jeblair : should I just have it send directly to lists.o.o now, then? 19:59:36 dhellmann: maybe? 19:59:49 dhellmann: that would probably be fine, though you may want to block and retry until it succeeds in case the network hates us for brief periods 19:59:52 jeblair: won't he have to deal with retry logic if he does? 19:59:53 yeah 20:00:01 anyway, we're out of time 20:00:04 k 20:00:05 thanks everyone! 20:00:09 #endmeeting