#openstack-meeting log

19:01:20 <jeblair> #startmeeting infra
19:01:21 <openstack> Meeting started Tue Oct 28 19:01:20 2014 UTC and is due to finish in 60 minutes.  The chair is jeblair. Information about MeetBot at http://wiki.debian.org/MeetBot.
19:01:22 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote.
19:01:25 <openstack> The meeting name has been set to 'infra'
19:01:26 <jeblair> #link agenda https://wiki.openstack.org/wiki/Meetings/InfraTeamMeeting
19:01:29 <jeblair> #link previous meeting http://eavesdrop.openstack.org/meetings/infra/2014/infra.2014-10-21-19.00.html
19:01:41 <jeblair> #topic  Actions from last meeting
19:01:43 <zaro> o/
19:01:50 <jeblair> clarkb figure out gerrit per project third party voting ACLs and third party accounts via openid
19:02:16 <clarkb> I have not started that
19:02:30 <jeblair> clarkb: still want to do it?
19:02:34 <clarkb> yup I can do it
19:02:35 <jeblair> #action clarkb figure out gerrit per project third party voting ACLs and third party accounts via openid
19:02:39 <jeblair> clarkb write py26 deprecation email
19:02:44 <clarkb> that was written and sent
19:02:44 <jeblair> i think that one happened?
19:02:53 <clarkb> and we deprecated py26 on th eserver projects already
19:03:06 <cody-somerville> o/
19:03:09 <jeblair> clarkb: is there more deprecation that needs doing?
19:03:11 <krtaylor> o/  sorry I'm late
19:03:24 <fungi> deprecation for other projects (e.g. infra) and also stackforge
19:03:25 <clarkb> jeblair: ya, potentially some stuff to sort out around oslo projects (thought we may just leave them as is)
19:03:31 <clarkb> then infra and stackforge
19:03:52 <jeblair> does infra need 2.6 for anything?
19:03:54 <fungi> the other projects review was waiting for confirmation from impacted projects' ptls right?
19:03:57 <jeblair> maybe jjb?
19:04:05 <clarkb> jeblair: any code that runs on centos will need it so zuul should keep it probably
19:04:09 <clarkb> (for the zuul-cloner)
19:04:19 <clarkb> jjb is probably reasonable to keep as well
19:04:23 <pleia2> 
19:04:25 <clarkb> everything else probably doesn't need it
19:04:36 <jeblair> clarkb: oh, that's a really good point re zuul-cloner
19:04:48 <fungi> #link https://review.openstack.org/#/q/status:open+project:openstack-infra/project-config+branch:master+topic:python26,n,z
19:04:52 <fungi> for the reviews in question
19:05:20 <clarkb> and we are going to give stackforge projects a lot of time (fungi was uncomfortable with quick switching them especially with all the summit goings on)
19:05:50 <fungi> scheduled for november 30th according to the announcement
19:06:11 <jeblair> ok
19:06:31 <jeblair> fungi draft third-party testing liaisons section for wiki
19:06:57 <fungi> that was a tentative item pending clarkb's testing
19:07:25 <fungi> need to know whether we actually need liasons for requests to disable/reenable accounts, or whether we can engineer that in per-project acls as well
19:07:45 <fungi> e.g. preventing them from commenting at all
19:08:09 <jeblair> fungi: oh, i thought we wanted liasons anyway
19:08:11 <fungi> so no action there yet
19:08:16 * AJaeger is sorry for beeing late
19:08:24 <jeblair> at least, i asked that last meeting and thought i got an affirmative response
19:08:48 <fungi> er, right, as rallying points for the third-party ci operators testing the respective projects
19:08:51 <jeblair> 19:44:36 <fungi> but beyond that, the liaisons idea acts as a rallying point for the third-party testers on those projects in place of our infra team
19:08:51 <jeblair> 19:44:42 <krtaylor> third-party liaisons would also be helpful for third-party systems, a point of contact for systems with questions
19:08:52 <jeblair> 19:44:56 <jeblair> so it sounds like liasons may still be useful even if we go to self-service, both for us (disabling for abuse, and facilitating onboarding of new ci systems with the projects themselves)
19:08:55 <krtaylor> yes, I think it was needed either way
19:09:04 <fungi> so the question is whether they'll be needed for just that, or for additional tasks
19:09:24 <krtaylor> it is a starting point for additional tasks
19:09:24 <jeblair> fungi: yep.  so you want to wait until we know what we're asking of them before we starting asking it of them?
19:09:25 <fungi> i'll get the initial writeup knocked out in that case and we can amend it with further needs as they become apparent
19:09:43 <jeblair> fungi: that sounds like a reasonable plan
19:09:48 <krtaylor> +1
19:09:53 <fungi> i'm indifferent there, but wouldn't be a bad idea before actually making it official
19:10:29 <hogepodge> fungi I'd like to be involved and as helpful as I can be.
19:10:37 <jeblair> #topic Priority Efforts: Swift logs
19:10:50 <fungi> #action fungi draft initial third-party liaisons description, to later be amended as needed before publication
19:10:59 <fungi> hogepodge: thanks--i'll keep you in the loop
19:11:06 <jeblair> jhesketh: are you around?
19:11:24 <krtaylor> fungi, I'll be glad to help as well
19:11:33 <fungi> krtaylor: appreciated
19:13:49 <fungi> i'm assuming either some of us are lagging badly or jhesketh is not present
19:14:08 <jeblair> last week we discussed that we may need a jenkins plugin to do log uploads regardless of job status
19:14:31 <clarkb> in order to capture logs when jobs fail
19:14:32 <jeblair> separately, an internal hp discussion brought up this plugin: https://wiki.jenkins-ci.org/display/JENKINS/PostBuildScript+Plugin
19:14:46 <jeblair> which is a way less insane way of doing that than the last time i looked
19:15:12 <jeblair> (there was a plugin where you needed to regex match on .* in the console log to have it run -- weird)
19:15:19 <fungi> probably makes more sense than hacking together yet-another-java-project
19:15:37 <jeblair> so at some point, i expect we'll want to look into using that
19:15:52 <jeblair> and if hp uses it, hopefully we can get some feedback there too
19:16:03 <clarkb> that plugin looks reasonable
19:16:26 <jeblair> yeah, it _seems_ like it shouldn't have any sync points or other jenkins things that make us unhappy
19:16:48 <jeblair> that's all i have on this one
19:16:59 <fungi> #link https://wiki.jenkins-ci.org/display/JENKINS/PostBuildScript+Plugin
19:17:04 <jeblair> oh thanks :)
19:17:07 <jeblair> #topic Priority Efforts: Puppet module split
19:17:27 <jeblair> asselin: ping
19:17:33 <jeblair> " asselin is working on Jenkins module split, when should we schedule the freeze and split?" is in the agenda
19:17:48 <krtaylor> here is the link to the topic
19:17:51 <krtaylor> https://review.openstack.org/#/q/branch:master+topic:module-split,n,z
19:18:20 <fungi> i think this spun out of yesterday's discussions about the next ways to coordinate the switch-over step so as to minimize module-specific freeze periods
19:18:27 <fungi> s/next/best/
19:18:35 <jeblair> okay, let's talk about that then :)
19:18:46 <krtaylor> mmedvede, ^^^
19:18:56 <fungi> anteaya: this was something you asked to have on the agenda i think?
19:19:07 <fungi> oh, right, she may be travelling today
19:19:14 <clarkb> we accidentally merged half of a split's changes
19:19:28 <fungi> so, i think it was the puppet-kibana module
19:19:33 <clarkb> yup
19:19:37 <fungi> the new project creation change got approved/merged
19:19:49 <fungi> with minimal fanfare
19:20:10 <jeblair> then what did not get merged?
19:20:22 <fungi> luckily someone (anteaya?) noticed and pointed out we should refrain from making changes to the copy of that module in system-config now
19:20:25 <clarkb> jeblair: the system-config change I asked you guys to review really quickly yesterday
19:20:28 <nibalizer> ya anteaya is out right now
19:20:36 <clarkb> which did merge iirc
19:21:10 <jeblair> clarkb: and that change was to add it to testing, the modules env file, and remove the old code?
19:21:17 <fungi> so i believe this meeting topic was in hopes of bringing some sane recommendations to the authors and approvers of those changes for better coordination
19:21:21 <clarkb> jeblair: yup
19:21:38 <nibalizer> fungi: exactly
19:22:33 <clarkb> I suggseted that we try to communicate the coupling a bit better. changes should be proposed and WIP'd and very clearly specify the relationships between changes
19:22:36 <fungi> options proposed were to schedule the approvals of those, or to seek out infra-core shepherds to work with the authors proposing them and handle the final coordination
19:22:38 <jeblair> so perhaps we should: wait until both changes are staged; link the system-config change to the project-config change with "Depends-On:"
19:22:40 <mmedvede> what we have left off at yesterday is that there should be a core who would coordinate each individual split, correct?
19:23:30 <fungi> in my opinion, i think making it clear in the commit messages that they depend on one another and core reviewers making sure to pay attention to commit messages should help most of this
19:23:51 <clarkb> agreed. I don't think we need a specific core to coordinate around each of these changes
19:23:55 <nibalizer> how does that deal with lag?
19:24:12 <fungi> but -2 or wip is another potential safeguard (though it does mean the author needing to actively troll for reviews)
19:24:14 <nibalizer> lag between the initial submodule spilt and the series of patching lags is the freeze period
19:24:29 <jeblair> i think the thing getting a core on-board with it gives is is a commitment from someone to be around in a few hours to ensure #2 doesn't sit outstanding for too long
19:24:37 <jeblair> or what nibalizer said :)
19:24:41 <nibalizer> ya
19:24:47 <jeblair> however, i don't think it needs to be the same core...
19:24:49 <nibalizer> we could even do the other thing anteaya said
19:25:01 <nibalizer> which is friday after project rename we do split outs for the week
19:25:03 <fungi> right, so i don't know it has to be an assigned core reviewer the entire way through, just when it comes time to approve things together
19:25:17 <clarkb> right whoever does approve one should approve the other
19:25:23 <jeblair> so maybe it's a matter of whoever approves the first one, they at least make sure it's likely that they or someone else will be around for the second?
19:25:36 <fungi> seems fair
19:26:36 <jeblair> want to try that for a bit?  and if we need more structure, maybe we do split-out friday? :)
19:26:55 <clarkb> wfm
19:27:15 <nibalizer> wfm
19:27:16 <fungi> proposed agreement: when dependent puppet-module splits are completely ready to merge, a core reviewer will commit to approving them in the appropriate order or coordinate with another reviewer to take over
19:27:46 <jeblair> fungi: wfm; i'm going to add a second item
19:27:58 <fungi> oh, i had a second one too, but go ahead
19:28:26 <jeblair> proposed agreement: system-config module removals should include Depends-On: in commit message referencing corresponding project-config module adds
19:28:38 <jeblair> anyone disagree with fungi's statement?
19:28:54 <fungi> heh, that was essentially my second agreement item ;)
19:29:12 <wenlock> seems like a good plan, its what we've been doing too btw
19:29:24 <krtaylor> should commit message include the core working it?
19:29:42 <jeblair> krtaylor: no, i don't think we want that kind of fixed structure
19:29:43 <nibalizer> works!
19:29:43 <wenlock> the commit message should include a link to the other commit thats dependent IMO
19:29:50 <fungi> i think the point of these is to serve as a reminder that we should communicate, pay attention, and stick around if we approve part of something to do the rest
19:29:57 <jeblair> fungi: yep
19:30:14 <asselin> asselin's here now
19:30:49 <jeblair> wenlock: should we add a third item: "project-config module add commit messages should link to system-config module removals with "Needed-By:" in commit message" ?
19:30:50 <fungi> i hear a resounding tacit approval
19:30:56 <jeblair> #agreed when dependent puppet-module splits are completely ready to merge, a core reviewer will commit to approving them in the appropriate order or coordinate with another reviewer to take over
19:31:00 <jeblair> #agreed system-config module removals should include Depends-On: in commit message referencing corresponding project-config module adds
19:31:09 <wenlock> jeblair +1
19:31:17 <mmedvede> So both related patches should have Depends-On, or the project-config one should have something else, e.g. reminder that merging it would imply a need to merge the second one
19:31:22 * fungi is good with all three
19:31:38 <jeblair> that's a little extra work on the commit side (it will require a git commit --amend in one of the repos)
19:31:55 <jeblair> or, i guess, clicking that little icon in gerrit :)
19:32:19 <fungi> or ctrl-d in gertty
19:32:33 <jeblair> #agreed project-config module add commit messages should link to system-config module removals with "Needed-By:" in commit message
19:33:02 <SergeyLukjanov> +1
19:33:13 <jeblair> #link review these module splits: https://review.openstack.org/#/q/branch:master+topic:module-split,n,z
19:33:19 <fungi> and i think if those are in there, wip/-2 aren't really needed to keep sanity. core reviewers caught ignoring commit messages will be summarily flogged with a fish
19:33:33 <jeblair> fungi: a wet fish?
19:33:46 <fungi> i was thinking perhaps a smoked mackeral
19:33:57 <krtaylor> better than a wet cat
19:33:58 <fungi> mackerel
19:34:01 <jeblair> mordred is always very specific about the moisture content of his felines
19:34:07 <clarkb> krtaylor: mordred can get the wet cats
19:34:12 <krtaylor> lol
19:34:15 <jeblair> #topic Priority Efforts: Nodepool DIB
19:34:30 <jeblair> anything we should be doing on this?
19:35:25 <jeblair> i'm starting to think this needs an owner
19:36:02 <fungi> where did we get to last with it?
19:36:04 <jeblair> it seems like it's been round-robining amongst mordred, clarkb, and yolanda
19:36:23 <fungi> we reverted the centos7 dib change because we need newer tar right?
19:36:32 <jeblair> oh
19:36:36 <jeblair> from last meeting:
19:36:38 <clarkb> we need to upgrade nodepool.o.o to trusty, fix nodepool's label can only build one type of image issue, and then kill whatever new things we discover
19:36:39 <ianw> yes, clarkb is looking at upgrading nodepool host
19:36:40 <fungi> so next step was going to be rebuilding the nodepool server on trusty i think?>
19:36:43 <jeblair> 19:17:16 <clarkb> the dib mutliple outputs change merged \o/
19:36:44 <jeblair> 19:17:23 <clarkb> shoudl release tomorrow if dib sticks to their schedule
19:36:55 <clarkb> fungi: yup, and I think we have everything we need to do that now
19:37:07 <clarkb> dib released, I tested the TMPDIR override locally
19:37:12 <jeblair> clarkb: i have a gap in my knowledge -- why upgrade to trusty?
19:37:18 <fungi> 0.1.35 tagged 5 days ago
19:37:20 <ianw> i also have an issue with image labels that i will look at (been saying that for a few days)
19:37:46 <ianw> jeblair: newer tar that supports xattr for building centos images
19:37:50 <clarkb> jeblair: centos7 images ship a qcow2 with extended fs attributes
19:37:58 <clarkb> jeblair: for whatever reason dib converts that image to a tar
19:38:05 <jeblair> ah ok
19:38:07 <clarkb> then later untars that into the dir it will chroot into
19:38:24 <clarkb> (I still think dib should just mount the image and chroot over that but I can't get anyone to tell me why that is a bad idea)
19:38:29 <jeblair> we need to review this change too
19:38:36 <jeblair> #link https://review.openstack.org/#/c/126747/
19:38:36 <fungi> and precise's tar is too old tu support extended attribs
19:38:57 <clarkb> I can rebase that change today
19:39:12 <clarkb> and I also wrote https://review.openstack.org/#/c/130878/
19:39:30 <clarkb> jeblair: if you look at https://review.openstack.org/#/c/130878/ that might be helpful. I got the test passing but it isn't doing what I wanted so it is WIPed
19:40:05 <jeblair> #link wip nodepool test change https://review.openstack.org/#/c/130878/
19:40:13 <clarkb> oh I think I may see a problem there
19:40:23 <clarkb> so maybe I just needed to do something else for a day or two :)
19:40:41 <jeblair> clarkb: how do you want to handle the nodepool trusty upgrade?
19:41:09 <clarkb> its a bit tricky
19:41:33 <clarkb> the lazy in my wants to just spin up a new node, swap DNS then cleanup old slaves and images via the alien listing
19:41:44 <clarkb> but I think alien listing only works for slaves not images?
19:42:13 <jeblair> clarkb: there's an 'alien-image-list'
19:42:15 <fungi> i think we had determined that they can run in parallel for a bit if demand is low (so that we don't overrun quotas, or we cut max-servers on them for the duration), then delete alien nodes after the old one is stopped
19:42:27 <clarkb> oh perfect than ya I think we just sort of do it without a db migratin and clean up aliens
19:42:34 <clarkb> fungi: ya
19:43:07 <jeblair> clarkb: zuul gearman will firewall the new server
19:43:21 <clarkb> ya so the dns update is important
19:43:28 <jeblair> clarkb: you could ignore that and just let it supply min-ready for a while to make sure it's working
19:43:32 <clarkb> it will basically be our cutover
19:43:42 <fungi> that seems pretty safe then
19:43:58 <jeblair> clarkb: more or less, yeah.  it should be that whichever has iptables access will do the bulk of the work
19:44:07 <jeblair> the other should only do min-ready
19:44:16 <jeblair> (and replacements thereof when nodes are used)
19:44:22 <clarkb> should I possibly just go ahead and spin up the replacement this afternoon?
19:44:30 <clarkb> and we can let it build images and make min-ready nodes?
19:44:31 <jeblair> clarkb: oh no!  we'll reset the node id counter :)
19:44:42 <clarkb> actually that may not be entirely safe
19:44:50 <clarkb> because those nodes will end up in jenkins but we won't get the node used events
19:45:02 <clarkb> (maybe that is ok, the node will still be single use)
19:45:10 <jeblair> clarkb: they come from zmq... does zmq have a firewall rule too?
19:45:15 <clarkb> ya
19:45:17 <clarkb> on the jenkins side
19:45:35 <fungi> oh, hrm, yeah that might get messy
19:45:39 <clarkb> I think in that case we will have "ready" nodes that have been used
19:45:40 <jeblair> clarkb: then if you want to run both in parallel, we probably need to manually open the firewalls
19:45:43 <clarkb> but jenkins will only use them once
19:45:52 <clarkb> so it should still be safe
19:45:55 <jeblair> clarkb: yeah, but they won't ever be deleted
19:46:11 <clarkb> they will after 8 hours right?
19:46:28 <jeblair> clarkb: nope, ready sticks around for a while
19:46:34 <jeblair> forever, i think?
19:46:39 <clarkb> gotcha
19:47:11 <clarkb> do we want to try doing this before the summit? I will also be afk week after summit
19:47:22 <jeblair> clarkb: so i think either manually add iptables rules for the new server, or shut it all down and do it on a saturday.  :)
19:47:41 <clarkb> I like the idea of shutting it all down simply because there is so much other stuff going on
19:47:50 <jeblair> my last pre-summit day is tomorrow
19:48:07 <fungi> i'll actually be on my way to paris tomorrow and the day after
19:48:07 <clarkb> fungi's too iirc
19:48:15 <clarkb> ya so lets post summit this and do it right?
19:48:21 <fungi> leaving for the airport tomorrow morning
19:48:23 <jeblair> clarkb: sounds like a plan
19:48:29 <fungi> i agree
19:48:41 <jeblair> #agreed upgrade nodepool to trusty after summit
19:49:03 <jeblair> #topic Priority Efforts: Jobs on trusty
19:49:11 <ianw> so can we organise a restart to at least get hp centos7 jobs going?
19:49:17 <ianw> they're currently failing on login
19:49:38 <jeblair> ianw: shouldn't be a problem
19:49:48 <jeblair> #link https://etherpad.openstack.org/p/py34-transition
19:50:18 <fungi> yeah, inching closer
19:50:34 <fungi> heat-translator fixed their issues, python-heatclient has a working change series up
19:50:44 <fungi> python-glanceclient is still stagnant
19:51:05 <jeblair> fungi: did you bring that up at the project meeting?
19:51:07 <fungi> and no new word on the two outstanding ubuntu bugs we need an sru to fix. hopefully ubuntuites are over their release hangovers now
19:51:34 <fungi> jeblair: i did not bring it up at the last project meeting, no, but now that it's been sitting for a week, probably a good idea
19:52:09 <jeblair> fungi: cool (if anyone shows up for it this week ;)
19:52:13 <fungi> #link https://launchpad.net/bugs/1382582
19:52:14 <uvirtbot> Launchpad bug 1382582 in python-glanceclient "untestable in Python 3.4" [Medium,Confirmed]
19:52:26 <fungi> i guess that's been 11 days
19:52:56 <fungi> anyway, nothing else new on this front for now
19:53:17 <jeblair> #topic  puppet-apache
19:53:26 <jeblair> ianw: i think this is yours?
19:53:34 <jeblair> #link https://review.openstack.org/#/c/129496/
19:53:57 <ianw> yes, i'm happy to do this as an exercise in puppet
19:54:04 <ianw> there has been disagreement over the name
19:54:32 <ianw> possibly disagreement over forking puppet-apache, but AFAICS we're pretty stuck if we don't
19:55:16 <fungi> i thought the primary proposal was to fork puppet-apache simply by renaming it, and then work to incrementally migrate
19:55:34 <ianw> yes, that is it
19:55:46 <clarkb> fungi: what does incrementally migrate mean?
19:55:51 <clarkb> migrate to it or away from it?
19:55:54 <ianw> although if we want to migrate remains an open question
19:56:01 <nibalizer> i want to fork the apache mod
19:56:05 <fungi> clarkb: switch modules from the fork to latest upstream bit by bit
19:56:13 <jeblair> clarkb: one .erb at a time; migrate from 0.0.4 (or our fork of it) to upstream
19:56:15 <clarkb> fungi: see I am almost a -2 on doing that :)
19:56:18 <jeblair> nibalizer: you want to fork it and stay on the fork?
19:56:27 <hogepodge> puppet-apache is a pain point for PL from what I understand.
19:56:37 <clarkb> it will make our apache vhosts much harder to reconsume by not puppet
19:56:42 <nibalizer> jeblair: kindof
19:56:48 <clarkb> as we will essentially model everything in more native puppet types
19:56:48 <nibalizer> certainly i think that is a thing we can get consensus on
19:56:54 <nibalizer> since clarkb loves his 0.0.4 apache mod
19:56:59 <clarkb> which apparently makes puppet happy but files are files and should be treated as such imo
19:57:14 <nibalizer> and honestly the design there, which is to weakly model apache, isn't bad
19:57:54 <nibalizer> so we could keep using apache 0.0.4 on our fork, then we and our downstreams have the option to bring in a newer apache if we want
19:58:00 <nibalizer> but at least we have the option
19:58:17 <nibalizer> and if we want to set up a service that already had a puppet module , we wouldn't be boxed out of it
19:58:36 <jeblair> does upstream have the capability to do things that we do in our vhosts?  eg, mod_rewrite certain urls, etc?
19:58:37 <nibalizer> if you remember i had to remove the apache dependency in the puppetboard module
19:58:40 <fungi> well, if it's proposed as a permanent fork and not a temporary stepping-stone, i take it there's no hope of getting templated vhost support readded to latest apache module?
19:59:11 <jeblair> i think doing so would end up with essentially a second configuration language for apache :/
19:59:21 <nibalizer> im not sure what the extent of the capabilities of the new apache mod it
19:59:25 <nibalizer> is*
19:59:33 <jeblair> i expect we may talk about this over some beer
19:59:34 <nibalizer> but yea, the problem with the upstream apache module is you have to learn two things
19:59:45 <nibalizer> since the puppet mod exposes apache configs into the puppet language
20:00:06 <nibalizer> forking is nice because it says loudly that we do not expect to upgrade to latest apache
20:00:17 <nibalizer> and if latest apache ends up in -infra for some ancillary service, im ok with that
20:00:22 <fungi> yeah, we're out of time now anyway
20:00:23 <jeblair> in all cases, https://review.openstack.org/#/c/129496/ seems to support any of the choices before us while getting us out of the dead-end of 0.0.4.
20:00:31 <jeblair> so is certainly worth our review
20:00:33 <jeblair> ianw: thanks!
20:00:48 <jeblair> and thanks everyone else; i hope to see you next week!
20:00:49 <jeblair> #endmeeting