#openstack-meeting log

19:01:27 <jeblair> #startmeeting infra
19:01:28 <openstack> Meeting started Tue Mar 25 19:01:27 2014 UTC and is due to finish in 60 minutes.  The chair is jeblair. Information about MeetBot at http://wiki.debian.org/MeetBot.
19:01:30 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote.
19:01:32 <openstack> The meeting name has been set to 'infra'
19:01:33 <zaro> 0o/
19:01:34 <jeblair> #link https://wiki.openstack.org/wiki/Meetings/InfraTeamMeeting
19:01:34 <jeblair> agenda ^
19:01:34 <jeblair> #link http://eavesdrop.openstack.org/meetings/infra/2014/infra.2014-03-18-19.02.html
19:01:34 <jeblair> last meeting ^
19:01:59 <jeblair> #topic Agenda
19:02:16 <jeblair> there's no agenda, so anyone have anything they want to discuss?
19:02:30 <jeblair> puppetboard
19:03:02 <sdague> jeblair: I think I noticed something out of the corner of my eye the other day on you doing multinode nodepool?
19:03:14 <jeblair> multinodepool
19:03:37 <clarkb> when do RCs happen and does infra need to soft freeze for that?
19:03:56 <jeblair> okay, throw other things out if you think of them, otherwise, this may be short (which is no bad thing)
19:04:00 <jeblair> #topic puppetboard
19:04:00 <fungi> oh, i guess the openstack-attic namespace might be worth some deliberation too
19:04:10 <jeblair> yay we haz puppetboard!
19:04:19 <fungi> it released?
19:04:22 <SergeyLukjanov> yay!
19:04:30 <jeblair> on the only slightly inaccurately named http://puppetdb.openstack.org/
19:04:49 <fungi> oh, puppetboard. i had storyboard in my head
19:04:55 <fungi> yes, puppetboard!
19:05:07 <jeblair> so what's the story with the name anyway?
19:05:07 <clarkb> jeblair: it has been suggested we make a CNAME for it
19:05:08 <jeblair> hostname
19:05:23 <clarkb> jeblair: puppetdb runs on that node and puppetboard depends on puppetdb so they were colocated
19:05:30 <jeblair> should puppetboard and puppetdb be on the same host?
19:05:47 <fungi> it seems fine using the server hostname to me, but if people want a different fqdn for the url that's easy enough to fix
19:05:49 <clarkb> jeblair: I don't see a problem with them being colocated
19:06:00 <anteaya> I think they have to be
19:06:25 <jeblair> okay, if we think they belong on the same host, then yeah, let's cname that
19:06:30 <jeblair> for discoverability, etc
19:06:49 <SergeyLukjanov> +1
19:06:55 <zaro> +1
19:07:01 <fungi> so should we move puppet-dashboard.o.o to be a cname to that then?
19:07:23 <jeblair> fungi: not a bad idea; though of course if anyone has a stale bookmark, it would be to port 3000
19:07:25 <fungi> or retire the old name now that we can tear down the old server and use puppetboard.o.o or something?
19:07:48 <jeblair> so given the port issue, i'd be okay just dropping puppet-dashboard
19:07:49 <anteaya> I like puppetboard.o.o
19:07:51 <clarkb> I think we should tear down the old server and use puppetboard
19:08:11 <clarkb> it wasn't a highly used external service anyways (I don't image many people will have to update bookmarks)
19:08:30 <fungi> there may be a handful of docs which need updating in the config repo now when pointed at it
19:08:41 <fungi> git grep ought to find them in short order
19:08:55 <jeblair> #agreed cname puppetboard.o.o and delete puppet-dashboard.o.o server and dns
19:09:00 <jeblair> i reckon i can do that
19:09:07 <jeblair> #action jeblair cname puppetboard.o.o and delete puppet-dashboard.o.o server and dns
19:09:30 <anteaya> fungi: 9 instances of puppet-dashboard in /config
19:09:46 <zaro> question about gerrit upgrade when done with this topic
19:09:48 <jeblair> and i guess we can start merging the refactoring patches, though we might want to confirm that aarongr, etc, are around to fix anything
19:09:58 <pleia2> yay :)
19:10:33 <jeblair> #topic multinode nodepool
19:10:46 <SergeyLukjanov> ++ for puppetboard
19:11:01 <jeblair> #link https://review.openstack.org/#/q/status:open+project:openstack-infra/nodepool+branch:master+topic:multi,n,z
19:11:13 <jeblair> sdague: ^ there's a 4 patch series to add multi-node support to nodepool
19:11:46 * anteaya sees 7 patches
19:11:50 <jeblair> it's a pretty naive implementation, but it can be refined/extended/replaced later
19:11:52 <sdague> jeblair: awesome
19:12:09 <jeblair> anteaya: there are some other unrelated patches later on in the series
19:12:15 <anteaya> ah
19:12:16 <clarkb> jeblair: I think for a naive implementation it has quite a bit of flexibility
19:12:17 <jeblair> i just didn't bother to change the topic
19:12:29 <clarkb> granted I havne't tried to spin up any subnodes yet so can't speak from experience
19:13:05 <jeblair> sdague: there isn't any L2 connectivity yet; but the nodes will be able to ssh into each other
19:13:10 <sdague> so when do we think we could try to kick the tires with a job that uses it?
19:13:34 <jeblair> sdague: they could set up an openvpn..., or soon we should have neutron support in hp and rax, and when that happens, we can have nodepool create an l2 network
19:13:43 <fungi> depends on what else the job needs to build on top of it (like vpn)
19:13:49 <fungi> er, what jeblair said
19:14:15 <jeblair> but if we can do something that doesn't require L2, it should be pretty easy
19:14:17 <sdague> what code is going to run pre-job start the do something like lay down the vpn
19:14:37 <sdague> yeh, well the start point for me is multinode devstack
19:14:54 <sdague> so I'm trying to figure out how I could hack this into making that work
19:14:55 <clarkb> sdague: it is arbitrary, you give nodepool a script to run on each node and all nodes can ssh to each other
19:14:56 <jeblair> basically, nodepool is going to run a script on all the nodes in a group after spinning them up and before attaching them to jenkins
19:15:03 <jeblair> so we can do whatever we want there
19:15:29 <sdague> it's going to run the same script in all nodes? or run a script ahead of time
19:15:31 <clarkb> and each node knows what its role is so you can distinguish between the node connected to jenkins and the subnodes
19:15:42 <jeblair> sdague: does multinode devstack need l2 connectivity, or can we just give it some ip addresses and let it talk over L3?
19:15:43 <sdague> I'm mostly thinking about cert generation to create the L2 connect
19:16:01 <sdague> jeblair: good question
19:16:13 <clarkb> sdague: you would probably have each node do their own cert generation then ssh to all of the others to copy it across
19:16:14 <sdague> actually we could probably get away with L3
19:16:25 <sdague> what's the firewall situation between hosts?
19:16:27 <clarkb> then start the openvpn service
19:16:52 <jeblair> cool, i'd like to avoid investing too much in creating openvpn stuff if we're going to end up with neutron in the not too distant future
19:16:54 <sdague> clarkb: well openvpn is hub and spoke, so we'd build the certs on one box, and distribute them
19:16:57 <clarkb> sdague: as is completely locked down, only ssh is allowed in
19:17:14 <fungi> is this where we need heat to gather things like signatures and then enroll on other nodes in the group?
19:17:27 <jeblair> but if we need it, it's an option
19:17:34 <clarkb> jeblair: ++
19:17:38 <jeblair> fungi: we could, but this is all doable with ssh
19:17:40 <clarkb> sdague: right so assuming we don't need L2
19:17:53 <sdague> I'm not convinced doing that in heat is actually any easier than with ssh
19:17:55 <clarkb> sdague: I imagine the script on each node would update the iptables to allow whichever ports you need in the mesh
19:18:14 <jeblair> fungi: the nodes all know each other, and the controller node is deterministically the last one to run the ready script
19:18:14 <fungi> yeah, i think neutron's desperation for multi-node assumes l2 connectivity but that can presumably come later
19:18:15 <sdague> clarkb: we'll have firewall control inside the nodes ourselves?
19:18:19 <clarkb> sdague: yup you have root
19:18:25 <clarkb> sdague: and we only do local firewalls
19:18:31 <sdague> I thought that previously there was an issue with punching holes
19:18:34 <sdague> that heat was running into
19:18:47 <clarkb> sdague: there was, but heat fixed it by updating the rules on those nodes
19:18:59 <sdague> ok
19:19:11 <jeblair> so i image the ready script pokes holes in iptables, adds the controller node's ssh key on all the subnodes, then on the controller node, it can ssh back out to the controller to do any key distribution, etc.
19:19:28 <jeblair> er "ssh back out to the subnodes"
19:19:57 <jeblair> anyway, that script is the last part of this that hasn't been written
19:20:06 <sdague> ok, cool
19:20:15 <sdague> well once that part is in place, let me know
19:20:30 <sdague> I think an experimental devstack sanity check job would be a good place to start after that
19:20:49 <sdague> and then we'll get to see a whole new level of failure in the gate :)
19:20:53 <jeblair> sdague: from a jenkins POV, it's just a single node attached to jenkins.  it just happens that the job jenkins runs on that node will have access to a set of other machines.
19:21:02 <sdague> jeblair: cool
19:21:04 <jeblair> sdague: hehe
19:21:32 <jeblair> #topic openstack-attic
19:21:44 <jeblair> i registered this on github in case we need it
19:22:20 <fungi> when last the conversation wound down, it sounded like the tc had basically already approved policy that we would archive things we no longer care about into a separate namespace
19:22:42 <fungi> thus something we need to plan for. i guess when we do the next round of renames
19:22:49 <jeblair> i missed that tc meeting
19:23:06 <jeblair> fungi: do you have a reference?
19:23:13 <fungi> i missed that the policy patch was stating that, or i'd have suggested not doing so
19:23:29 <fungi> ttx linked to the change in governance. i'll dig it back up
19:24:13 <mordred> o/
19:24:16 <mordred> (sorry late)
19:24:32 <pleia2> welcome mordred
19:24:36 <SergeyLukjanov> mordred, hey
19:25:58 <fungi> #link http://git.openstack.org/cgit/openstack/governance/commit/reference/programs.yaml?id=7044c17
19:26:10 <zaro> mordred: your appearance made everything quiet.
19:26:36 * mordred is the quiet maker
19:27:16 <fungi> in the commit message, which was apparently part of what's voted on along with the patch itself, "openstack-dev/openstack-qa, openstack/melange, openstack/openstack-chef, and openstack/python-melangeclient which should move to some attic ... openstack-dev/sandbox and openstack/python-openstackclient which should be in stackforge"
19:27:35 <jeblair> hrm
19:27:47 <jeblair> i'm not sure a commit message makes policy
19:27:55 <fungi> that was more or less what the discussion ended with at any rate
19:27:59 <jeblair> especially a vague one like that
19:28:11 <ttx> o/
19:28:19 <jeblair> i mean, i think it's the actions on files in the repo that are the policy and what people are voting on
19:28:49 <jeblair> i see the commit message as more like the speech you would give to the committee about the motion, not the motion itself
19:29:03 <mordred> jeblair: ++
19:29:08 <fungi> in which case, more clarification needed after all
19:29:10 <mordred> I would consider that to be the case
19:29:30 <ttx> fungi: commit message @ http://git.openstack.org/cgit/openstack/governance/commit/?id=7044c177dcc02d44321660db8b909483dad68be3
19:29:58 <ttx> yes, that was not a "resolution", more of a default understanding
19:29:59 <fungi> ttx: righth, that's what i linked above
19:30:07 <ttx> missed link sorry
19:30:22 <fungi> okay, so it'
19:30:24 <fungi> er
19:30:30 <ttx> (hence the "should" language which is far from perscriptive)
19:30:31 <jeblair> it probably does represent what most of us think anyway, at least generally
19:30:45 <jeblair> as in, i think we could probably all get behind moving melange to openstack-attic
19:31:08 <ttx> I think there is the idea that all projects under openstack*/ should be accounted for
19:31:14 <fungi> i was less convinced that openstack-dev/sandbox belongs in stackforge
19:31:20 <jeblair> fungi: ++
19:31:22 <ttx> not really forcing the implemntation
19:31:45 <clarkb> and in general I am not sure stackforge is a dumping ground so ++ to attic
19:31:48 <ttx> the commit message reflects the discussion we had about those orphans
19:32:06 <clarkb> stackforge isn't a default. It is specificlly a way for community members to leverage the tools we provide for active development
19:32:09 <jeblair> so i think our open questions are: a) what to do with melange? should we initialize openstack-attic and move it there?  or leave it in openstack and categorize it as abandoned?
19:32:20 <fungi> though i guess the gantt and openstackclient moves to stackforge are more in line with what we expect to do in the case of an aborted incubation
19:32:29 <jeblair> and b) what to do with gantt
19:33:04 <jeblair> i would say under normal conditions, that's an easy one, we move it to stackforge, except people keep saying it might come back to life again later
19:33:15 <fungi> in which case, the compute program deciding to no longer foster gantt is actually an aborted incubation right?
19:33:42 <jeblair> fungi: i think so.  i just hope they know that if they move it to stackforge, they're giving up the right to use the name gantt.  :)
19:34:04 <jeblair> because i think it would be mean to rename a stackforge project because openstack likes the name
19:34:13 <clarkb> ++
19:34:22 <fungi> or is there some other class of projects which might get spun out of an official program into unofficial space?
19:34:25 <jeblair> (that is unless they want to rename it _back_ to openstack in the future)
19:35:46 <jeblair> fungi: i think once they are unofficial, they're unofficial; we don't care anymore
19:35:53 <sdague> gantt is going to be fully reinitialized when it happens again
19:35:59 <fungi> basically wondering if gantt was incubated and is now being un-incubated, in which case we have initial precedent for the process
19:36:08 <jeblair> sdague: what's it going to be called?
19:36:13 <sdague> gantt
19:36:26 <sdague> basically the forklift failed
19:36:34 <jeblair> sdague: so there's a group of people that want to keep working on the current forklift
19:36:52 <sdague> my understanding is it will be a new forklift
19:37:08 <sdague> after cleaning up some nova scheduler parts to make the forklift doable
19:37:10 <clarkb> right, which leads to the issue jeblair points out above
19:37:14 <sdague> agreed
19:37:29 <jeblair> sdague: i think if nova has decided to mothball gantt, but wants to revive it in the future.... there are two least-bad choices
19:37:55 <clarkb> keep gantt as openstack/gantt >_>
19:38:03 <jeblair> a) rename openstack/gantt -> something/notgantt; this gives nova devs a clean start
19:38:22 <sdague> jeblair: honestly, I have no idea why we're not just deleting gantt. I think the people that want to play with the code should do so off in github if they really want to. Because there is no path forward from the current codebase to the eventual project
19:38:41 <fungi> s/delete/mothball/ but otherwise agree
19:38:47 <sdague> fungi: sure
19:38:56 <sdague> openstack-attic/gantt-mark1
19:39:05 <jeblair> b) keep openstack/gantt and disable the acls; when it is restarted, people can merge a new branch over the current one; we won't do a force push, but we can do a wholesale replacement in one commit like we did with keystone (it preserves history too)
19:39:28 <jeblair> sdague: yeah, that's closest to option (a) (actual deletion isn't an option)
19:40:22 <jeblair> sdague: thanks for that clarification, i think it helps a lot
19:40:48 <jeblair> anyone have preferences on option a vs b?
19:41:19 <fungi> i think b is a little more genuine and preserves what went on
19:41:34 <clarkb> ++ to b
19:41:44 <jeblair> i still have my notes from when i did that for keystone, so it should be pretty easy to do
19:41:48 <clarkb> continuing work on what was gantt can continue wherever the devs like
19:42:10 <fungi> with a we sent a signal that it's okay to etch-a-sketch projects by just renaming them out of the way and trying harder next time
19:42:27 <jeblair> fungi: yeah, i agree, that makes me uneasy
19:42:27 <fungi> s/sent/send/
19:43:30 <jeblair> ttx: it seems like we think the best thing to do with gantt is to leave it where it is but disable access for now, until it is either revived (most likely), or it is certain that it will not be revived.
19:44:06 <jeblair> ttx: maybe we should have a mothball status for projects
19:44:57 <jeblair> anyone object to moving melange to -attic ?
19:45:29 <anteaya> I do not object
19:45:38 <fungi> if we are agreed on wanting openstack-attic, then .*melange.* are good fodder
19:45:45 <anteaya> it isn't like there hasn't been adequate discussion
19:46:07 <SergeyLukjanov> fungi, ++
19:46:32 <jeblair> fungi: i think that's implicit in the question; i am also asking if you think moving it elsewhere or not moving it are preferable
19:47:10 <sdague> it would be a good initial signal as well
19:47:22 <sdague> see what other dead projects people propose moving to the attic
19:47:39 <sdague> is your intent to move stackforge there as well?
19:47:44 <sdague> or should there be a stackforge-attic?
19:47:49 <fungi> i'm fine with moving stuff into an attic as a low-priority task when we're already doing other more urgent renames, since there seems to be tc consensus that we should not keep things around in the openstack/.* namespace indefinitely
19:47:57 <jeblair> i'll send an email to the infra list suggesting dispositions for these tricky projects, and we can verify that we agree with them, and if so, i'll forward to the tc to see if it's agreeable to them
19:48:23 <jeblair> and if they object, then we'll start writing policy i guess.  :)
19:48:38 * clarkb returns to following meeting
19:48:55 <jeblair> (by writing policy i mean proposing changes to governance to make this explicit)
19:49:17 <jeblair> #action jeblair propose organizational cleanup repo renames to infra and then tc lists
19:49:24 <jeblair> #topic freeze and release schedule
19:49:29 <jeblair> clarkb: just in time
19:49:47 <jeblair> #link https://wiki.openstack.org/wiki/Icehouse_Release_Schedule
19:50:03 <jeblair> milstones start thursda
19:50:08 <clarkb> I saw that the first RCs are starting to come out
19:50:19 <clarkb> and was curious if we needed to institute a soft freeze around that happening
19:50:20 <jeblair> oh that seems earl
19:50:21 <jeblair> y
19:50:41 <sdague> it's later than you think
19:50:47 <sdague> monday is april
19:51:06 <jeblair> i guess that calendar means rc's start at the beginning of the week that has the 27th
19:51:08 <fungi> oh, and related to release stuffs, i should get started on the grizzly eol i meant to start yesterday
19:51:13 <ttx> jeblair: RCs may start anytime now
19:51:17 <jeblair> i initially read it as rcs start on the 27th
19:51:34 <ttx> theer is no start date for them, the schedule just shows when they are ~expected
19:51:40 <sdague> yeh
19:51:44 <jeblair> i don't think a 1 month freeze is tenable
19:51:48 <ttx> the first one shall happen tomorrow fwiw
19:52:09 <clarkb> jeblair: agreed, perhaps we be nice during the first part of freezing (this week) then go backto business as usual?
19:52:22 <ttx> jeblair: soft freeze, everyone works on storyboard instead ! :)
19:52:31 <krotscheck> yay!
19:52:44 <anteaya> sorry what are we freezing?
19:52:47 <sdague> this week and next week are probably the sensitive times
19:52:49 <jeblair> ttx: i started breaking/fixing storyboard things this morning!
19:52:50 <clarkb> anteaya: infra
19:53:05 <sdague> when the likelyhood of rc critical bugs still flowing at a good rate exist
19:53:31 <jeblair> sdague, ttx: right, but there aren't any hard deadlines right now
19:53:34 <ttx> jeblair: the RCS are not that time-sensitive. I'd rather softfreeze the week before release so that we can respin RCs quickly if need be
19:53:45 <sdague> sure
19:53:56 <ttx> if a RC has to wait another day due to wreckage, not that big of a deal
19:54:06 <sdague> and what do we really mean by soft freeze? basically no zuul or nodepool upgrades?
19:54:16 <sdague> it's definitely on the soft size
19:54:18 <sdague> side
19:54:19 <jeblair> sdague: unless we really want to, yes.  :)
19:54:31 <ttx> so I'd say, hold off on weird changes (like Gerrit 2.8), and softfreeze from apr10 to Apr17
19:54:31 <clarkb> sdague: basically we tend to spend more time evaluating things (eg FFE)
19:54:35 <sdague> basically just try to reduce high risk changes
19:54:57 <fungi> and not schedule outages unless they're absolutely needed to get things un-broken
19:54:57 <sdague> ttx: well gerrit is already pushed past that, so that's all good
19:55:08 <ttx> sdague: yes, using it as an example
19:55:20 <jeblair> so yeah, let's be cautious now, but not defer significant work yet, and instead expect a soft freeze around april 10-17
19:55:25 <clarkb> ++
19:55:25 <fungi> so for example, not doing project renames while we're in a soft freeze
19:55:36 <sdague> fungi: yeh, that seems solid
19:55:47 <ttx> frankly speaking, the following two weeks are less sensitive infra-wise than FeatureFreeze week
19:55:47 <sdague> probably worth actually figuring out what soft freeze means :)
19:55:56 <sdague> true
19:56:34 <ttx> so if you give me two weeks for freezing infra changes, I'd pick feature freeze week and release week
19:56:35 <jeblair> #topic gerrit upgrade
19:56:44 <jeblair> zaro: ?
19:56:44 <clarkb> ttx: good to know thanks
19:57:14 <zaro> you asked me about using WIP plugin for gerrit 2.8.  still interested in that?
19:57:26 <jeblair> zaro: yes, especially if we have a bit of extra time to look into it
19:57:42 <jeblair> zaro: if the patch needed has landed in master, i think we can consider backporting it to 2.8
19:58:14 <zaro> ok.  i'll ask _david_ about which patches are necessary.
19:58:25 <jeblair> zaro: cool, thanks!
19:58:41 <jeblair> thanks everyone!
19:58:45 <jeblair> #endmeeting