19:01:27 #startmeeting infra 19:01:28 Meeting started Tue Mar 25 19:01:27 2014 UTC and is due to finish in 60 minutes. The chair is jeblair. Information about MeetBot at http://wiki.debian.org/MeetBot. 19:01:30 Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 19:01:32 The meeting name has been set to 'infra' 19:01:33 0o/ 19:01:34 #link https://wiki.openstack.org/wiki/Meetings/InfraTeamMeeting 19:01:34 agenda ^ 19:01:34 #link http://eavesdrop.openstack.org/meetings/infra/2014/infra.2014-03-18-19.02.html 19:01:34 last meeting ^ 19:01:59 #topic Agenda 19:02:16 there's no agenda, so anyone have anything they want to discuss? 19:02:30 puppetboard 19:03:02 jeblair: I think I noticed something out of the corner of my eye the other day on you doing multinode nodepool? 19:03:14 multinodepool 19:03:37 when do RCs happen and does infra need to soft freeze for that? 19:03:56 okay, throw other things out if you think of them, otherwise, this may be short (which is no bad thing) 19:04:00 #topic puppetboard 19:04:00 oh, i guess the openstack-attic namespace might be worth some deliberation too 19:04:10 yay we haz puppetboard! 19:04:19 it released? 19:04:22 yay! 19:04:30 on the only slightly inaccurately named http://puppetdb.openstack.org/ 19:04:49 oh, puppetboard. i had storyboard in my head 19:04:55 yes, puppetboard! 19:05:07 so what's the story with the name anyway? 19:05:07 jeblair: it has been suggested we make a CNAME for it 19:05:08 hostname 19:05:23 jeblair: puppetdb runs on that node and puppetboard depends on puppetdb so they were colocated 19:05:30 should puppetboard and puppetdb be on the same host? 19:05:47 it seems fine using the server hostname to me, but if people want a different fqdn for the url that's easy enough to fix 19:05:49 jeblair: I don't see a problem with them being colocated 19:06:00 I think they have to be 19:06:25 okay, if we think they belong on the same host, then yeah, let's cname that 19:06:30 for discoverability, etc 19:06:49 +1 19:06:55 +1 19:07:01 so should we move puppet-dashboard.o.o to be a cname to that then? 19:07:23 fungi: not a bad idea; though of course if anyone has a stale bookmark, it would be to port 3000 19:07:25 or retire the old name now that we can tear down the old server and use puppetboard.o.o or something? 19:07:48 so given the port issue, i'd be okay just dropping puppet-dashboard 19:07:49 I like puppetboard.o.o 19:07:51 I think we should tear down the old server and use puppetboard 19:08:11 it wasn't a highly used external service anyways (I don't image many people will have to update bookmarks) 19:08:30 there may be a handful of docs which need updating in the config repo now when pointed at it 19:08:41 git grep ought to find them in short order 19:08:55 #agreed cname puppetboard.o.o and delete puppet-dashboard.o.o server and dns 19:09:00 i reckon i can do that 19:09:07 #action jeblair cname puppetboard.o.o and delete puppet-dashboard.o.o server and dns 19:09:30 fungi: 9 instances of puppet-dashboard in /config 19:09:46 question about gerrit upgrade when done with this topic 19:09:48 and i guess we can start merging the refactoring patches, though we might want to confirm that aarongr, etc, are around to fix anything 19:09:58 yay :) 19:10:33 #topic multinode nodepool 19:10:46 ++ for puppetboard 19:11:01 #link https://review.openstack.org/#/q/status:open+project:openstack-infra/nodepool+branch:master+topic:multi,n,z 19:11:13 sdague: ^ there's a 4 patch series to add multi-node support to nodepool 19:11:46 * anteaya sees 7 patches 19:11:50 it's a pretty naive implementation, but it can be refined/extended/replaced later 19:11:52 jeblair: awesome 19:12:09 anteaya: there are some other unrelated patches later on in the series 19:12:15 ah 19:12:16 jeblair: I think for a naive implementation it has quite a bit of flexibility 19:12:17 i just didn't bother to change the topic 19:12:29 granted I havne't tried to spin up any subnodes yet so can't speak from experience 19:13:05 sdague: there isn't any L2 connectivity yet; but the nodes will be able to ssh into each other 19:13:10 so when do we think we could try to kick the tires with a job that uses it? 19:13:34 sdague: they could set up an openvpn..., or soon we should have neutron support in hp and rax, and when that happens, we can have nodepool create an l2 network 19:13:43 depends on what else the job needs to build on top of it (like vpn) 19:13:49 er, what jeblair said 19:14:15 but if we can do something that doesn't require L2, it should be pretty easy 19:14:17 what code is going to run pre-job start the do something like lay down the vpn 19:14:37 yeh, well the start point for me is multinode devstack 19:14:54 so I'm trying to figure out how I could hack this into making that work 19:14:55 sdague: it is arbitrary, you give nodepool a script to run on each node and all nodes can ssh to each other 19:14:56 basically, nodepool is going to run a script on all the nodes in a group after spinning them up and before attaching them to jenkins 19:15:03 so we can do whatever we want there 19:15:29 it's going to run the same script in all nodes? or run a script ahead of time 19:15:31 and each node knows what its role is so you can distinguish between the node connected to jenkins and the subnodes 19:15:42 sdague: does multinode devstack need l2 connectivity, or can we just give it some ip addresses and let it talk over L3? 19:15:43 I'm mostly thinking about cert generation to create the L2 connect 19:16:01 jeblair: good question 19:16:13 sdague: you would probably have each node do their own cert generation then ssh to all of the others to copy it across 19:16:14 actually we could probably get away with L3 19:16:25 what's the firewall situation between hosts? 19:16:27 then start the openvpn service 19:16:52 cool, i'd like to avoid investing too much in creating openvpn stuff if we're going to end up with neutron in the not too distant future 19:16:54 clarkb: well openvpn is hub and spoke, so we'd build the certs on one box, and distribute them 19:16:57 sdague: as is completely locked down, only ssh is allowed in 19:17:14 is this where we need heat to gather things like signatures and then enroll on other nodes in the group? 19:17:27 but if we need it, it's an option 19:17:34 jeblair: ++ 19:17:38 fungi: we could, but this is all doable with ssh 19:17:40 sdague: right so assuming we don't need L2 19:17:53 I'm not convinced doing that in heat is actually any easier than with ssh 19:17:55 sdague: I imagine the script on each node would update the iptables to allow whichever ports you need in the mesh 19:18:14 fungi: the nodes all know each other, and the controller node is deterministically the last one to run the ready script 19:18:14 yeah, i think neutron's desperation for multi-node assumes l2 connectivity but that can presumably come later 19:18:15 clarkb: we'll have firewall control inside the nodes ourselves? 19:18:19 sdague: yup you have root 19:18:25 sdague: and we only do local firewalls 19:18:31 I thought that previously there was an issue with punching holes 19:18:34 that heat was running into 19:18:47 sdague: there was, but heat fixed it by updating the rules on those nodes 19:18:59 ok 19:19:11 so i image the ready script pokes holes in iptables, adds the controller node's ssh key on all the subnodes, then on the controller node, it can ssh back out to the controller to do any key distribution, etc. 19:19:28 er "ssh back out to the subnodes" 19:19:57 anyway, that script is the last part of this that hasn't been written 19:20:06 ok, cool 19:20:15 well once that part is in place, let me know 19:20:30 I think an experimental devstack sanity check job would be a good place to start after that 19:20:49 and then we'll get to see a whole new level of failure in the gate :) 19:20:53 sdague: from a jenkins POV, it's just a single node attached to jenkins. it just happens that the job jenkins runs on that node will have access to a set of other machines. 19:21:02 jeblair: cool 19:21:04 sdague: hehe 19:21:32 #topic openstack-attic 19:21:44 i registered this on github in case we need it 19:22:20 when last the conversation wound down, it sounded like the tc had basically already approved policy that we would archive things we no longer care about into a separate namespace 19:22:42 thus something we need to plan for. i guess when we do the next round of renames 19:22:49 i missed that tc meeting 19:23:06 fungi: do you have a reference? 19:23:13 i missed that the policy patch was stating that, or i'd have suggested not doing so 19:23:29 ttx linked to the change in governance. i'll dig it back up 19:24:13 o/ 19:24:16 (sorry late) 19:24:32 welcome mordred 19:24:36 mordred, hey 19:25:58 #link http://git.openstack.org/cgit/openstack/governance/commit/reference/programs.yaml?id=7044c17 19:26:10 mordred: your appearance made everything quiet. 19:26:36 * mordred is the quiet maker 19:27:16 in the commit message, which was apparently part of what's voted on along with the patch itself, "openstack-dev/openstack-qa, openstack/melange, openstack/openstack-chef, and openstack/python-melangeclient which should move to some attic ... openstack-dev/sandbox and openstack/python-openstackclient which should be in stackforge" 19:27:35 hrm 19:27:47 i'm not sure a commit message makes policy 19:27:55 that was more or less what the discussion ended with at any rate 19:27:59 especially a vague one like that 19:28:11 o/ 19:28:19 i mean, i think it's the actions on files in the repo that are the policy and what people are voting on 19:28:49 i see the commit message as more like the speech you would give to the committee about the motion, not the motion itself 19:29:03 jeblair: ++ 19:29:08 in which case, more clarification needed after all 19:29:10 I would consider that to be the case 19:29:30 fungi: commit message @ http://git.openstack.org/cgit/openstack/governance/commit/?id=7044c177dcc02d44321660db8b909483dad68be3 19:29:58 yes, that was not a "resolution", more of a default understanding 19:29:59 ttx: righth, that's what i linked above 19:30:07 missed link sorry 19:30:22 okay, so it' 19:30:24 er 19:30:30 (hence the "should" language which is far from perscriptive) 19:30:31 it probably does represent what most of us think anyway, at least generally 19:30:45 as in, i think we could probably all get behind moving melange to openstack-attic 19:31:08 I think there is the idea that all projects under openstack*/ should be accounted for 19:31:14 i was less convinced that openstack-dev/sandbox belongs in stackforge 19:31:20 fungi: ++ 19:31:22 not really forcing the implemntation 19:31:45 and in general I am not sure stackforge is a dumping ground so ++ to attic 19:31:48 the commit message reflects the discussion we had about those orphans 19:32:06 stackforge isn't a default. It is specificlly a way for community members to leverage the tools we provide for active development 19:32:09 so i think our open questions are: a) what to do with melange? should we initialize openstack-attic and move it there? or leave it in openstack and categorize it as abandoned? 19:32:20 though i guess the gantt and openstackclient moves to stackforge are more in line with what we expect to do in the case of an aborted incubation 19:32:29 and b) what to do with gantt 19:33:04 i would say under normal conditions, that's an easy one, we move it to stackforge, except people keep saying it might come back to life again later 19:33:15 in which case, the compute program deciding to no longer foster gantt is actually an aborted incubation right? 19:33:42 fungi: i think so. i just hope they know that if they move it to stackforge, they're giving up the right to use the name gantt. :) 19:34:04 because i think it would be mean to rename a stackforge project because openstack likes the name 19:34:13 ++ 19:34:22 or is there some other class of projects which might get spun out of an official program into unofficial space? 19:34:25 (that is unless they want to rename it _back_ to openstack in the future) 19:35:46 fungi: i think once they are unofficial, they're unofficial; we don't care anymore 19:35:53 gantt is going to be fully reinitialized when it happens again 19:35:59 basically wondering if gantt was incubated and is now being un-incubated, in which case we have initial precedent for the process 19:36:08 sdague: what's it going to be called? 19:36:13 gantt 19:36:26 basically the forklift failed 19:36:34 sdague: so there's a group of people that want to keep working on the current forklift 19:36:52 my understanding is it will be a new forklift 19:37:08 after cleaning up some nova scheduler parts to make the forklift doable 19:37:10 right, which leads to the issue jeblair points out above 19:37:14 agreed 19:37:29 sdague: i think if nova has decided to mothball gantt, but wants to revive it in the future.... there are two least-bad choices 19:37:55 keep gantt as openstack/gantt >_> 19:38:03 a) rename openstack/gantt -> something/notgantt; this gives nova devs a clean start 19:38:22 jeblair: honestly, I have no idea why we're not just deleting gantt. I think the people that want to play with the code should do so off in github if they really want to. Because there is no path forward from the current codebase to the eventual project 19:38:41 s/delete/mothball/ but otherwise agree 19:38:47 fungi: sure 19:38:56 openstack-attic/gantt-mark1 19:39:05 b) keep openstack/gantt and disable the acls; when it is restarted, people can merge a new branch over the current one; we won't do a force push, but we can do a wholesale replacement in one commit like we did with keystone (it preserves history too) 19:39:28 sdague: yeah, that's closest to option (a) (actual deletion isn't an option) 19:40:22 sdague: thanks for that clarification, i think it helps a lot 19:40:48 anyone have preferences on option a vs b? 19:41:19 i think b is a little more genuine and preserves what went on 19:41:34 ++ to b 19:41:44 i still have my notes from when i did that for keystone, so it should be pretty easy to do 19:41:48 continuing work on what was gantt can continue wherever the devs like 19:42:10 with a we sent a signal that it's okay to etch-a-sketch projects by just renaming them out of the way and trying harder next time 19:42:27 fungi: yeah, i agree, that makes me uneasy 19:42:27 s/sent/send/ 19:43:30 ttx: it seems like we think the best thing to do with gantt is to leave it where it is but disable access for now, until it is either revived (most likely), or it is certain that it will not be revived. 19:44:06 ttx: maybe we should have a mothball status for projects 19:44:57 anyone object to moving melange to -attic ? 19:45:29 I do not object 19:45:38 if we are agreed on wanting openstack-attic, then .*melange.* are good fodder 19:45:45 it isn't like there hasn't been adequate discussion 19:46:07 fungi, ++ 19:46:32 fungi: i think that's implicit in the question; i am also asking if you think moving it elsewhere or not moving it are preferable 19:47:10 it would be a good initial signal as well 19:47:22 see what other dead projects people propose moving to the attic 19:47:39 is your intent to move stackforge there as well? 19:47:44 or should there be a stackforge-attic? 19:47:49 i'm fine with moving stuff into an attic as a low-priority task when we're already doing other more urgent renames, since there seems to be tc consensus that we should not keep things around in the openstack/.* namespace indefinitely 19:47:57 i'll send an email to the infra list suggesting dispositions for these tricky projects, and we can verify that we agree with them, and if so, i'll forward to the tc to see if it's agreeable to them 19:48:23 and if they object, then we'll start writing policy i guess. :) 19:48:38 * clarkb returns to following meeting 19:48:55 (by writing policy i mean proposing changes to governance to make this explicit) 19:49:17 #action jeblair propose organizational cleanup repo renames to infra and then tc lists 19:49:24 #topic freeze and release schedule 19:49:29 clarkb: just in time 19:49:47 #link https://wiki.openstack.org/wiki/Icehouse_Release_Schedule 19:50:03 milstones start thursda 19:50:08 I saw that the first RCs are starting to come out 19:50:19 and was curious if we needed to institute a soft freeze around that happening 19:50:20 oh that seems earl 19:50:21 y 19:50:41 it's later than you think 19:50:47 monday is april 19:51:06 i guess that calendar means rc's start at the beginning of the week that has the 27th 19:51:08 oh, and related to release stuffs, i should get started on the grizzly eol i meant to start yesterday 19:51:13 jeblair: RCs may start anytime now 19:51:17 i initially read it as rcs start on the 27th 19:51:34 theer is no start date for them, the schedule just shows when they are ~expected 19:51:40 yeh 19:51:44 i don't think a 1 month freeze is tenable 19:51:48 the first one shall happen tomorrow fwiw 19:52:09 jeblair: agreed, perhaps we be nice during the first part of freezing (this week) then go backto business as usual? 19:52:22 jeblair: soft freeze, everyone works on storyboard instead ! :) 19:52:31 yay! 19:52:44 sorry what are we freezing? 19:52:47 this week and next week are probably the sensitive times 19:52:49 ttx: i started breaking/fixing storyboard things this morning! 19:52:50 anteaya: infra 19:53:05 when the likelyhood of rc critical bugs still flowing at a good rate exist 19:53:31 sdague, ttx: right, but there aren't any hard deadlines right now 19:53:34 jeblair: the RCS are not that time-sensitive. I'd rather softfreeze the week before release so that we can respin RCs quickly if need be 19:53:45 sure 19:53:56 if a RC has to wait another day due to wreckage, not that big of a deal 19:54:06 and what do we really mean by soft freeze? basically no zuul or nodepool upgrades? 19:54:16 it's definitely on the soft size 19:54:18 side 19:54:19 sdague: unless we really want to, yes. :) 19:54:31 so I'd say, hold off on weird changes (like Gerrit 2.8), and softfreeze from apr10 to Apr17 19:54:31 sdague: basically we tend to spend more time evaluating things (eg FFE) 19:54:35 basically just try to reduce high risk changes 19:54:57 and not schedule outages unless they're absolutely needed to get things un-broken 19:54:57 ttx: well gerrit is already pushed past that, so that's all good 19:55:08 sdague: yes, using it as an example 19:55:20 so yeah, let's be cautious now, but not defer significant work yet, and instead expect a soft freeze around april 10-17 19:55:25 ++ 19:55:25 so for example, not doing project renames while we're in a soft freeze 19:55:36 fungi: yeh, that seems solid 19:55:47 frankly speaking, the following two weeks are less sensitive infra-wise than FeatureFreeze week 19:55:47 probably worth actually figuring out what soft freeze means :) 19:55:56 true 19:56:34 so if you give me two weeks for freezing infra changes, I'd pick feature freeze week and release week 19:56:35 #topic gerrit upgrade 19:56:44 zaro: ? 19:56:44 ttx: good to know thanks 19:57:14 you asked me about using WIP plugin for gerrit 2.8. still interested in that? 19:57:26 zaro: yes, especially if we have a bit of extra time to look into it 19:57:42 zaro: if the patch needed has landed in master, i think we can consider backporting it to 2.8 19:58:14 ok. i'll ask _david_ about which patches are necessary. 19:58:25 zaro: cool, thanks! 19:58:41 thanks everyone! 19:58:45 #endmeeting