19:01:53 #startmeeting infra 19:01:53 Meeting started Tue Nov 25 19:01:53 2014 UTC and is due to finish in 60 minutes. The chair is jeblair. Information about MeetBot at http://wiki.debian.org/MeetBot. 19:01:54 Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 19:01:55 o/ 19:01:57 The meeting name has been set to 'infra' 19:02:08 o/ 19:02:16 o/ 19:02:22 #link agenda https://wiki.openstack.org/wiki/Meetings/InfraTeamMeeting 19:02:38 o/ 19:02:44 #link last meeting http://eavesdrop.openstack.org/meetings/infra/2014/infra.2014-11-18-19.02.html 19:02:45 o/ 19:02:52 o/ 19:03:13 #topic Actions from last meeting 19:03:18 there are going to be a lot of these i actioned to myself and was then too lazy to get around to 19:03:29 * fungi wears hat of shame 19:03:33 clarkb figure out gerrit per project third party voting ACLs and third party accounts via openid 19:03:41 I have a document on that 19:03:55 #link https://etherpad.openstack.org/p/third-party-openid-accounts 19:04:01 o/ 19:04:03 I actually don't think ACLs need to cahnge at least in the first iteration 19:04:12 instead we can do this all purely with gerrit groups 19:04:20 agreed, that could be left to a future improvement 19:04:25 to maintain the current status quo of voting rights 19:04:42 then in the future individual projects could update their acls as I describe there to get a more restricted thing 19:04:51 comments and feedback welcome 19:04:53 belated o/ 19:05:10 o/ 19:06:10 o/ 19:06:12 that seems reasonable.... 19:06:17 o/ 19:06:32 should we have the testers group owned by a group that is empowered to help manage third-party ci systems? 19:06:40 yes 19:06:49 testers _groups_ i guess 19:06:55 i think that group already exists 19:07:18 oh, they are owned by $project-ptl in the proposal 19:07:19 #link https://review.openstack.org/#/admin/groups/440,members 19:07:22 but project-ptl desn't exist 19:07:26 jeblair: ya my initial propsal figured project-ptl could do that 19:07:30 er release 19:07:38 but that can be changed 19:07:43 oh, right, for the "no longer managed by infra" group membership 19:08:12 which was the down-the-road part of your proposal 19:08:14 I don't think those groups should be self owned 19:08:25 agreed 19:08:30 yeah, having them owned by the ptl or core makes sense... 19:08:34 they should explicitly _not_ be self-owned 19:08:52 but then there's no way for third-party coordinators to assist in managing them 19:08:58 maybe that's a feature :) 19:09:01 jeblair: I think thats a good thing 19:09:06 imo this is a project thing 19:09:08 yeah, i think that's the end state 19:09:18 pushing as much of the work into the projects as possible is a good thing 19:09:25 i expect that there will be less consistency around naming, behavior, etc. 19:09:38 infra gerrit admins can still step in to "fix" things which the ptl (or delegates) cannot 19:09:50 true 19:10:35 jeblair: oh also 19:10:57 I haven;t tested this but it would be neat if we could give ownership of the larger third party groups to say the third party manager group 19:11:09 then they can have at least some large level hammer tools 19:11:17 that's doable 19:11:24 but what could they do with that? 19:11:31 manage things 19:11:34 jeblair: remove $project-testers from groups 19:11:35 only remove an entire $project-testers group, right? 19:11:38 jeblair: ya 19:11:49 what was the third-party manager group? 19:11:51 so its not super useful but potentially a thing 19:11:52 that's a really big hammer -- "disable third-party testing for an entire project" 19:12:06 "disable _all_ third-party testing for an entire project" 19:12:23 though perhaps the reverse is useful 19:12:30 "enable any third-party testing for an entire project" :) 19:12:44 that alone is probably reason enough to do it 19:13:02 krtaylor: right now it's anteaya, and she can make accounts voting or not by changing group membership 19:13:11 right, ok 19:13:25 krtaylor: though that has been of limited use because most issues have warrented entirely disabling the account anyway 19:13:36 i'm also unsure the non-voting group is relevant in the new order? 19:14:23 fungi: maybe less so, i guess? if there is less that is special about a ci account, then there's probably less reason to keep a list of them 19:14:30 yes, we had proposed that we "help" that process by some third-party peer team reviewing whether a system was ready to be re-enabled 19:14:43 fungi: oh ya it would be 19:14:49 er would be redundant 19:15:04 if the workflow is that a third-party ci operator creates a gerrit account for their system and then at some later date requests addition into the group enabling verify voting, there will be more than a few out there which don't identify themselves initially (or ever if they don't get around to requesting voting status) 19:15:33 and so we won't really be tracking them, in that case, i don't think 19:16:00 krtaylor: so does this seem workable? 19:16:00 ya I think we can remove the nonvoting group 19:16:22 there are cases where a system doesnt need voting status 19:16:34 but I think that might be ok 19:16:37 krtaylor: right but you get that for free 19:16:41 right 19:17:14 although, it can still need to be disabled due to spam, but that is not limited 19:17:17 oh, i guess the one place where the non-voting group currently has something conveyed by an acl is voting on the sandbox repo 19:17:58 should we have the third-party managers group manage the sandbox-testers group? 19:18:00 krtaylor: infra gerrit admins still have to disable/enable accounts if they're commenting on projects which want them to stop 19:18:01 we could have an acl that allows registered users to +/-1 on sandbox 19:18:11 or jeblair's suggestion 19:18:14 i was thinking the latter, yeah 19:18:19 all registered users 19:18:26 fungi, right, just thinking it through 19:18:40 i'm okay with that -- though requesting voting on sandbox gets you interaction with the third party managers group 19:18:47 though maybe they don't want that interaction :) 19:18:56 hehheh 19:19:29 the other question i have, is whether this is something we can press forward with soon and then redirect all the pending new account requests to this new workflow? 19:19:37 so let me rewrite the proposals to remove nonvoting and suggest registered users get +/- verified on sandbox 19:19:47 +1 to let all users vote on sandbox 19:19:52 then we can argue over those refinements (I don't want to spend too much time on this in this meeting) 19:19:59 fungi: I think we can 19:20:00 sounds good 19:20:16 since gerrit group creation can be scripted and old accounts can be given time to move 19:20:58 maybe i should send a message to the third-party-announce list saying that we've been overwhelmed with requests but have a plan to make it self-service in the very near future 19:21:01 i think if we do want to do all registered users, we should add a new repo 19:21:01 i think sandbox should look like what most people see for code review training purposes 19:21:01 so shall we ask clarkb to take the etherpad and turn it into a patch to the ci.o.o third party docs? 19:21:04 oh wmow lag 19:21:12 jeblair: ya I can do that 19:21:32 jeblair: ++ to that and sandbox 19:21:38 cool 19:21:39 we can make a ci-sandbox repo 19:21:46 where the rules are different 19:21:48 sounds good 19:21:55 I can start putting that together 19:21:57 wfm 19:22:10 sounds good 19:22:14 #action clarkb propose self-service third-party accounts to ci docs 19:22:19 fungi draft initial third-party liaisons description, to later be amended as needed before publication 19:22:29 so a related action from last week... how does this relate? 19:22:34 I'll bring it up in the next meeting also 19:22:40 yeah, didn't get around to it. though now maybe those actually become the ptl delegates who manage voting control 19:22:46 third-party meeing that is 19:22:50 meeting 19:22:59 krtaylor: cool, thanks 19:23:44 fungi: yeah... though if we want the ptl to delegate, perhaps we really should create a new group 19:24:15 thats simple to do as well 19:24:21 jeblair: i expect that in at least nova and neutron (possibly also cinder) cases it will not be the ptl doing it directly 19:24:40 fungi: it does seem like from a personnel pov, the third-party liason should be the one managing the group 19:24:40 maybe start with release and if mikal yells we can make a new group 19:24:46 we could refer to them as liaisons but i think they wouldn't really be liaisoning with infra much if any in that case 19:25:19 though i could certainly be wrong 19:25:22 clarkb: that sounds good; easy enough to change, especially since there's no complicated acls (yet) 19:25:28 yup 19:25:57 fungi, clarkb: so this topic is still relevant in that we do need to communicate with the projects about this new responsibility 19:26:14 but like the cross-projects liaisons it defaults to the ptl i think 19:26:29 jeblair: yep. still a valid action item, and i'll work on messaging for it 19:26:38 so maybe when we have this ready to go, we just need to make a nice message to the list and point to the ci docs 19:26:40 ya an email to the dev list would be good too 19:26:44 ++ 19:26:55 we might or might not want to actually use the word liaison but regardless, still need to have something written 19:27:30 #action fungi draft messaging to communicate the new third-party account process 19:27:38 fungi nibalizer get pip and github modules split out 19:27:55 still on my plate to work with nibalizer on 19:28:03 #action fungi nibalizer get pip and github modules split out 19:28:34 fungi: is this extra complicated? 19:29:14 jeblair: the pip one needs a push --force to update it since the project creation got approved while the upstream split wasn't in sync with system-config 19:29:26 fungi: when you're ready 19:29:29 ok 19:29:38 fungi: i dont think thats actually true 19:29:43 nibalizer: after this meeting is good if you're around 19:29:48 (pip being out of date) 19:29:51 nibalizer: oh, is it possibly still in shape? 19:29:52 but we can talk after meeting 19:29:58 sounds good 19:30:00 fungi push puppet-apache 0.0.4 into puppet-httpd master 19:30:06 the two commits its 'missing' reverted each other so.... 19:30:15 nibalizer: nice! 19:30:26 jeblair: that one's done, so i guess i didn't completely sit on my thumbs since last meeting 19:30:43 fungi refresh storyboard imports and lock lp bugs 19:30:45 krotscheck announce infra projects migration to storyboard 19:30:51 i think those all happened, ya? 19:30:53 jeblair: also done and done 19:30:54 I saw that announcement 19:30:57 yup 19:31:09 #topic Priority Specs 19:31:23 #link story types spec https://review.openstack.org/#/c/129267/ 19:31:59 i'd like to bring this storyboard spec to everyone's attention as it's a pretty significant aspect of storyboard that i think a lot of us have an interest in 19:32:08 so if you have a moment, please review :) 19:32:19 thanks! 19:32:25 #topic Priority Efforts 19:32:25 in the list 19:32:48 #topic Priority Efforts (Swift logs) 19:33:18 i think this is still pending jhesketh doing some performance analysis, unless he has updates i haven't seen? 19:33:31 he is afk this week back next iirc 19:33:38 I believe our cinder issue last week elevated the importance of this one in my eyes 19:33:39 k, let's skip then 19:33:39 he posted a blog with numbers prior to summit though 19:33:46 mordred: agreed 19:33:51 but ya much easier to talk about when he is back 19:33:55 yup 19:34:00 yeah, i think at the summit he said that he's comfortable moving on 19:34:09 so i _think_ the status may be "start doing it" :) 19:34:13 sweet 19:34:19 pull trigger 19:34:24 we do need to get the jenkins plugin to upload post job everywher 19:34:25 are we thinking we might write a thing to upload our old logs to swift? 19:34:31 we are still blockign on that iirc 19:34:40 mordred: no, I think we just let them die on the vine 19:34:41 clarkb: oh, how so? 19:34:45 clarkb: ok 19:34:50 clarkb: is there a change to add it? 19:34:57 jeblair: the way its done today is a build script. if job fials no log uploads 19:35:02 mordred: clarkb: yeah, 4-6 months and they'll be gone anyway 19:35:04 jeblair: so you need a post build action to run the script 19:35:12 clarkb: yeah, there was a plugin that looked like what we wanted 19:35:14 yes iirc there is a change to add it 19:35:23 so we need that in then restart all the jenkinses 19:35:29 then update jobs to use it and retest 19:35:39 we should link to that in the agenda so we know it's a priority review :) 19:35:42 can you find a link for that change and i'll have a look this afternoon 19:35:45 let me find it 19:36:04 that would be a nice present for jhesketh's return 19:36:11 #link https://review.openstack.org/#/c/133179/ 19:36:11 indeedy 19:36:19 thanks clarkb 19:36:47 cool 19:36:56 #topic Priority Efforts (Puppet module split) 19:37:19 asselin: do you have your spec change handy? 19:37:29 I have been working with asselin this week to create the scripts for the puppet module split. I have a script running live on github right now. 19:37:53 nice 19:37:55 asselin: had to step out, I can field any questions 19:38:24 mmedvede, ^^^ 19:38:30 i was just looking for the review we were talking about earlier 19:38:41 oh this one 19:38:42 spec is here: http://specs.openstack.org/openstack-infra/infra-specs/specs/puppet-modules.html 19:38:43 #link https://review.openstack.org/#/c/135452/ 19:39:04 nibalizer: has a change that I think we should get in first 19:39:23 because it reduces the number of changes to system-config for each split 19:39:26 so that looks good except i think we should remove the word "READ:" because i think it's silly to tell people to read something 19:39:33 clarkb: link? 19:39:40 #link https://review.openstack.org/#/c/134723/ 19:40:09 (where first is relative to other system-config changes, the spec is independent) 19:40:58 okay, so we should review both of those 19:41:16 i agreed to merge some changes this afternoon, should i postpone that until after 134723 lands then? 19:41:24 jeblair: imo yes 19:41:28 ok 19:41:39 we have identified a cuase of headache in the process and have a fix for that headache we should do that first 19:41:44 i'll make a point of digging into 134723 first thing after the meeting and then we can rebase the pending changes 19:42:24 sweston: thanks for writing that script 19:42:39 #link https://github.com/Triniplex/puppet-module-split/blob/master/module_split.sh 19:42:44 jeblair: you bet 19:42:48 should we add that to the system-config repo in tools/? 19:43:09 I think it would be a useful addition 19:43:26 wfm 19:43:30 sweston: if you want to relicense to apache2 and propose it, i think it would be welcome :) 19:43:39 i would certainly welcome it 19:43:54 jeblair: sure, I only used that license because it's the one we use for specs ;-) 19:44:24 I think it's a super useful script! 19:44:31 fungi: jeblair I will propose it to the repo 19:44:35 mordred: thanks 19:44:36 heh, yeah, i thought about specs, but i think it's a bit too long for direct inclusion there, may as well put it in with the code :) 19:44:39 thanks sweston! 19:45:01 #topic Priority Efforts (Storyboard migration) 19:45:14 fungi: welcome 19:45:16 Seems to be done. 19:45:19 woot 19:45:20 yep! 19:45:24 #link http://lists.openstack.org/pipermail/openstack-dev/2014-November/051117.html 19:45:37 anything we should talk about? feedback so far? 19:45:59 I'd check but my internet sucks - what came of the openstack-ci/openstack-infra discussion? 19:46:07 if you lose network connectivity storyboard doesn't really let you know about that 19:46:16 lol 19:46:16 whcih may be important for people like mordred 19:46:16 i (or someone) needs to remember to periodically refresh system-config until we can close down openstack-ci bugs 19:46:22 thats the only thing I have really noticed 19:46:43 fungi: are people filing bugs there? 19:46:51 i mean, non e-r bugs 19:46:59 yeah, some 19:47:03 jeblair: i think one came in just before the meeting 19:47:06 sudo storyboard-migrate --from-project openstack-ci --to-project openstack-infra/system-config --auto-increment 2000000 19:47:07 (I still get email notifications that I pay attention to :)) 19:47:09 for reference 19:47:29 It’s a bit hard to see new tickets though. 19:47:30 also organization of bugs is a little bit hard 19:47:33 ya that 19:47:59 fungi, pleia2: should we mark all non e-r bugs in lp as invalid? 19:48:01 krotscheck: do we need to increase --auto-increment each time we run it to be higher than the most recently non-imported new bug number? 19:48:12 (or will that mess up the sync?) 19:48:21 fungi has been marking Won't Fix I think 19:48:23 jeblair: they will get imported 19:48:33 pleia2: i haven't been 19:48:45 fungi: oh, maybe just some new ones I saw? 19:48:49 i don't think it will mess up the sync per se 19:49:14 we'll just wind up importing an invalid/won't fix/whatever copy of a bug we asked someone to refile in storyboard 19:50:15 well, we're probably just going to have to live with it for a while no matter what we do 19:50:43 agreed 19:50:45 could we set up a new lp tracker for e-r 19:50:51 then shutdown openstack-ci? 19:50:54 is it useful to keep this topic on the agenda, or should we drop it? 19:51:03 clarkb: problem is, the best name for that is 'openstack-ci'. 19:51:23 mtreinish and jogo indicated a willingness to help switch e-r over to a different lp project for bug tracking 19:51:35 clarkb: I thought that was the original plan and that we'd liked the name 'openstack-gate' 19:51:38 if we decide that's the appropriate route 19:51:41 or add e-r support for storyboard 19:51:43 which is why this: https://launchpad.net/openstack-gate was made 19:52:01 fungi: it actually doesn't make a difference which project the bugs against 19:52:05 when we were sitting in the room in paris hacking on this 19:52:06 it looks them up by number 19:52:10 I think we keep the topic open if we intend on doing one of these things otherwise we shouldn't need it 19:52:39 fungi:: No need to update the autoinfrement after the first time 19:52:54 mordred: oh i must have been running around at the time :) 19:53:02 i think it's probably safe to drop this topic from the priority efforts agenda and just raise it as a normal meeting topic when there are updates we need to discuss 19:53:19 anyone opposed to moving e-r to openstack-gate and closing openstack-ci? 19:53:25 krotscheck: okay, does continuing to set it break anything? 19:53:33 fungi: Nope. 19:53:34 jeblair: i'm in favor 19:53:41 krotscheck: thanks 19:53:59 i assume clarkb is as well so 19:53:59 I think we should do that 19:54:02 yup 19:54:04 jeblair: i think it would help cut down on infra-proper bugs getting filed on lp now 19:54:06 jeblair: go ahead 19:54:15 #agreed close openstack-ci and move e-r to use openstack-gate 19:54:22 who wants to own that? :) 19:54:46 i'll volunteer since i already handled the others 19:55:08 #action fungi close openstack-ci and move e-r to use openstack-gate 19:55:13 #action fungi work with jogo/mtreinish on a plan to switch elastic-recheck to a new lp project 19:55:16 #undo 19:55:33 #topic Priority Efforts (Nodepool DIB) 19:55:44 we opened a can of worms with this one 19:55:45 so this moved forward then rolled back 19:55:47 ya 19:55:49 ya 19:56:07 on my todo list for today is to attack booting dib images in rackspace 19:56:32 though first we need to get a handle on whether we can run nodepool on trusty safely 19:56:38 i think our current plan for the centos7 issue is: fix async io in gear; move zuul to it; try trusty nodepool with that 19:56:39 so there are several concurrent things happening here 19:56:46 and separately is the thing mordred is working on 19:56:48 yah 19:57:06 there is the thing jeblair describes. the thing mordred describes then a set of changes yolanda and I have written to make nodepool generally better with dib 19:57:08 I mean, dib or no dib, figuring out nodepool on trusty is important 19:57:27 #link https://review.openstack.org/#/c/130878/ 19:57:36 #link https://review.openstack.org/#/c/126747/ 19:57:44 #link https://review.openstack.org/#/c/137110/2 19:57:49 especially since we have a bug and i don't think it's trusty's fault, that's just where we happen to have observed it emerge 19:58:03 yup 19:58:22 i'm planning on digging into the gear io issue this afternoon 19:58:31 did anyone see the thing I wrote in -infra about my intended approach for network on rax? 19:58:46 mordred: i did not; last i saw was you were looking into options 19:58:48 essentially - put in an init script that tries to mount a config drive partition if it's there, and if it is able to do that and there is network information in the config drive instance, replace the DHCP config with static config based on what's pulled from config drive. if there isn't a config drive, then fail with no error 19:59:19 mordred: that won't work 19:59:20 so that way we can control whether we need for network to be dealt with by enabling config drive on instance boot 19:59:22 clarkb: why not? 19:59:24 because the cloud drive doesn't have the correct data 19:59:29 per johnthetubaguy 19:59:36 I dont' believe that's what he said 19:59:37 also there is no dhcp in rax fwiw 19:59:45 mordred: does hp support config drive, or are they dhcp only? 19:59:52 right, we'd need to inspect the configdrive content to see if the network details are included and fall back on something else if not 19:59:52 I believe what he said was that they did not have a mechanism to use config-drive to set the network info 20:00:03 but I'm going to explore this empirically today 20:00:08 and not depend on something I've heard 20:00:16 mordred: ++ thanks! 20:00:19 i looked at what's on rax's configdrive currently and they don't include the network configuration, but they do still have configdrive working 20:00:23 mordred: I guess I don't understand your fallback case 20:00:25 jeblair: hp supports config drive, but I do not believe we need to sue it there 20:00:27 we know it will fail 20:00:30 so we need to handle that 20:00:37 clarkb: dhcp is for hp 20:00:44 clarkb: because we want the same images 20:00:47 oh right 20:00:53 clarkb: if we don't get the info from the config drive, we delete the instance 20:01:08 because it'll fail the "ssh has come up" test after X time 20:01:20 but first, I'm going to make sure that this is even possible at all 20:01:20 mordred: right and thats a bad situation if it fails 100% of the time 20:01:22 then I'l poke at automation 20:01:25 sure we should test it 20:01:30 thanks everyone! 20:01:32 clarkb: I'm pretty sure I'm not going to try to do this without having tested it 20:01:32 Quick reminder: the Infra-manual documentation Sprint is coming up on Monday-Tuesday: http://lists.openstack.org/pipermail/openstack-infra/2014-November/002088.html (I'll email out a reminder too) 20:01:47 pleia2: thanks! 20:01:51 woot! 20:01:53 #endmeeting