19:03:46 #startmeeting infra 19:03:46 Meeting started Tue Jun 28 19:03:46 2016 UTC and is due to finish in 60 minutes. The chair is fungi. Information about MeetBot at http://wiki.debian.org/MeetBot. 19:03:47 Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 19:03:49 The meeting name has been set to 'infra' 19:03:53 #link https://wiki.openstack.org/wiki/Meetings/InfraTeamMeeting#Agenda_for_next_meeting 19:04:00 #topic Announcements 19:04:12 #info Gerrit downtime on 1 July 2016 at 20:00 UTC 19:04:19 #link http://lists.openstack.org/pipermail/openstack-infra/2016-June/004463.html 19:04:28 #info Reminder: late-cycle joint Infra/QA get together to be held September 19-21 (CW38) in at SAP offices in Walldorf, DE 19:04:37 #link https://wiki.openstack.org/wiki/Sprints/QAInfraNewtonSprint 19:04:47 #info Two new infra-core reviewers/infra-root sysadmins: ianw and rcarrillocruz 19:04:50 a hearty welcome to both of you! feel free to go ahead and propose changes to add yourselves to the relevant places in the system-config repo and hit me up for appropriate group additions after the meeting 19:04:57 \o/ 19:05:16 \o/ thx folks! 19:05:21 ianw: rcarrillocruz: congrats and welcome to both! 19:05:22 ianw and rcarrillocruz have been helping us out for quite a while, and i'm looking forward to what they'll be able to do next 19:05:30 congrats! 19:05:34 welcome, ianw and rcarrillocruz ! 19:05:52 i'll get an announcement out to the mailing lists later today as well 19:06:02 looking forward to helping out more 19:06:12 ++ 19:06:38 congrats! 19:06:50 thank you both for volunteering for these additional responsibilities, and demonstrating your ability to get things done without them up to now 19:06:57 yay! 19:07:04 welcome! 19:07:24 #topic Actions from last meeting 19:07:30 #link http://eavesdrop.openstack.org/meetings/infra/2016/infra.2016-06-21-19.03.html 19:07:30 o/ btw 19:07:38 pleia2 send maintenance notification for static/zuul server upgrades 20:00 utc friday july 1 19:07:43 that's done (linked during our announcements a few minutes ago). thanks pleia2! 19:07:59 #topic Specs approval 19:08:03 rcarrillocruz, congrats! 19:08:06 when it rains, it pours 19:08:12 #info APPROVED Priority specs cleanup/update for Newton cycle 19:08:18 #link http://specs.openstack.org/openstack-infra/infra-specs/ 19:08:28 #topic Specs approval: PROPOSED Update Artifact Signing details (fungi) 19:08:35 #link https://review.openstack.org/332968 19:08:41 this is mostly minor updates to account for the fact that we're no longer using jenkins, but doesn't really impact the important details of the spec itself 19:08:53 #info Council voting is open on "Update Artifact Signing details" until 19:00 UTC, Thursday, June 30 19:09:33 mostly just wanted to get it through a formal vote since it includes some adjustments like a new job node naming scheme we discussed in #openstack-infra late last week 19:09:56 (using "ci" instead of "slave" now that we no longer have jenkins slaves for our job nodes) 19:10:08 #topic Specs approval: PROPOSED Finglonger (nibalizer) 19:10:13 #link https://review.openstack.org/310948 19:10:16 this is mostly just rehashed from last week 19:10:18 looks like this has been updated based on feedback from last week 19:10:31 one proposal now, with some feedback from ianw and jhesketh 19:10:57 sounds good. ready to put it up for a vote? 19:11:55 i'm assuming yes 19:11:56 i am 19:11:58 #info Council voting is open on "Finglonger" until 19:00 UTC, Thursday, June 30 19:12:01 woo 19:12:10 #topic Specs approval: PROPOSED Add wiki modernization spec (pleia2, fungi) 19:12:27 i listed pleia2 as she's the author of the spec, though i'm the primary assignee 19:12:31 this is already underway, so just wanted to get it up for a formal vote and recorded 19:12:49 I haven't looked at this one. I'll do that today 19:13:49 cool. you've got a couple days to register concerns 19:13:58 and we can always amend it later as needed too 19:14:08 #info Council voting is open on "Add wiki modernization spec" until 19:00 UTC, Thursday, June 30 19:14:23 oh, forgot to link it 19:14:29 #link https://review.openstack.org/328455 19:14:47 #topic Priority Efforts 19:14:56 i've updated the list in the agenda to reflect the new priorities list on the specs site, but doesn't look like we have any urgent discussion for them flagged at the moment. anybody have last-minute updates for one of these? 19:14:58 clark has a proposed priority addition too, which i'm reordering to be the next discussion topic 19:15:54 is that my cue? 19:15:56 #topic Move testing of master to Xenial from Trusty (possible priority effort candidate) (clarkb) 19:16:04 _that's_ your cue ;) 19:16:22 standby clarkb, go clarkb 19:16:25 so ya Xenial images are being built, openstack-ansible and openstack puppet are both using it at this point 19:16:45 Ideally we would get everything using the default of ubuntu trusty to using xenial this cycle 19:17:07 clarkb, on that note, we should be good to move to xenial with kolla as well 19:17:16 The tricky bit comes from how we want to keep testing liberty and mitaka on trusty but master/newton and future branches on xenial 19:17:35 since we use gearman we need to make sure that on the JJB side we register jobs on both image types 19:17:51 then on the zuul side we need to choose trusty or xenial based on the branch or other info 19:18:18 there was one possibility discussed where we switch to different job names for trusty vs xenial jobs 19:18:28 yes I pushed a change to mock that up 19:18:30 and then filter those jobs by branch instead 19:18:38 let me get a link 19:18:50 #link https://review.openstack.org/335166 19:18:52 both solutions are ugly, so it's a matter of trying to decide which is less ugly 19:18:54 what about documentation jobs? Do we need this split there as well - or can we move them for all branches to xenial? 19:19:03 AJaeger: in the past we have split those iirc 19:19:17 but I would have to read git logs to be sure 19:19:40 yeah, _if_ there are tooling specifics/dependency pins which keep your earlier branches of documentation working, we might rbeak those by stuffing them all onto xenial 19:19:54 s/rbeak/break/ 19:20:01 so i think i like the name approach, there *may* be cases we want to cross-test thing to trusty as a one-off job [ever]? 19:20:10 335166 basically encodes the image into the job name, then in zuul we can separate them based on job name suffix. This is nice because it is very explicit and should be hard to confuse anyone over where and why their job is running there 19:20:12 I see little risk to doing all branches the same for documentation jobs. 19:20:16 clarkb: ++ 19:20:21 the potentially huge issue with that is it doubles our job count 19:20:22 I like 355166 19:20:49 though it doubles our job _name_ count really 19:20:55 the names are more flexible than matching strictly on branches. 19:20:58 but indeed tricky to figure out, fungi 19:21:02 it doesn't especially increase our job _registration_ count in gearman 19:21:12 we do have issues with that in general as is 19:21:20 fungi: correct, since we would register different jobs for each image either way 19:21:20 I have some concerns on the regex format for matching, but we can talk about that in the review 19:21:32 we have issues with teh job registrations in gearman, but they'd be roughly the same either way 19:21:37 * notmorgan nods. 19:21:57 the alternative which we used with precise/trusty split is to have a zuul function implicitly determine tings based on job name nad branch 19:22:14 basically from gearman's standpoint it's a difference between ubuntu-trusty:gate-nova-pep8 and gate-nova-pep8-trusty 19:22:15 the nice thing about what we did with precise/trusty split is few people needed to think about the differences here and it mostly just worked 19:22:23 but anytime someone started changing jobs it could get a little confusing 19:22:51 the only issue I see about adding node into the job name, is we'll likely hit the pip 127char limit for the interrupter. So we need to keep that in mind 19:22:55 adding jobs for anything that wasn't trusty or precise based got pretty painful 19:23:53 since you needed to update regular expressions in zuul's parameter function script and also be mindful not to accidentally reorder regex matches on jobs in the layout 19:24:23 i think the precise/trusty split was pretty heavy on the implicit magic, it's easy to forget that openstack_functions.py is doing things 19:24:24 fungi: yup, but those are still the a tiny minortiy in our jobs 19:24:40 of our 8k jobs maybe a few hundred are affected by that 19:25:10 so it had the benefit of keeping the jjb configuration simpler but made the zuul configuration bery complex, and often resulted in an infra-root logging into zuul and digging in debug logging and gearman registrations to figure out why things weren't firing or were running on the wrong nodes or were coming back not registered 19:25:34 clarkb: we're at 9763 jobs 19:25:47 well, the vast majority of our jobs are configured by a comparatively small number of job-templates 19:26:03 indeed, they are 19:26:20 so if you look at it from the perspective of job configuration surface area rather than expanded job list counts, they present a more significant proportion 19:26:43 oh it was definitely an issue, I am just wondering if all of a sudden we hvae 20k jobs if we made anythin gbetter 19:26:52 I really dislike both options available to us but can't think of a third 19:27:21 a quick grep: 140 jobs and 410 job-templates are setup right now 19:27:37 we will be trading confusion around gearman registration with confusion over why didn't any jobs run on my stable branch or why did not jobs run on my master branch 19:27:40 probably the biggest point using magic parameter functions has going for it is that we've done it once already so we know wher ethe warts are 19:28:19 devil you know vs the one you don't 19:28:28 also, keep in mind, we don't have NOT_REGISTERED any more. So, it will be slightly hard to debug too 19:28:42 I think either way there will be confusion over particular corner issues 19:28:47 pabelanger: we don't? why not? 19:29:10 aiui the gearman side of zuul didn't change so we should still get those regardless of using zuul launcher or jenkins 19:29:22 clarkb: let me get review, but mostly because we have JJB tests in project config to do the validation now 19:29:52 pabelanger: oh so zuul can still do it we just gate on not letting it happen 19:30:12 https://review.openstack.org/#/c/327323/ 19:30:15 the thing i like the most about the job name mangling proposal is that it's easier for non-rooters to debug. when you look at a change on your project in an old branch you see what jobs ran and you can tell by job name what platform they needed to run on 19:30:27 anyways I am happy to help implement either choice, I don't think we will be completely happy with whatever we choose so its a matter of picking one and making it work 19:30:29 clarkb: ya, we just added a config flag to enable / disable it 19:32:00 I should probably send mail to the list to get a larger set of eyes on this then we can maybe start implementing after feedback next week 19:32:18 so anyway, i'm leaning toward job name mangling this time around instead of parameter function script doing turing-complete tasks to decide what node type to use 19:32:25 but I think we should consider this priority work in order to get it done this cycle 19:32:54 yeah, i agree it needs to be added to the priority list and an ml discussion could probably help bring it to a wider audience (since we're missing a few people today) 19:33:09 I'm happy to help out were needed too 19:33:26 to make it "officially" a priority it will need a (very small) spec submitted outlining the plan 19:34:07 can do that once we get a bit more feedback, there may be a completely different option here I am not considering 19:34:20 yep, agreed 19:34:41 #action clarkb bring xenial default job transition discussion to the mailing list 19:34:52 anything else on this? 19:34:58 not from me 19:35:17 thanks clarkb, for getting this moving 19:35:20 #topic Bikeshed on Gerrit account name for automated release tag pushing (fungi) 19:35:22 #link http://eavesdrop.openstack.org/meetings/releaseteam/2016/releaseteam.2016-06-24-14.00.html last week's release team meeting 19:35:28 in discussion with the release team about their automation goals, we weighed the pros and cons of reusing the existing proposal bot account to push release tags, or setting up a separate gerrit account for a release bot so we could limit blanket tag permissions to that 19:35:41 the account name we came up with is "OpenStack Release Bot " where the e-mail address matches the one i used in our artifact signing key 19:35:49 #link http://p80.pool.sks-keyservers.net/pks/lookup?op=vindex&search=0x64DBB05ACC5E7C28&fingerprint=on&exact=on artifact signing key for newton cycle 19:35:56 the credentials for the account would be puppeted from hiera, as usual, and in thos case would only get used by the signing.ci.openstack.org job node 19:36:02 #link https://review.openstack.org/#/q/topic:artifact-signing+status:open changes to add signing.ci.openstack.org 19:36:55 looks great 19:37:01 just wanting to make sure there were no significant objections to that idea before i go create the gerrit account and add puppeting to install its ssh key and whatnot 19:37:37 yah seems fine 19:37:51 the current change stack doesn't add that yet since it was outside the scope of the artifact signing spec 19:38:00 fungi: people will expect there to be bot code somewhere :) but name is fine 19:38:31 yeah, we at one point a while back decided that gerrit accounts used by automation would have "bot" in their names 19:39:10 in retrospect that has caused some people to assume there is an actual robot with articulated servo-operated appendages typing at the other end of the line or something, i guess 19:39:41 The servomechanisms are virtualised. 19:40:10 right. maybe it just causes them to assume there's a single codebase which performs all the functions they see manifested on the surface as whatever the account name is 19:40:33 anyway, i'm not too concerned about that. we're usually pretty quick to explain that buance to people 19:40:40 er. nuance 19:40:52 my fingers are off today. must be a malfunctioning servo 19:41:11 ya I think people expect they can run a single process and magically have signed releases :) but ya not a huge deal 19:41:32 sounds like no objections, so i'll move forward with it and let people scream later if it turns out they didn't like my choice 19:41:44 #topic Open discussion 19:41:57 o/ 19:42:00 i know inc0 and mrhillsman had an interest in talking about the new osic resources some more? 19:42:14 yeah, soo 19:42:16 maybe finish fleshing out a plan to bring them online in nodepool? 19:42:25 i put a feeler out on the infra-list thread about the midcycle trying to figure out what the topics are 19:42:35 we want to deploy openstack for infra, but we want to re-deploy it every now and then 19:42:36 nibalizer: saw that--great idea 19:42:41 so anyone with feelings on what topics should get worked on, would love some feedback 19:42:50 (im on the fence for attending) 19:42:55 that means, we need good way to quickly bootstrap whatever you guys needs on fresh openstack 19:43:10 #link https://etherpad.openstack.org/p/infra-redeploy-notes 19:43:12 inc0: notably, we are not all guys 19:43:20 * fungi is a meat popsicle 19:43:29 rcarrillocruz: any gut feeling on whether or not we will have servers in a useable state mid september for the mid cycle? 19:43:32 I think infracloud and finglonger+automated infra deployments would be good imho 19:43:39 I assume so 19:43:45 my bad guys, gals, better-not-to-tell and popsicles of various substance 19:43:46 rcarrillocruz: when is the move happening? 19:43:55 Last I heard the move of racks will be end of July 19:44:31 rcarrillocruz: and a month or so is enough time for them to figure out networking again and passwords on ilos? 19:44:31 rcarrillocruz, pardon my ignorance, could you please tell what infracloud and finglonger is? 19:44:48 I haven't dug into infracloud in the last 10 days. I was hoping to do so this week 19:45:06 Finglonger you better ask nibalizer to link you 19:45:08 we actually have specs (one approved, priority; one proposed for approval this week) on them both, but quick summaries might be nice 19:45:28 Infracloud is hw donated by HPE to run our own cloud provider 19:45:37 also, we can help you guys write any code or whatever is needed 19:45:43 #link http://specs.openstack.org/openstack-infra/infra-specs/specs/infra-cloud.html Infra-cloud spec 19:45:46 (for reference) 19:45:49 Sorry if I cannot link you, typing from phone 19:45:54 rcarrillocruz, well, we will deploy openstack 19:46:05 so you'll get this part covered 19:46:25 what I'd need is automated "clean openstack to infra-compatible tenants and stuff" 19:46:51 Hard to tell clarkb, past experience is not on our side 19:46:58 rcarrillocruz: indeed 19:47:03 #link http://git.openstack.org/cgit/openstack-infra/puppet-infracloud/ Infra-cloud primary Puppet module 19:47:06 (also for reference) 19:47:08 rcarrillocruz: will you go to the midcycle? 19:47:13 rcarrillocruz: if thats the case should we plan on doing that work during the midcycle? that is what I am trying to figure out 19:47:29 because if infra-cloud fell through, and we just hacked on finishing puppet-apply that would be worht it for me 19:47:31 rcarrillocruz: set some reasonable expectations so that people don't fly around the world for that hten find out it can't happen because raisins 19:47:40 inc0: let's sync up next week on that initial cloud bootstrap 19:48:09 rcarrillocruz, sure, I'm usually on #openstack-infra chat so ping me whenever you have info 19:48:23 I will try to go, dependa o 19:48:26 nibalizer: I haven't submitted my travel request for midcycle either. waiting to hear what topics we'll be focusing on too 19:48:27 in the meantime, we can definitely still work through manual bits 19:48:32 On budget 19:48:33 rcarrillocruz: thanks for showing up! 19:48:45 have a good remainder of your holiday 19:48:46 yeah, we can go manual and just automate it later 19:49:10 but it definitely sounds like your goals and ours (with infra-cloud automated deployment) may be compatible 19:49:16 anyway, could we get your eyes on etherpad so we'll at least figure out potential issues? 19:49:29 yeah, seems like it 19:49:58 input from clarkb and nibalizer would probably be good since they've done new environment turn-ups in nodepool recently-ish i think 19:50:12 ish 19:50:24 basically the best experience on our end IMO is the public cloud model 19:50:35 yeah, I'd guess 19:50:35 Infrastructure as a service works best if I don't have to care about hte infrastructure 19:50:49 we care about infrastructure 19:50:57 the bluebox model of semi managed semi not here have half an admin is more work on things that we don't really care about I don't think 19:51:05 what we need is to setup and re-setup anything on top of clean openstack 19:51:44 if you can hand us account credentials with networking already configured for sanity (eg provider networks) then we just point nodepool at it and we are done 19:52:13 clarkb, yeah, that won't change, but I think potential issue was to setup mirror for images and stuff 19:52:27 we can just try and figure it out as we go... 19:52:38 thats not a big deal, we do that on our end with every other cloud and takes a few minutes 19:52:57 so I guess "figure it out as we go" it is? 19:53:02 yes 19:53:07 fair enough 19:53:08 Ya, just tested it last week. No problems launching a mirror 19:53:33 cool, we'll get back to you then when we have hardware up and running 19:53:35 pabelanger: which cloud? 19:53:48 tripleo? 19:53:48 clarkb: tripleo-test-cloud-rh2 19:53:51 ya 19:53:55 ah cool 19:54:06 used launch-node.py, worked 19:54:49 cool 19:54:50 yeah, since the mirror servers are basically stateless, cookiecutter web server frontends to afs, there's not much to it i guess 19:55:00 Yup 19:59:13 and we're at time 19:59:24 thanks everyone! 19:59:31 thanks 19:59:31 #endmeeting