19:03:22 #startmeeting infra 19:03:23 Meeting started Tue Jan 12 19:03:22 2016 UTC and is due to finish in 60 minutes. The chair is fungi. Information about MeetBot at http://wiki.debian.org/MeetBot. 19:03:24 Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 19:03:27 The meeting name has been set to 'infra' 19:03:31 o/ 19:03:32 o/ 19:03:51 #link https://wiki.openstack.org/wiki/Meetings/InfraTeamMeeting#Agenda_for_next_meeting 19:03:53 o/ 19:04:02 #topic Announcements 19:04:14 o/ 19:04:34 i didn't have any specific announcements at this point, anything i need to mention for posterity which we won't cover as part of the agenda? 19:04:46 o/ 19:05:08 #topic Actions from last meeting 19:05:11 #link http://eavesdrop.openstack.org/meetings/infra/2016/infra.2016-01-05-19.05.html 19:05:13 there were none, completed successfully 19:05:26 #topic Specs approval 19:05:37 o/ 19:06:07 #info "Consolidation of docs jobs" specification voting is deferred and placed back into an in-progress state due to requested revisions 19:06:16 #link https://review.openstack.org/246550 19:06:18 fungi, yep. I want to enhance a bit more but currently concentrate on the translation work since that's more urgent. 19:06:30 there were no approvals or new proposed specs for voting this week 19:06:50 #topic Priority Efforts: Zuul v3 19:07:01 jhesketh timrc pabelanger anteaya Clint mordred GheRivero zaro yolanda rcarrillocruz SpamapS fungi fbo cschwede greghaynes nibalizer fdegir: ping! 19:07:09 * Clint twitches. 19:07:13 you all put your name on an etherpad in vancouver to help with zuulv3 19:07:16 pong! 19:07:19 I did 19:07:21 yes 19:07:22 yup 19:07:23 yep 19:07:25 yep 19:07:25 I have reviewed some patches 19:07:27 and, as we all know, you can't edit an etherpad after the fact. 19:07:33 true 19:07:35 totally impossible 19:07:35 that is the law 19:07:37 so you're stuck with it. 19:08:04 is there a list of next steps we can get more specific about volunteering to tackle now? 19:08:11 anyway... i have some really bad code that very roughly sketches out the bulk of the major changes in zuulv3 19:08:28 and in the spec, there are a few major work items that i think can proceed in parallel: http://specs.openstack.org/openstack-infra/infra-specs/specs/zuulv3.html#work-items 19:08:36 pong - yes 19:09:12 i think we can take my very bad code and land it on the v3 branch 19:09:24 and then start to make it better with some parallel efforts 19:09:27 probably not too many yet 19:09:40 jeblair: the gerrit topic you are using is feature/zuulv3 (v3) yes? 19:10:04 it's actually a branch 19:10:11 but i think that a) nodepool, b) job definition and playbook loading/organization, c) ansible launching are all major areas that can happen without too much toe stepping 19:10:16 jeblair: indeed! 19:10:27 sorry that is the branch yes: https://review.openstack.org/#/q/status:open+branch:feature/zuulv3+topic:v3 19:10:33 yes, I've been landing some of jeblair's code.. There are some very large patchsets so I haven't finished reviewing them all, but the strategy was to land them and then create followups where necessary 19:10:42 re nodepool, current plan is to get image builder changes in tomorrow and have it running for thursday image builds 19:10:51 ohai 19:11:00 which is sort of an ancillary to zuulv3 related work 19:11:03 clarkb: nice, already to start using them 19:11:04 cool, so we can probably branch nodepool next week for work on (a) 19:11:19 Ya, hopefully 19:11:24 the udea was to branch nodepool after the image worker stuff is in then? 19:11:27 third time is the charm 19:11:27 er, idea 19:11:33 fungi: yes 19:11:37 righteous 19:11:51 does anyone want to start hacking on any of those things? i'm thinking i'd like to take (b).. anyone want (a) or (c) ? 19:12:19 I'm happy to take a look at c) 19:12:23 or did someone have their heart set on (b)? :) 19:12:42 (also happy for (b) if there are no other takers) 19:13:01 I will continue to open patches and stare at them saying i did so and asking questions if I can 19:13:11 we want to branch nodepool before we land the shade patches? 19:13:45 maybe we should land shade? 19:13:51 (err, (a) I meant) 19:14:09 mordred: how ready is shade? 19:14:21 so we're mostly in need of a volunteer for a so that jhesketh doesn't need to work on both a and c 19:14:35 jeblair: shade patch I've been ignoring until the image worker patches land 19:14:38 yeah, the nodepool work is fairly self-contained 19:14:38 specifically "Modify nodepool to support new allocation and distribution" 19:14:44 jeblair: largely just needs to be rebased once the codebase is not churn 19:15:01 sorry, i cannot compromise for that in the next weeks, but i'll be happy to review, and help on the future 19:15:09 mordred: don't we also need to get to the bottom of the larger api call counts? 19:15:15 mordred: or was that figured out? 19:15:17 jeblair: how about I take a quick stab at the rebase after the image builders land, and if it's big, we can shelve it until v3 timeframe 19:15:24 clarkb: figured out 19:15:30 yay 19:15:38 mordred: sounds like a plan. 19:15:45 \o/ 19:16:02 so, no volunteers yet to actually do the nodepool v3 work? 19:16:13 on, I'll definitely work on it 19:16:18 my pipeline has backed up a lot since the summit, so while i'm willing to take a stab at item (a) i'm hoping someone else steps up 19:16:44 * mordred actively wants to work on it 19:16:53 all yours, mordred! 19:16:58 jeblair: i want to work on a) but I'm worried about overcommiting 19:17:07 cool, i'll propose a change to the spec to add these 19:17:20 nibalizer: overcommitting is a time-honored infra tradition 19:17:45 nibalizer: i'm hoping that as these larger things take shape we can start having more people chip in on less all-encompasing tasks 19:18:42 i expect some somewhat tightly scoped subtasks to fall out of these larger tasks as well 19:18:50 jhesketh: i've learned some things this very day about ansible, so we'll chat later 19:19:06 jeblair: sounds good 19:19:23 fungi: i think that's good for now 19:19:25 (I'd also like to chat a few zuul things after this) 19:19:41 thanks jeblair, mordred, jhesketh! 19:19:55 #topic Turning off HPCloud node provider(s) gracefully (clarkb) 19:20:04 this was held over from last week's agenda 19:20:22 we chatted a little after the meeting about plans, but would help to echo the nuggets of that in the meeting 19:20:42 sure, basic plan was to not ease off of it and instead just turn it off off at some point before the 31st 19:20:59 i believe we decided some time shortly before the 31st we should gracefully shift from current quotas to 0 in nodepool 19:21:01 that way we can control when it happens and don't run into funnyness with nodepool operations (say 500 tests all turning off at the same time like aderaan 19:21:24 clarkb: a great disturbance indeed 19:21:32 (alderaan) 19:21:35 jeblair beat me to the quote 19:21:48 the changes to do this have been written, there are two portions, first is to set our max-servers to -1 so that we stop running jobs, then the second removes the configuration from nodepool.yaml 19:21:51 pleia2: did your IRC client just show you a picture of alderaan? 19:21:51 makes sense, and I spoke with some HPE folks last week and they were asking about our plans 19:21:58 +1 pleia2 19:22:01 let me get change links 19:22:17 #link https://review.openstack.org/#/c/264371/ 19:22:19 that one and its child 19:22:24 it might be good to push this back as far as we feel comfortable though... 19:22:28 it would help to get confirmation from hpe that they don't intend to pull that rug out from under us before the 31st 19:22:39 fungi: no, the 31st it is 19:22:45 we will be fine until 31 19:22:47 okay, awesome 19:23:02 so that a) we have more opportunity for other quota to magically show up, and b) do we have some image types that are hpcloud only? 19:23:10 the 31st is a Sunday 19:23:11 so given that's a sunday we likely want to do it on a day people are more likely around to pull the trigger 19:23:12 jeblair: ++ 19:23:17 jeblair: there is one that is only in hpcloud 19:23:28 and I actually need to finish detangling that (or someone does) 19:23:32 it is devstack-centos7 iirc 19:23:51 yup 19:23:54 ianw: were there issues getting that booting in other providers? 19:24:03 I'm at nova mid-cycle the last week of Jan, so won't be around to answer the phone 19:24:14 so I may need to split the 3 changes into 2 changes with devstack-centos7 handled first, then max-servers = -1, then remove hpcloud from config 19:24:22 er 2 into 3 19:24:24 clarkb: ++ 19:24:24 I can maths 19:24:39 fungi: i haven't tried it on the other providers yet. it's on my new-year todo list 19:24:40 clarkb and i aren't really around that week leading up to the 31st either 19:24:57 right, traveling sunday through thurs 19:24:57 fewer phone answers 19:25:02 ianw: are there any voting jobs depending on it? i assume no? 19:25:02 given past experience, my inclination is that it will not work 19:25:24 should not be voting jobs 19:25:26 yeah, I'm leaving for LCA on the 27th 19:25:29 ianw: rax is the only provider with funny lack of dhcp, we may be able to get it workin on the other providers with dib 19:25:41 clarkb: yep, that's the plan 19:25:42 i won't be home until mid-day on friday the 29th but could approve changes that afternoon 19:25:46 ianw: kk 19:26:12 there is also a non zero possibility that I may be buying a house which makes everyting extra crazy 19:26:17 so who _is_ avalilable the 28th/29th to answer the channel and calm and disgruntled masses? 19:26:25 s/and/any 19:26:28 I should be around until the 31st then off to LCA (but still available) 19:26:35 thanks 19:26:36 i mean, i _could_ do it from the plane or a hotel before the afternoon of the 29th but would rather not commit to that 19:26:37 one 19:26:44 fungi: yup 19:26:46 * krotscheck is probably not available 19:26:47 clearly a different timezone to the masses though 19:26:57 I might be around but can't commit yet either ;( 19:27:04 jhesketh: we'll take what we can get 19:27:20 or alternatively we move it to a few days earlier 19:27:26 jhesketh: ++ 19:27:27 * yolanda will not be available, on travel 19:27:36 i expect to be around then 19:27:36 * craige will be already at LCA but you can ping me to help, jhesketh 19:27:53 craige: you'd need to be here to help 19:27:57 jeblair: yay two 19:28:29 so as for what to expect, we're going to roughly halve our current capacity. if that coincides with some major problems in another provider (particularly rackspace) then that could be pretty terrible. otherwise it's likely we'll just be backed up a bit more than usual at peak times 19:29:43 right we also need to work on getting unittests on the new providers 19:30:10 which is the bindep and pre test setup related work 19:31:04 yeah, those macros should in theory be ready to get added to jobs/job-templates now, but more testing would help 19:31:58 the trick is that we need to reorder builders a little bit because bindep's ability to read a project-supplied list of package names means that repo cloning needs to happen prior to revoking sudo 19:32:23 though we could flip to bindep with the global list to start 19:32:26 right now our jobs which revoke sudo do so before cloning the repo 19:32:35 then reorder, either way we have options and much of the work is done we just need to get it in place 19:32:59 yeah, it should just work and ignore repo-supplied package lists in that case 19:33:09 rather, ignore the fact that it can't find any 19:33:22 I am not sure I have time to take that on this week, but can attempt next week 19:33:27 i don't think the reorder should be problematic 19:33:43 the database setup and bindep macros should just be a no-op on bare-.* workers so could be added now 19:34:28 does anybody have time to hack on that a little between now and the end of the month? 19:34:53 i can help but am wary of promising to have available time to do it all before then 19:35:55 i think i can get involved with that 19:36:22 ianw: awesome--it would actually help to have more rh-oriented insight on it too 19:36:35 as mentioned, i'll be keeping on the general rpm distro side of things throughout 19:37:23 i'll get up with you after the meeting on that 19:37:43 so did we decide to merge the hpcloud turn-down changes on friday the 29th? 19:38:28 * craige thought we did. 19:38:31 that's the soonest i could do it that week, but i didn't see anybody else volunteer 19:39:03 I won't be here to respond to questions so I don't have an opinion 19:39:45 29th sounds good 19:39:52 #agreed HPCloud nodepool quotas will be set to 0 on some time on Friday, January 29th in preparation for their public cloud sunset on the 31st 19:39:54 gives us a couple of additional days to work through anything funny that may come up 19:40:19 meh, my grammar was terrible on that but not so terrible that i'm going to fix it 19:40:35 okay, so 3 more topics in the next 20 minutes 19:40:51 er, 4 19:40:58 #topic puppetlabs-apache migration (pabelanger) 19:41:08 #link https://review.openstack.org/205596 19:41:17 pabelanger: how's this going? 19:42:59 or anybody else know why he wanted to discuss it in the meeting? 19:43:26 *crickets* 19:43:31 i guess we can come back to it after the other topics if he returns and there's still time 19:43:32 I do not know 19:43:39 #topic clarifying requirements for moving release tools into project-config repository (dhellmann) 19:43:44 hi! 19:43:50 dhellmann: hi 19:43:50 howdy 19:44:02 so last week during the discussion of release automation it was mentioned that the scripts would need to move as part of that work 19:44:16 I interpreted that as them needing to move into project-config, though that may not be a valid interpretation 19:44:29 so I'm looking for clarification on why and where, to make sure I can line up that work 19:44:55 in fact, now that I think harder, I think fungi said the jenkins slave scripts directory? 19:44:57 dhellmann: sorry if this is obvious, which release tools? 19:45:06 dhellmann: basically any scripts that run on the release.slave or signing.slave hosts will need to be in project-config's jenkins/scripts directory 19:45:22 fungi: ah thanks now I understand 19:45:23 anteaya : good question. There are a bunch of tools in openstack-infra/release-tools. Some shell, some python. They are all needed as part of the release tagging and publishing process. 19:45:41 sorry 19:45:46 19:50:16 dhellmann: i think we discussed a while back that it might need to move to project-config jenkins/scripts directory instead 19:45:49 http://eavesdrop.openstack.org/meetings/infra/2016/infra.2016-01-05-19.05.log.html 19:45:52 fungi : ok, so that's where. I want to understand why, because that's going to make maintaining them much more inconvenient for us. 19:45:52 dhellmann: sounds like scripts that run on release or signing slaves are the ones that need to be moved, yes? 19:45:57 fungi: go distracted, will loop back in open topic 19:46:13 not that I'm objecting, just making sure I fully understand 19:46:30 anteaya : that will be enough of them that we might as well move them all. 19:46:35 oh okay 19:46:44 dhellmann: mostly so that the infra-core and project-config-core reviewers have a chance to audit them to make sure they hopefully don't expose the private key material or other credentials we have secured on those hosts 19:47:11 so scripts that don't run on release and signing slaves, in jenkins/scripts as well (this is for the rest of infra)? 19:47:20 fungi : ok. Since they are a mix of python and bash, how do we manage the installation on the nodes? is it ok for them to pip install things, for example? 19:47:40 dhellmann: in a virtual environment that they setup 19:47:48 dhellmann: we do pip install some things from pypi if that's what you're asking 19:47:53 AJaeger : sure, they use a virtualenv now 19:48:04 though actually now that i think about it, no we don't any longer (twine was an exception) 19:48:19 we now install distro packages of twine on release.slave 19:48:23 fungi : yeah, the scripts look for a virtualenv and create it if they need to, but it's not clear if that goes against security policies to do that at runtime vs. image build time or something 19:48:49 fungi: will have to get back to it next week, need to step away from computer 19:48:59 in particular, the script that figures out which releases were requested is python and need pyyaml and the script that updates launchpad with comments after a release needs some launchpad libraries 19:49:07 dhellmann: we try to make sure any dependencies are installed on the machine in its system context rather than using a virtualenv 19:49:13 there are likely to be others, those are the big ones I can think of off the top of my head 19:49:18 ideally from distro packages of those python libraries 19:49:18 both of those are in ubuntu 19:49:35 ok, so I'll get the rest of the list of requirements 19:49:46 and I guess we can change the script to not use a virtualenv and require that those things be installed 19:49:55 however, some of the scripts themselves are installed as console scripts, too 19:50:00 I guess that will need to change 19:50:06 the desire for packages is more about a desire for stability of jobs that run in the post or release pipelines than security, per se 19:50:16 ok, that makes sense, too 19:50:28 yeah, avoids shifting dependencies 19:50:44 and how do I express the dependencies on system packages for those scripts? 19:51:18 dhellmann: puppet 19:51:21 #link http://git.openstack.org/cgit/openstack-infra/system-config/tree/modules/openstack_project/manifests/release_slave.pp 19:51:36 ok 19:51:50 you can see there where it's installing the "twine" package 19:51:56 for example 19:52:10 also python-wheel 19:52:28 other questions about this? 19:52:57 fungi : yes, can ttx and I get +2 approval on that part of the project-config tree? 19:53:30 dhellmann: I promise to give your changes priority in reviewing... 19:53:39 dhellmann: not really without agreeing to review project-config changes in general. are the scripts that complex? 19:53:54 and going to experience frequent changes? 19:54:04 they tend to change often 19:54:14 fungi : it's hard to say. if I have to move the code, I would like to move the review team with it, though. 19:54:20 but maybe that's just a transition phase 19:54:50 I suggest to let's see how this turns out and then discuss again if needed 19:54:56 fungi : some of the python is a little twisty, but the tagging stuff is pretty simple 19:55:06 dhellmann ttx if one of you proposes something as long as the other +1's AJaeger and I will give it priority 19:55:16 the jenkins slave scripts directory isn't a great place for things that need to be updated frequently -- since they are installed on images, they can take weeks to actually be updated. 19:55:31 yeah, that's not optimal in this case 19:55:31 it seems like it's probably similar to some of the other scripts we have for, e.g., proposing requirements changes or translations updates from a complexity perspective 19:56:24 jeblair: I fear "urgent" release tools updates as we discover problems in a release and need to fix something in the next one coming in 10 min 19:56:27 fungi : there's a lot of launchpad and release notes stuff in here, too 19:56:28 true, not the case at the moment for the release and signing slaves, but in zuul/nodepool v3 those become dynamically-generated workers 19:56:38 in fact, reno is needed and that's probably not available in a system package yet 19:57:14 i'm worried that this topic is going to overrun the remainder of the meeting (and is probably not something we're going to be able to decide in the next 3 minutes) 19:57:22 yeah, I agree 19:57:29 yeah we should move that off meeting 19:58:10 yolanda: zaro: are your topics urgent that they get covered in the next minute? 19:58:10 fungi: have the tc meeting next, so I'll come back to -infra tomorrow and ping you to continue? 19:58:27 fungi, not from my side, i'm progressing on several areas on infra cloud 19:58:52 dhellmann: yeah, it's likely a large-ish change both to the complexity of what we expect to run in release jobs and to the teams expected to review the tooling, so tomorrow would be good to continue 19:58:53 fungi: no 19:59:25 dhellmann: and we should try to get input from more project-config-core and infra-root people 19:59:51 fungi : sounds good 19:59:54 okay, i'm going to defer pabelanger, yolanda and zaro's topics to next week and skip open discussion 19:59:56 thanks everyone! 19:59:59 #endmeeting