19:01:25 #startmeeting infra 19:01:26 Meeting started Tue Apr 10 19:01:25 2018 UTC and is due to finish in 60 minutes. The chair is clarkb. Information about MeetBot at http://wiki.debian.org/MeetBot. 19:01:27 Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 19:01:29 The meeting name has been set to 'infra' 19:01:38 #link https://wiki.openstack.org/wiki/Meetings/InfraTeamMeeting#Agenda_for_next_meeting 19:01:54 #topic Announcements 19:02:11 clarkb, I just added an item under open discussion 19:02:17 I don't have any announcements. Anything we want to announce? 19:02:19 anteaya: ok 19:02:30 thanks 19:02:51 I guess a reminder that may 3, 2018 is gerrit upgrade day 19:03:00 *gerrit server upgrade, not gerrit service upgrade 19:03:16 there'll probably be a social media push for the ci/cd opendev later today if people want to help promote it 19:03:36 o/ 19:04:27 as not a social media person myself, i won't be doing that, but i figure some people do that sort of thing ;) 19:04:33 #info Gerrit server upgrade to Xenial on May 3, 2018 19:04:42 facebook is looking fragile today 19:04:54 #info OpenDev social media push if you want to help promote it 19:05:09 twitter is apparently the new facebook, or something like that 19:05:19 +1 19:05:40 #topic Actions from last meeting 19:05:48 #link http://eavesdrop.openstack.org/meetings/infra/2018/infra.2018-04-03-19.01.txt minutes from last meeting 19:06:07 We didn't have any formally noted #actions but we did have a bunch of stuff come out of last weeks meeting and I think many of those happened 19:06:29 the gerrit upgrade was announced to the dev list, we sent a mini irc abuse faq email to the dev list, discussion on new priority efforts was started and so on 19:06:38 thank you everyone that helped get all that done 19:07:20 #topic Specs Approval 19:07:44 I don't think there are any new specs up for approval. But I did merge a the specs that I said I would get merged last week 19:08:14 also if you ahve a moment dmsimard's eavesdrop updates spec and anteaya/pleia2's survey spec could use reviews 19:08:38 #link https://review.openstack.org/349831 Survery service spec 19:08:52 #link https://review.openstack.org/550550 IRC discoverability spec 19:09:00 #topic Priority Efforts 19:09:05 #topic Storyboard 19:09:24 fungi: sounds like the migration process continues? any luck sorting out bug updates from gerrit events? 19:09:46 aside from importing cyborg i haven't done much since the last meeting, no 19:10:18 it's on my list for the next day or so to start playing with story commenting from review-dev to storyboard-dev 19:10:43 cool, offer still stands to help there should that be useful since I did the gerrit 2.13 upgrade and missed this (I feel abd :P ) 19:11:03 i'll definitely loop you in with what i find, at least 19:11:12 #topic New Priority Efforts 19:11:20 fungi: I think 19:11:22 gah 19:11:26 * mordred doesn't computer well 19:11:36 mordred: should I undo if there is more storyboarding to talk about? 19:12:38 I guess not? /me continues with new priority effort discussion 19:13:12 we can catch up in open discussion if need be 19:13:12 the mailing list thead on this has pointed out an important topic which is modernizing our control plane deployment tooling. Options that have come up are supporting puppet 4 or 5, and ansibilifying our control plane 19:13:45 these were not on my original list but thinking about it more this is likely our biggest current pressing issue 19:14:04 I haven't had a chance to comment myself, but lively discussion 19:14:22 Both of the specs related to this have gone a bit stale I think. Merge conflicts if nothing else. A good next step may be to update both specs so that we can consider our options with up to date information 19:14:34 i wonder whether dropping the central puppetmaster and removing puppet from our test node images means that we're now only testing everything across a common puppet version because we try to install it all on one test node 19:15:26 fungi: we run the apply tests on a variety of distros 19:15:27 and whether we could incrementally move systems to newer puppet if we stopped assuming we use a common puppet version everywhere. no need for a big bang upgrade 19:15:33 fungi: which should use the puppet for that $distro 19:16:04 fungi: thats a good point, and I think that would be doable 19:16:27 i suppose system-config is a bit of a linchpin where we'd need backward-compatibility 19:16:31 I thought distro puppet was still lagging on the version they use 19:16:43 fungi: as in, making install_modules.sh not the canonical source of puppet modules? 19:16:48 pabelanger: it is and its largely why we've been fine on puppet3 19:17:03 pabelanger: but we don't have to use distro puppet (and haven't in the past) 19:17:17 yah, bionic jumps up to 4.10 19:17:23 ianw: maybe more than one module list, or dropping the central module list, yeah 19:18:22 I do like the idea of a central puppet list, if only to keep things in sync across hosts. But do agree, it is a little harder to deal with upgrading 19:18:27 using the limesurvey module we were trying to use as a recent example, it's over a year old and claims to need puppet 4.5 or newer since it was initially created, so even bionic's system puppet would be too old 19:19:08 clarkb: I think we should take a step back and think holistically about how we manage things. as it stands now I'd be more in favor of work to shift to ansible than work to rewrite puppet to support a new puppet ... 19:19:25 mordred: yes I'm suggesting we update both specs 19:19:29 and evaluate them together 19:19:43 both specs are out of date and if we were to implement them would need updates I think 19:19:57 but then we can consider them both and decide which is worth the effort 19:20:00 clarkb: but I can also make arguments that it's time to consider if/how containers fit in to what we're doing and whether adopting their use in places would make our lives better 19:20:10 well, updating existing puppet modules would still presumably be way less effort than rewriting them all in ansible (i strongly doubt most of the service we rely on have equivalent ansible modules already) 19:20:22 clarkb: so maybe for completeness I should write a straw-man spec for that too (even if the answer winds up being nope) 19:20:30 mordred: ya that could be useful as well 19:20:40 I'll rebase the ansibilfy spec later today 19:21:15 and then there is another meta idea where we take fungi's puppet idea and could expand that to be service specific deployment tooling (I don't think we really want to do that at least not long term but may make sense for incremental deployment to get to some future state) 19:21:20 i'd be curious to read the containers spec, because so far i can't even begin to imagine how they make production deployment any simpler for us 19:21:37 fungi: agree - however, it could be the case that we could simplify what we're asking of the puppet modules currently (the game where people's apps are better after they migrate from one tech to the other - not bcause the new tech is better, but because they refactor the app armed with more knowledge than when the app was written the first time) 19:21:52 but I think one of the struggles with this topic has been its a difficult one in general and we haven't actually written down proper start to finish specs that we can read side by side 19:22:10 clarkb: ++ 19:22:14 which I think will help everyone understand the needs and what is involved and allow us to make a better decision (if a bit more upfront work) 19:22:23 fungi: broadly speaking i think containers could make things easier by completely decoupling apps from system. our biggest problems seem to be that we keep running into os-upgrades interfering with app upgrades. 19:22:38 clarkb: yeah, i wouldn't expect us to try to run multiple puppet versions forever, just suggesting that finding an incremental upgrade solution might be easier than an all-at-once 19:22:41 they are not the only solution to that, but they are one. :) 19:22:45 fungi: ya 19:22:57 sounds like we have a volunteer for ansible and a volunteer for container spec 19:23:04 corvus, fungi: yah - and also the coupling of installing softare and configuring software in our puppet modules 19:23:14 anyone willing to go back over our puppte spec and consider fungi's idea there as well as puppet 5? 19:23:24 i also think this touches on the openstackci question 19:23:28 (I'm guessing that one might fall on me by default if no one else wants it) 19:23:45 corvus: ya I think so too 19:23:56 The bonus with ansible, is there are container connection plugins, if users choose to use them. Or how OSA does lxc things on the remove node 19:24:00 ianw brought this up in zuul meeting yesterday, and it'd be nice for us to discuss it more on its own. but regardless, we need to decide what we want the future of that to be 19:24:24 pabelanger: I think I'd like to avoid using less common container tooling fwiw 19:24:26 clarkb: it's also possible if nobody steps up to write the puppet spec that we've learned that nobody really wants to champion it as whatwe use moving forward 19:24:41 corvus: it's on the agenda :) 19:24:41 pabelanger: one of the benefits of using conatiners would be others know how to work with them and we lose a lot of that if using special magic for containers 19:24:47 mordred: indeed :) 19:24:51 clarkb: basically saying that I don't think you should write the puppet5 spec by default (although you certainly _can_) 19:25:08 clarkb: understood, but is an option you'd toggle based on inventory variables 19:25:19 +1 19:25:25 ok I'll respond to the list with these thoughts (basically elts update the specs so we can consider them directly against each other and then make a decision) 19:25:41 cool. and yay. somehow now I'm on the hook for writing a spec 19:25:41 this sort of assumes we want to make this the priority spec going forward but honestly I think not making it the priority effort would be a mistake 19:25:43 what's wrong with me? 19:25:46 my biggest fear there is that we end up blocking fixing bugs in services by forcing fixes to come with a complete rewrite of deployment/configuration management tooling 19:25:48 clarkb: +100 19:26:06 fungi: I agree with that - and I think any plan we adopt should not have that quality 19:26:09 fungi: ya I think we need to make it incremental 19:26:21 so that we don't put a force lock on everything for a long period of time 19:26:24 * mordred is very supportive of incremental as long as we have an end-goal in mind 19:26:28 we *definitely* need to make it incremental 19:26:30 mordred: ++ 19:26:31 now incremental may be do it all in a month or two 19:26:33 clarkb: I definitely think it should be a priority effort 19:26:39 Development Effort Estimate, Person-Years (Person-Months) = 33.57 (402.89) 19:26:39 (Basic COCOMO model, Person-Months = 2.4 * (KSLOC**1.05)) 19:26:40 but you should know where to push the update to say fix etherpad 19:26:46 that's sloccount on /etc/modules 19:27:00 re-writing all that in ansible isn't particuarly appealing 19:27:02 like we discovered yesterday that we may need to upgrade to latest etherpad and that took some moderate puppet updating. i would hate to think of what would have instead been involved in replacing puppety-etherpad_lite with ansible-role-etherpad_lite written from scratch 19:27:04 nope 19:27:14 fungi: ya 19:27:21 this is why I'd like to evaluate the specs together 19:27:33 so that we can consider the effort of pushing puppet forward against what using containers or ansible would get us 19:27:33 fungi: there are lots of existing ansible roles 19:27:36 yup. 19:28:13 corvus: i'd be excited to find out that etherpad has an ansible role already 19:28:28 fungi: i'd wager that once the *rest* of the infrastructure is in place (basic system, users, etc), the effort for upgrading to puppet5 vs ansible for most services would be within a small percentage. 19:28:29 this is also likely something we should include in the specs 19:28:39 a survey of existing tooling for our various services 19:29:02 fungi: there are at least 4 19:29:10 wow! 19:29:18 https://galaxy.ansible.com/list#/roles?page=1&page_size=10&autocomplete=etherpad%20lite 19:29:20 but I think the thing to get us over the hump on actually moving forward here is being able to consider them together rather than just getting annoyed/scared/overwhelmed by the amount of work for $effort 19:29:29 that's actually how the ansible spec is written, deal with common services first, port them to ansible, but still run puppet for other bits. Then slowly replace or decide to uplift to another version 19:29:38 becomes a lot less scary when you know that doing something else is either more work or bigger risk or has fewer features 19:30:16 hopefully next week I can ask everyoen to go and read three specs and we can then make a decision sometime not too long after that 19:30:23 anything else we want to consider that won't be covered by ^ 19:31:52 cool that takes us to the next topic 19:31:56 #topic General Topics 19:32:02 the future of third part ci instructions 19:32:39 ianw: corvus this came out of the zuul meeting yesterday 19:33:07 yes, there's some notes in the agenda 19:33:23 it sounded like we had two different cases... it would be nice for the zuul family of projects to have a thorough instructional walkthrough... it might also be necessary to have some more neatly-packaged solution for people who are deploying an openstack third-party ci system and don't want to have to learn anything 19:33:26 the basic concern appears to be that puppet-openstackci cannot deploy a zuul v3 as it exists today 19:33:49 fungi: right and puppet-openstackci has been our existing "use this" tool for the openstack third party ci specific use case 19:34:13 it is also how we deploy our system (sort-of), with additional things third-party folks don't normally want 19:34:18 does puppet-openstackci still work? 19:34:21 clarkb: i guess that's the root cause ... the bigger question is if someone is offering to help fix that, how should we assist 19:34:24 does it deploy something? 19:34:25 anteaya: for zuulv2 yes 19:34:39 thanks 19:34:47 so it's 3 things: our system. simple openstack third-party ci. project that wants to have an "openstack infra". 19:34:58 puppet-openstackci was also driven mostly by third-party ci operators, so it seems to me that it would be fine if we waited for some of them to want to write the zuul v3 edition 19:35:04 it still deploys part of our system 19:35:13 i've only worked around some parts of it for zuulv3 19:35:49 true. we'd need to end up using less and less of it if we don't upfit it ourselves 19:36:40 do we want to continue with the premise that module should work for all 3 audiences? it's possible, but it's work. 19:36:40 for me I've always struggled with the overlap between our system and people that want an openstack infra 19:36:56 system-config is not a public interface which is why we have puppet-openstackci I think 19:37:00 perhaps a session at opendev? 19:37:15 or do we want to make separate systems: a 3pci-in-a-box, and then whatevere we're using to run ours. 19:38:13 i don't have a problem with changes to puppet-zuul. nor would i with ansible-role-zuul. where this gets really hard for me is puppet-openstackci where i feel like i'm the sysadmin for 100 systems i've never met. 19:38:27 corvus: right that 19:38:29 i think if we make a "whatever we're using to run ours" (which we need to do anyway) then we can let "3pci-in-a-box" happen as people come along who are interested in implementing it 19:38:30 yah 19:39:08 i consider the puppet-openstackci shim module well-intentioned but ultimately problematic 19:39:25 fungi: the problem is we have someone interested in developing it, asking for help on what to do, and i'm not really sure we have an answer for them 19:39:35 we should focus on having a good and flexible base layer which can have multiple entry points 19:39:45 ianw who is the person? 19:39:54 fungi: i think we should set some direction there. i think we're in a good position to suggest an implementation for 3pci. i don't know that there's anyone but a handful of us that could tell people how to do that with zuulv3 for now. 19:39:58 ianw: I think bernd is interested more out of learning how infra tools work and the third party ci docs got them furthest along there 19:40:07 ianw: and not necessarily for third party ci specifically 19:40:45 yeah, bernd happened across the documentation and decided to see if it still works and provides any insights into how our systems operate 19:40:51 (and to be clear, i'm not personally ready to send out the email that says "hey 3pci folks, you should all upgrade to zuulv3 now!" i think we have a little more work for that yet) 19:41:21 corvus, ++ 19:41:23 if we want to support 3pci using puppet, I can work on it in puppet-openstackci. But, as I spend more time one windmill, I personally prefer working on it. 19:41:46 yeah, i'm not saying we shouldn't set direction... more that the idea of modelling our deployment as just a special case of a third-party ci deployment is a bit confining 19:41:51 but if we don't at least create one 3pci-in-a-box recipe, then the space is going to get really muddled, and we won't see good results. or, people will keep trying to just run our system-config. 19:41:53 is it worth considering softwarefactory to be part of the solution some of this? they've been more focused on 'easily installable copy of infra' as a primary goal than we have ... I think we'd need to sort out how to get sf to not be carrying unlanded patches first, but may be worth considering? 19:42:00 fungi: i agree with that 19:42:10 fungi: ++ 19:42:42 mordred: thats my concern with software factory, its presented as infra made easy but reality is it tends to be a fairly large forkl and people ask us questins about it all the time 19:42:44 mordred: maybe that's a good answer for "i want to run a fuul openstack-infra". i'm not sure it's the best 3pci in a box? 19:42:51 fungi: ++ ... i think the future more looks like "here's a nicely wrapped up zuul/nodepool/jobs that plugs into openstack" rather than "here's how to run a mini openstack-infra" 19:43:26 ianw: right, i think they're separate branches from a base layer rather than one being branched off the other 19:43:28 clarkb: yes- I think we'd have to sort that story out 19:43:30 i mean, the 3pci instructions for puppet-openstackci get you zuul+apache talking to our gerrit. i think that's what that product needs, and not anything else. 19:44:06 yah - and that's a good definition of what a 3pci needs 19:44:22 ianw: indeed, with zuulv3 we have an opportunity to have something that will write an appropriate tenant config to pull in devstack, tempest, etc, to get all the jobs they should base their tests on. 19:44:38 +1 19:44:39 that's an advantage of a purpose-built 3pci solution 19:44:44 ++ 19:45:16 sounds like we'd maybe punt on an infra made easy for now, suggest a third party ci specific tool, then have our own tools for deploying infra itself? 19:45:36 and we'd work with interested 3pci groups to build that tool? 19:45:44 clarkb: yeah, i think decoupling this from infra actual makes both things easier, and this is the right time on both accounts. 19:46:00 clarkb: yah - that sounds good to me 19:46:10 ianw: ^ does that make sense as a path forward to you (you posted the topic so want to make sure your input is heard :) ) 19:46:17 pabelanger: is windmill in a position to be this 3pci thing? 19:46:19 maybe we should write up an infra spec for "build a new 3pci solution for zuulv3" ? 19:46:30 probably helps to look back at where puppet-openstackci came from too... it was in part at attempt to codify and automate some install howtos from random blogs frozen in time, and partly at attempt to keep us from making lock-step changes in services and system-config/project-config which subsequently broke people who were running outdated forks of system-config/project-config 19:46:34 ianw, pabelanger: it could certainly be the basis of it 19:46:51 ianw: pretty close, yes 19:47:13 working on documentation for it to be used in opendevconf workshop 19:47:17 fungi: yah - a lot of that got rolled in to the design of zuul v3 too - so some of the issues p-oci was working on solving are just different now 19:47:25 fungi: indeed -- i think discipline around low-level service modules + testing can help with that. 19:47:31 and we already have to make sure we dont' make changes to zuul-jobs that will break people consuming that repo 19:47:34 and what mordred said 19:47:50 yeah, we're in a much better place wrt all of that than we were several years ago 19:48:01 yah 19:48:02 ianw: can give you a run though this week if you'd like 19:48:33 has anyone talked to any third party operators about this topic? 19:48:52 anteaya: I think that is the next step, we are just trying to figure out where we as infra and zuul devs stand 19:48:53 some of our folks who use puppet-openstackci now? 19:49:02 ah okay sorry 19:49:26 the ones who were most communicative in the past were maintaining puppet-openstackci 19:49:26 speccing this out would be a great way to get input 19:49:33 ++ 19:49:47 fungi, mostly ramy yes 19:49:54 it's also worth considering that some of them may want to continue to use puppet for a third-party-ci - so if our 3pci isn't based on it, we could turn that repo over to whoever *does* want to use it 19:50:09 #agreed Spec out third party ci tooling as its own tool and solicit feedback from third party ci operators. 19:50:10 now that they're mostly gone, we're sort of left with a rift between what we need to do and a bunch of users who are probably not very connected with what they're running 19:50:13 mordred: yes, easier to do too if we decouple infra 19:50:17 whot's writing the spec? 19:50:21 ianw: corvus et al does that seem to reflect what we've ended up at? 19:50:26 fungi, yes 19:50:30 clarkb: yah 19:50:45 corvus/clarkb: i'll volunteer to write the spec, i think i've got the gist of the ideas from this and zuul conversation 19:50:45 clarkb: ++ (but i'd love an assignment) 19:50:55 corvus: ^ looks like we have one, thanks ianw 19:50:59 thanks ianw! 19:51:09 ianw: great, thanks. expect interest/help from me. 19:51:43 #topic Open Discussion 19:51:53 I have a thing 19:51:56 anteaya: go for it 19:51:58 thanks 19:52:12 https://review.openstack.org/#/c/557979/ is the current stand up a survey server patch 19:52:31 it fails zuul puppet apply because the puppet module is written in puppet 4 19:53:07 so currently fungi has advised me to scope the module and pull what I need from it, rewrite it in the system-config patch and don't worry about using a module 19:53:20 skimming that module, most of its complexity is around managing the database and webserver which we'd want to override anyway 19:53:26 just wanted folks to know current thought, the spec says we will use a module or create one 19:53:42 anyone object? 19:53:59 that seems like a great path forward for now- especially given the discussion around considering puppet4-5/ansible/containers 19:54:11 mordred, thank you 19:54:31 I'll offer a new patch to the spec to state the new module-less direction 19:54:35 it doesn't seem like investing in a module creation for this would be the right place to spendeffort 19:54:36 we could revisit leveraging the limesurvey module if/when we're on newer puppet or at least have the opportunity to run different puppet versions to deploy different services 19:54:36 with rationale 19:54:37 anteaya: cool 19:54:44 fungi: ++ 19:54:48 seems like a reasonable way forward 19:54:58 thank you 19:55:03 i'm torn. i feel like we finally got all 'general purpose' puppet code out of system-config. i feel like that's a bit of a step backward. maybe it's a calculated regression we want to make. 19:55:19 anteaya: if we fork the module and remove the "String " bits, does it work? 19:55:32 ianw I don't know, I haven't tried that yet 19:55:39 you could try that by forking it privately on github and updating modules.env, then re-running ci 19:55:41 I have zero puppet, I'm copy pasting here 19:55:54 we could make our own module for it, though that gets into the territory of competing with a module on the forge (specifically so that we can support an eol puppet version with it) 19:56:02 any direction I take means that fungi or someone else has to hold my hand 19:56:12 my internet is terrible today, apologies to people that have send a message my way 19:56:22 the limesurvey puppet person is looking for a new maintainer 19:56:36 I emailed him last week and cc'd clark on it and heard no reply 19:57:12 corvus I totally get your stance, I wish I had cmurphys puppet ability here 19:57:41 but the thing is I'm on my own dime, trying to get something up so I can get consulting contracts which I want to shop at the summit 19:57:48 not trying to bend infra policy 19:57:54 I'd be wary of volunteering to maintain that module long term given we need to decide our own long term plans there but maybe that is an option should we decide to continue with puppet 19:58:03 but I need to get a product I'm able to offer a paying customer soon 19:58:03 it's hard to say how much general-purpose puppet we'd be introducing into a class in system-config for this. it could be fairly minimal depending on how complex deploying limesurvey actually is 19:58:05 and we can convert to that module under new maintainerhsip (us) at that point 20:00:06 anteaya: i can give you a hand seeing if the changes to the module are minimal for puppet3 support; see https://review.openstack.org/#/c/559178/ 20:00:08 i'm not going to object to putting some puppet in system-config, just that if we do, that's something that we're likely going to want to change if we upgrade puppet. 20:00:39 it's also how we ended up with people wanting to run our system-config repo, and how we ended up with puppet-openstackci. :) 20:00:43 corvus: ++ we can capture that in the spec too 20:00:53 and we are at time. Thank you everyone 20:00:55 #endmeeting