19:02:25 #startmeeting infra 19:02:26 Meeting started Tue Aug 21 19:02:25 2018 UTC and is due to finish in 60 minutes. The chair is fungi. Information about MeetBot at http://wiki.debian.org/MeetBot. 19:02:27 Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 19:02:29 The meeting name has been set to 'infra' 19:02:35 #link https://wiki.openstack.org/wiki/Meetings/InfraTeamMeeting#Agenda_for_next_meeting 19:02:44 #topic Announcements 19:03:18 #info clarkb (our peerless feeder, i mean fearless leader) is on vacation August 20-24, this week 19:03:49 which is why you get to deal with your illustrious substitute meeting chair 19:03:53 * mordred waves 19:04:00 #topic Actions from last meeting 19:04:41 #link http://eavesdrop.openstack.org/meetings/infra/2018/infra.2018-08-14-19.01.html Minutes from last meeting 19:05:03 1. (none) 19:05:11 well, that's exciting. i guess we did all those ;) 19:05:23 #topic Specs approval 19:06:07 we don't seem to have any specs up this week on the agenda for infra council rollcall, though there are a couple active under review at present 19:06:52 #link https://review.openstack.org/563849 Direction setting for 3rd Party CI (ianw) 19:07:18 #link https://review.openstack.org/587283 A spec for exploring the specifics of letsencrypt support (ianw) 19:07:40 i haven't received any particular comments on the section about the goals of 3rd party ci 19:07:42 ianw: i know i still owe you a rough implementation writeup of my alternative for that second one 19:08:12 in the process of drafting it as a review comment so as not to get in the way of you addressing the other comments on that most recent patchset 19:08:47 fungi: ok, i can update and am happy for you to upload directly 19:09:34 that's fine too. if you want to address the other comments in there however and then i'll do a patchset on top of that with the alternative as a cohesive bit under the alternatives section (which is now remarkably sparse anyway) 19:10:07 ++ 19:10:37 anyway, unless anyone wants to bring up anything specs related right now, i'll move on since i think the priority efforts discussion is likely to take up some time 19:11:32 #topic Priority Efforts - A Task Tracker for OpenStack 19:11:38 #link http://specs.openstack.org/openstack-infra/infra-specs/specs/task-tracker.html 19:12:47 Today on the ML I suggested that now since everything else is in flux for Searchlight and Trove, might be a good time to migrate. 19:13:02 ++ 19:13:07 the fix for 404 errors from the "Jump to..." box landed last week in the webclient 19:13:18 Yes! That is exciting too 19:13:32 as did one for non-array team_id and user_id parameters and something about disabling closing modals when you click the backdrop 19:13:58 i think the fontawesome update is also close to landing 19:14:10 woot 19:14:18 Will be wonderful to have all that landed. 19:14:29 which is the bottom of a fairly large stack SotK has up for review to improve the webclient 19:15:03 Looks like Fatema has two patches ready for review too- she also said that she would try to stick around as much as she can through school- she doesn't plan on disappearing now that her internship is over. 19:15:13 that's excellent news 19:15:27 I think she likes our community or something ;) 19:15:41 usually interns can't wait to get away from us (just kidding, usually they simply get really busy with their studies, which is entirely understandable) 19:16:08 Yeah she has till like the end of Sept before school starts 19:16:20 and she will try to stick around past then too but time will tell 19:16:44 pressure is also mounting to implement some swift integration or something like that to handle story attachments 19:17:13 Yeah I think that is the next big target. We can talk about that at the PTG. 19:17:15 so if anybody's interested, reviewing open changes and maybe hacking on attachments support would be great places to contribute to the effort 19:17:25 +2000 19:17:37 (start with writing a spec for attachments, please) 19:17:53 we good on sb updates for the week? 19:18:14 Yeah I think so 19:18:24 * diablo_rojo sits down at the back of the room 19:18:39 stay standing, we're almost sure to volunteer you for things! ;) 19:18:45 #topic Priority Efforts - Update Config Management 19:18:50 #link http://specs.openstack.org/openstack-infra/infra-specs/specs/update-config-management.html 19:18:51 * diablo_rojo sinks lower in her seat 19:19:02 here's where i expect we'll spend a good portion of our hour today 19:19:10 fungi: it's done! 19:19:13 there has been a continued flurry of activity on this 19:19:23 o/ 19:19:24 we're completely migrated and everything is perfect 19:19:35 it's done depending on exactly how you dereference "it" 19:19:38 oh that's good news 19:19:39 * fungi waits for the other shoe to drop 19:19:54 * cmurphy goes on vacation 19:19:58 corvus: I think life is easier with vague antecedents to our pronouns 19:19:58 so, anything we should not merge currently? Any caveats for reviewers? 19:20:07 ok - seriously ... 19:20:33 the cronjob on bridge.openstack.org isn't live yet - but I think we're mostly good and should be able to land it post-meeting 19:21:25 yeah, i was +2 on it but didn't want to single-core merge that one particularly 19:21:27 AJaeger: I'd still wait until the cron patch has gone live for new projects - but otherwise, no 19:21:30 fungi: ++ 19:21:45 if people want to review a GIANT patch related to iptables rules ... 19:22:18 that sounds very fun and not at all terrible 19:22:23 i've unfortunately had to save looking at that one until my evening. today has gotten away from me rather quickly 19:22:27 https://review.openstack.org/#/c/593973 and https://review.openstack.org/#/c/594340 and https://review.openstack.org/#/c/594437 and https://review.openstack.org/#/c/594438 are fun 19:22:33 mordred: ok, just tell once ready... 19:22:39 cmurphy: ^^ there you go - all the fun you've ever wanted 19:22:46 ;) 19:22:47 AJaeger: will do 19:23:33 re puppet there are a few lingering topic:puppet-4 patches (the ones verified+1 and without workflow-1) that are just adding tests that should be easy reviews 19:23:47 mordred: after snmpd and iptables, are there any other base puppet things to move to ansible? 19:23:47 and if anyone is brave there is a series to turn on the future parser on several more nodes 19:23:53 also - the groups.yaml file is starting to become a bit unwieldy - so we might need to make a new inventory plugin to let us express some groups more easier 19:24:00 mordred: is there any reason why that can't be tested? 19:24:42 ianw: nope - and in fact, at least some of it is being tested 19:24:44 I have a moderately large stack starting https://review.openstack.org/#/c/593487/ that uses roles out of system-config 19:25:02 corvus: afs (which ianw's work should do for us) 19:25:05 corvus: and unbound 19:25:11 there are a few things that need doing to get the roles documented and usable 19:25:13 mordred: afs isn't used on every server 19:25:20 but unbound is 19:25:28 yah- oh - and unattended-upgrades 19:26:22 bup might be nice to do early on as well? 19:26:23 i think my question is this -- the next milestone we should be working toward is "bridge configured as a standard host and everything on bridge configured by ansible", yeah? 19:26:30 so unbound/unattended-upgrades/timezone/ntp are the last every-server things - afs isn't on every server but is curently plumbed through openstack_project::server so getting it transitioned will get us off of openstack_project::server 19:26:41 or did bup get done already (is bridge being backed up)? 19:27:15 fungi: no - I don't think bup has been done- and I don't see it in openstack_project::server - so yeah, we should put that up on the list 19:27:36 corvus: I like your milestone 19:27:43 fwiw, backing up puppetmaster had been a very-long-standing todo anyway 19:28:04 I think 'bridge configured as standard host' is one milestone and 'everything on bridge configured by ansible' is a good followup 19:28:09 mostly because we wanted to avoid backing up cleartext sensitive file content (passwords, keys...) 19:28:27 mordred: oh are we running puppet on bridge? 19:28:37 since there are some things that bridge+others all need, like ntp or unbound - and there are things thare are specific to bridge that got done by hand 19:29:01 corvus: nope - there is no puppet on bridge - there are just some things that got copied over from puppetmaster - like the clouds.yaml file 19:29:07 mordred: oh, i thought all the stuff that was done by hand was captured. 19:29:08 i see 19:29:29 mordred: sorry, crossing streams ... if the iptables rule was in a top level role/ directory i feel like it could be tested with our usual integration jobs (like what i've proposed for afs+kerberos) 19:29:38 #agreed next milestone is bridge configured as standard host, with everything on bridge configured by ansible (including anything which was initially configured manually) 19:29:46 i can #undo that if we don't actually agree 19:29:53 ianw: yup. I agree - I think we can land the move-roles-dir patch pretty much any time 19:30:16 mordred: what's "timezone"? 19:30:16 mordred: well, see everything i've stacked ontop of it first, having actually used it :) 19:30:25 :) 19:30:42 corvus: setting the system timezone to utc 19:30:44 wow we really have a puppet-timezone module 19:30:51 i didn't know/remember that 19:30:55 class { 'timezone': timezone => 'Etc/UTC', } ... yeah 19:31:00 #info fundamental modules remaining to port to ansible: iptables (in progress), snmp (in progress), unbound, unattended-upgrades, timezone, ntp 19:31:01 that does indeed seem like overkill 19:31:04 that look right ^? 19:31:28 wfm, unless we also want bup in there on principle 19:31:33 yes 19:31:47 fungi: i'd like to draw the line before bup since it's not strictly required for every server 19:32:06 but i also am good with considering backups of bridge.o.o as a separate milestone, since we need to be careful about _what_ we back up from there 19:32:06 (i'm mostly trying to come up with work units for the proposed milestone) 19:32:27 yeah, i think that's perfectly fine 19:33:33 i think getting this work done gets us to the point where the tree can branch and we can do more interesting things at the PTG 19:33:41 totally agre 19:33:46 that will be awesome 19:34:02 once the base stuff is done, hacking on migrating individual services is much more parallelizable 19:35:58 any specific items on that set of work units we want to dive into in the meeting? 19:36:24 anything we should be on the lookout for? 19:36:30 i'll finish working on snmp/iptables; i think the others are up for grabs 19:36:37 ianw: were your testing concerns addressed above? 19:36:45 i'd like to talk more about testing 19:36:55 great! 19:36:59 please do 19:37:08 IMO we should start out moving these to top level roles/ and test them from zuul as much as possible, and also have things like the readme written with an eye to them being generic-ish. I know we can race ahead and leave all that for later 19:37:09 well, actually i'd like to listen more about testing :) 19:37:23 but later is a thing that tends to never come 19:37:40 I've been testing with molecule for ansible recently, works well. same idea as beaker testing for puppet 19:37:52 ianw: yeah, i think we've got exim and (at the end of the series) iptables and snmp all done generically and with a readme 19:38:41 corvus: right; i've got patches up to actually make the documentation too, which fixes at least one issue in the exim readme 19:38:42 pabelanger: bending beaker to our model of already-provisioned test environments ended up being sort of a challenge, if my memory isn't failing me. does molecule present similar base assumptions we have to work around? 19:38:46 does anything exercise these roles for real? 19:38:58 the roles are being run in our current puppet tests 19:39:03 mordred: the apply test? 19:39:07 or the beaker test? 19:39:17 oh it's the beaker tests 19:39:18 at least the beaker test 19:39:21 yeah 19:39:29 my snmp role failed beaker but passed apply 19:39:32 we run base.yaml as a setup step in the beaker test 19:39:36 fungi: yea, molecule has a delegated backend, where you can say the node has already been provisioned. Then either use localhost or loopback ssh 19:40:15 pabelanger: link for molecule? 19:40:20 fungi: the main bonus for molecule, would be to say allow local iteration against docker / vagrant, then when in zuul just assume node is already setup and skip the provision 19:40:24 the spec says we should start using testinfra 19:40:33 https://molecule.readthedocs.io/en/latest/ 19:40:37 yah - testinfra is a way to express the actual tests 19:40:51 my understanding is that molecule is a test runner and can use testinfra tests 19:41:02 okay, so are we at a place where we should start using testinfra? 19:41:07 so kind of like beaker and serverspec 19:41:36 corvus: I think we're definitely at a place where we should figure out the molecule/testinfra situation and come up with a testing pattern 19:41:36 Yah, I've also switched some roles to testinfra recently too: http://git.openstack.org/cgit/openstack/ansible-role-nodepool/tree/tests/test_role.py 19:41:50 default is pytest, but does support python unittests 19:41:56 but haven't tested 19:43:11 well, the other thing we can do is, and this is maybe what ianw was getting at, is have zuul run the ansible for the tests 19:43:30 rather than an indirection layer 19:43:36 so, basically, run the base.yaml playbook from zuul, with zuul's own inventory 19:43:48 yes. the main issue with that is that zuul is using 2.5 and were using 2.6 on bridge - it's PROBABLY not a problem - but it's a thing to be aware of 19:43:52 if we don't expect people to use sytem-config roles, that sounds like a good start 19:43:54 then validate with testinfra 19:43:59 like a dry-run of what we might do in the future to configure servers directly from zuul jobs? 19:44:01 also - we have some plugins 19:44:08 that zuul will disallow us from using 19:44:25 mordred: what plugins? 19:44:38 zuul ansible, run tox ansible still is valid for testing too 19:44:50 the iptables patch adds a jinja filter_plugin 19:45:06 mordred/corvus: hrm, yeah i hit that with https://review.openstack.org/#/c/593998/ 19:45:08 we also have an edited copy of module_utils for the zypper/apt detection 19:45:09 mordred: oh, so does exim 19:45:12 yeah 19:45:33 if that's not ok, system-config may need to be a trusted project? 19:45:39 and I think it's correct for us to have those - so I think running ansible with ansible is maybe a better bet 19:46:02 ianw: oh you're right, we don't need true anymore 19:46:03 ianw: if trusted, then we get no per merge testing 19:46:05 also - we're going to run ansible with ansible for CD - so testing that way would be closer to producciton 19:46:11 pre-merge* 19:46:20 pabelanger: yes, which is why I didn't want that! :) 19:46:27 +1 19:46:57 could this be our first use case for a post-review pipeline? 19:47:24 I still think having ansible-playbook from zuul, run tox jobs, which then does ansible-playbook is a good way to testing too 19:47:38 zuul's ansible running a shell task called "ansible-playbook" on a node called "bridge" with an inventory consisting of the other nodes created by zuul would be closer to testing production than having zuul's ansible run directly against nodes 19:47:55 okay, so it sounds like ansible->ansible is what we need for this. kind of a bummer; i'd prefer to use this as a test case for how much we can push into zuul directly, but that's going to be slow and it doesn't sound like we have much appetite for that. 19:48:33 corvus: maybe we can also do direct-zuul for some of the roles 19:48:51 and yeah, if we've crossed the bridge of adding the indirection for production, it probably doesn't make too much sense to stretch the paradigm for test. 19:49:00 corvus: yah 19:49:17 I'm pondering writing an ansible ansible module 19:49:24 similar to the ansible puppet module 19:49:27 mordred: the problem i hit with 593998 was that if anything has plugins, zuul stops? 19:50:09 * fungi sees it's ansibles all the way down 19:50:31 ianw: yes, that's right 19:50:51 ianw: yes. so two things: 1) we can proceed with 998 because it's removing dead code. but 2) the general solution is that we should not use these roles directly in zuul. we should use them indirectly with zuul-ansible running a separate ansible as part of the test process. 19:51:20 pabelanger: what does molecule add to this? 19:51:30 so, I've been arguing that some of them might be useful to use directly in zuul 19:51:39 right, just saying "maybe we can also do direct-zuul for some of the roles" doesn't really work unless we start splitting things into two repos? 19:51:41 (like ianw's afs and krb roles) 19:51:48 corvus: just saying if you wanted to iterate on local infra, you could use molecule 19:51:54 via docker or vagrant 19:52:07 mordred: more that might be useful, that's really the point of them, to be used in the wheel jobs 19:53:26 maybe we need a more nuanced story then - we could have a roles/ and also a playbooks/roles/ and put things that are generally useful in both jobs and production into roles/ and things that are not useful directly from jobs into playbooks/roles ? 19:53:29 or is that too convoluted? 19:53:49 i can follow that 19:54:08 mordred: that can work, and with a small update https://review.openstack.org/#/c/593477/ you can even have them documented 19:54:35 neat. I'll update the move-roles patch to only move _some_ of the roles 19:55:12 at least until we have the complete ansible-turtles-all-the-way-down testing story 19:56:07 so what does that look like? 19:56:16 i'd like to get that running asap 19:56:33 but i don't have enough context to write it myself from scratch 19:56:38 the only theoretical problem might be if people start relying on the roles, and then we have to go, say, adding a plugin and have to essentially move it to a "private" role 19:57:09 i don't think anyone should rely on anything in the system-config repository :) 19:57:26 we've been saying that for a very long time 19:57:35 fungi: well, actually we haven't 19:57:39 well - except for other zuul jobs 19:57:47 we've said that the system-config repository is in fact suitable for anyone to use 19:58:17 i thought we'd long said system-config was full of openstackisms and that was why we extracted the reusable bits out to other modules 19:58:19 like with the afs-client role 19:58:19 i would like to revert back to what we used to say, which is that it is *merely* for operating the actual systems we run for openstack 19:58:48 ahh, i thought we'd been saying that all along 19:59:02 I think if a role is system-config, it is limited to system-config. if we want others to use said role, we should create the role under openstack-infra, like we did with puppet-foo 19:59:17 fungi: there's an extra level of abstraction in the sytem-config puppet designed to allow other people to "run an etherpad the way we run an etherpad" 19:59:28 pabelanger: eventually I tihn that's appropriate for some of them 19:59:39 mordred: yup, some for sure 19:59:45 don't think we want all 19:59:47 pabelanger: yeah, i mean ... do you think puppet-timezone etc has really worked that well? i think collecting them makes it much more discoverable and maintainble 19:59:57 but it's a lot of extra churn and effort here at the beginning while we're sorting thigns out and reorganizing as we learn new things 20:00:16 ianw: to be honest, puppet-timezone has worked perfectly for me :) 20:00:33 corvus: but as a community building and reusable component 20:00:39 (i didn't even remember it existed) 20:00:59 I think there are three types of roles here ... roles that are very infra specific, roles that we want to use in infra but also want to use in jobs, and roles that are at the level of wanting to be legit standalone 20:01:03 okay, we're a minute over time. we should probably take this to the #openstack-infra channel. my general topic isn't super urgent anyway 20:01:11 ianw: yep, it has failed to accrue a community to support it, but i'm not certain that was (or should be) a goal :) 20:01:19 if the roles are small, and compact, and say timezone only modifies timezone. I think it works, the grey aready is when a role does things with apache, then package repos, then etc. Much harder to suppport for other users 20:01:42 ftr, i do not know how to proceed in setting up a real deployment test 20:01:44 (also, to be clear - we did not write puppet-timezone - it's a puppet community module we're consuming) 20:02:09 https://github.com/saz/puppet-timezone 20:02:16 perhaps we can continue to discuss how to set up that test, either here or in infra? 20:02:25 if this channel isn't reserved, can we just continue? 20:02:34 if nobody is kicking us out, would be good to be logged here 20:02:41 I have no issues with either choice 20:03:04 okay, so let me sketch out a strawman 20:03:12 i'm fine continuing, though i may need to #chair someone else if it goes on much longer (need to start cooking dinner shortly) 20:03:23 fungi: at this point we can all endmeeting anyway :) 20:03:37 yup, i don't suppose the topic needs to change again ;) 20:04:08 just to be clear, if anyone is in here waiting for another meeting to start, please let us know and we'll scram 20:05:15 a job which requests a xenial node and runs ansible-playbook with the supplied nodepool inventory and the base playbook. then it runs testinfra to validate that things were set up correctly? 20:06:50 seems reasonable to me 20:06:50 corvus: yes. or perhaps a 2-node job, with a bionic 'bridge' node on which ansible-playbook is run and a second node to run against? 20:07:48 mordred: do we bootstrap the bridge node by running the base playbook on it? 20:08:32 corvus: I don't think we need the base playbook for that - there is a smaller bridge playbook that should get the bridge things installed 20:08:41 and which is probably much more suitable for running directly from zuul 20:09:57 although even that isn't quite right - since it's going to install the openstack inventory plugin 20:10:06 okay, so zuul uses its own ansible to run the bridge playbook against the bridge node, then zuul copies the inventory over, and runs ansible-playbook base.yaml on bridge. 20:10:07 so we might want to construct a playbook specifically for that purpose 20:10:21 are there any secrets that need to be set for the playbooks to work? 20:10:40 as an example: http://git.openstack.org/cgit/openstack/windmill/tree/tests/playbooks/run.yaml is a crud want to run ansible->ansible on a single node. But it is currently not using the inventory file from nodepool. 20:10:41 corvus: yes, that sounds correct 20:10:54 but could be a something to work from 20:10:56 https://etherpad.openstack.org/p/CJUWmx62al 20:11:53 okay, is that correct up to that point in the job? 20:12:00 yes. I think so 20:12:11 so how's this testinfra thing work? :) 20:13:01 could be a tox entry, testinfra will use the ansible inventory file for the remote nodes 20:14:07 something like: zuul runs tox which runs testinfra using zuul's inventory 20:14:29 yah, that is possible 20:14:49 mordred: is that like what you were thinking? 20:15:06 yes - I think so 20:15:22 ianw: i can't think of any secrets needed for the base playbooks right now 20:15:31 we should also ponder what per-service playbooks looks like -I'm guessing they each want a specific zuul job 20:15:43 yeah, i'll continue the etherpad sketch 20:16:43 fungi: just FYI, i am not sure how github.com/openstack-infra/irc-meetings is accurate, but it seems like for today there is only one more meeting left: scientific-sig-meeting (grep -nri 'Tuesday' --before-context 1 *.yaml | grep -i 'time' | sort -k 4 | column -t) 20:17:43 sdatko: yes, there's also a calendar file linked from http://eavesdrop.openstack.org/ 20:17:51 sdatko: yes, my calendar has that starting in 43 mins 20:18:38 ach, ok :-D 20:18:47 ianw: can you type something in the etherpad? 20:18:56 ianw: preferably your name 20:19:40 okay. weird. i clicked the thing to set my name, and it was pre-filled with ianw 20:19:51 corvus: maybe you are actually the same person as ianw 20:19:53 other than that, things seem to be working. 20:20:31 okay, this makes sense to me, and seems like something we should be able to get going pretty quickly 20:20:57 are we planning in running ansible-playbook in virtualenv or use distro version 20:21:10 #link etherpad sketching out system-config ansible testing https://etherpad.openstack.org/p/CJUWmx62al 20:21:37 pabelanger: whatever bridge.yaml installs; currently i think it's pip installed globally? or something? 20:21:48 pip installed globally 20:22:25 using the python3.6 that came with bionic 20:22:53 ack 20:23:04 mordred: is it the case that we do not want the openstack inventory plugin installed? if there's no clouds.yaml file, will it no-op or break? 20:23:47 corvus: it doesn't matter - what will matter is the ansible.cfg we install 20:24:40 mordred: ok, so we might be able to run the bridge playbook directly, then a second playbook which modifies the config 20:24:45 corvus: yah 20:24:52 that's probably the cleanest thing to do 20:25:00 and gets us coverage of the bridge playbook 20:25:10 we're going to need to think about host_vars/group_vars and how they interact with this - someof them are essential qualities of a given host, so we'd want them to get pulled in 20:25:57 but some of them are deployment specific - like the IP address of a remote host to enable in iptables 20:25:57 I don't have a solution for that to propose yet - just a thing to keep in mind while we keep working on this 20:25:57 mordred: yeah. maybe we can overlay test values on top of the normal values 20:26:20 ++ 20:27:27 mordred: if our jobs are based on functional units, we might be able to use inventory values for that. eg, a multinode job with all the zuul components as hosts 20:28:31 anyway, i think i have a good enough idea of what this should look like for us to get started; anyone have anything else? 20:29:04 I think that's great! 20:29:13 sounds like a solid start on a plan 20:29:50 fungi: since you're around, i'll leave the endmeeting honors to you :) 20:29:56 awesome, i'll cover my general topic real fast since we have the channel anyway... 20:30:01 #topic Continued discussion of Winterscale naming (fungi) 20:30:06 #link http://lists.openstack.org/pipermail/openstack-infra/2018-August/006075.html 20:30:11 for those of you who aren't following the infra ml, hoping we can find some consensus on a name for the winterscale effort, ideally before the ptg 20:30:14 that's all 20:30:28 what should we do if we like that name and want to move forward with it? :) 20:30:48 silence or thumbs up or have a beer 20:30:50 (the email was clear on what to do if we *don't* like it, less clear on how to express support) 20:31:07 i don't know how to do the first 2 things so i'll do the last 20:31:39 if there are additional items added to the short list, we can discuss them. of there aren't, then i guess we have one to move forward on 20:32:28 any other quick questions? i'll leave it on the agenda for next week when clarkb's back too 20:33:10 thanks everyone! find us in #openstack-infra for further discussion 20:33:19 #endmeeting