14:00:55 #startmeeting tripleo 14:00:56 Meeting started Tue Nov 27 14:00:55 2018 UTC and is due to finish in 60 minutes. The chair is jaosorior. Information about MeetBot at http://wiki.debian.org/MeetBot. 14:00:57 Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 14:00:59 The meeting name has been set to 'tripleo' 14:01:09 #topic agenda 14:01:12 ah :) 14:01:15 o/ 14:01:16 o/ hi TripleOers! 14:01:20 o/ 14:01:21 * Review past action items 14:01:22 * One off agenda items 14:01:24 * Squad status 14:01:25 o/ 14:01:26 * Bugs & Blueprints 14:01:26 o/ 14:01:28 * Projects releases or stable backports 14:01:30 * Specs 14:01:31 \o 14:01:32 * open discussion 14:01:34 Anyone can use the #link, #action and #info commands, not just the moderatorǃ 14:01:36 Hey folks! who's around? 14:01:42 o/ 14:01:47 o/ 14:01:59 o/ 14:02:06 0/ 14:02:27 Emilien Macchi proposed openstack/ansible-role-redhat-subscription master: Fix table in the README https://review.openstack.org/620317 14:02:33 o/ 14:02:46 o/ 14:03:13 o/ 14:04:49 o/ 14:04:56 o/ 14:04:57 o/ 14:05:01 #topic review past action items 14:05:03 None. 14:05:11 #topic one off agenda items 14:05:13 #link https://etherpad.openstack.org/p/tripleo-meeting-items 14:05:35 First topic: (jtomasek) Since the Workflows squad has been reorganized and UI&Validations squad is taking over the tripleo-common responsibilities, I'd like to propose we give UI squad cores right to +2 tripleo-common patches 14:06:19 o/ 14:06:36 o/ 14:06:39 jtomasek: around? 14:06:41 o/ 14:06:41 jaosorior: as described ^, should I send an email to openstack-dev about it? Or is discussing it here enough? 14:07:07 ui team is hitting stuff that is very UI specific that needs +2 workflow 14:07:28 jtomasek: sending a mail would be nice too. 14:07:46 jaosorior: ok, will do 14:07:46 jrist: understood. 14:08:03 jaosorior: what's your initial reaction? 14:08:15 Thomas Herve proposed openstack/tripleo-common master: Don't update run UpdateParametersAction errors https://review.openstack.org/620319 14:08:26 ccamacho: try this instead "openstacksdk>=0.17.2" 14:08:32 ccamacho: and remove the tox line 14:08:37 jrist, jtomasek: My initial reaction is to hit stackalaytics and check how much are folks from the UI squad reviewing tripleo-common stuff. 14:08:38 ccamacho: I had it working locally 14:08:40 Jose Luis Franco proposed openstack/ansible-role-redhat-subscription master: Define job templates in zuul config. https://review.openstack.org/620293 14:09:24 jaosorior: noted, thanks 14:09:42 historically there's been a lot of discussion and good testing/reviews of the workbooks and actions, so IMO this should be fine 14:10:12 Carlos Camacho proposed openstack/instack-undercloud stable/rocky: Fix instack-undercloud nits https://review.openstack.org/619467 14:10:40 jrist, jtomasek, shardy: I understand the practical reasons; and I want to say it's cool and all. But I don't see folks doing a whole lot of reviews. 14:11:36 yup, understood. 14:11:50 jaosorior: I understand your concerns, we can revisit the topic again later 14:12:12 chem so now, what about running them with python2 holser_ has a point we dont have py3 in rocky, right? 14:12:37 chem I think we need to setup setup.cfg back to python2 14:13:06 as it will create a huge divergence between upstream and downstream 14:13:08 jtomasek: sounds good to me. 14:13:11 Anybody else has a take on this topic? 14:13:13 holser_: ccamacho well I did the validation using python3 and it was ok 14:13:39 jtomasek: Does that mean UI is handling all tripleo-common bugs? :) 14:13:56 * Tengu votes +1 on that 14:14:02 chem, there will be python2 downstream anyway... 14:14:16 therve: yes, they're all on apetrich now, who joined UI squad from workflows 14:14:47 Hum okay 14:15:09 therve: depends is always the answer 14:15:10 depends 14:15:13 but sort of, yeah. 14:15:30 therve: unless you want them? :) 14:15:43 d0ugal: Not really, but I end up fixing them anyway 14:15:49 That's why I'm asking :) 14:15:50 haha 14:15:56 therve: I noticed, thanks! 14:16:06 therve++ 14:17:22 jtomasek, jrist: Lets do this, send a mail to the mailing list so more folks have time to give their thoughts there. 14:17:24 Jiri Stransky proposed openstack/tripleo-heat-templates stable/rocky: Stop upgrade if a task on one node fails https://review.openstack.org/620320 14:17:30 roger that 14:17:38 jaosorior: ack 14:18:50 alright, the second topic is from panda|rover: featureset overriding and developer experience 14:19:00 yes, I'm introducing this, but would like to discuss it in the CI community meeting. 14:19:10 As part of the design to replace scenarios with standalone, a mechanism that overrides variables defined in featuresets was introduced in the CI workflow. This was deemed necessary to satisfy requirements given by tripleo developers, like the ability to change scenario more easily during the local workflow. I'd like to understand what these requirements are and in what way overriding featureset is the thing 14:19:12 that you need. Featureset design is one of the long lived and most robust design in our workflow, and its non overridability has protected us from regressions in the past, and it represented the last source of truth to understand what a job is doing. Since we are also changing the practice to map a job to a featureset 1:1 (all the standalone scenarios will now map to fs052) this represents also a big change 14:19:14 in out to intepret the jobs. I want to understand what was the request that rendered this big change necessary, and eventually work on an alternative solution together. 14:19:56 so, if some developers would be able to spend some time to explain it in the community meeting, I'd appreciate it, thanks. 14:20:55 panda|rover: might also be a good idea to bring it up to the mailing list. Given that not everybody might be able to join to the CI community meeting (I can't join cause I have a conflicting meeting, for instance) 14:21:21 jaosorior: ok, I'll do that, thanks. 14:21:56 thank you! 14:22:10 the next topic is mine: (jaosorior) Lets do some docs! 14:22:44 So, based on some of the community feedback that was taken from the Summit; one of the pain points of folks was stale and difficult to find documentation 14:23:17 in order to address this, marios and I synced up and decided to start a small initiative to work on the docs 14:23:41 Namely, lets do small changes to improve the docs (readability, structure, and removing stale bits) 14:23:51 and we're tracking this via this bug: https://bugs.launchpad.net/tripleo/+bug/1804642 14:23:53 Launchpad bug 1804642 in tripleo "[DOCS] Improve upstream documentation" [Low,In progress] - Assigned to Marios Andreou (marios-b) 14:24:10 I know everybody's very busy, so we have the goal of doing one change a week 14:24:22 Steven Hardy proposed openstack/tripleo-quickstart-extras master: Switch httpboot directory for baremetal underclouds https://review.openstack.org/585344 14:24:25 so, if you're interested in joining, help is very much appreciated! 14:24:28 jaosorior: thanks (also listening to another call) 1/week if you want to join us please do small changes/improvements 14:25:15 I can probably get some time out of the "idling" while a deploy is running on my lab :) 14:25:47 that will also improve my knowledge. 14:26:11 Tengu: that's how I've been approaching it. take some idle time, read the docs, and try to find something to fix. 14:26:23 just remember to include the related-bug so we can have a nice track of our improvement 14:26:23 so far it's been a nice learning experience, even if we just passed the first week. 14:26:25 Bogdan Dobrelya proposed openstack/tripleo-heat-templates master: Optimize config containers used for puppet steps https://review.openstack.org/620061 14:26:34 marios: for sure 14:26:34 lots to read still! :D 14:26:43 jaosorior: yeah. will try to ;). 14:26:46 welcome Tengu :) you now owe one patch by friday 14:26:50 Honza Pokorny proposed openstack/tripleo-quickstart-extras master: Bump OVB version https://review.openstack.org/620325 14:27:49 The next topic is this one: (bogdando) https://bugs.launchpad.net/tripleo/+bug/1804822 optimize base layer for security and edge cases - feedback and code review needed 14:27:50 Launchpad bug 1804822 in tripleo "Reduce kolla containers image size by moving off puppet bits we override for tripleo" [High,In progress] - Assigned to Bogdan Dobrelya (bogdando) 14:28:06 yeah, I hope it self containing, nothing to add here 14:28:24 just more eyes needed please 14:28:51 bogdando: I'll check it out after the meetings, thanks for bringing it up! 14:28:58 tl;dr the battle for images size and security 14:29:00 bogdando: I'm not convinced puppet is your biggest problem 14:29:12 just a one of many, dprince 14:29:18 bogdando: I would prefer we hold on pushing any patches perhaps until we agree 14:29:38 ofc, I'm pushing updates for review only 14:29:48 to not develop things detached 14:30:03 bogdando: I think the --volumes-from thing is a bit janky. it is more complex to debug for example 14:30:06 all those have wip-1 14:30:28 dprince: we could use paunch to start containers 14:30:35 so debugging will be very easy 14:30:37 the way the layers are built today uses less space, because puppet is in the base layer. 14:30:41 paunch creates systemd units... 14:30:45 and solid with containers & paunch 101 UX 14:31:29 dprince: yes, but puppet brings in systemd, ruby and more 14:32:02 bogdando: 2 things I would stress. Lets not just change our base layers around without hard data. Suspicion isn't enough to go and change these things 14:32:03 so we have at least 3 unrelated subjects for security maintenance for *each* container 14:32:27 bogdando: and you may consider alternative (radical) containers to better accomidate edge clouds 14:32:47 bogdando: what is your main issue? space, or network bandwidth? 14:32:53 what do you mean a suspicion? I think it's time to address that tech debt, at very least from the security standpoints 14:33:08 those look very generic, not just suspicious things to be proved 14:33:09 introducing more dependencies on the host will make it harder when we eventually want to adopt an immuntable host OS like coreos, FWIW 14:33:15 bogdando: my vision for how to address the puppet technical debt is quite different from yours I think 14:33:31 Jason E. Rist proposed openstack/tripleo-ui master: Enable deletion of deployment upon successful deployment https://review.openstack.org/620332 14:33:38 bogdando: the LP ticket you filed mentioned space and network bandwidth... 14:33:49 shardy: the change does not bring dependencies for host, only reorg containers 14:33:53 bogdando: https://bugs.launchpad.net/tripleo/+bug/1804822 14:33:55 Launchpad bug 1804822 in tripleo "Reduce kolla containers image size by moving off puppet bits we override for tripleo" [High,In progress] - Assigned to Bogdan Dobrelya (bogdando) 14:34:04 image size is your main concern? 14:34:05 dprince: I updated it for security concerns todayt 14:34:27 and would be nice to alter cronie to not bring in systemd 14:34:40 #link https://review.rdoproject.org/r/#/q/topic:base-container-reduction 14:34:40 bogdando: I spoke to slage yesterday. One idea for edge might be to go the other direction and try using 1 container for everything 14:35:02 #link https://review.openstack.org/#/q/topic:base-container-reduction 14:35:24 bogdando: like, we used to have an overcloud-full.qcow2 contain all the packages. If you want to only have a minimum amount of space, then that is likely your best option 14:35:24 dprince: I do have to agree with bogdando on the security side. Whenever a security vulnerability is found on one of the dependencies that are pulled because of puppet, we need to update the base, and thus update all of the containers. 14:35:46 dprince: > my vision for how to address the puppet technical debt is quite different from yours I think 14:35:46 let's please consolidate on that 14:36:01 jaosorior: do we have data on this? Like there are other Openstack things in our base container that are likely much more commonly security patched 14:36:01 in mail lists, code review, and alternative patches! it's very welcome 14:36:26 jaosorior: I guess what confuses me is why puppet. Its one of perhaps many things 14:36:39 # link http://lists.openstack.org/pipermail/openstack-discuss/2018-November/000212.html 14:37:00 Derek Higgins proposed openstack/tripleo-quickstart-extras master: Setup Ironic in Overcloud https://review.openstack.org/509728 14:37:10 dprince: understood. I'll bring this topic up with my team and get more heads around the security side of this. 14:37:30 bogdando: I will write a spec for my thoughts on the puppet containerization thoughts I have. But do note they are more about "logical" containers (keeping the containers self-contained) and do not perhaps fit with what your space/disk concerns are after 14:38:18 jaosorior: look at the history, how often are we currently bumping the base layer for puppet and its deps. Is it really a problem? 14:38:24 well, if we can drop systemd from containers, that would already be a nice thing. 14:38:25 jaosorior, dprince: not just puppet, but systemd and ruby :) 14:38:35 jaosorior: compare that to other packages. 14:38:45 Sorin Sbarnea proposed openstack/tripleo-docs master: Add simple diagram for contributor guide promotion stages https://review.openstack.org/616187 14:38:51 my strong opinion is it's time to stop adding those into all containers 14:39:08 so the size is a 2nd issue , and yes, for edge cases only 14:39:08 bogdando: but they get used in all the containers? 14:39:38 bogdando: you are essentially asking us to stop using RPM dependencies in our containers... and just start bind mounting random things from other places into them 14:40:03 alright, I suggest we continue this discussion in the BZ or the mailing list 14:40:11 I mean I guess both containers would be built with their own RPMs 14:40:19 if this allows to prevent having to rebuild all the containers when there are security concerns... 14:41:00 dprince: jaosorior: > I guess what confuses me is why puppet. Its one of perhaps many things 14:41:00 yes, one of many, in the linked mail list, hjensas brought more of that 14:41:23 and mwhahaha did some optimizations, earlier in the topic 14:41:33 bogdando: mind starting another mail thread with a new topic so this discussion is easier to track? 14:41:40 We do already have parameters which allow splitting ServiceNameImage (runtime) and ServiceNameConfigImage 14:42:00 so we could use different images, double how many are built and remove the runtime puppet etc deps 14:42:15 dprince: yes, they are in all containers in tripleo 14:42:16 but yeah, ML sounds like a good place to discuss 14:42:36 not used maybe ,but stay here runtime and may be exploited, all of that security things :) 14:42:43 shardy: if you have to download the config image once why not just use it to run the service? 14:43:05 dprince: to avoid the need to restart service containers if e.g puppet gets updated which was mentioned above 14:43:10 dprince: but yeah, I do agree 14:43:12 shardy: the reason the Config image parameters exist was more along the lines of optimization I think, to eliminate generating things like nova.conf more than once 14:43:15 sorry, that's it from my side 14:43:30 just pointing out we already provide an interface which could enable "lean" runtime containers if folks really want that for hardening etc 14:43:37 Marios Andreou proposed openstack/tripleo-docs master: Add simple diagram for contributor guide promotion stages https://review.openstack.org/616187 14:44:08 shardy: its all pre-optimization to me at this point. Like how often is puppet or ruby updated that something else in the base container isn't? 14:44:18 * dprince thinks we are chasing the wrong problem 14:44:20 very rarely 14:44:26 because we don't switch versions 14:44:52 dprince: yeah, I'm just saying different images could be cleaner than bind mounts between containers etc, not arguing for the change itself 14:45:45 Alright, lets move on with the agenda and continue this topic in the ML 14:45:47 shardy: cool. And I'm saying having a container be able to configure itself is very useful. Furthermore a logical 'service' container is a very good thing 14:45:54 dprince: well, between systemd, puppet and ruby, there are many security concernes, almost every month... and also, what's the point keeping them in runtime containers when they are useless? reducing attack surface is a must, regarding security ;). anyway, will also jump in the LP + ML. 14:46:17 that said, if you want to break the rules for edge and have an all-in-one container then we have parameters in place that should allow you to do that 14:46:44 but don't go re-layer everything for the sake of pre-optimizing container layers 14:46:55 FWIW removing puppet/ruby from service containers is going to make situation worse w/r/t space usage likely 14:47:07 would we start building 2 sets of base images? 14:47:10 shardy: note, those bind mounts between containers is a daily approach for pods 14:47:16 one for services, other for config 14:47:52 mounts between containers is also a known and well used thing in rancher world. 14:47:57 bogdando: sure, but is having a service be able to configure itself really need to involve a separate pod? 14:49:06 I'd say yes, dprince... removing not-runtime things is a good idea. That will require some chances in the architecture, but still... I support bogdando on that :). 14:49:27 shardy: I wish I could just use podman pods, but I have hope to backport that for Queens 14:49:30 rather than this, I would like to see use use --volumes from for the /var/lib/config-data directory 14:49:37 having that be in a data container makes so much sense 14:49:51 and IMO is a more correct "podlike" construct 14:49:53 bogdando: IMHO backporting something like this would be very risky indeed 14:50:06 but we want to support edge there... 14:50:09 DCN 14:50:16 but what is being suggested here is basically teasing apart functional containers... 14:50:24 those are two very different things 14:50:29 think of 30,000 or 40,000 nodes getting a base layer with the world plugged in 14:51:01 bogdando: if you need a solution that works on queens personally I'd just build two sets of containers vs the sidecar thing 14:51:05 but that's just my opinion 14:51:16 I think that can work as well 14:51:29 would be nice to play with neat podman pods 14:51:30 shardy: thats actually more data that he'll download I think 14:51:37 w/o legacy ) 14:51:46 it's still a risk from an update perspective but it miminizes the architecural changes 14:52:29 dprince: +1. I don't follow how can we save on downloading things if we build two sets of images, it's likely going to have the opposite effect AFAICT? 14:52:50 i understand the security argument though 14:53:07 dprince: I'm not sure all-in-one container is a good idea tbh 14:53:22 but as long as we generate config locally on each node, the 30 or 40K nodes are going to have to download images with world plugged in anyway 14:53:33 bogdando: given what you are asking for it is the way you'll download the least data to these nodes 14:53:36 jistr, dprince: yeah, I was just saying it's a less risky way of achieving the hardening goal vs the image size thing 14:53:48 jistr: > well, between systemd, puppet and ruby, there are many security concernes, almost every month... and also, what's the point keeping them in runtime containers when they are useless? reducing attack surface is a must, regarding security ;). anyway, will also jump in the LP + ML. 14:53:53 hm, we seriously should continue this discussion on the ML and get over the mtg though (ping jaosorior ;)) 14:54:00 no, just a separate image with config-time dependencies 14:54:11 and config_image to leverage volumes off it 14:54:26 that's what I proposed in the lined topics 14:54:26 Tengu: at this point there's 5 min left. 14:54:26 hi all in triple-o is it possible to have separate roles like RabbitMQ and load balancer so that it can be installed on separate nodes 14:54:43 bogdando, shardy: yea re hardening it makes sense i think 14:54:44 bogdando: honestly, the best solution for you would be using packages on the host, generating the config files on the host. And then having an all-in-one container for all the services which lets them run in an isolated mannner 14:54:45 Anyway 14:54:47 just for status: 14:54:49 #topic bugs & blueprints 14:54:51 #link https://launchpad.net/tripleo/+milestone/stein-2 14:54:53 For Stein we currently have 29 (for stein-2) and 3 (for stein-3) blueprints open in Launchpad. 14:54:55 Bugs: 780 (+15) stein-2, 5 (+2) stein-3. 102 (+1) open Storyboard bugs. 14:54:57 bug 36202 in gnupg (Ubuntu) "duplicate for #780 race condition on ~/.gnupg/random_seed when signing" [Medium,Fix released] https://launchpad.net/bugs/36202 - Assigned to Daniel Silverstone (dsilvers) 14:54:57 bogdando: i.e. taking a step back from containers 14:54:57 #link https://storyboard.openstack.org/#!/project_group/76 14:54:58 krypto: yes, see the Messaging role we provide 14:55:36 #topic open discussion 14:55:46 Thanks mwhahaha 14:58:45 Alright, folks! thanks for joining 14:58:51 #endmeeting