19:01:25 <clarkb> #startmeeting infra
19:01:25 <openstack> Meeting started Tue Feb 18 19:01:25 2020 UTC and is due to finish in 60 minutes.  The chair is clarkb. Information about MeetBot at http://wiki.debian.org/MeetBot.
19:01:26 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote.
19:01:28 <openstack> The meeting name has been set to 'infra'
19:01:35 <clarkb> #link http://lists.openstack.org/pipermail/openstack-infra/2020-February/006601.html Our Agenda
19:01:58 <clarkb> We've got a fairly large agenda today so we may push forward through some topics. In particular I'd like to make sure we have time to discuss pip and virtualenv
19:02:07 <clarkb> #topic Announcements
19:02:32 <clarkb> I've got early meetings for half the day the next two days. I don't expect to be properly around during that
19:02:49 <diablo_rojo> o/
19:03:34 <fungi> same for me
19:03:44 <fungi> though they're later in my day
19:04:05 <clarkb> #topic Actions from last meeting
19:04:13 <clarkb> #link http://eavesdrop.openstack.org/meetings/infra/2020/infra.2020-02-11-19.01.txt minutes from last meeting
19:04:20 <clarkb> There were no recorded actions from last meeting
19:04:55 <clarkb> #topic Priority Efforts
19:04:59 <clarkb> #topic OpenDev
19:05:03 <clarkb> Diving right in
19:05:20 <clarkb> I did update the openstack governance change after our discussion here last week
19:05:27 <clarkb> thatshould be good to go I think and diablo_rojo has +1'd it
19:05:37 <clarkb> we are just waiting on TC review now?
19:06:23 <clarkb> #link https://review.opendev.org/#/c/705804/ Upgrade Gitea to 1.10.3
19:06:50 <clarkb> it would be good to review and land ^ if it is ready. I think keeping up with gitea is important so that we can deploy the commit cache as soon as it and we are ready
19:07:08 <clarkb> I can help monitor that today if we get it landed
19:07:17 <clarkb> mordred: ^ anything new on that topic?
19:07:31 <mordred> nope. I can also help monitor. it's ready go to
19:07:55 <mordred> I have not yet updated the v1.11 patch with template changes, but it's also marked WIP
19:09:20 <clarkb> #topic Update Config Management
19:09:43 <clarkb> I don't think there has been much new here recently. Anyone have anything to add on this topic?
19:10:36 <mordred> no - I'm on a new tangent with gerrit now
19:10:51 <mordred> we need to redo the image build jobs because the bazel sitch has gotten more complex than we can currently handle
19:11:04 <clarkb> this is what bazelisk would help with?
19:11:07 <mordred> luckily, corvus has already written the new stuff over in gerrit's gerrit
19:11:09 <mordred> yeah
19:11:23 <fungi> that'
19:11:30 <fungi> s convenient, i guess
19:11:33 <corvus> i'm going to propose some stuff to zuul-jobs today (i meant to yesterday but ran out of time)
19:11:44 <mordred> so the plan is to port in the stuff from that change series into zuul-jobs so we can make use of it - and maybe to add zuul/jobs from gerritreview so we can get the gerrit job directly
19:12:06 <mordred> then have the image build job just copy the locally built war into the image
19:12:50 <mordred> also - review-dev.openstack.org SHOULD be a decent redirect alias for review-dev.opendev.org but I'm getting the self-signed-cert warning and I haven't looked in to that yet
19:13:26 <corvus> mordred: i just tried it and got redirected without warning -- do you have a hosts entry?
19:13:48 <mordred> I don't - but maybe my browser cached something?
19:13:54 <corvus> (do i have a hosts entry?  no i don't)
19:14:14 <mordred> yeah - I'm getting a cert without the alt name for review-dev.openstack.org
19:14:43 <mordred> but if it's working for you - maybe shrug?
19:14:56 <clarkb> the cert I see is for review-dev.opendev.org
19:15:04 <clarkb> and so hitting review-dev.openstack.org makes ff sad
19:15:14 <corvus> maybe i cached a redirect?
19:15:16 <mordred> yeah. that's what I'm seeing
19:15:18 <fungi> firefox gives me a really bizarre warning about the site not supporting encryption
19:15:28 <fungi> but then says it's verified by let's encrypt
19:15:35 <mordred> maybe the apache handler didn't fire
19:15:55 <fungi> yeah, i'd first see if apache wasn't reloaded or has a stale worker from before the reload
19:16:00 <ianw> fwiw i see what corvus sees, straight to opendev.org ...
19:16:13 <clarkb> ianw: corvus maybe you are starting with http not https
19:16:33 <ianw> ahh, yes
19:16:34 <corvus> oh probably
19:16:45 <mordred> yes - http://review-dev.openstack.org redirects cleanly
19:16:48 <corvus> yep
19:16:51 <mordred> https://review-dev.openstack.org doesn't
19:17:07 <fungi> on closer inspection, firefox reports "Error code: SSL_ERROR_BAD_CERT_DOMAIN"
19:17:22 <mordred> I'm betting there's an apache handler misfire
19:17:29 <fungi> Unable to communicate securely with peer: requested domain name does not match the server’s certificate.
19:18:29 <fungi> and yeah, i have subject altnames for review-dev.opendev.org and review-dev01.opendev.org with a cn of review-dev.opendev.org so no review-dev.openstack.org
19:18:39 <mordred> anybody want to debug anything further other than trying a graceful on apache to see if it gets it?
19:18:56 <fungi> seems it was issued 2020-02-05 18:22:30 utc
19:19:08 <mordred> which is when the apache process last started
19:19:11 <clarkb> mordred: nope, but if you aren't getting the right name after a restart then double check the acme.sh logs
19:19:15 <fungi> er, that was 18:22:20 utc if it matters, i mistyepd
19:19:25 <clarkb> it may not be validating review-dev.openstack.org or not even attempting to get that name
19:19:26 <mordred> oh - interesting
19:19:38 <mordred> we have a new .csr and new .conf - but not a new .cer
19:19:49 <mordred> the .cer on disk is from feb 5
19:20:06 <mordred> so maybe the LE code saw a .cer already and didn't get a new one?
19:20:19 <mordred> anyway - I guess we can troubleshoot this after the meeting :)
19:20:22 <clarkb> ya that could be a bug in our side too where if the names list changes we don't catch that
19:20:23 <ianw> http://paste.openstack.org/show/789725/ is where it changed
19:20:24 <clarkb> ++
19:20:26 <clarkb> lets move on
19:20:29 <mordred> we're _really_ close to review-dev being solid
19:20:40 <clarkb> #topic General topics
19:21:14 <clarkb> a friendly reminder that I've been asked if we plan to attend the vancouver ptg
19:21:25 <clarkb> so far I've had a very lukewarm maybe from mordred :)
19:22:15 <clarkb> I have until the end of the month to respond. Please let me know if you would like to go or think it likely that you will be going and working on infra things is on your agenda
19:22:21 <fungi> oh, yep, i'll be around... is there somewhere to officially declare our availability?
19:22:39 <clarkb> fungi: no I was just keeping it informal. Would a etherpad poll like corvus did for zuul help?
19:22:46 <clarkb> if so I can send that out to the infra lst after the meeting
19:22:49 <corvus> i plan on going
19:23:38 <fungi> yeah, i probably was just experiencing fosdem hangover and missed the opportunity to let you know i'd be present
19:23:58 <clarkb> ok so thats ~4 ish if mordred decides that planes are acceptable again :)
19:24:22 <clarkb> I think that is enough to put in for some time. Thanks
19:24:33 <clarkb> (and if others plan to be there still let me know as that helps with room sizing requests)
19:24:46 <corvus> vancouver is totally road-trippable.  mordred can pick me up on the way.
19:24:50 <clarkb> ha
19:25:12 <mordred> corvus: you probably won't fit in the car with the drysuits
19:25:24 <clarkb> mordred: what if he wears the drysuits
19:25:43 <corvus> how many would i have to wear?
19:25:51 <mordred> clarkb: that would be an uncomfortable car ride :)
19:26:09 <clarkb> Next up is trusty upgrades
19:26:24 <clarkb> I think we're still in a holding pattern for refstack (I need to follow up with foudnation side to see if they got any movement)
19:26:31 <clarkb> fungi: any progress on the wiki?
19:27:55 <fungi> nope
19:28:09 <clarkb> ianw: I think static.o.o has made progress. What is next?
19:28:33 <clarkb> I checked out tarballs.openstack.org using an /etc/hosts entry and it seemed to work with the new server
19:29:05 <ianw> if everyone is happy that /afs/openstack.org/project/tarballs.opendev.org is keeping sync, then it's a matter of switching dns entries for the host
19:29:26 <ianw> then we can remove publishing to the static server
19:29:39 <ianw> i'm out on pto today, maybe my tomorrow we could do this?
19:29:42 <clarkb> ianw: I guess double check recent releases (last 24 hours) for openstack and see that the artifacts ended up in afs too?
19:29:52 <clarkb> I don't think there is a rush. tomorrow should be fine
19:30:27 <ianw> oh, and a quick change to the zuul publishing jobs to move them from /afs/openstack.org/project/opendev.org/tarballs to /afs/openstack.org/project/tarballs.opendev.org
19:30:47 <ianw> and a sanity rsync
19:31:06 <clarkb> ianw: on the agenda you'd noted that there were redirects and other publishing sites to do. At this point I imagine that is pretty cookiecutter? Is there anything we can help with?
19:31:24 <ianw> umm, well there's still the service-types site, i haven't looked at that
19:31:57 <ianw> and if we're still going with a haproxy for redirects then https://review.opendev.org/#/c/677903/
19:31:57 <clarkb> is that the thing that publishes a simple json blob?
19:32:09 <ianw> which has gone into merge conflict, i can update
19:32:14 <clarkb> mordred: ^ you probably know about service-types site if there is anything to keep in mind for that
19:32:40 <fungi> i do still wonder why we bother to install a separate haproxy to do something that an apache we're already running can do just as easily, but i'm not vested enough in that to argue further
19:33:14 <ianw> well that was the spec plan, but if we're having second thoughts i'm open to discussion
19:33:29 <clarkb> The main motivation was simpler config?
19:33:41 <clarkb> making it easier for more than just openstack to set up similar potentially?
19:34:03 <fungi> i had second thoughts on the spec plan when it was proposed, but was willing to cede since others apparently felt it was easier
19:34:19 <ianw> maybe that it's falling out to fairly straight forward config files, not wrapped behind layers of puppet and templating it remains simple enough
19:34:41 <clarkb> ya I think I'm still ok with it. We run it elsewhere and have reasonable experience with the tool
19:34:53 <clarkb> it is not like adding a cimpletely new tool for something apache could do
19:34:56 <mordred> clarkb: service-types is just a static site - it's openstack specific - but it just wants to be there and doesn't change super frequently
19:35:02 <fungi> remember that a redirect only really needs an alias and a rewrite or redirect rule (which could be in a .htaccess file even)
19:35:30 <fungi> if we serve the redirects from the same server as the content, they could just be transparent rewrites too
19:35:30 <clarkb> fungi: it needs a full vhost too right?
19:35:46 <clarkb> I guess you're saying with an alias we can ahve a single vhost for all the redirects
19:35:47 <fungi> a redirect needs a full vhost, but not its own vhosty
19:35:57 <fungi> you can stick as many domains in a vhost as you like
19:36:15 <fungi> much like saying a redirect needs a full haproxy
19:36:37 <fungi> you don't need to install a separate haproxy for every redirect any more than you need a separate apache vhost for each redirect
19:36:40 <clarkb> right I'm just trying to think about what it looks like written down in config because that is where people will get lost if we try to make it easy for them
19:37:20 <clarkb> and having a single vhost with a bunch of aliases and redirect rules probably isn't that bad.
19:37:30 <ianw> note that in terms of what it looks like in haproxy, that is already worked on in https://review.opendev.org/#/c/678159/13/playbooks/roles/service-lb/templates/haproxy.cfg.j2
19:37:44 <fungi> i honestly have no idea what an haproxy redirect config looks like but i have a hard time imagining it being particularly more complex than an apache redirect or rewrite
19:37:45 <clarkb> oh right is it ssl that changes that?
19:37:55 <fungi> ahh, thanks for the example
19:38:06 <clarkb> it forces us to do a single cert for all the names or split vhosts iirc
19:39:11 <fungi> don't we do that with letsencrypt anyway?
19:39:34 <clarkb> fungi: ya, I'm thinking in the theoretical future when its more than just static.o.o sites doing that (potentially)
19:39:37 <clarkb> for now I think its fine
19:39:58 <clarkb> its possible that zuul and openstack won't want to share ssl certs
19:40:05 <clarkb> (but maybe that is also ok)
19:40:06 <fungi> also i don't see where the haproxy example tells it what ssl certs to use for which sites either
19:40:39 <ianw> none of those existing sites have any ssl
19:40:48 <ianw> so it's not covered there
19:40:48 <fungi> the amount of boilerplate in the haproxy config is about the same as the amount of boilerplate for an apache vhost either
19:40:50 <clarkb> ah that explains it
19:41:09 <clarkb> we are only doing http to https redirection so the ssl concern is not really applicable (for now)
19:41:55 <clarkb> I'd like to keep moving so that we have time to discuss pip and virtualenv
19:42:10 <clarkb> we should be able to sort thorugh this further in -infra and in review
19:42:15 <fungi> given these are mostly for sites where we already run apache, wouldn't incorporating them into the apache config for the sites to which they're redirecting actually be simpler anyway?
19:42:26 <fungi> but yeah, we can defer discussion to later
19:42:48 <clarkb> next up was a quick update on the new airship citycloud cloud resources
19:43:06 <clarkb> I've got grafana sorted out there and it seems to largely be happy. I can't tell if roman_g has managed to successfully use the larger resources yet though
19:43:19 <clarkb> Last I saw the cloud was unable to find hypervisros to schedule to.
19:43:33 <clarkb> Mostly a heads up that you may have to double check nodepool logs for nova errors if there are subsequent failures
19:43:40 <clarkb> And that takes us to pip and virtualenv
19:43:54 <clarkb> #link https://review.opendev.org/707499 pip and virtualenv discussion in comments
19:44:00 <clarkb> #link https://review.opendev.org/707513 use venv for glean
19:44:07 <clarkb> #link https://review.opendev.org/707750 use venv in project-config elements; drop pip-and-virtualenv inclusion from element and move to individual configs; add node type with no pip-and-virtualenv.  this can be used for job testing.
19:44:43 <clarkb> #link https://etherpad.openstack.org/p/pTFF4U9Klz Captures the problem and brainstorming around it
19:45:17 <clarkb> ianw has suggested an idea that I think mordred and I are fans of
19:45:39 <clarkb> we can use python3 -m venv consistently on all the images we build to install python utilities like glean, os-testr, and bindep
19:45:53 <mordred> ++
19:45:55 <clarkb> that avoids having a system level pip and virtualenv context
19:46:15 <clarkb> Then at a job level we can run get-pip.py or system package pip installation depending on what the job context demands
19:46:23 <mordred> yeah
19:46:38 <clarkb> this will require some changes to assumptions in our jobs, but we should be able to mask over that with base job updates
19:46:59 <clarkb> and then slowly peel back (I don't actually know what that looks like but it should be possible to do this without too much breakage and pain)
19:47:30 <fungi> i doubt it will be as painful and drawn out as the thick images to bindep transition
19:47:49 <clarkb> it would be helpful if others can review the notes and changes and point out any flaws with the ideas presented
19:48:07 <ianw> note 707750 keeps the images as is, but adds a ubuntu-bionic-plain (or something) that we can use as a test node type
19:48:09 <clarkb> but then I'm thinking we may want to formalize this into a spec so that we can communicate it relatively easily with our broad userbase
19:48:44 <fungi> and yeah, the biggest hole in current use cases is that there's no way to avoid pip installing pip system-wide if you want to use pip to install other packages system-wide
19:49:30 <clarkb> that way we can present the problem and plan succinctly in one place that gets sent out on the mailing list
19:49:33 <fungi> pip in a virtualenv/venv can be used to install --local just fine though, and virtualenv in a venv or another virtualenv also works as we would want
19:50:12 <clarkb> fungi: ya and I think that also reflects that "always install python from source" is becoming less important for the majority of our userbase
19:50:44 <clarkb> and behaving more similarly to distro published images is becoming more important
19:50:48 <fungi> you can also use python3 venv to install virtualenv, link that in your execution path, then use that to create a python2.7 virtualenv to install tox in, if you need tox defaulting to python2.7
19:51:29 <ianw> fungi: yep ... but that should definitely be in a base job, not in image build anyway
19:51:36 <fungi> agreed
19:51:50 <ianw> so we can definitely explore those avenues
19:52:15 <clarkb> as for writing the spec I expect I can put time into that on friday. I guess let ianw and mordred and I know if there are concern with what is up so far and if not I/we can start writing based on that?
19:52:20 <clarkb> ianw: mordred ^ does that make sense to you
19:52:33 <mordred> yah
19:52:38 <ianw> ++
19:53:03 <clarkb> Great that gives us a path to move forward on
19:53:24 <clarkb> The last topic we have was put up by zbr around third party ci logging recommendations
19:53:36 <clarkb> we actually discussed this yseterday and this change was a result
19:53:38 <clarkb> #link https://review.opendev.org/#/c/708323/ improvements landing here
19:53:53 <zbr> already sorted the guidelines, and thanks for the help.
19:54:08 <clarkb> it basically says that compressing files is ok as they get large (because browsers can't render them effectively in many cases), but we should try to ensure important logs are avialable without extra effort
19:54:30 <clarkb> and that gives us ~5 minutes for anything else
19:54:33 <clarkb> #topic Open Discussion
19:54:48 <zbr> is still have one question for free topic, i want to improve testing of zuul-jobs roles to test them with centos-8
19:55:24 <zbr> based on current setup, this would require adding extra jobs, if I have your support I will start doing that.
19:55:33 <corvus> zbr: have you read https://zuul-ci.org/docs/zuul-jobs/policy.html#testing ?
19:55:46 <corvus> especially the last paragraph
19:56:00 <corvus> and the "all-platforms" tag
19:56:01 <clarkb> I have no objection to adding more jobs to cover zuul-jobs more effectively
19:56:13 <zbr> ahh, thanks for the hint.
19:56:23 <corvus> yes, i also have no objection, and ^ can hopefully help do so efficiently
19:56:57 <corvus> (if we just need to test something specifically on centos-8, that's fine, but most of those cases are also probably worth testing on all the platforms, so the tag might be best)
19:57:17 <clarkb> agreed particularly with the python switch between 7 and 8
19:57:33 <clarkb> that has the potential to have broad impact and 7 tests won't catch those issues
19:57:46 <zbr> creating jobs is not a goal by itself, i will follow these guidelines.
19:59:02 <clarkb> Sounds like that may be it. Thank you everyone
19:59:06 <clarkb> we'll see you here next week
19:59:10 <fungi> thanks clarkb!
19:59:10 <clarkb> #endmeeting