19:01:25 #startmeeting infra 19:01:25 Meeting started Tue Feb 18 19:01:25 2020 UTC and is due to finish in 60 minutes. The chair is clarkb. Information about MeetBot at http://wiki.debian.org/MeetBot. 19:01:26 Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 19:01:28 The meeting name has been set to 'infra' 19:01:35 #link http://lists.openstack.org/pipermail/openstack-infra/2020-February/006601.html Our Agenda 19:01:58 We've got a fairly large agenda today so we may push forward through some topics. In particular I'd like to make sure we have time to discuss pip and virtualenv 19:02:07 #topic Announcements 19:02:32 I've got early meetings for half the day the next two days. I don't expect to be properly around during that 19:02:49 o/ 19:03:34 same for me 19:03:44 though they're later in my day 19:04:05 #topic Actions from last meeting 19:04:13 #link http://eavesdrop.openstack.org/meetings/infra/2020/infra.2020-02-11-19.01.txt minutes from last meeting 19:04:20 There were no recorded actions from last meeting 19:04:55 #topic Priority Efforts 19:04:59 #topic OpenDev 19:05:03 Diving right in 19:05:20 I did update the openstack governance change after our discussion here last week 19:05:27 thatshould be good to go I think and diablo_rojo has +1'd it 19:05:37 we are just waiting on TC review now? 19:06:23 #link https://review.opendev.org/#/c/705804/ Upgrade Gitea to 1.10.3 19:06:50 it would be good to review and land ^ if it is ready. I think keeping up with gitea is important so that we can deploy the commit cache as soon as it and we are ready 19:07:08 I can help monitor that today if we get it landed 19:07:17 mordred: ^ anything new on that topic? 19:07:31 nope. I can also help monitor. it's ready go to 19:07:55 I have not yet updated the v1.11 patch with template changes, but it's also marked WIP 19:09:20 #topic Update Config Management 19:09:43 I don't think there has been much new here recently. Anyone have anything to add on this topic? 19:10:36 no - I'm on a new tangent with gerrit now 19:10:51 we need to redo the image build jobs because the bazel sitch has gotten more complex than we can currently handle 19:11:04 this is what bazelisk would help with? 19:11:07 luckily, corvus has already written the new stuff over in gerrit's gerrit 19:11:09 yeah 19:11:23 that' 19:11:30 s convenient, i guess 19:11:33 i'm going to propose some stuff to zuul-jobs today (i meant to yesterday but ran out of time) 19:11:44 so the plan is to port in the stuff from that change series into zuul-jobs so we can make use of it - and maybe to add zuul/jobs from gerritreview so we can get the gerrit job directly 19:12:06 then have the image build job just copy the locally built war into the image 19:12:50 also - review-dev.openstack.org SHOULD be a decent redirect alias for review-dev.opendev.org but I'm getting the self-signed-cert warning and I haven't looked in to that yet 19:13:26 mordred: i just tried it and got redirected without warning -- do you have a hosts entry? 19:13:48 I don't - but maybe my browser cached something? 19:13:54 (do i have a hosts entry? no i don't) 19:14:14 yeah - I'm getting a cert without the alt name for review-dev.openstack.org 19:14:43 but if it's working for you - maybe shrug? 19:14:56 the cert I see is for review-dev.opendev.org 19:15:04 and so hitting review-dev.openstack.org makes ff sad 19:15:14 maybe i cached a redirect? 19:15:16 yeah. that's what I'm seeing 19:15:18 firefox gives me a really bizarre warning about the site not supporting encryption 19:15:28 but then says it's verified by let's encrypt 19:15:35 maybe the apache handler didn't fire 19:15:55 yeah, i'd first see if apache wasn't reloaded or has a stale worker from before the reload 19:16:00 fwiw i see what corvus sees, straight to opendev.org ... 19:16:13 ianw: corvus maybe you are starting with http not https 19:16:33 ahh, yes 19:16:34 oh probably 19:16:45 yes - http://review-dev.openstack.org redirects cleanly 19:16:48 yep 19:16:51 https://review-dev.openstack.org doesn't 19:17:07 on closer inspection, firefox reports "Error code: SSL_ERROR_BAD_CERT_DOMAIN" 19:17:22 I'm betting there's an apache handler misfire 19:17:29 Unable to communicate securely with peer: requested domain name does not match the server’s certificate. 19:18:29 and yeah, i have subject altnames for review-dev.opendev.org and review-dev01.opendev.org with a cn of review-dev.opendev.org so no review-dev.openstack.org 19:18:39 anybody want to debug anything further other than trying a graceful on apache to see if it gets it? 19:18:56 seems it was issued 2020-02-05 18:22:30 utc 19:19:08 which is when the apache process last started 19:19:11 mordred: nope, but if you aren't getting the right name after a restart then double check the acme.sh logs 19:19:15 er, that was 18:22:20 utc if it matters, i mistyepd 19:19:25 it may not be validating review-dev.openstack.org or not even attempting to get that name 19:19:26 oh - interesting 19:19:38 we have a new .csr and new .conf - but not a new .cer 19:19:49 the .cer on disk is from feb 5 19:20:06 so maybe the LE code saw a .cer already and didn't get a new one? 19:20:19 anyway - I guess we can troubleshoot this after the meeting :) 19:20:22 ya that could be a bug in our side too where if the names list changes we don't catch that 19:20:23 http://paste.openstack.org/show/789725/ is where it changed 19:20:24 ++ 19:20:26 lets move on 19:20:29 we're _really_ close to review-dev being solid 19:20:40 #topic General topics 19:21:14 a friendly reminder that I've been asked if we plan to attend the vancouver ptg 19:21:25 so far I've had a very lukewarm maybe from mordred :) 19:22:15 I have until the end of the month to respond. Please let me know if you would like to go or think it likely that you will be going and working on infra things is on your agenda 19:22:21 oh, yep, i'll be around... is there somewhere to officially declare our availability? 19:22:39 fungi: no I was just keeping it informal. Would a etherpad poll like corvus did for zuul help? 19:22:46 if so I can send that out to the infra lst after the meeting 19:22:49 i plan on going 19:23:38 yeah, i probably was just experiencing fosdem hangover and missed the opportunity to let you know i'd be present 19:23:58 ok so thats ~4 ish if mordred decides that planes are acceptable again :) 19:24:22 I think that is enough to put in for some time. Thanks 19:24:33 (and if others plan to be there still let me know as that helps with room sizing requests) 19:24:46 vancouver is totally road-trippable. mordred can pick me up on the way. 19:24:50 ha 19:25:12 corvus: you probably won't fit in the car with the drysuits 19:25:24 mordred: what if he wears the drysuits 19:25:43 how many would i have to wear? 19:25:51 clarkb: that would be an uncomfortable car ride :) 19:26:09 Next up is trusty upgrades 19:26:24 I think we're still in a holding pattern for refstack (I need to follow up with foudnation side to see if they got any movement) 19:26:31 fungi: any progress on the wiki? 19:27:55 nope 19:28:09 ianw: I think static.o.o has made progress. What is next? 19:28:33 I checked out tarballs.openstack.org using an /etc/hosts entry and it seemed to work with the new server 19:29:05 if everyone is happy that /afs/openstack.org/project/tarballs.opendev.org is keeping sync, then it's a matter of switching dns entries for the host 19:29:26 then we can remove publishing to the static server 19:29:39 i'm out on pto today, maybe my tomorrow we could do this? 19:29:42 ianw: I guess double check recent releases (last 24 hours) for openstack and see that the artifacts ended up in afs too? 19:29:52 I don't think there is a rush. tomorrow should be fine 19:30:27 oh, and a quick change to the zuul publishing jobs to move them from /afs/openstack.org/project/opendev.org/tarballs to /afs/openstack.org/project/tarballs.opendev.org 19:30:47 and a sanity rsync 19:31:06 ianw: on the agenda you'd noted that there were redirects and other publishing sites to do. At this point I imagine that is pretty cookiecutter? Is there anything we can help with? 19:31:24 umm, well there's still the service-types site, i haven't looked at that 19:31:57 and if we're still going with a haproxy for redirects then https://review.opendev.org/#/c/677903/ 19:31:57 is that the thing that publishes a simple json blob? 19:32:09 which has gone into merge conflict, i can update 19:32:14 mordred: ^ you probably know about service-types site if there is anything to keep in mind for that 19:32:40 i do still wonder why we bother to install a separate haproxy to do something that an apache we're already running can do just as easily, but i'm not vested enough in that to argue further 19:33:14 well that was the spec plan, but if we're having second thoughts i'm open to discussion 19:33:29 The main motivation was simpler config? 19:33:41 making it easier for more than just openstack to set up similar potentially? 19:34:03 i had second thoughts on the spec plan when it was proposed, but was willing to cede since others apparently felt it was easier 19:34:19 maybe that it's falling out to fairly straight forward config files, not wrapped behind layers of puppet and templating it remains simple enough 19:34:41 ya I think I'm still ok with it. We run it elsewhere and have reasonable experience with the tool 19:34:53 it is not like adding a cimpletely new tool for something apache could do 19:34:56 clarkb: service-types is just a static site - it's openstack specific - but it just wants to be there and doesn't change super frequently 19:35:02 remember that a redirect only really needs an alias and a rewrite or redirect rule (which could be in a .htaccess file even) 19:35:30 if we serve the redirects from the same server as the content, they could just be transparent rewrites too 19:35:30 fungi: it needs a full vhost too right? 19:35:46 I guess you're saying with an alias we can ahve a single vhost for all the redirects 19:35:47 a redirect needs a full vhost, but not its own vhosty 19:35:57 you can stick as many domains in a vhost as you like 19:36:15 much like saying a redirect needs a full haproxy 19:36:37 you don't need to install a separate haproxy for every redirect any more than you need a separate apache vhost for each redirect 19:36:40 right I'm just trying to think about what it looks like written down in config because that is where people will get lost if we try to make it easy for them 19:37:20 and having a single vhost with a bunch of aliases and redirect rules probably isn't that bad. 19:37:30 note that in terms of what it looks like in haproxy, that is already worked on in https://review.opendev.org/#/c/678159/13/playbooks/roles/service-lb/templates/haproxy.cfg.j2 19:37:44 i honestly have no idea what an haproxy redirect config looks like but i have a hard time imagining it being particularly more complex than an apache redirect or rewrite 19:37:45 oh right is it ssl that changes that? 19:37:55 ahh, thanks for the example 19:38:06 it forces us to do a single cert for all the names or split vhosts iirc 19:39:11 don't we do that with letsencrypt anyway? 19:39:34 fungi: ya, I'm thinking in the theoretical future when its more than just static.o.o sites doing that (potentially) 19:39:37 for now I think its fine 19:39:58 its possible that zuul and openstack won't want to share ssl certs 19:40:05 (but maybe that is also ok) 19:40:06 also i don't see where the haproxy example tells it what ssl certs to use for which sites either 19:40:39 none of those existing sites have any ssl 19:40:48 so it's not covered there 19:40:48 the amount of boilerplate in the haproxy config is about the same as the amount of boilerplate for an apache vhost either 19:40:50 ah that explains it 19:41:09 we are only doing http to https redirection so the ssl concern is not really applicable (for now) 19:41:55 I'd like to keep moving so that we have time to discuss pip and virtualenv 19:42:10 we should be able to sort thorugh this further in -infra and in review 19:42:15 given these are mostly for sites where we already run apache, wouldn't incorporating them into the apache config for the sites to which they're redirecting actually be simpler anyway? 19:42:26 but yeah, we can defer discussion to later 19:42:48 next up was a quick update on the new airship citycloud cloud resources 19:43:06 I've got grafana sorted out there and it seems to largely be happy. I can't tell if roman_g has managed to successfully use the larger resources yet though 19:43:19 Last I saw the cloud was unable to find hypervisros to schedule to. 19:43:33 Mostly a heads up that you may have to double check nodepool logs for nova errors if there are subsequent failures 19:43:40 And that takes us to pip and virtualenv 19:43:54 #link https://review.opendev.org/707499 pip and virtualenv discussion in comments 19:44:00 #link https://review.opendev.org/707513 use venv for glean 19:44:07 #link https://review.opendev.org/707750 use venv in project-config elements; drop pip-and-virtualenv inclusion from element and move to individual configs; add node type with no pip-and-virtualenv. this can be used for job testing. 19:44:43 #link https://etherpad.openstack.org/p/pTFF4U9Klz Captures the problem and brainstorming around it 19:45:17 ianw has suggested an idea that I think mordred and I are fans of 19:45:39 we can use python3 -m venv consistently on all the images we build to install python utilities like glean, os-testr, and bindep 19:45:53 ++ 19:45:55 that avoids having a system level pip and virtualenv context 19:46:15 Then at a job level we can run get-pip.py or system package pip installation depending on what the job context demands 19:46:23 yeah 19:46:38 this will require some changes to assumptions in our jobs, but we should be able to mask over that with base job updates 19:46:59 and then slowly peel back (I don't actually know what that looks like but it should be possible to do this without too much breakage and pain) 19:47:30 i doubt it will be as painful and drawn out as the thick images to bindep transition 19:47:49 it would be helpful if others can review the notes and changes and point out any flaws with the ideas presented 19:48:07 note 707750 keeps the images as is, but adds a ubuntu-bionic-plain (or something) that we can use as a test node type 19:48:09 but then I'm thinking we may want to formalize this into a spec so that we can communicate it relatively easily with our broad userbase 19:48:44 and yeah, the biggest hole in current use cases is that there's no way to avoid pip installing pip system-wide if you want to use pip to install other packages system-wide 19:49:30 that way we can present the problem and plan succinctly in one place that gets sent out on the mailing list 19:49:33 pip in a virtualenv/venv can be used to install --local just fine though, and virtualenv in a venv or another virtualenv also works as we would want 19:50:12 fungi: ya and I think that also reflects that "always install python from source" is becoming less important for the majority of our userbase 19:50:44 and behaving more similarly to distro published images is becoming more important 19:50:48 you can also use python3 venv to install virtualenv, link that in your execution path, then use that to create a python2.7 virtualenv to install tox in, if you need tox defaulting to python2.7 19:51:29 fungi: yep ... but that should definitely be in a base job, not in image build anyway 19:51:36 agreed 19:51:50 so we can definitely explore those avenues 19:52:15 as for writing the spec I expect I can put time into that on friday. I guess let ianw and mordred and I know if there are concern with what is up so far and if not I/we can start writing based on that? 19:52:20 ianw: mordred ^ does that make sense to you 19:52:33 yah 19:52:38 ++ 19:53:03 Great that gives us a path to move forward on 19:53:24 The last topic we have was put up by zbr around third party ci logging recommendations 19:53:36 we actually discussed this yseterday and this change was a result 19:53:38 #link https://review.opendev.org/#/c/708323/ improvements landing here 19:53:53 already sorted the guidelines, and thanks for the help. 19:54:08 it basically says that compressing files is ok as they get large (because browsers can't render them effectively in many cases), but we should try to ensure important logs are avialable without extra effort 19:54:30 and that gives us ~5 minutes for anything else 19:54:33 #topic Open Discussion 19:54:48 is still have one question for free topic, i want to improve testing of zuul-jobs roles to test them with centos-8 19:55:24 based on current setup, this would require adding extra jobs, if I have your support I will start doing that. 19:55:33 zbr: have you read https://zuul-ci.org/docs/zuul-jobs/policy.html#testing ? 19:55:46 especially the last paragraph 19:56:00 and the "all-platforms" tag 19:56:01 I have no objection to adding more jobs to cover zuul-jobs more effectively 19:56:13 ahh, thanks for the hint. 19:56:23 yes, i also have no objection, and ^ can hopefully help do so efficiently 19:56:57 (if we just need to test something specifically on centos-8, that's fine, but most of those cases are also probably worth testing on all the platforms, so the tag might be best) 19:57:17 agreed particularly with the python switch between 7 and 8 19:57:33 that has the potential to have broad impact and 7 tests won't catch those issues 19:57:46 creating jobs is not a goal by itself, i will follow these guidelines. 19:59:02 Sounds like that may be it. Thank you everyone 19:59:06 we'll see you here next week 19:59:10 thanks clarkb! 19:59:10 #endmeeting