#openstack-meeting log

19:05:23 <fungi> #startmeeting infra
19:05:24 <openstack> Meeting started Tue Nov 19 19:05:23 2019 UTC and is due to finish in 60 minutes.  The chair is fungi. Information about MeetBot at http://wiki.debian.org/MeetBot.
19:05:25 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote.
19:05:27 <openstack> The meeting name has been set to 'infra'
19:06:23 <fungi> #link http://lists.openstack.org/pipermail/openstack-infra/2019-November/006518.html agenda
19:06:38 * corvus hopes mordred is around for a gerrit update (gratuitous ping)
19:06:44 <fungi> #topic announcements
19:07:12 <fungi> clarkb is indisposed, so you get my retro meeting treatment
19:07:42 * corvus hopes this doesn't mean clarkb has a cavity
19:07:52 <fungi> my brain has a cavity
19:08:08 <fungi> anyway, i've been apprised of no announcements
19:08:19 <fungi> anyone have something?
19:08:45 <clarkb> I dont but dentist was slow getting to the exam portion if the visit
19:08:46 <fungi> i'm taking your silence as a tacit declination
19:08:56 <fungi> #topic Actions from last meeting
19:09:09 <fungi> #link http://eavesdrop.openstack.org/meetings/infra/2019/infra.2019-11-12-19.01.html minutes from last meeting
19:09:32 <fungi> Action items: "1. (none)"
19:09:37 <fungi> seems we can move on
19:09:50 <fungi> #topic Specs approval
19:10:27 <fungi> no specs up for approval which i can see
19:11:08 <fungi> #topic Priority Efforts: A Task Tracker for OpenStack
19:11:23 <fungi> #link http://specs.openstack.org/openstack-infra/infra-specs/specs/task-tracker.html A Task Tracker for OpenStack
19:11:41 <fungi> SotK has pushed up a refresh of the attachments work
19:12:40 <SotK> it uses openstacksdk now and also doesn't leak auth tokens
19:12:45 <fungi> #link https://review.opendev.org/#/q/topic:story-attachments+status:open
19:13:21 <fungi> ended up using tempurls right?
19:13:49 <fungi> rather than formpost
19:14:57 <SotK> indeed, I couldn't get the formpost signature that openstacksdk produced to work, and using tempurls made it easier to name the objects something that wasn't their filename
19:15:20 <fungi> mordred: ^ you may also have some input there
19:16:17 <fungi> #topic Priority Efforts: Update Config Management
19:16:22 <fungi> #link http://specs.openstack.org/openstack-infra/infra-specs/specs/update-config-management.html Update Config Management
19:17:01 <fungi> i didn't make any further progress on the mediawiki puppetry update this week
19:17:30 <fungi> though i think i keep conflating this with the trusty upgrades
19:17:59 <fungi> there is work in progress to letsencrypt-ify the gitea deployment
19:18:34 <fungi> oh, though that's listed under the opendev topic already
19:19:14 <fungi> i guess the main activity under this actual topic currently is mordred's dockerification of our gerrit deployment
19:20:08 <fungi> but it seems like he's probably not around at the moment so we can circle back to this at the end if we get time
19:20:46 <fungi> #topic Priority Efforts: Zuul as CD engine
19:21:24 <fungi> the bullet depth on the agenda is not entirely clear
19:22:12 <fungi> i'm not aware of any new developments on this front in the past few weeks
19:23:14 <clarkb> probably the only recent update was the discussion we had at the ptg
19:23:29 <clarkb> and how its the project ssh keys that give us trouble that we need to sort out in zuul somehow?
19:23:31 <fungi> yeah, which has been summarized on the ml
19:24:11 <fungi> oh, or did that make it into the summary?
19:24:23 <fungi> i guess that was in the zuul-related bits you elided
19:26:06 <fungi> anyway, in short, running jobs for different projects to connect to the same resource is challenging, and mordred (was it mordred?) suggested sort of proxying them by configuring jobs in one project to essentially be triggered by events from another project
19:26:30 <clarkb> yup taking advantage of the proposed http trigger
19:26:45 <corvus> yeah; i think that made it into the etherpad notes
19:26:48 <clarkb> or an internal zuul event trigger system that could be added
19:26:48 <fungi> that would require a new zuul feature, but seemed non-contentious once we talked through some of the logistics
19:27:41 <fungi> or at least a slight extension to an existing feature
19:28:16 <fungi> anyway, that's more of a zuul-side discussion we probably need to restart with the broader community now that we're not in shanghai
19:28:27 <fungi> #topic Priority Efforts: OpenDev (LetsEncrypt issued cert for opendev.org)
19:28:31 <fungi> #link https://review.opendev.org/694181 LetsEncrypt issued cert for opendev.org
19:28:43 <fungi> clarkb has been hacking on this
19:29:05 <fungi> the manually-acquired opendev.org ssl cert is due to expire in ~w0 days
19:29:12 <fungi> er, ~10 days
19:29:35 <ianw> i think it's right to go?  the sticking point yesterday was failing mirror jobs, which we ended up tracking down to afs server being offline
19:29:40 <fungi> but the implementation there seems to be working and ready for review, allowing us to switch to letsencrypt fo rthose
19:30:03 <clarkb> ianw: ya I've rechecked the change
19:30:18 <clarkb> if it comes back +1 I think we can go ahead and have gitea01 try it
19:30:39 <fungi> so yeah, not super urgent would be good to land in the next few days so we have a week to make sure it's working before we end up having to buy a new cert at the last minute
19:31:05 <clarkb> ++ I should be able to approve and monitor the gitea01 change today
19:31:38 <fungi> #topic Priority Efforts: OpenDev (Possible gitea/go-git bug in current version of gitea we are running)
19:31:56 <fungi> #link https://storyboard.openstack.org/#!/story/2006849 Possible gitea/go-git bug in current version of gitea we are running
19:32:07 <fungi> ianw has been trying to sort this out with tonyb
19:32:38 <fungi> strange disconnects on the server side and client-side errors during git fetch for openstack/nova
19:32:48 <ianw> no smoking gun, but upstream has asked us to upgrade to 1.9.6
19:33:01 <ianw> #link https://review.opendev.org/694894
19:33:02 <fungi> at first we thought it was isolated to gitea06 but then identified similar behavior with gitea01
19:33:19 <ianw> if there's still issuses with that ... well ... it's a one way trip to debug town i guess
19:33:23 <corvus> does it persist if fetching directly from the backends, not going through th lb?
19:33:34 <clarkb> corvus: yes
19:34:12 <ianw> #link https://github.com/go-gitea/gitea/issues/9006
19:34:23 <fungi> load balancer shenanigans were also an early suspicion, but got ruled out, yeah
19:36:49 <fungi> #topic Trusty Upgrade Progress: Wiki updates
19:37:00 <fungi> this one's on me, i guess
19:37:20 <fungi> no progress since shanghai, should be able to pick it back up again this week
19:37:41 <fungi> #topic static.openstack.org
19:37:58 <fungi> according to the agenda text, "Ajaeger started writing some changes"
19:38:25 <fungi> #link https://review.opendev.org/#/q/topic:static-services+status:open
19:38:46 <ianw> yeah, i'm on the hook for creating volumes, but haven't got to it yet
19:39:04 <AJaeger> fungi: some time ago ;) First one just got approved, rest needs review and AFS creation - and mnaser can take those over and then add the remaining changes...
19:39:22 <fungi> ianw: no worries, per chatter in #openstack-infra we should probably get the spice flowing on existing volumes first anyway
19:39:38 <clarkb> fungi: ++
19:41:03 <fungi> #topic ask.openstack.org apache2 crashing daily during logrotate
19:41:16 <clarkb> I kept this on the agenda because I hadn't heard anything new on this topic
19:41:31 <clarkb> but I was also sick late last week and may have missed updates
19:41:36 <clarkb> It looks like ask is running now
19:41:45 <frickler> it seems the patch to change the logrotate posthook is working fine
19:41:57 <clarkb> frickler: doing a restart instead of a reload?
19:42:14 <frickler> I merged that on friday and didn't need a manual intervention since then
19:42:18 <frickler> clarkb: yes
19:42:27 <fungi> that was the old fix and somehow got undone in an upgrade, right?
19:42:36 <frickler> actually 4 of them in a row
19:42:56 <frickler> fungi: no, I tested it manually earlier and it was undone by the next puppet run
19:43:33 <frickler> not sure why it started breaking initially after running seemingly fine for some time
19:43:39 <frickler> possibly some ubuntu update
19:44:39 <frickler> but I think we can regard that as solved for now, in particular giving that we may stop running it at all in the future
19:45:14 <fungi> okay, cool. thanks!
19:45:22 <clarkb> frickler: thank you!
19:45:30 <fungi> #topic Installing tox with py3 in base images runs some envs with py3 now
19:45:46 <fungi> #link http://eavesdrop.openstack.org/irclogs/%23openstack-infra/%23openstack-infra.2019-11-15.log.html#t2019-11-15T11:10:57
19:45:59 <frickler> okay, the issue here are tox envs that don't specify a basepython
19:46:15 <frickler> like functional
19:46:21 <fungi> they presumably use whatever python tox was installed with
19:46:24 <tosky> as you can see from the linked job, a job from a stable branch is now failing
19:46:45 <tosky> my understanding/hope is that stable jobs shouldn't be affected by changes like this
19:46:51 <tosky> sorry, jobs running on stable branches
19:46:57 <frickler> and we seem to have changed installing tox from py2 to py3 in dib not long ago
19:47:46 <ianw> i bet somehow it has to do with I3414fb9e503f94ff744b560eff9ec0f4afdbb50e
19:48:00 <clarkb> tosky: I'm not sure we can assert that. stable jobs break all the time because the world around them has to move forward
19:48:06 <frickler> one solution is to hardcode basepython=py2 where needed, but tosky argues that we shouldn't force people to touch so many stable jobs
19:48:09 <ianw> https://review.opendev.org/#/c/686524/
19:48:51 <tosky> clarkb: uhm, I'm not sure about that
19:49:04 <clarkb> tosky: even centos 7 updates have done it
19:49:08 <tosky> we have requirement boundaries for this
19:49:30 <tosky> but something that forces all functional jobs (and maybe others) for all repositories?
19:49:51 <clarkb> tosky: it does not force it, instead it changes a default
19:49:58 <clarkb> a forward looking default
19:50:07 <ianw> yeah, i think that would do it ... i did not intend this fwiw
19:50:45 <tosky> but it changes it in a way that was needed for that stable branch
19:50:50 <clarkb> the default here is tox's default python which is based on what it got installed under
19:51:31 <fungi> the way we've defended stable jobs against such uncertainty in the past is by being explicit about what those jobs use. making them explicitly declare a python interpreter version is consistent with that in my opinion
19:51:57 <tosky> ok, time is running out and I see where it's heading, so what I'd like to ask is at least
19:52:02 <tosky> to send an email explaining the issue
19:52:04 <ianw> these jobs must be running on bionic?
19:52:07 <fungi> implicit dependencies are where we get into trouble
19:52:25 <tosky> and if someone really feels like that, an set of infra forced jobs to fix this, like the openstack->opendev migration
19:52:34 <tosky> s/forced jobs/forced pushes/
19:53:00 <tosky> but at least the email
19:54:39 <clarkb> tosky: ya I think we can send email about it. I don't think the change was intentional or expected (hence not catching it earlier), but I do think the change is likely desireable as we try to put python2 behind us
19:54:47 <fungi> to reiterate ianw's question, is this only happening for jobs running on bionic (so, say, stable/train and master branch jobs?)
19:54:56 <clarkb> in general if tox environments require a python version they should specify that in the tox config though
19:55:17 <ianw> fungi: i think it must be -- what has happened is that we install the environment now with what dib thinks is the default python for the platform
19:55:32 <ianw> which is python3 for bionic, but not for xenial
19:55:49 <fungi> yeah, if so the problem domain is small and recent
19:56:03 <ianw> i can put in an override, as such, that forces bionic to install with python2
19:56:09 <fungi> we don't need to, say, backport these basepython additions to extended-maintenance branches
19:56:18 <tosky> uhm, when was bionic introduced? Rocky?
19:56:48 <ianw> one small point is, though, *outside* the gate, it seems fragile for anyone trying to run things
19:56:58 <clarkb> ianw: yes exactly. the tox config should be specific
19:57:05 <clarkb> otherwise it is broken for people running tox locally half the time
19:57:16 <clarkb> (depending on what python version they've elected to install tox with)
19:57:52 <ianw> clarkb: yep, but i also understand tosky's point that it was annoying to throw this out there without warning
19:57:59 <ianw> (although, it was totally unintentional)
19:57:59 <fungi> tosky: whatever openstack release cycle started after april 2018
19:58:54 <fungi> so stein, looks like
19:59:21 <fungi> because the rocky cycle started in february 2018
19:59:52 <tosky> I totally understand it was unintentional, that was never a problem; issues happen
20:00:18 <fungi> okay, we're out of time
20:00:46 <fungi> mordred: if you're around later and want to provide a gerrit status update in #openstack-infra, there's interest
20:01:04 <ianw> tosky: i can take a point to send a mail to opendev-list about this today, and we can decide on the path
20:01:09 <fungi> and anyone with further things to discuss, find us similarly in the #openstack-infra channel
20:01:23 <fungi> thanks everyone!
20:01:27 <tosky> thanks!
20:01:29 <fungi> #endmeeting