#opendev-meeting log

19:01:25 <clarkb> #startmeeting infra
19:01:26 <openstack> Meeting started Tue May 25 19:01:25 2021 UTC and is due to finish in 60 minutes.  The chair is clarkb. Information about MeetBot at http://wiki.debian.org/MeetBot.
19:01:27 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote.
19:01:29 <openstack> The meeting name has been set to 'infra'
19:01:46 <clarkb> #link http://lists.opendev.org/pipermail/service-discuss/2021-May/000241.html Our Agenda
19:01:56 <ianw> o/
19:02:47 <clarkb> #topic Announcements
19:02:51 <clarkb> I didn't have any announcements
19:02:58 <clarkb> #topic Actions from last meeting
19:03:04 <clarkb> #link http://eavesdrop.openstack.org/meetings/infra/2021/infra.2021-05-18-19.01.txt minutes from last meeting
19:03:20 <clarkb> We don't appear to have recorded any actions last meeting
19:03:33 <clarkb> #topic Priority Efforts
19:03:41 <clarkb> #topic OpenDev
19:04:02 <clarkb> I think it was fungi that mentioned it is odd we keep this as a separate meeting topic when all the things we do are basically opendev.
19:04:19 <fungi> the meeting is the opendev meeting ;)
19:04:25 <clarkb> I'll try to remember when I set up the agenda for next meeting that this needs a reorg
19:04:45 <clarkb> I don't have any updates that have historically fit under this heading that don't have agenda items later in the agenda
19:04:50 <clarkb> I think we should move on
19:05:00 <clarkb> #topic Update Config Management
19:05:17 <clarkb> This one is in a similar situation. We tend to talk about these updates in the context of saying updating mailman config management
19:05:39 <clarkb> I'm thinking maybe instead of putting "priority efforts" upfront we just try to order the agenda item lists to ensure the most important items come first
19:05:50 <fungi> wfm
19:05:57 <clarkb> (some weeks that will be our priority efforts and others it will be out of the blue important things)
19:06:08 <ianw> ++
19:06:15 <fungi> it was originally a section for covering activities related to specs marked as priority
19:06:55 <clarkb> cool, with that I think we can jump into the agenda items which I've already applied a priority sorting function to
19:07:02 <clarkb> #topic General topics
19:07:12 <clarkb> #topic Refreshing non LE certs
19:07:22 <clarkb> #link https://review.opendev.org/c/opendev/system-config/+/792789 Redirect ask.o.o
19:07:43 <clarkb> ianw started work on making a landing page for a shtudown ask.openstack.org. This change lgtm though dns for acme challenges needs updating
19:08:12 <clarkb> That switches ask.o.o to an LE cert then serves a tiny index.html pointing people to the internet archive
19:08:50 <clarkb> Separately I've got a stack of changes to convert ethercalc, storyboard, and translate to LE certs
19:08:52 <clarkb> #link https://review.opendev.org/c/opendev/system-config/+/792708 stack of LE certs
19:08:57 <ianw> thanks, yeah i can finalise that
19:09:08 <clarkb> These changes will handle the bulk of the expiring certs.
19:09:18 <clarkb> We are left with wiki, openstackid and openstackid-dev
19:09:27 <clarkb> fungi: for wiki is the easiest thing to buy a cert?
19:09:45 <clarkb> or is it under sufficient config management to use LE?
19:09:46 <fungi> yes probably
19:09:51 <fungi> to buy a cert
19:10:05 <fungi> it is under absolutely zero config management
19:10:14 <clarkb> ok, in the past I've done that about a week before expiry. I can work that out when we get closer to expiration
19:10:57 <clarkb> For openstackid and openstackid-dev fungi and I spoke to the foundation sysadmins and they are willing to take on management of those services. However, it is unlikely that will be done before the certs expire
19:11:03 <fungi> the config management we have for mediawiki is mostly sufficient to stand up a working server, but the data needs reorganizing compared to how it's laid out on the production server
19:11:29 <clarkb> I told them I would refresh the cert for the prod server one way or another. My plan is to try and LE it if the others I am LEing go well (they should be fine)
19:11:31 <fungi> also "mostly" is important there, i did not get so far as to work out why openid auth was broken on the dev deployment
19:11:36 <clarkb> fungi: gotcha
19:11:37 <fungi> (for wiki)
19:11:56 <clarkb> for openstackid-dev it sounded like we don't think anything needs a valid cert for that anymore so I'm going to let it just expire
19:12:22 <clarkb> if we discover after it expires that something does need it we can provision it with LE too (I wanted to avoid provisioning a cert if someone else is going to manage it with their own config management later)
19:13:07 <clarkb> Long story short we have a plan now and are making progress, we should have things orted well before any certs expire
19:13:19 <clarkb> #topic Potentially Migrating away from Freenode to OFTC
19:13:59 <clarkb> Last week freenodes staff quit due to changes to the organization. Since then there have been changes to things like use policies for freenode too
19:14:53 <clarkb> this is precipitated conversations around whether or not we should stick to freenode
19:15:16 <clarkb> for many yaers we have maintained a small backup presence on oftc
19:15:36 <clarkb> increasing it appears that switching to oftc is going to be a good moev for us, but we are soliciting feedback from our membr projects
19:15:42 <clarkb> #link http://lists.opendev.org/pipermail/service-discuss/2021-May/000236.html Fungi's call for input
19:16:23 <clarkb> Kata and Zuul have both responded that they are happy to move (zuul would like to stick close to ansible if possible but that isn't critical)
19:16:31 <clarkb> openstack seems to be dragging its feet
19:16:40 <clarkb> fungi: have we heard anything from airship or starlingx?
19:16:59 <fungi> well, rather the tc is trying to crowdsource input on an irc preference from all the openstack projects
19:17:19 <fungi> i've heard nothing from either airship or starlingx leadership
19:17:35 <fungi> nor openinfralabs, whose ml i also contacted
19:17:47 <clarkb> fungi: maybe we should ask ildikov and ashlee to prod them?
19:18:03 <clarkb> and I guess have you prod openinfralabs again? (I think you interface wit hthem)
19:18:09 <ianw> https://github.com/ansible-community/community-topics/issues/19 appears to still be ansible discussing things?
19:18:11 <fungi> can't hurt, though for the most part i'm taking a lack of response to mean they're fine with whatever we decide to go with
19:18:34 <clarkb> ianw: ya from corvus' email it seemed ansible was undecided
19:18:42 <corvus> that's my understanding
19:18:45 <fungi> ianw: yeah, also the python community, the freebsd community and some other large ones are still trying to decide what to do
19:19:25 <corvus> i feel like these are all feathers that tip the scale really; i don't think there are many folks that feel very strongly oftc vs libera.
19:19:36 <corvus> there are folks that feel strongly "not freenode"
19:20:56 <clarkb> the recent updates to freenode policies are concerning so ya "not freenode" seems to be picking up more steam
19:23:02 <clarkb> fungi: it would probably be good to keep the pressure on so that we can make a decision sooner than later. As stalling puts us in a weird spot. I can reach out to ildikov and ashlee to see what they say about airship and starylingx
19:23:09 <clarkb> (I cannot type today, or most dats)
19:24:03 <fungi> yeah, i agree. i mostly expect openstack to take the longest to come to an answer, given larger ships take longer to turn
19:24:29 <clarkb> cool, anything else to add to this? maybe you want to talk about your bot modifications?
19:24:39 <corvus> based on what i've read, we may expect problems from freenode if we vacate.  that's worth planning for.
19:24:58 <clarkb> corvus: specifically around repurposing channels right?
19:25:27 <corvus> yeah.  i expect our idea of closing channels with forwarding messages is incompatible with how the new staff sees things.
19:26:02 <clarkb> ya I guess that means if we switch we need to overcommunicate so that people can discover the new location without relying on messages in irc
19:26:28 <corvus> yeah.  i think that would be prudent.  if nothing bad comes to pass, there's no downside.  :)
19:26:52 <ianw> i've seen one incident with something to do with a hacker-news bot repeater channel
19:27:13 <ianw> but i now have several moved channels
19:28:05 <ianw> #cryptography-dev went mute and changed topic, centos seems to now be #centos-has-moved
19:29:17 <clarkb> ya I don't expect much trouble, but being prepared for it can't hurt
19:30:39 <corvus> i suspect it has more to do with areas of personal interest
19:30:39 <fungi> right, #openbsd has had a forwarding message in its topic for nearly a week already
19:30:54 <corvus> so we're probably fine.  but maybe don't rule it out.
19:31:03 <clarkb> ++
19:31:31 <fungi> i agree, it's more likely to come up when moves get press, or are otherwise high-profile and present marketing opportunities or damage control
19:32:18 <corvus> slightly OT: i'm engaging in a matrix experiment for learnings.
19:32:45 <corvus> don't need to discuss now, but just wanted to share
19:33:03 <ianw> ++ experimenting with matrix > talking about matrix :)
19:33:05 <clarkb> cool. Anything else on this topic or should we move on?
19:33:14 <fungi> sean-k-mooney and yoctozepto have been as well
19:33:25 <clarkb> fungi: ^ feel free to discuss your updates or whatever other relevant items are in play
19:33:43 <fungi> wip changes are up with an "oftc" review topic
19:34:06 <fungi> i'm currently wrestling the state machine in accessbot to get it working
19:34:24 <clarkb> #link https://review.opendev.org/q/topic:oftc Bot updates to make oftc a possibility
19:34:36 <fungi> the biggest difference there is that the acls on oftc use coarse-grained access levels rather than individual access flags
19:34:47 <fungi> so our mask handling has to be ripped out and replaced
19:35:32 <corvus> fungi: are you sure it has to be ripped out?  i *thought* i left some provision for swapping in roles instead of flags
19:35:36 <fungi> also the lack of identify-msg cap on their ircd means the initial connection stages are subtly different, but enough i'm having trouble getting the identify message to fire at the right time
19:36:21 <corvus> (but it's entirely possible that it grew more complex since then and overwhelmed whatever provision i thought i left for that)
19:36:37 <fungi> corvus: if it can also support rbac then that'd be a huge help, but it wasn't especially obvious if it's in there
19:36:50 <corvus> where is the code now? :)
19:37:04 <fungi> in my editor, i'll push up a broken i progress change
19:37:11 <fungi> er, in progress change
19:37:20 <corvus> ah looks like ./docker/accessbot/accessbot.py ?
19:37:28 <fungi> yes, that's the current state
19:37:52 <fungi> i thought you were asking if my work in progres on that script was up for review yet, sorry
19:38:13 <corvus> fungi: i think you're right and i'm wrong.  i think it needs to be ripped out and replaced.
19:38:27 <fungi> cool, i'm about halfway into that, at least
19:38:33 <fungi> thanks for looking
19:38:38 <corvus> sorry for the red herring
19:38:54 <fungi> no worries, that code has a thick layer of dust on it
19:39:59 <clarkb> We have a few more topisc to get through, lets get through those and can swing back to this if we still have time
19:40:01 <fungi> anyway, that's all i've got. going to keep plugging on the accessbot work
19:40:05 <fungi> thanks!
19:40:07 <clarkb> #topic Switch Vexxhost to provide only specialized labels in Nodepool
19:40:34 <clarkb> fungi: I didn't manage to find this change. But the tl;dr is that the difference in base memory for vexxhost nodes make its problematic for landing things that need more memory then later run elsewhere with less memory
19:41:06 <clarkb> considering the size of vexxhost's pool slice we can (with reasonable safety) shift vexxhost to only providing the specialized resources and not the generic ones
19:41:13 * clarkb looks for the change again
19:41:38 <clarkb> #link https://review.opendev.org/c/openstack/project-config/+/785769
19:41:54 <fungi> #link yes, sorry, i was pulling up the ml thread
19:41:56 <clarkb> fungi: looks like it needs to be reabsed. Is there any reason to keep it as WIP?
19:42:17 <fungi> #link http://lists.opendev.org/pipermail/service-discuss/2021-April/000219.html Recent nodepool label changes
19:42:41 <fungi> i didn't want it to merge until it had been discussed, but the ml thread just sort of petered out
19:42:49 <clarkb> gotcha
19:43:06 <fungi> and yeah, i'm not surprised it's fallen into a merge conflict in the meantime
19:43:16 <clarkb> do you want us to try and pick up the discussion on the mailing list?
19:43:28 <clarkb> fwiw I'm happy to make that change (and could say so on the mailing list)
19:43:49 <ianw> do we have a sense of how many jobs use those expanded node types?
19:43:50 <clarkb> basically try and be more flexible in allocating resources to both avoid problems and maybe provide more interesting flavors where they can be useful
19:44:37 <clarkb> ianw: I think it has fluctuated over time. octavia uses the nested virt flavors pretty consistently, but then sean-k-mooney does things with the bigger and nested virt flavors when nova has specific needs iirc
19:44:50 <clarkb> airship is also a large flavor user for their main end to end test job
19:45:19 <fungi> i switched it out of wip
19:45:44 <fungi> and will rebase to clear the merge conflict
19:45:52 <ianw> we could only set mem= type kernel command lines to limit the ram by setting it in a dib element, right?
19:46:04 <clarkb> ianw: we can but if we do that we need different images
19:46:18 <clarkb> because we want to use more memory in other circumstances
19:46:48 <ianw> right, yeah.  we used to do that iirc
19:47:04 <clarkb> ya we did it for hpcloud when there were only two clouds and the images were different anyway
19:47:45 <clarkb> its probably still worth thinking through to see if there is a way to make that happen (like can nova set boot params for us? or have a pre run playbook update the node and reboot if necessary if the time loss isn't too bad)
19:47:55 <clarkb> but fungi's change is the simplest thing we can land right now
19:48:02 <fungi> second simplest
19:48:19 <fungi> but less wasteful than just removing that region
19:48:31 <ianw> ++.  i wonder if we can hotplug remove memory
19:48:59 <fungi> plz_to_be_removing_sticks_of_ram
19:49:03 <clarkb> I think we can proceed with fungi's change then revert if we find an alternative
19:49:32 <clarkb> #topic Server Upgrades
19:49:42 <clarkb> (running out of time so want to keep going)
19:50:03 <clarkb> The mailman ansible stuff seems to have gone well. I don't know that we haev created any new lists yet (but this is tested with zuul) though
19:50:49 <clarkb> Next up is taking a snapshot of the server and upgrading it through to focal. One thought I had was before we bother with a snapshot of the server I can test this on a zuul node that we hold
19:51:23 <clarkb> once we have this working reliably on the held zuul nodes (can rerun ansible against the focal node too etc) then take an up to date snapshot and make sure it runs happily on our snapshot?
19:51:41 <clarkb> do we think running through it on the snapshot is useful if doing it on a zuul node? (how careful do we want to be is the question I guess)
19:51:49 <ianw> ++ .. having gone through quite a few in-place updates with afs i didn't really have any issues
19:52:07 <ianw> the only problem was a stuck volume on the afs sever which was barely related
19:52:35 <clarkb> I'll try to make time for the held zuul node upgrade late this week or next week and we can plan doing the actual snapshot copy after that if we feel it is necessary
19:53:08 <fungi> i feel like having the snapshot on hand as a fallback is probably fine, probably the only actual thing you won't be testing with the in-place zuul upgrade is the newer mailman packages
19:53:26 <clarkb> fungi: ya and last time we had a few issues related to those iirc
19:53:32 <clarkb> had to do with new pipeline entries for mailman iirc
19:53:51 <clarkb> we worked through them then, we can probably work through it again
19:54:06 <clarkb> another consideration is we may want to avoid mailing list outages while IRC is still somewhat up for discussion
19:54:14 <clarkb> but that is a problem to worry about when we are ready to upgrade :)
19:55:02 <clarkb> ianw: any gerrit upgrade updates?
19:55:25 <ianw> no but i swear i'm getting back to it
19:55:38 <clarkb> ok :) we have all been busy
19:55:45 <clarkb> #topic Scheduling project renames
19:55:58 <clarkb> related to being busy I feel like we're not in a great spot to try and plan for this right now with everything else going on
19:56:26 <clarkb> If we can dig out of the backlog and get the playbooks updated we can start scheduling properly
19:56:33 <clarkb> #topic Open Discussion
19:56:45 <clarkb> rosmaita: did you have updates or more to discuss about zuul comments in gerrit?
19:56:55 <clarkb> (wanted to make sure I saved some time for the catch all topic if so)
19:57:00 <rosmaita> no, i did have a different topic though
19:57:07 <rosmaita> somehow got nuked off the agenda :)
19:57:20 <fungi> we did have someone come into the opendev channel asking how to set tags on gerrit comments, probably related to running a cinder third-party ci system
19:57:22 <clarkb> rosmaita: I may have accidentally cleaned up that topic and the old one thinking it was just the old one
19:57:32 <clarkb> rosmaita: apologies if that is what happened
19:57:46 <clarkb> rosmaita: go for it though
19:58:44 <ianw> (happy to run slightly over for missed topics)
19:58:54 <rosmaita> thanks
19:59:02 <rosmaita> sorry, my system went unstable for a minute there
19:59:12 <rosmaita> quick question about publishing the api-ref
19:59:23 <rosmaita> https://docs.openstack.org/api-ref/block-storage/index.html
19:59:30 <rosmaita> that has both v2 and v3
19:59:45 <rosmaita> we are removing v2 this cycle, so ordinarily would also remove the v2 api-ref
20:00:03 <rosmaita> but, it could be useful for people who have procrastinated
20:00:17 <rosmaita> the api-ref isn't branched like the regular docs are
20:00:40 <fungi> right, because the apis aren't versioned the same as the software
20:00:58 <rosmaita> so i was wondering if there is a way to have the v2 api-ref built from the stable/wallaby branch or something like that?
20:01:22 <clarkb> rosmaita: can you change https://docs.openstack.org/api-ref/block-storage/v2/index.html to say (Removed in Xena) instead of (DEPRECATED) and maybe put a warning there but keep the info there?
20:01:23 <rosmaita> or is the best think just to say "REMOVED" in big letters at the top of the v2 ref
20:01:39 <rosmaita> clarkb: yeah, i guess that's the best thing
20:01:44 <clarkb> I think the reason the api docs are separated from the projects is that the apis live longer than the master branch
20:02:03 <fungi> well, it's removed from the latest version of the software, but not retroactively removed from earlier versions of the software, and that document is in theory about all openstack cinder versions
20:02:13 <clarkb> (thats probably not hte most accurate statement, but iirc the info that generates those docs isn't directly tied to the code for this reason)
20:02:23 <clarkb> fungi: ++
20:02:27 <rosmaita> ok, a note makes sense
20:02:54 <rosmaita> for glance we put up a note telling users to generate the v1 api-ref for themselves if they really wanted it
20:02:58 <fungi> rosmaita: we're probably well off-topic for opendev, but consider it like the sdk. should the latest version of the sdk rip out cinder v2 api support and just declare old cloud deployments are no longer valuable?
20:03:48 <rosmaita> well, we keep the documentation branched with the software
20:04:11 <rosmaita> and you can use an older version of the sdk if you need to
20:04:17 <rosmaita> anyway, doesn't matter
20:04:30 <fungi> sort of. it's in the same repository as the software these days (it wasn't always) but in theory only the master branch version is maintained since that's the only one which gets published
20:04:59 <fungi> also you can't use old versions of the sdk if you have one piece of software which needs to talk to two clouds which have different supported api versions
20:05:19 <fungi> (there are people who interface with more than one cloud at the same time)
20:05:21 <rosmaita> ok, we'll put a note on the v2 ref, that will work
20:05:28 <clarkb> cool, glad we could help
20:05:34 <clarkb> Anything else? we are over time so I'll end us here if not
20:05:46 <fungi> ultimately, it's more of a question for the openstack tc and tech writing sig though
20:06:04 <rosmaita> that's all from me, other than a thank you for being ready for the freenode -> OFTC change
20:06:06 <fungi> i don't have anything else
20:06:21 <clarkb> thanks everyone!
20:06:23 <clarkb> #endmeeting