19:01:25 #startmeeting infra 19:01:26 Meeting started Tue May 25 19:01:25 2021 UTC and is due to finish in 60 minutes. The chair is clarkb. Information about MeetBot at http://wiki.debian.org/MeetBot. 19:01:27 Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 19:01:29 The meeting name has been set to 'infra' 19:01:46 #link http://lists.opendev.org/pipermail/service-discuss/2021-May/000241.html Our Agenda 19:01:56 o/ 19:02:47 #topic Announcements 19:02:51 I didn't have any announcements 19:02:58 #topic Actions from last meeting 19:03:04 #link http://eavesdrop.openstack.org/meetings/infra/2021/infra.2021-05-18-19.01.txt minutes from last meeting 19:03:20 We don't appear to have recorded any actions last meeting 19:03:33 #topic Priority Efforts 19:03:41 #topic OpenDev 19:04:02 I think it was fungi that mentioned it is odd we keep this as a separate meeting topic when all the things we do are basically opendev. 19:04:19 the meeting is the opendev meeting ;) 19:04:25 I'll try to remember when I set up the agenda for next meeting that this needs a reorg 19:04:45 I don't have any updates that have historically fit under this heading that don't have agenda items later in the agenda 19:04:50 I think we should move on 19:05:00 #topic Update Config Management 19:05:17 This one is in a similar situation. We tend to talk about these updates in the context of saying updating mailman config management 19:05:39 I'm thinking maybe instead of putting "priority efforts" upfront we just try to order the agenda item lists to ensure the most important items come first 19:05:50 wfm 19:05:57 (some weeks that will be our priority efforts and others it will be out of the blue important things) 19:06:08 ++ 19:06:15 it was originally a section for covering activities related to specs marked as priority 19:06:55 cool, with that I think we can jump into the agenda items which I've already applied a priority sorting function to 19:07:02 #topic General topics 19:07:12 #topic Refreshing non LE certs 19:07:22 #link https://review.opendev.org/c/opendev/system-config/+/792789 Redirect ask.o.o 19:07:43 ianw started work on making a landing page for a shtudown ask.openstack.org. This change lgtm though dns for acme challenges needs updating 19:08:12 That switches ask.o.o to an LE cert then serves a tiny index.html pointing people to the internet archive 19:08:50 Separately I've got a stack of changes to convert ethercalc, storyboard, and translate to LE certs 19:08:52 #link https://review.opendev.org/c/opendev/system-config/+/792708 stack of LE certs 19:08:57 thanks, yeah i can finalise that 19:09:08 These changes will handle the bulk of the expiring certs. 19:09:18 We are left with wiki, openstackid and openstackid-dev 19:09:27 fungi: for wiki is the easiest thing to buy a cert? 19:09:45 or is it under sufficient config management to use LE? 19:09:46 yes probably 19:09:51 to buy a cert 19:10:05 it is under absolutely zero config management 19:10:14 ok, in the past I've done that about a week before expiry. I can work that out when we get closer to expiration 19:10:57 For openstackid and openstackid-dev fungi and I spoke to the foundation sysadmins and they are willing to take on management of those services. However, it is unlikely that will be done before the certs expire 19:11:03 the config management we have for mediawiki is mostly sufficient to stand up a working server, but the data needs reorganizing compared to how it's laid out on the production server 19:11:29 I told them I would refresh the cert for the prod server one way or another. My plan is to try and LE it if the others I am LEing go well (they should be fine) 19:11:31 also "mostly" is important there, i did not get so far as to work out why openid auth was broken on the dev deployment 19:11:36 fungi: gotcha 19:11:37 (for wiki) 19:11:56 for openstackid-dev it sounded like we don't think anything needs a valid cert for that anymore so I'm going to let it just expire 19:12:22 if we discover after it expires that something does need it we can provision it with LE too (I wanted to avoid provisioning a cert if someone else is going to manage it with their own config management later) 19:13:07 Long story short we have a plan now and are making progress, we should have things orted well before any certs expire 19:13:19 #topic Potentially Migrating away from Freenode to OFTC 19:13:59 Last week freenodes staff quit due to changes to the organization. Since then there have been changes to things like use policies for freenode too 19:14:53 this is precipitated conversations around whether or not we should stick to freenode 19:15:16 for many yaers we have maintained a small backup presence on oftc 19:15:36 increasing it appears that switching to oftc is going to be a good moev for us, but we are soliciting feedback from our membr projects 19:15:42 #link http://lists.opendev.org/pipermail/service-discuss/2021-May/000236.html Fungi's call for input 19:16:23 Kata and Zuul have both responded that they are happy to move (zuul would like to stick close to ansible if possible but that isn't critical) 19:16:31 openstack seems to be dragging its feet 19:16:40 fungi: have we heard anything from airship or starlingx? 19:16:59 well, rather the tc is trying to crowdsource input on an irc preference from all the openstack projects 19:17:19 i've heard nothing from either airship or starlingx leadership 19:17:35 nor openinfralabs, whose ml i also contacted 19:17:47 fungi: maybe we should ask ildikov and ashlee to prod them? 19:18:03 and I guess have you prod openinfralabs again? (I think you interface wit hthem) 19:18:09 https://github.com/ansible-community/community-topics/issues/19 appears to still be ansible discussing things? 19:18:11 can't hurt, though for the most part i'm taking a lack of response to mean they're fine with whatever we decide to go with 19:18:34 ianw: ya from corvus' email it seemed ansible was undecided 19:18:42 that's my understanding 19:18:45 ianw: yeah, also the python community, the freebsd community and some other large ones are still trying to decide what to do 19:19:25 i feel like these are all feathers that tip the scale really; i don't think there are many folks that feel very strongly oftc vs libera. 19:19:36 there are folks that feel strongly "not freenode" 19:20:56 the recent updates to freenode policies are concerning so ya "not freenode" seems to be picking up more steam 19:23:02 fungi: it would probably be good to keep the pressure on so that we can make a decision sooner than later. As stalling puts us in a weird spot. I can reach out to ildikov and ashlee to see what they say about airship and starylingx 19:23:09 (I cannot type today, or most dats) 19:24:03 yeah, i agree. i mostly expect openstack to take the longest to come to an answer, given larger ships take longer to turn 19:24:29 cool, anything else to add to this? maybe you want to talk about your bot modifications? 19:24:39 based on what i've read, we may expect problems from freenode if we vacate. that's worth planning for. 19:24:58 corvus: specifically around repurposing channels right? 19:25:27 yeah. i expect our idea of closing channels with forwarding messages is incompatible with how the new staff sees things. 19:26:02 ya I guess that means if we switch we need to overcommunicate so that people can discover the new location without relying on messages in irc 19:26:28 yeah. i think that would be prudent. if nothing bad comes to pass, there's no downside. :) 19:26:52 i've seen one incident with something to do with a hacker-news bot repeater channel 19:27:13 but i now have several moved channels 19:28:05 #cryptography-dev went mute and changed topic, centos seems to now be #centos-has-moved 19:29:17 ya I don't expect much trouble, but being prepared for it can't hurt 19:30:39 i suspect it has more to do with areas of personal interest 19:30:39 right, #openbsd has had a forwarding message in its topic for nearly a week already 19:30:54 so we're probably fine. but maybe don't rule it out. 19:31:03 ++ 19:31:31 i agree, it's more likely to come up when moves get press, or are otherwise high-profile and present marketing opportunities or damage control 19:32:18 slightly OT: i'm engaging in a matrix experiment for learnings. 19:32:45 don't need to discuss now, but just wanted to share 19:33:03 ++ experimenting with matrix > talking about matrix :) 19:33:05 cool. Anything else on this topic or should we move on? 19:33:14 sean-k-mooney and yoctozepto have been as well 19:33:25 fungi: ^ feel free to discuss your updates or whatever other relevant items are in play 19:33:43 wip changes are up with an "oftc" review topic 19:34:06 i'm currently wrestling the state machine in accessbot to get it working 19:34:24 #link https://review.opendev.org/q/topic:oftc Bot updates to make oftc a possibility 19:34:36 the biggest difference there is that the acls on oftc use coarse-grained access levels rather than individual access flags 19:34:47 so our mask handling has to be ripped out and replaced 19:35:32 fungi: are you sure it has to be ripped out? i *thought* i left some provision for swapping in roles instead of flags 19:35:36 also the lack of identify-msg cap on their ircd means the initial connection stages are subtly different, but enough i'm having trouble getting the identify message to fire at the right time 19:36:21 (but it's entirely possible that it grew more complex since then and overwhelmed whatever provision i thought i left for that) 19:36:37 corvus: if it can also support rbac then that'd be a huge help, but it wasn't especially obvious if it's in there 19:36:50 where is the code now? :) 19:37:04 in my editor, i'll push up a broken i progress change 19:37:11 er, in progress change 19:37:20 ah looks like ./docker/accessbot/accessbot.py ? 19:37:28 yes, that's the current state 19:37:52 i thought you were asking if my work in progres on that script was up for review yet, sorry 19:38:13 fungi: i think you're right and i'm wrong. i think it needs to be ripped out and replaced. 19:38:27 cool, i'm about halfway into that, at least 19:38:33 thanks for looking 19:38:38 sorry for the red herring 19:38:54 no worries, that code has a thick layer of dust on it 19:39:59 We have a few more topisc to get through, lets get through those and can swing back to this if we still have time 19:40:01 anyway, that's all i've got. going to keep plugging on the accessbot work 19:40:05 thanks! 19:40:07 #topic Switch Vexxhost to provide only specialized labels in Nodepool 19:40:34 fungi: I didn't manage to find this change. But the tl;dr is that the difference in base memory for vexxhost nodes make its problematic for landing things that need more memory then later run elsewhere with less memory 19:41:06 considering the size of vexxhost's pool slice we can (with reasonable safety) shift vexxhost to only providing the specialized resources and not the generic ones 19:41:13 * clarkb looks for the change again 19:41:38 #link https://review.opendev.org/c/openstack/project-config/+/785769 19:41:54 #link yes, sorry, i was pulling up the ml thread 19:41:56 fungi: looks like it needs to be reabsed. Is there any reason to keep it as WIP? 19:42:17 #link http://lists.opendev.org/pipermail/service-discuss/2021-April/000219.html Recent nodepool label changes 19:42:41 i didn't want it to merge until it had been discussed, but the ml thread just sort of petered out 19:42:49 gotcha 19:43:06 and yeah, i'm not surprised it's fallen into a merge conflict in the meantime 19:43:16 do you want us to try and pick up the discussion on the mailing list? 19:43:28 fwiw I'm happy to make that change (and could say so on the mailing list) 19:43:49 do we have a sense of how many jobs use those expanded node types? 19:43:50 basically try and be more flexible in allocating resources to both avoid problems and maybe provide more interesting flavors where they can be useful 19:44:37 ianw: I think it has fluctuated over time. octavia uses the nested virt flavors pretty consistently, but then sean-k-mooney does things with the bigger and nested virt flavors when nova has specific needs iirc 19:44:50 airship is also a large flavor user for their main end to end test job 19:45:19 i switched it out of wip 19:45:44 and will rebase to clear the merge conflict 19:45:52 we could only set mem= type kernel command lines to limit the ram by setting it in a dib element, right? 19:46:04 ianw: we can but if we do that we need different images 19:46:18 because we want to use more memory in other circumstances 19:46:48 right, yeah. we used to do that iirc 19:47:04 ya we did it for hpcloud when there were only two clouds and the images were different anyway 19:47:45 its probably still worth thinking through to see if there is a way to make that happen (like can nova set boot params for us? or have a pre run playbook update the node and reboot if necessary if the time loss isn't too bad) 19:47:55 but fungi's change is the simplest thing we can land right now 19:48:02 second simplest 19:48:19 but less wasteful than just removing that region 19:48:31 ++. i wonder if we can hotplug remove memory 19:48:59 plz_to_be_removing_sticks_of_ram 19:49:03 I think we can proceed with fungi's change then revert if we find an alternative 19:49:32 #topic Server Upgrades 19:49:42 (running out of time so want to keep going) 19:50:03 The mailman ansible stuff seems to have gone well. I don't know that we haev created any new lists yet (but this is tested with zuul) though 19:50:49 Next up is taking a snapshot of the server and upgrading it through to focal. One thought I had was before we bother with a snapshot of the server I can test this on a zuul node that we hold 19:51:23 once we have this working reliably on the held zuul nodes (can rerun ansible against the focal node too etc) then take an up to date snapshot and make sure it runs happily on our snapshot? 19:51:41 do we think running through it on the snapshot is useful if doing it on a zuul node? (how careful do we want to be is the question I guess) 19:51:49 ++ .. having gone through quite a few in-place updates with afs i didn't really have any issues 19:52:07 the only problem was a stuck volume on the afs sever which was barely related 19:52:35 I'll try to make time for the held zuul node upgrade late this week or next week and we can plan doing the actual snapshot copy after that if we feel it is necessary 19:53:08 i feel like having the snapshot on hand as a fallback is probably fine, probably the only actual thing you won't be testing with the in-place zuul upgrade is the newer mailman packages 19:53:26 fungi: ya and last time we had a few issues related to those iirc 19:53:32 had to do with new pipeline entries for mailman iirc 19:53:51 we worked through them then, we can probably work through it again 19:54:06 another consideration is we may want to avoid mailing list outages while IRC is still somewhat up for discussion 19:54:14 but that is a problem to worry about when we are ready to upgrade :) 19:55:02 ianw: any gerrit upgrade updates? 19:55:25 no but i swear i'm getting back to it 19:55:38 ok :) we have all been busy 19:55:45 #topic Scheduling project renames 19:55:58 related to being busy I feel like we're not in a great spot to try and plan for this right now with everything else going on 19:56:26 If we can dig out of the backlog and get the playbooks updated we can start scheduling properly 19:56:33 #topic Open Discussion 19:56:45 rosmaita: did you have updates or more to discuss about zuul comments in gerrit? 19:56:55 (wanted to make sure I saved some time for the catch all topic if so) 19:57:00 no, i did have a different topic though 19:57:07 somehow got nuked off the agenda :) 19:57:20 we did have someone come into the opendev channel asking how to set tags on gerrit comments, probably related to running a cinder third-party ci system 19:57:22 rosmaita: I may have accidentally cleaned up that topic and the old one thinking it was just the old one 19:57:32 rosmaita: apologies if that is what happened 19:57:46 rosmaita: go for it though 19:58:44 (happy to run slightly over for missed topics) 19:58:54 thanks 19:59:02 sorry, my system went unstable for a minute there 19:59:12 quick question about publishing the api-ref 19:59:23 https://docs.openstack.org/api-ref/block-storage/index.html 19:59:30 that has both v2 and v3 19:59:45 we are removing v2 this cycle, so ordinarily would also remove the v2 api-ref 20:00:03 but, it could be useful for people who have procrastinated 20:00:17 the api-ref isn't branched like the regular docs are 20:00:40 right, because the apis aren't versioned the same as the software 20:00:58 so i was wondering if there is a way to have the v2 api-ref built from the stable/wallaby branch or something like that? 20:01:22 rosmaita: can you change https://docs.openstack.org/api-ref/block-storage/v2/index.html to say (Removed in Xena) instead of (DEPRECATED) and maybe put a warning there but keep the info there? 20:01:23 or is the best think just to say "REMOVED" in big letters at the top of the v2 ref 20:01:39 clarkb: yeah, i guess that's the best thing 20:01:44 I think the reason the api docs are separated from the projects is that the apis live longer than the master branch 20:02:03 well, it's removed from the latest version of the software, but not retroactively removed from earlier versions of the software, and that document is in theory about all openstack cinder versions 20:02:13 (thats probably not hte most accurate statement, but iirc the info that generates those docs isn't directly tied to the code for this reason) 20:02:23 fungi: ++ 20:02:27 ok, a note makes sense 20:02:54 for glance we put up a note telling users to generate the v1 api-ref for themselves if they really wanted it 20:02:58 rosmaita: we're probably well off-topic for opendev, but consider it like the sdk. should the latest version of the sdk rip out cinder v2 api support and just declare old cloud deployments are no longer valuable? 20:03:48 well, we keep the documentation branched with the software 20:04:11 and you can use an older version of the sdk if you need to 20:04:17 anyway, doesn't matter 20:04:30 sort of. it's in the same repository as the software these days (it wasn't always) but in theory only the master branch version is maintained since that's the only one which gets published 20:04:59 also you can't use old versions of the sdk if you have one piece of software which needs to talk to two clouds which have different supported api versions 20:05:19 (there are people who interface with more than one cloud at the same time) 20:05:21 ok, we'll put a note on the v2 ref, that will work 20:05:28 cool, glad we could help 20:05:34 Anything else? we are over time so I'll end us here if not 20:05:46 ultimately, it's more of a question for the openstack tc and tech writing sig though 20:06:04 that's all from me, other than a thank you for being ready for the freenode -> OFTC change 20:06:06 i don't have anything else 20:06:21 thanks everyone! 20:06:23 #endmeeting