19:01:13 #startmeeting infra 19:01:14 Meeting started Tue May 21 19:01:13 2019 UTC and is due to finish in 60 minutes. The chair is clarkb. Information about MeetBot at http://wiki.debian.org/MeetBot. 19:01:15 Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 19:01:17 The meeting name has been set to 'infra' 19:01:22 #link http://lists.openstack.org/pipermail/openstack-infra/2019-May/006382.html 19:01:32 you will find today's agenda at that link 19:01:38 #topic Announcements 19:01:42 o/ 19:01:50 This one is sort of last minute but I'm going to be afk tomorrow to spend time with family fishing 19:02:04 good idea! 19:02:20 you might be able to catch more family 19:02:33 just need the right bait 19:03:00 or allergies from the great outdoors (though appaerntly there is a line of thought that allergies are more of a city thing because usda said grow male trees in cities back in the 40s) 19:03:18 #topic Actions from last meetin 19:03:24 #link http://eavesdrop.openstack.org/meetings/infra/2019/infra.2019-05-14-19.01.txt minutes from last meeting 19:03:34 No actions from last meeting as I had completed my action from the prior meeting 19:04:13 #topic Specs approval 19:04:50 No specs up for approval but I did want to make a note that there are a small number of specs starting to trickle out from ptg/summit conversations/work 19:05:04 so if you've got a moment skimming through or even better doing proper reviews on those would be great 19:05:23 we do have a rather packed agenda (I expect) so lets dive right in 19:05:25 #topic Priority Efforts 19:05:32 #topic Update Config Management 19:06:01 One major item to take note of here is puppet inc deleted their puppet 3 and 4 apt/rpm repos 19:06:08 (on the previous topic, there was some renewed interest in the irc bot spec too) 19:06:29 as a result we have switched to installing puppet 4 from the archive via a direct package install of the .deb 19:06:56 we no longer have centos 7 machines in production so no rpms to worry about and the puppet-agent pacakge includes everything it needs to function at least with our puppet apply method 19:07:14 And we only have one last remaining puppet3 instance and that is ask.o.o 19:07:21 #link https://review.opendev.org/#/c/647877/ Last puppet 4 upgrade 19:07:32 ianw: any reason to not merge ^ today? I didn't want to step on your xenial upgrade for ask 19:07:56 clarkb: should be good, i can watch that in today 19:08:12 great I can help watch it too as I've learned some of the patterns for how puppet 4 gets unahppy 19:08:39 On the zuul driven CD side of things the reorg of base.yaml to split it up into a bunch of separate playbooks called by run_all.sh has merged 19:08:50 this means if you are making changes to base.yaml you will need to rebase and split your stuff out too 19:08:59 uneventfully too as far as i can tell 19:09:28 other than it seems to have (maybe?) shrunk the duration of our ansipup pulses 19:09:31 one nice side effect of this is we actually fully test our ansible + puppet stuff in those system-config-run-base type jobs now 19:09:59 yes, over time we ought to be able to whittle away at the beaker and apply jobs in favor of these 19:10:03 And over the longer term we can break stuff out of run_all.sh and have zuul jobs trigger those playbooks instead 19:10:51 any questions, concerns, or things I've missed on this topic? 19:11:02 fungi: i think the overall time is still reflected in http://grafana.openstack.org/d/qzQ_v2oiz/bridge-runtime?orgId=1 and hasn't shrunk; though i will update the graph to take into account the other stats now being sent 19:11:14 ahh 19:11:38 sounded like we'd gone from 45 minutes between starts to 30 19:11:51 but that could also be coincidence 19:11:58 I did want to say thank you to cmurphy for getting us on to more modern puppet even if puppet pulled the rug out from under us in the end. It was a fair bit of work and should helpfully result in a more sustainable setup between now and the future 19:12:11 ++ ! 19:12:25 #thanks cmurphy for driving our massive puppet upgrade 19:12:37 oh, right, we don't keep statusbot in here 19:12:41 :) 19:13:21 * mordred hands cmurphy a nicely glazed antelope 19:13:54 We did still have unhappy docker jobs in limestone. I think the plan to use the mirror nodes for that is still a good one, but we started brainstorming other debugging ideas in -infra today 19:13:56 * cmurphy prefers cantaloupe tbh 19:14:08 s/did/do/ 19:14:09 also known as deglazed antelope 19:14:36 I expect using the mirror will solve all the problems so this is mostly an exercise in understanding the quirks and features of docker tooling 19:14:54 (I expect that because other people/jobs have used those mirrors successfully) 19:15:10 i'm leaning toward missing a v6 iptables rule as a likely suspect. it fits the observed behavior 19:15:56 alright anything else on the topic of config management before we move on? 19:16:10 nothing springs to mind 19:16:22 #topic OpenDev 19:16:32 I deleted the cgit cluster yesterday 19:16:36 \o/ 19:16:37 I have not heard any screaming yet 19:17:12 * mordred screams in joy 19:18:03 As for next steps, I'm personally interested in improving the stability and sustainability of our new tooling 19:18:20 I think we also have some cleanup work like project renames (which we have later on the agenda) 19:19:10 for stability and sustainability I'd like to see our gitea image builds be reliable (hence the docker job debugging), rebuild the gitea06 server which has a corrupted fs as well as eventually rebuilding all gitea servers with more disk and less cpu (based on cacti data) 19:19:38 #link https://review.opendev.org/#/c/640027/ control plane clouds.yaml on nodepool builders 19:19:39 yeah, i'm fine putting off further service migrations until we get cleanup from the last one behind us 19:19:56 #link https://review.opendev.org/#/c/640044/5 Build ubuntu bionic control plane images with nodepool 19:20:14 These two changes from mordred are part of being able to sanely rebuild the gitea servers 19:20:20 if people have time to review them that would be great 19:20:33 awesome 19:21:03 long live our new nodepool-created-base-images-overlords 19:21:13 That said I don't think this is the only opendev work that has to be done. I think ianw's work to build opendev in region mirrors is helpful because it should result in more reliable jobs and starts the process of rotating out the old names 19:21:36 ++ 19:21:37 I think we can probably start to entertain the idea of people picking off services like etherpad and paste and the like as well 19:21:44 since those should be self contained 19:21:46 i see that as part of getting the image builds more reliable anyway 19:21:57 (the mirrors work) 19:22:02 ya 19:22:04 ya 19:23:18 that would be migrations of such things to more containerised approaches? 19:24:18 ianw: in many cases yes I expect we'll couple the container deployment to the new naming scheme. In particular we have to update apache configs for many things so may as well take that on with the container approach (or ansible if containers just don't make sense) 19:24:32 ++ 19:25:37 speaking of, has the approach so far been application container with apache proxy in the outer system context? 19:25:52 or are we deploying apache container proxying to application container? 19:26:29 I'm not sure any of our currently docker'd services have an apache container (grafana might?) 19:26:46 I agree with clarkb 19:26:55 we're not apache-ing gitea 19:26:56 thinking about upcoming work i expect to be doing on mailman 3 and maybe mediawiki 19:26:56 fungi: but beacuse we use system network namespace we should be able to haev an apache container listen on 443 that proxies to port 8080 or whatever just like we do today 19:27:29 yup. but we could also install apache on the host os and have it proxy to port 8080 too, if we find that to be more pleasant 19:27:37 clarkb: grafana not dockered yet ... 19:27:41 ianw: ah 19:27:46 got it. and if the application needs a database, is that in yet another container or in the system context? 19:27:51 container 19:27:53 in any case we don't have to think about what traffic looks like over a docker bridge 19:28:06 fungi: gitea runs a mariadb container 19:28:11 cool 19:28:15 fungi: with all of the state mounted from the host 19:28:22 (which was important after we nuked docker with k8s) 19:28:25 bindmount? 19:28:28 fungi: ya 19:28:32 yeah. and I think for stuff like that it gives us a good way to get modern db services from our upstreams 19:29:18 so do we have a standard mariadb container image, or is it gitea-specific? 19:29:22 so I could see us deciding to docker the apache when we get to it for consistency sake and ease of latest apache 19:29:38 like, for example if i wanted to dockerize the mediawiki deployment 19:30:03 fungi: standard mariadb container 19:30:11 granted, i haven't thought about what this looks like for applications which run in the context of an apache plugin 19:30:19 fungi: we have one of those already! 19:30:29 oh? 19:30:47 fungi: zuul-proxy is an apache thing - so for that, we build a container based on teh apache container and then install zuul-proxy in it 19:31:03 oh right we do have an apache example then 19:31:14 aha, so i could in theory do the same with mod_php and fastcgi and whatever else 19:31:24 fungi: yup 19:31:30 fungi: yup 19:31:45 fungi: also I've successfully dockered my local nextcloud which does php cgi things and can look at how they've constructed things 19:31:48 and the nice part about doing that is it locates most of the work in the CI step 19:31:52 and then bindmount in all the application scripts and data 19:31:59 fungi: exactly 19:32:19 or would the application scripts go im the image and then just bindmount the data? 19:32:41 I'd put the application scripts in the image 19:32:43 and just bind mount data 19:32:48 and any config files 19:32:59 fungi: https://opendev.org/zuul/zuul-preview/src/branch/master/Dockerfile is the zuul-preview Dockefile 19:33:01 i guess depends on whether you want to rebuild the container image each time the application source changes 19:33:06 and is a good example 19:33:14 they use fpm in a dedicated container that runs separately from nginx 19:33:24 oh interesting 19:33:30 fungi: I'd argue we do want to rebuild container image with each source change, since then we can CI the change 19:33:38 sure, makes sense 19:33:57 which does the port proxy thing 19:34:22 fungi: also https://opendev.org/opendev/system-config/src/branch/master/playbooks/roles/gitea/templates/docker-compose.yaml.j2 is good to cargo-cult from for things with databases 19:35:14 so what is our take if the application already has a recommended docker image? rebuild that if we can rather than writing from scratch? 19:35:45 fungi: in some cases we use it as is (mariadb) in others we layer over the top (zuul-preview + apache) and in others we build from scratch (gitea) 19:35:52 fungi: kind of depends - looking at how they're building their docker image and whether it would be nice to use theirs or build our own is a judgement call 19:35:53 likely comes down to how many changes we need to make 19:35:56 the two examples i gave both also maintain docker images (mailman3 and mediawiki) 19:36:04 gitea maintains a docker image - we do not use it 19:36:28 ya not sure there is a hard fast rule here 19:36:36 but - I think it'll mostly come down to analyzing whether the one that's there is good for our use 19:36:37 will keep that in mind 19:36:49 i guess this is actually very similar to choosing a puppet module -- if upstream stops/goes away etc you're left holding the upgrade bag anyway (see: ask.openstack.org) 19:36:55 the gitea upstream image puts multiple processes in a single image 19:36:59 so it's more like a lightweight vm 19:37:13 and we were wanting to embrace the single-process-container model a bit more (iirc) 19:37:16 ahh, so that's one of the smells we'd check for 19:37:30 ianw: ++ 19:38:40 Anything else on opendev before we move on? running out of time and want to get to the other topics 19:38:56 huh, the mediawiki dockerfile sets it up with sqlite 19:39:01 an interesting choice 19:39:18 #topic Storyboard 19:39:51 looks like storyboard has elected to have the openstack release team manage its release process 19:39:53 the telemetry team for openstack moved all their task tracking over to storyboard.openstack.org at the end of last week and i imported all their bugs over the weekend 19:40:31 yeah, i'm not sure what the deal is with release process choices there, maybe diablo_rojo_phon or SotK can comment 19:41:03 I don't mind as long as they are happy with it. Also we'll have to keep in mind we can't just push a tag if they are using the managed process 19:41:38 i don't think storyboard has been doing releases up to now 19:41:45 ah 19:41:55 but maybe the idea is to start? i really don't know 19:42:03 seems like a reasonable thing to do 19:42:34 anything else? We've got a number of topics to go so I'll keep this moving 19:42:59 also on import tooling, the patch to make it so we can import launchpad blueprints as stories is semi-usable now, i pushed some fixes for it last week, but there are still a number of gotchas i spotted which need to be addressed before it's usable for production imports 19:43:24 i didn't have anything else to mention 19:43:28 #topic General Topics 19:43:38 First up Gerrit project renames 19:43:52 friday the 31st still? 19:44:07 I think so. mordred has indicated he can't help that day iirc and I imagine ianw would rather sleep/weekend 19:44:14 but I expect fungi corvus and myself to be around 19:44:24 There are a number of things we should get sorted out before that day though 19:44:30 #link https://review.opendev.org/#/c/655476/ Fixes for rename playbook 19:44:39 I think we want that change or something like it ready to go first 19:44:45 clarkb: yes - I cannot help that day - but I doubt I'll be needed 19:45:05 Then we also want to generate changes for the openstack-infra stragglers and the openstack-dev straggler repos 19:45:12 so that we can follow normal renaming process 19:45:44 other projects like airship have already started to push their changes 19:45:59 any volunteers to do this for the infra repos? 19:46:45 maybe we can start with an etherpad to collect the ones we think need to be renamed. How about https://etherpad.openstack.org/openstack-infra-stragger-opendev-renames 19:46:53 I just made that url up hopefully not a name collision 19:47:17 i can help get things ready ... but yeah a list would be good 19:47:33 #link https://etherpad.openstack.org/openstack-infra-stragger-opendev-renames Put list of repos that need to be renamed for openstack-infra and openstack-dev here 19:47:49 Next up is the trusty server upgrades 19:47:59 #link https://etherpad.openstack.org/p/201808-infra-server-upgrades-and-cleanup 19:48:06 We are down to status, static, refstack, and wiki 19:48:11 thank you ianw for taking care of ask.o.o 19:48:36 #link https://review.opendev.org/651352 Replace transitional package names for Xenial 19:48:39 that could use another review 19:48:53 I've got status on my list next after gitea 1.8.0 patches - now that the per-service playbook patch has landed 19:49:02 ... if we have a sec, i would like to discuss our thoughts on the future of ask.o.o, but maybe at end if time 19:49:29 ianw: ya should have a few minutes at the end 19:49:34 or we can discuss it in #openstack-infra after if we run out 19:49:53 The last major item on the agenda is the opendev in region mirrors deployment 19:49:55 #link https://review.opendev.org/#/c/658281/ actual change to implement opendev.org mirrors 19:50:06 if you have time to review that please do. This will get us tls on our mirrors 19:50:19 and update us to bionic 19:50:20 thanks mordred! i'll take another go at wiki-dev after that merges 19:51:11 #topic Open Discussion 19:51:14 aka the future of ask 19:51:51 as we've discussed previously, it's a bit of an unusual service in that the sort of folks who are relying on it are highly unlikely to be the folks who want to help us maintain it 19:52:17 ianw: I was thinking about this earlier today and one idea I had was to basically send email to the openstack-discuss list explaining that it is basically on lifesupport via some hacky workarounds to problems (link to those details). Then basically ask people to help us do it better and mention that docker/ansible are options 19:52:40 And let that serve as notice that we'll probably just turn the service off when xenial is eol 19:52:49 if it's got any future, it probably needs people who have an interest in it continuing to exist because their project benefits from it being available to their users 19:52:55 assuming no improvements are made to make it sustainable 19:52:56 yeah, that was sort of my plan, draft an email and send something, just wanted to make sure others were ok with it 19:52:57 clarkb, ianw: should we wait just a little bit on that mirror patch and boot them using the ubuntu-bionic-minimal images we're about to start building? 19:53:21 mordred: that seems reasonable if we can get these changes in soonish 19:53:37 also - is linaro-london a thing? 19:53:37 mordred: however our mirrors have long been run on cloud provided images so probably not a major deal if we don't get that in first 19:53:48 mordred: aiui linaro-london is a thing but linaro the china cloud is no longer 19:53:54 mordred: ok, i have a couple of changes out to add dns entries and setup the first mirror, which i've launched. i wouldn't mind debugging issues on that 19:54:00 gotcha - so we should add linaro-london to the image-building patch 19:54:12 #link https://review.opendev.org/#/c/660235/ dns entries 19:54:18 we could also stand to leard from history. we did send a similar message two or three years ago about turning off the wiki, and there was outcry that people who are unable to help maintain it heavily use it, and then we ended up not turning it off even though it's still mostly unmaintained 19:54:26 s/leard/learn/ 19:54:27 #link https://review.opendev.org/#/c/660237/ system-config updates 19:54:38 cool- mostly was thinking it's not a huge deal - but since we're about to boot a bunch of things, it seems like good timing if we can make it work 19:54:45 but if not - I don't think we should hold up progress 19:55:00 fungi: ya though in the case of the wiki the people that heavily use it are the types that can direct resources to help which maybe makes that a worse situation than ask 19:55:14 yep 19:55:26 we can always just make jimmy run it 19:55:28 ianw: sending that email has my vote 19:55:33 ianw: just to be clear on that 19:55:34 he's not busy enough already :) 19:56:25 yeah, i mean we don't want to be taking on major puppet refactors at this point 19:57:01 taking over dead upstream puppet modules etc 19:57:48 stackoverflow have a sort of community thing https://area51.stackexchange.com/faq 19:58:10 In general I think we need to start making arguments for helping the opendev team from the constituent projects particularly when a service (like ask) doesn't have a ton of overalp in end users 19:58:24 this seems like a reasonable start since the service is on life support 19:58:28 really, in terms of "critical mass of people looking at your problems, one of whom might have an answer" being on a larger site like stackoverflow probably has a lot of benefits ... 19:58:39 yeah. it's also openstack specific - so it's only a service serving one of our tenants 19:58:51 (ask is) 19:58:58 (that's me agreeing with clarkb) 19:59:03 (parenthetically) 19:59:06 well, the same could be said of any of our services which currently say "openstack" on them 19:59:34 fungi: sure - but some of them are more likely to be renamed .opendev.org - like paste or etherpad - and serve the general opendev world 19:59:46 it could be made not-openstack-specific if we wanted, but at that point we really are just running a site competing with far better-funded alternatives 19:59:49 others are specific to serving the openstack project ecosystem - which is also fine 19:59:55 fungi: exactly 20:00:03 also many of them are developer tools and developers seem to have a much easier time of understanding this says openstack but its an open tool for me to use 20:00:10 yah 20:00:14 and we are at time 20:00:19 thank you everyone 20:00:22 #endmeeting