19:01:15 #startmeeting infra 19:01:16 Meeting started Tue Dec 3 19:01:15 2019 UTC and is due to finish in 60 minutes. The chair is clarkb. Information about MeetBot at http://wiki.debian.org/MeetBot. 19:01:17 Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 19:01:19 The meeting name has been set to 'infra' 19:01:32 o/ 19:01:39 #link http://lists.openstack.org/pipermail/openstack-infra/2019-December/006535.html Our Agenda 19:01:47 #topic Announcements 19:01:53 o/ 19:02:10 No major announcements. 19:02:27 A note that in 3 and 4 weeks we have meeting days scheduled on Christmas and New Year's eves 19:02:39 I expect we might simply avoid meeting on those days 19:02:57 i will be out at sea, so no internet 19:03:11 (i mean, more out at sea than i usually am) 19:03:21 i aim for similar non-availability 19:03:53 and moisture content. though perhaps a different altitude. 19:04:06 ok unless I hear interest otherwise we can go ahead and pencil in meeting cancellations for the 24th and 31st 19:04:12 and a different latitude? ;) 19:04:25 and longitude! 19:04:45 oh and for ianw it will be the actual holidays local time 19:04:49 even more reason to cancel :) 19:05:18 that way he can go enjoy the blistering summer sun 19:05:30 #topic Actions from last meeting 19:05:35 #link http://eavesdrop.openstack.org/meetings/infra/2019/infra.2019-11-26-19.01.txt minutes from last meeting 19:05:38 though i guess it'll be early enough in the day that it's not too oppressive out for him 19:05:46 o/ 19:05:58 ianw: you took an action to create afs volumes for static migration. We have that as an agenda item later in the agenda if you want to skip this for now? 19:06:11 yeah, i didn't do it, sorry! 19:06:21 #topic Priority Efforts 19:06:30 #topic Update Config Management 19:06:39 Quick lets talk config management updates while mordred is here :) 19:06:46 * mordred hides 19:06:53 mordred: I think we are all interested in gerrit dockering updates if you have them 19:07:02 yes. mostly I ate turkey 19:07:28 but - the last thing I was hitting up against was ssl certs with storyboard-dev and gerrit-dev 19:08:10 would LEing them help? if so we can give them the same treatment planned for static.o.o vhosts with acme cnames in openstack.org zone to opendev.org zones 19:08:18 because getting a real cert on storyboard-dev is a whole other rabbit hole - but review.o.o doesn't do anything with custom certs for storyboard - so right before the break I think we informally decided to just ignore the whole thing 19:08:29 ah ok 19:08:43 yeah - I think LE would help - but upon reflection I don't think it's worth very much 19:08:48 yeah, doing le would help. it's being manually installed into the review-dev jvm because storyboard-dev has a snakeoil cert and... java 19:08:51 until such a time as we do that for real 19:08:57 yeah 19:09:14 the issue is that now the java is in the container image - so installing the self-signed cert becomes ... very yuck 19:09:43 mordred: I think if you create a couple dns records and then set ansible vars the LE stuff would actually work as is 19:09:50 so I do think we should LE storyboard-dev - but I'm going to see if we can't finish review-dev other than storyboard-dev ssl cert without blocking on it 19:09:52 the "gotcha" there is we have to create dns records in opesntack.org zone too 19:10:01 right, also it's only for its-storyboard communication at the moment i think, so if that breaks for a while between the dev servers i wouldn't be too concerned 19:10:08 yah 19:10:20 well that and writing the appropriate handler to symlink/copy files and restart services (this wasn't too bad for gitea at least) 19:10:30 fungi: agreed 19:10:36 yeah, that latter bit is the harder part of it 19:10:44 yup. puppet. 19:11:23 next goal there being to get review-dev completely run from ansible+podman 19:11:39 then I'll tackle the stuff that review does that review-dev doesn't do :) 19:11:40 would it help if the letsencrypt roles make archives or whatever it was java needs? 19:12:02 ianw: not really - the main goal from LE would be to avoid needing to register anything with java 19:12:04 nah, java needs nothing if letsencrypt is providing proper certs for sb-dev 19:12:11 so if we LE storyboard-dev - the issue goes away 19:12:12 yeah 19:12:39 the complexity there is purely a workaround for supporting snakeoil certs 19:12:42 oh, ok, it's just the self-signed bits that cause problems 19:12:59 exactly 19:13:10 but if we can limp along for a bit - it might be nicer to just ansible+container storyboard after we're done with gerrit - and we can LE things then I'd guess 19:13:45 i also have a lingering todo to redeploy it with python 3 anyway (the current deployment is python 2) 19:14:02 but we get that for free too as part of the containering 19:14:43 testing and dev environments for it have been python 3 for a while, our deployment is just lagging behind 19:14:54 also - people should review ianw's patches https://review.opendev.org/#/q/topic:docker-siblings 19:15:16 not gerrit related - but fell out of the updates for nodepool 19:15:18 those are currently very high on my list. To be done after opendev governance email goes out 19:15:54 ++ i will do that today as well 19:16:47 alright anything else on this topic or should we move on? 19:17:00 I think that's all I've got for now - hasn't moved much over the last few weeks. turkey etc etc 19:17:44 #topic OpenDev 19:17:55 My plan on the governance email is to send that out immediately after this meeting 19:18:12 as well as pointer threads on the various -discuss lists to point people at it (in an effrot to keep discussion in one place) 19:18:45 thanks again for writing that up 19:18:53 The other thing is tonyb's git fetch bug which I think we agreed the next step is to upgrade to current gitea release? 19:19:03 mordred: ^ any progress on being able to upgrade to 1.10? 19:19:19 and if not are we able to help? 19:19:43 clarkb: actually - the change is green now 19:20:16 so I'm going to say - I am not aware of any blockers to us upgrading 19:20:24 https://review.opendev.org/#/c/690873/ 19:20:48 i'm around to keep an eye on post-upgrade behavior if we want to press forward there 19:20:51 #link https://review.opendev.org/#/c/690873/ upgrade gitea to 1.10 then retest tonyb's git fetch bug 19:21:46 please to review https://review.opendev.org/#/c/690873/8/docker/gitea/custom/templates/repo/header.tmpl carefully - ttx's template change caused a merge conflict that I fixed - but making sure I fixed it right would be good when eyeballs are happening 19:21:49 I'll add that to the review list 19:21:56 mordred: ++ 19:21:57 there was another report of hangs on that upstream bug too, seemed similar but not 100% the same 19:22:50 i think they said they were pushing, which would be something review.o.o would be doing, and i don't think we've seen that stuck? 19:23:04 ianw: not since we've discovered the gitconfig lockfile issue 19:23:28 basically if the lock file is stale gitea won't actually update the repos on push (though they don't hang either, they "succeed") 19:24:42 Alright anything else re opendev? 19:25:31 #topic Storyboard 19:25:39 fungi: diablo_rojo: Anything to share? 19:26:01 i'm currently working on creating the swift container for attachments, but had some questions 19:26:19 i was going to try to duplicate how we set up authentication for the intermediate registry 19:26:41 i found where we use the values in the system-config role to populate clouds.yaml 19:26:54 but how did we create those container credentials? 19:27:06 i'm pretty sure we(i) did a bunch of curl stuff to set that up 19:27:19 i love to do a bunch of curl stuff 19:27:31 curl fungi | bash 19:27:31 i should have added it to the docs, sorry 19:27:44 and i doubt i have any record now 19:27:56 no worries, i can probably reverse-engineer it, but yea looked for docs on how we got there and didn't find any 19:27:56 is that because we create subaccounts that only work with swift rather than full cloud credentials? 19:28:10 i think so, yes 19:28:22 in fact 19:28:27 because we only want to grant write access to a container, not to all the provider's resources for our tenant 19:28:39 i'm not actually sure whether that's a rackspace cloud user, or a cloud-objects-only user 19:28:48 i think we've done both things at various points 19:28:55 i suspect the latter, but will dig around 19:29:12 Thanks for all your work on this fungi :) 19:29:15 probably worth looking at the control panel and seeing if it shows up anywhere 19:29:23 and i'll make a note to put something about how we're managing swift creds in the system-config docs 19:30:38 unrelated, i'm wondering if we need to keep sb updates as a standing item in the weekly infra meeting agenda, and/or whether we should consider the infra spec done enough. the software is under ongoing improvement but it's already being used by a bunch of rpojects day to day 19:30:43 wow, it looks like rax may be burning its old blog content 19:31:01 see https://blog.rackspace.com/create-cloud-files-container-level-access-control-policies for unhelpfulness 19:31:11 yikes 19:31:19 Huh interesting 19:32:04 fungi: ya I think we've largely reached steady state there and are supporting projects as they show up. For infra related tasks like the swift item we can add those as normal agenda entries 19:32:24 right, i think we have sufficiently established channels for folks to get updates and ask questions about sb 19:33:01 also as we transition to being more focused on opendev activities, a fair amount of the outstanding stuff in that spec is about openstack project specific process 19:33:12 if there are no objections I can push up the appropriate specs repo change(s) and update our meeting agenda 19:33:25 and can be tracked independently in openstack anyway 19:33:29 None from me. 19:34:07 i can push it up even, will take a closer look and see if we need any sort of epilogue for it while i'm there 19:34:18 fungi: that would be great. Thanks! 19:34:25 Anything else on the storyboard topic while we are here? 19:34:51 I think the swift container is the only news really. 19:34:58 So nothing else from me 19:35:07 nor me 19:35:34 #topic General Topics 19:35:50 We'll start with quick check on wiki upgrade (I don't expect this has moved much given the holiday) 19:36:06 fungi: ^ any surprises there ? 19:36:48 nope, unless you like to be surprised by me not finding time to take it further yet 19:37:05 Next up is static.o.o moves 19:37:23 though i did include the wiki in your list of services we may want to talk about retiring (again) 19:37:33 fungi: k 19:37:40 ianw: mentioned he hadn't created afs volumes yet. Is that something we can help with or just need time? 19:37:55 do we have a list of which volumes we'll need? 19:38:03 sorry i just got totally distracted with iterating on container things 19:38:27 i guess the other critical bit is how much quota to give each of them 19:38:43 presumably inferred from current utilization and growth over time 19:38:45 for quota we should be able to run a du against current content and double it 19:38:46 that's really the correlation i need to do with the initial jobs proposed by ajaeger 19:39:30 i also think the clarkb algorithm is an excellent approach to capacity planning ;) 19:40:00 i will do it today. no more excuses :) 19:40:05 ianw: thank you! 19:40:21 That is a good lead into the dib/nodepool containerization topic though 19:40:34 I see that we've changed planned approach since last week. Can you give us an update onthat? 19:40:53 #link https://review.opendev.org/#/q/topic:docker-siblings 19:41:02 #link http://lists.openstack.org/pipermail/openstack-infra/2019-November/006529.html 19:41:57 ianw: ^ 19:42:40 yes, after discussion yesterday (in #openstack-infra) we are fleshing out the idea of making the python-builder be able to install siblings 19:43:00 rather than the idea of building smaller images ontop of each other 19:43:44 my main goal is to allow for cross-project testing; this should allow it without having to import jobs across tenants 19:44:02 (which breaks abstractions about what projects rely on what projects) 19:44:07 would each tenant have their own job to build the images then (using a common dockerfile?) 19:45:54 yes, or more likely projects will choose to share the zuul job 19:46:22 but it means we do not have to import all openstacksdk/dib jobs into zuul to be able to build dib/openstacksdk containers 19:46:28 the common dockerfile in this case is likely to be the nodepool-builder dockerfile, since the immediate goal is to test a nodepool-builder container image with a change to dib or sdk 19:46:55 I see 19:47:47 yes, so basically build the nodepool containers from the zuul checkouts of nodepool (main project) and siblings dib+openstacksdk 19:48:35 dib already uses the nodepool functional test jobs, so i don't see any issues consuming a container build job defined there either 19:48:54 so does openstacksdk actually (95% sure it runs the nodepool functional tests) 19:49:44 so the whole system should end up being a pretty close analog of the current one 19:50:59 sounds good. Anything else to add? 19:51:42 no; there's a lot in flight but i think we've found a rough heading :) 19:53:19 Last item on the agenda is taking stock of our current list of services and thinking about whether or not it may make sense to retire some of them 19:53:36 Thinking this will help us to focus on the services we do run better and provide better functionality through them 19:53:39 #link https://etherpad.openstack.org/infra-service-list 19:54:11 At this point I think we are largely in the brainstorm phase but if it makes sense I'd like to start pushing on this in the new year 19:55:10 When you're drinking tea/coffee/whatever it would be great if you could skim it and add feedback 19:55:17 #topic Open Discussion 19:55:27 And now we have ~5 minutes for anything else we'd like to bring up 19:56:13 the list looks good. i don't think ethercalc is much of a burden? but i'm okay if it goes. it's one of many services that would be easier to run if we had a k8s. 19:57:58 agreed, a better hosting platform would simplify some of these simpler services. THough we still need to manage backups for them 19:58:28 we also haven't updated it in a while 19:58:29 ethercalc just needs db backups presumably 19:58:33 fungi: yes 19:59:58 And we are at time. Thank you everyone! we'll see you here next week 20:00:08 yah - I hold out optimism that we will find a point where putting in a k8s will reduce the overall burden on running the simpler services 20:00:09 #endmeeting