19:01:17 <clarkb> #startmeeting infra
19:01:18 <mkolesni> hi
19:01:18 <openstack> Meeting started Tue Nov 26 19:01:17 2019 UTC and is due to finish in 60 minutes.  The chair is clarkb. Information about MeetBot at http://wiki.debian.org/MeetBot.
19:01:19 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote.
19:01:21 <openstack> The meeting name has been set to 'infra'
19:01:28 <clarkb> #link http://lists.openstack.org/pipermail/openstack-infra/2019-November/006528.html Our Agenda
19:01:41 <clarkb> #topic Announcements
19:01:54 <clarkb> Just a note that this is a big holiday week(end) for those of us in the USA
19:02:13 <clarkb> I know many are already afk and I'll be afk no later than thursday :)
19:02:48 <fungi> i'll likely be increasingly busy for the next two days as well
19:03:09 <fungi> (much to my chagrin)
19:03:43 <clarkb> I guess I should also mention that OSF individual board member elections are coming up and now is the nomination period
19:04:14 <clarkb> #topic Actions from last meeting
19:04:31 <clarkb> thank you fungi for running last week's meeting, I failed at accounting for DST changes when scheduling dentist visit
19:04:37 <clarkb> #link http://eavesdrop.openstack.org/meetings/infra/2019/infra.2019-11-19-19.05.txt minutes from last meeting
19:04:50 <clarkb> No actions recorded. Let's move on
19:04:51 <fungi> no sweat
19:04:57 <clarkb> #topic Priority Efforts
19:05:09 <clarkb> #topic OpenDev
19:05:10 <fungi> i half-assed it in hopes nobody would ask me to run another meeting ;)
19:05:19 <clarkb> #link https://etherpad.openstack.org/p/rCF58JvzbF Governance email draft
19:05:30 <clarkb> I've edited that draft after some input at the ptg
19:05:47 <clarkb> I think it is ready to go out and I'm mostly waiting for monday (to avoid getting lost in the holdiay) and for thoughts on who to send it to?
19:06:15 <clarkb> I had thought about sending it to all the top level projects that are involved (at their -discuss mailing lists)
19:06:33 <clarkb> but worry that might create too many separate discussions? we unfortunatley don't have a good centralized mechanism yet (though this proposal aims to create one)
19:06:51 <clarkb> I don't need an answer now, but if you have thoughts on the destination fo that email please leave a note in the etherpad
19:07:04 <fungi> you could post it to the infra ml, and then post notices to other related project mailing lists suggesting discussion on the infra ml
19:07:21 <fungi> and linking to the copy in the infra ml archive
19:07:41 <fungi> that should hopefully funnel discussion into one place
19:07:58 <clarkb> I'm willing to try that. If there are no objects I'll give that a go monday ish
19:08:29 <clarkb> ianw: any movement on the gitea git clone thing tonyb has run into?
19:08:46 <fungi> we upgraded gitea and can still reproduce it, right?
19:08:50 <ianw> not really, except it still happens with 1.9.6
19:08:51 <clarkb> fungi: yes
19:09:09 <ianw> tonyb uploaded a repo that replicates it for all who tried
19:09:30 <clarkb> across different git version and against specific and different backends
19:09:35 <ianw> it seems we know what happens; git dies on the gitea end (without anything helpful) and the remote end doesn't notice and sits waiting ~ forever
19:09:41 <clarkb> whcih I think points to a bug in the gitea change to use go-git
19:10:01 <clarkb> ianw: and tcpdump doesn't show any Fins from gitea
19:10:21 <clarkb> we end up playing ack pong with each side acking bits that were previous transferred (to keep the tcp connection open)
19:10:38 <ianw> i'm not sure on go-git; it seems it's a result of "git upload-pack" dying, which is (afaics) basically just system() called out to
19:10:44 <clarkb> ah
19:11:29 <fungi> how long is that upload-pack call running, do we know?
19:11:42 <fungi> could it be that go-git decides it's taking too long and kills it?
19:11:46 <clarkb> when I reproduce it takes about 20 seconds to hit the failure case
19:12:03 <clarkb> I don't think that is long enough for gitea to be killing it. However we can double check those timeout values
19:12:04 <ianw> from watching what happens, it seems to chunk the calls and so about 9 go through, then the 10th (or so) fails quickly
19:12:42 <ianw> we see the message constantly in the logs; but there don't seem to be that many reports of issues, though, only tonyb
19:12:51 <fungi> this is observed straight to the gitea socket, no apache or anything proxying it right?
19:12:56 <clarkb> fungi: correct
19:13:08 <fungi> ianw: there can be only one tonyb
19:13:10 <clarkb> (there is no apache in our gitea setup. just haproxy to gitea fwiw)
19:13:25 <fungi> that's what i thought, thanks
19:15:00 <ianw> i think we probably need custom builds with better debugging around the problem area to make progress
19:15:08 <clarkb> I guess the next step is to try and see why upload-pack fails (strace it maybe?) and then trace back up through gitea to see if it is the cause or simply not handling the failure properly?
19:15:25 <clarkb> I would expect that gitea should close the tcp connection if the git process under it failed
19:15:30 <ianw> yeah, i have an strace in the github bug, that was sort of how we got started
19:15:35 <clarkb> ah
19:16:08 <ianw> it turns out the error message is ascii bytes in decimal, which when you decode is actually a base-64 string, which when decoded, shows the same message captured by the strace :)
19:16:47 <clarkb> wow
19:17:17 <ianw> i know mordred already has 1.10 patches up
19:17:43 <ianw> i'm not sure if we want to spend effort on old releases?
19:17:45 <clarkb> yeah there were a few issues he had to work through, but maybe we address those and get to 1.10 then try to push upstream to help us debug further?
19:17:57 <fungi> that sounds good
19:18:18 <clarkb> seems like a good next step. lets move on
19:18:26 <clarkb> #topic Update Config Management
19:18:41 <clarkb> zbr_ had asked about helping out on the mailing list and I tried to point to this topic
19:19:26 <clarkb> Long story short if you'd like to help us uplift our puppet into ansible and containers we appreciate the help greatly. Also most of the work can be done without root as we have a fairly robust testing system set up which will allow you test it all before merging anything
19:19:27 <fungi> it was a great read
19:19:33 <clarkb> Then once merged an infra-root can help deploy to production
19:19:50 <ianw> ++ i think most tasks there stand-alone, have templates (i should reply with some prior examples) and are gate-testable with our cool testinfra setup
19:20:39 <clarkb> That was all I had on this topic. Anyone have other related items?
19:21:09 <clarkb> #topic Storyboard
19:21:18 <clarkb> fungi: diablo_rojo anything to mention about storyboard?
19:22:06 <fungi> the api support for attachments merged
19:22:37 <fungi> next step there is to negotiate and create a swift container for storyboard-dev to use
19:22:43 <clarkb> exciting
19:23:12 <fungi> then the storyboard-webclient draft builds of the client side implementation for story attachments should be directly demonstrable
19:23:43 <fungi> (now tat we've got the drafts working correctly again after the logs.o.o move)
19:24:26 <fungi> i guess we can also mention that the feature to allow regular expressions for cors and webclient access in the api merged
19:24:40 <fungi> since that's what we needed to solve that challenge
19:25:17 <fungi> so storyboard-dev.openstack.org now allows webclient builds to connect and be usable from anywhere, including your local system i suspect
19:25:43 <fungi> (though i haven't tested that bit, not sure if it needs to be a publicly reachable webclient to make openid work correctly)
19:25:51 <clarkb> sounds like good progress ona couple fronts there
19:26:19 <fungi> any suggestions on where we should put the attachments for storyboard-dev?
19:26:31 <fungi> i know we have a few places we're using for zuul build logs now
19:26:51 <clarkb> maybe vexxhost would be willing to host storyboard attachments as I expect there will be much less of them than job log files?
19:27:09 <fungi> for production we need to make sure it's a container which has public indexing disabled
19:27:22 <fungi> less critical for storyboard-dev but important for production
19:27:33 <clarkb> fungi: I think we control that at a container level
19:27:38 <clarkb> (via x-meta settings)
19:28:03 <fungi> (to ensure just anyone can't browse the container and find attachments for private stories)
19:28:16 <fungi> cool
19:28:40 <fungi> and yeah, again for storyboard-dev i don't think we care if we lose attachment objects
19:28:51 <fungi> for production there wouldn't be an expiration on them though, unlike build logs
19:29:12 <fungi> maybe we should work out a cross-cloud backup solution for that
19:29:29 <fungi> to guard against unexpected data loss
19:29:57 <clarkb> I think swift supports that somehow too, but maybe we also have storybaord write twice?
19:30:29 <fungi> yeah, we could probably fairly easily make it write a backup to a second swift endpoint/container
19:31:08 <fungi> that at least gets us disaster recovery (though not rollback)
19:31:42 <fungi> certainly enough to guard against a provider suddenly going away or suffering a catastrophic issue though
19:32:25 <fungi> anyway, that's probably it for storyboard updates
19:32:45 <clarkb> #topic General Topics
19:32:51 <clarkb> fungi: anything new re wiki?
19:33:29 <fungi> nope, keep failing to find time to move it forward
19:34:09 <clarkb> ianw: for static replacement are we ready to start creating new volumes?
19:34:21 <clarkb> I think afs server is fully recovered from the outage? and we are releasing volumes successfully
19:34:55 <ianw> yes, i keep meaning to do it, maybe give me an action item so i don't forget again
19:34:57 <fungi> yeah, some releases still take a *really* long time, but they're not getting stuck any longer
19:35:18 <clarkb> #action ianw create AFS volumes for static.o.o replacement
19:35:44 <fungi> though on a related note, we need to get reprepro puppetry translated to ansible so we can move our remaining mirroring to the mirror-update server. none of the reprepro mirrors currently take advantage of the remote release mechanism
19:35:46 <ianw> fungi: yeah, the "wait 20 minutes from last write" we're trying with fedora isn't working
19:36:18 <ianw> yeah, i started a little on reprepro but not pushed it yet, i don't think it's too hard
19:36:39 <fungi> i think it shouldn't be too hard, it's basically a package, a handful of templated configs, maybe some precreated directories, and then cronjobs
19:36:41 <clarkb> its mostly about getting files in the correct places
19:36:46 <clarkb> there are a lot of files but otherwise not too bad
19:36:49 <fungi> and a few wrapper scripts
19:37:38 <clarkb> Next up is the tox python version default changing due to python used to install tox
19:37:42 <clarkb> #link http://lists.openstack.org/pipermail/openstack-discuss/2019-November/010957.html
19:37:59 <clarkb> ianw: fwiw I agree that the underlying issue is tox targets that require a specific python version and don't specify what that is
19:38:11 <clarkb> these tox configs are broken anywhere someone has installed tox with python3 instead of 2
19:38:30 <ianw> yeah, i just wanted to call out that there wasn't too much of a response, so i think we can leave it as is
19:38:45 <clarkb> wfm
19:38:51 <fungi> yep, with my openstack hat on (not speaking for the stable reviewers though) i feel like updating stable branch tox.ini files to be more explicit shouldn't be a concern
19:39:21 <fungi> there's already an openstack stable branch policy carve-out for updating testing-related configuration in stable branches
19:39:24 <clarkb> I think we're just going to have to accept there will be bumps on the way to migrating away from python2
19:39:34 <clarkb> and we've run into other bumps too so this isn't unique
19:40:03 <clarkb> And that takes us to mkolesni's topic
19:40:11 <clarkb> Hosting submariner on opendev.org.
19:40:15 <mkolesni> thanks
19:40:19 <clarkb> I think we'd be happy to have you but there were questions about CI?
19:40:22 <mkolesni> dgroisma, you there?
19:40:28 <fungi> mkolesni: thanks for sticking around through 40 minutes of other discussion ;)
19:40:35 <mkolesni> fungi, no prob :)
19:40:52 <mkolesni> let me wake dgroisma ;)
19:41:18 <dgroisma> we wanted to ask what it takes to move some of our repos to opendev.org
19:41:50 <mkolesni> currently we have all our ci in travis on the github
19:42:11 <clarkb> The git import is usually pretty painless. We point out gerrit management scripts at an existing repo source that is publicly accessible and suck the existing repo content into gerrit. This does not touch the existing PRs or issues though
19:42:15 <dgroisma> there are many question around ci, main one if we could keep using travis
19:42:34 <clarkb> For CI we run Zuul and as far as I know travis doesn't integrate with gerrit
19:42:41 <mkolesni> clarkb, yeah for sure the prs will have to be manually migrated
19:43:02 <clarkb> It may be possible to write zuul jobs that trigger travis jobs
19:43:14 <clarkb> That said my personal opinion is that much of the value in hosting with opendev is zuul
19:43:39 <clarkb> I think it would be a mistake to put effort into continuing to use travis (though maybe it would help us to understand your motiviations for the move if Zuul is not part of that)
19:43:41 <fungi> the short version on moving repos is that you define a short stanza for the repository including information on where to import existing branches/tags from, also define a gerrit review acl (or point to an existing acl) and then automation creates the projects in gerrit and gitea. after that you push up a change into your repo to add ci configuration for zuul so that changes can be merged (this can
19:43:43 <fungi> be a no-op job to just merge whatever you say should be merged)
19:44:10 <mkolesni> btw are there any k8s projects hosted on opendev?
19:44:22 <fungi> how do you define a kubernetes project?
19:44:31 <fungi> airship does a bunch with kubernetes
19:44:39 <fungi> so do some openstack projects like magnum and zun
19:44:43 <clarkb> fungi: as does magnum and mnaser's k8s deployment tooling
19:44:47 <mkolesni> one thats in golang for example :)
19:44:57 <clarkb> there are golang projects
19:45:10 <fungi> yeah, programming language shouldn't matter
19:45:10 <clarkb> bits of airship are golang as an example
19:45:22 <fungi> we have plenty of projects which aren't even in any programming language at all for taht matter
19:45:42 <fungi> for example, projects which contain only documentation
19:46:14 <mkolesni> you guys suggested a zuul first approach
19:46:23 <mkolesni> to transition to zuul and then do a migration
19:46:46 <mkolesni> but there was hesitation for that as well since zuul will have to test github based code for a while
19:47:24 <clarkb> mkolesni: dgroisma how many jobs are we talking about and are they complex or do they do simple things like "execute unittests", "build docs", etc?
19:47:26 <fungi> well, it wouldn't have to be the zuul we're running. zuul is free software anyone can run wherever they like
19:47:54 <clarkb> My hunch is they can't be too complex due to travis' limitations
19:48:11 <mkolesni> clarkb, dgroisma knows best and can answer that
19:48:14 <clarkb> and if that is the case quickly adding jobs in zuul after migrating shouldn't be too difficult and is something we can help with
19:48:17 <dgroisma> the jobs are a bit complex, we are dealing with multicluster and require multiple k8s clusters to run for e2e stuff
19:48:37 <clarkb> dgroisma: and does travis provide that or do your jobs interact with external clusters?
19:48:50 <dgroisma> the clusters are kind based (kubernetes in docker), so its just running bunch of containers
19:48:56 <fungi> is that a travis feature, or something you've developed and happens as part of your job payload?
19:49:09 <mkolesni> fungi, well currently we rely on github and travis and dont have our own infra so we'd prefer to avoid standing up the infra just for migration sake
19:49:27 <fungi> mkolesni: totally makes sense, just pointing that out for clarity
19:49:33 <mkolesni> ok sure
19:49:39 <dgroisma> its out bash/go tooling
19:50:06 <dgroisma> our tooling not travis feature
19:50:14 <mkolesni> we use dapper images for the environment
19:50:19 <fungi> okay, so from travis's perspective it's just some shell commands being executed in a generic *nix build environment?
19:50:32 <dgroisma> yes
19:50:53 <fungi> in that case, making ansible run the same commands ought to be easy enough
19:50:54 <dgroisma> the migration should be ok, we just run some make commands
19:51:05 <clarkb> fungi: dgroisma mkolesni and we can actually prove that out pre migration
19:51:17 <clarkb> we have a sandbox repo which you can push job configs to which will run your jobs premerge
19:51:44 <clarkb> That is probably the easiest way to make sure zuul will work for you, then if you decide to migrate to opendev simply copy that job config into the repos once they migrate
19:52:03 <clarkb> that should give you some good exposure to gerrit and zuul too which will likely be useful in your decision making
19:52:04 <ianw> yeah, you will probably also find that while you start with basically ansible running shell: mycommand.sh ... you'll find many advantages in getting ansible to do more and more of what mycommand.sh over time
19:52:12 <mkolesni> clarkb, so you mean do initial migration, test the jobs, and if all is good sync up whatever is left and carry on?
19:52:29 <mkolesni> or is the sandbox where we stick the project itself?
19:52:41 <clarkb> mkolesni: no I mean, push jobs into opendev/sandbox which already exists in opendev to run your existing test jobs against your software
19:52:47 <fungi> you could push up a change to the opendev/sandbox repo which replaces all the files with branch content from yours and a zuul config
19:52:57 <fungi> it doesn't need to get approved/merge
19:53:00 <mkolesni> ah ok
19:53:02 <clarkb> Then if you are happy with tose results you can migrate the repos and copy the config you've built in the sandbox repo over to your migrated repos
19:53:12 <clarkb> this way you don't have to commit to much while you test it out and don't have to run your own zuul
19:53:12 <fungi> zuul will test eth change as written, including job configuration
19:53:25 <mkolesni> dgroisma, does that approach sound good to you? for a poc of the CI?
19:53:37 <dgroisma> yes sounds good
19:53:43 <mkolesni> ok cool
19:53:56 <mkolesni> do you guys have any questions for us?
19:54:10 <mkolesni> i think the creators guide covers everything else we need
19:54:14 <fungi> not really. it's all free/libre open source software right?
19:54:15 <clarkb> I'm mostly curious to hear what your motivation is if not CI (most people we talk to are driven by the CI we offer)
19:54:34 <clarkb> also we'd be happy to hear feedback on your experience fiddling with the sandbox repo and don't hesitate to ask questions
19:54:39 <dgroisma> gerrit reviews
19:54:40 <fungi> sounds like the ci is a motivation and they just want a smooth transition from their existing ci?
19:54:43 <mkolesni> github sucks for collaborative development :)
19:54:50 <clarkb> oh neat we agree on that too :)
19:54:55 <dgroisma> :)
19:55:16 <mkolesni> and as former openstack devs we're quite farmiliar with gerrit and its many benefits
19:55:29 <fungi> at least i didn't hear any indication they wanted a way to keep using travis any longer than needed
19:55:40 <mkolesni> no i don
19:55:53 <mkolesni> i don't think we're married to travis :)
19:56:07 <clarkb> ok sounds like we have a plan for moving forward. Once again feel free to ask questions as you interact with Zuul
19:56:20 <fungi> welcome (back) to opendev! ;)
19:56:21 <clarkb> I'm going to quickly try to get to the last couple topics before our hour is up
19:56:22 <mkolesni> ok thanks we'll check out the sandbox repo
19:56:27 <dgroisma> thank you very much
19:56:33 <mkolesni> thanks for your time
19:56:34 <clarkb> ianw: want to tldr the dib container image fun?
19:56:40 <clarkb> mkolesni: dgroisma you're welcome
19:57:05 <ianw> i would say my idea is that we have Dockerfile.opendev Dockefile.zuul Dockerfile.<insertwhateverhere>
19:57:06 <clarkb> ianw: if I read your email correctly it is that layering doesn't work for our needs here and maybe we should just embrace that and have different dockerfiles?
19:57:18 <ianw> and just build layers together that make sense
19:58:01 <ianw> i don't know if everyone else was thinking the same way as me, but I had in my mind that there was one zuul/nodepool-builder image and that was the canonical source of nodepool-builder images
19:58:01 <clarkb> It did make me wonder if a sidecar appraoch would be more appropriate here
19:58:16 <clarkb> but I'm not sure what kind of rpc that would require (and we don't have in nodepool)
19:58:29 <ianw> but i don't think that works, and isn't really the idea of containers anyway
19:58:34 <fungi> and then we would publish container images for the things we're using into the opendev dockerhub namespace, even if there are images for that software in other namespaces too, as long as those images don't do what we specifically need? (opendev/gitea being an existing example)>
19:58:45 <clarkb> fungi: ya that was how I read it
19:58:59 <ianw> fungi: yep, that's right ... opendev namespace is just a collection of things that work together
19:59:08 <fungi> i don't have any objection to this line of experimentation
19:59:11 <clarkb> with the sidecar idea I had it was don't try to layer everything but instead incorporate the various bits as separate containers
19:59:26 <ianw> it may be useful for others, if they buy into all the same base bits opendev is built on
19:59:38 <clarkb> nodepool builder would run in its own container context then execute dib in another container context and somehow get the results (shared bind mount?)
19:59:39 <fungi> yeah, putting those things in different containers make sense when they're services
20:00:10 <fungi> but putting openstacksdk in a different container from dib and nodepool in yet another container wouldn't work i don't think?
20:00:26 <clarkb> We are at time now
20:00:41 <clarkb> The last thing I wanted to mention is I've started to take some simple notes on maybe retiring some services?
20:00:42 <ianw> no, adding openstacksdk does basically bring you to multiple inheritance, which complicates matters
20:00:45 <clarkb> #link https://etherpad.openstack.org/infra-service-list
20:00:57 <fungi> thanks clarkb
20:00:58 <clarkb> ianw: fungi ya I don't think the sidecar is a perfect fit
20:01:22 <clarkb> re opendev services, if you have a moment over tea/coffee/food it would be great for a quick look and thoughts
20:01:37 <clarkb> I think if we can identify a small number of services then we can start to retire them in a controlled fashion
20:01:56 <clarkb> (mostly the ask stuff is what brought this up in my head because it comes up periodically that ask stops working and we really don't have the time to keep it working)
20:02:00 <clarkb> thanks everyone!
20:02:03 <clarkb> #endmeeting