19:00:32 <clarkb> #startmeeting infra
19:00:32 <openstack> Meeting started Tue Nov 28 19:00:32 2017 UTC and is due to finish in 60 minutes.  The chair is clarkb. Information about MeetBot at http://wiki.debian.org/MeetBot.
19:00:33 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote.
19:00:35 <openstack> The meeting name has been set to 'infra'
19:00:44 <clarkb> hello, who is here for the infra meeting?
19:00:50 <frickler> o/
19:00:54 <tobiash> o/
19:01:01 <ianw> o/
19:01:25 <clarkb> #link https://wiki.openstack.org/wiki/Meetings/InfraTeamMeeting#Agenda_for_next_meeting
19:02:21 <clarkb> There are a couple items on the agenda. Jeblair will be joining us late so I think we will do zuulv3 after the general topics list
19:02:31 <clarkb> #topic Announcements
19:03:03 <clarkb> its been a quiet week with the US thanksgiving holiday, I'm not aware of anything that needs announcing
19:03:07 <clarkb> is there something I've missed?
19:03:28 <fungi> i guess the venue for the ptg hasn't been officially announced yet
19:03:51 <fungi> so maybe flag that to announce in next week's meeting if it has been by then
19:04:05 * clarkb scribbles a note
19:04:13 <diablo_rojo_phon> It should be announced by this afternoon.
19:04:35 <clarkb> in that case look to read the openstack-dev mailing list later today for an announcement on the ptg location :)
19:04:43 <AJaeger> o/
19:04:55 <pabelanger> o/
19:04:58 <clarkb> #topic Actions from last meeting
19:05:08 <clarkb> #link http://eavesdrop.openstack.org/meetings/infra/2017/infra.2017-11-21-19.01.txt Minutes from last meeting
19:05:22 <clarkb> Fungi's action to write the secrets backups doc is complete \o/
19:05:43 <fungi> oh, that was complete last week
19:05:51 <fungi> i think i linked it during the meeting then
19:06:06 <clarkb> ya it ended up in the actions list but now is done so we can keep it out
19:06:09 <fungi> maybe it hadn't merged yet at that point
19:06:53 <clarkb> I'm not aware of any specs that need review or otherwise need to be brought up and will skip zuulv3 for now so that jeblair can join us which means straight ot general topics
19:07:00 <clarkb> #topic General topics
19:07:11 <clarkb> #topic Zanata 4 upgrade
19:07:22 <clarkb> #link https://review.openstack.org/#/c/506795/ initial change needed for zanata upgrade work
19:07:52 <clarkb> we are seeing some of the initial changes come in that will allow us to upgrade from zanata 3 to zanata 4. I think the i18n team would like to see this get done before the string freeze which is late january.
19:08:18 <clarkb> it would be great if we can help make that possible (code reviews, actually merging code/running upgrades)
19:08:35 <jeblair> o/
19:08:58 <clarkb> I did the last one so am fairly familiar with the service, is anyone else interested in learning about the java/wildfly/zanata things? If so let me know and we can work to help get this going with the i18n team
19:09:12 <ianw> i have a passing familiarity from the last upgrade, and i think we're in similar tz's, so put me down to help
19:09:27 <clarkb> ianw: awesome thanks
19:09:45 <clarkb> I expect it will be mostly straightfroward after talking to aeng, no need for a java update or distro upgrade
19:10:18 <fungi> unlike, say, the next gerrit upgrade
19:10:26 <ianw> i'll start by reviewing ^^ :)
19:10:43 <pabelanger> fungi: +1
19:11:35 <clarkb> #topic Priority Efforts
19:11:44 <clarkb> jeblair: is here, on to zuul
19:11:50 <clarkb> #topic Zuul v3
19:12:17 <clarkb> I wanted to do a quick recap of the zuul cloner venv removal because there were some hiccups with it and want to amke sure we don't forget to do the last cleanups
19:12:32 <clarkb> pabelanger: ^ I think the fixes for the pyyaml install have gone in job side, have we remvoed it from the image again?
19:12:45 <pabelanger> clarkb: we have!
19:13:11 <pabelanger> I think we are ready to actually move to https://review.openstack.org/513506/ now, which removes zuul-cloner from base jobs
19:13:55 <clarkb> #link https://review.openstack.org/513506/ remove z-c shim from base job is ready for review now
19:14:30 <pabelanger> for 513506, we'll need to be ready to fix jobs that are broken by it, and re-parent them to legacy-base
19:14:42 <clarkb> fungi: I think we basically agreed to remove jenkins from the ci group in gerrit last week as well. Did that happen yet?
19:14:51 <pabelanger> as we expect native zuulv3 jobs 1) not to use zuul-cloner, 2) parent to base
19:14:52 <fungi> it has not, but i can do it now
19:15:00 <pabelanger> fungi: +1
19:15:08 <fungi> 10 seconds ;)
19:15:12 <clarkb> works for me and is an easy revert if we need to
19:16:00 <fungi> okay, that was more than 10 seconds, but done now
19:16:11 <clarkb> frickler: tristanC have a third party CI of zuul-jobs agenda item
19:16:16 <fungi> i had to remember the group name because i forgot i'd linked it from the agenda
19:16:25 <clarkb> I don't think tristanC is here, but frickler is. Want to fill us in?
19:16:41 <frickler> I'm not directly related to the CI
19:16:47 <jeblair> somehow i missed that email
19:16:50 <fungi> #info The retired "Jenkins" account has been removed from the Continuous Integration Tools group in Gerrit now
19:16:56 <jeblair> but i've read it now.  and i support the concept in general
19:17:04 <frickler> but I wanted to make sure that tristan gets more feedback on his mail
19:17:13 <jeblair> though i was about to send back a reply suggesting that we just use 'recheck' for the recheck command
19:17:25 <fungi> #action fungi send an announcement about the removal of Jenkins from Continuous Integration Tools
19:17:32 <fungi> i'll do that after the meeting
19:17:35 <clarkb> fungi: thanks
19:17:50 <clarkb> #link http://lists.openstack.org/pipermail/openstack-infra/2017-November/005688.html for details on third party CI for zuul-jobs
19:18:01 <jeblair> i've long advocated that third-party cis should not have their own recheck command language
19:18:48 <jeblair> i don't think it should be a problem for this repo
19:19:26 <clarkb> agreed. I think the other considering to make is whether or not we want it to +1, +/-1, -1 or just +0 with logs
19:19:31 <clarkb> *other consideration
19:20:01 <jeblair> i'm happy to try out voting if other folks are
19:20:51 <clarkb> I think voting helps get more eyeballs on the problem and not allowing voting makes it easier to ignore the failures. So I am happy to try voting as well
19:21:13 <tobiash> ++voting
19:21:20 <dmsimard> hi
19:21:29 <dmsimard> sorry, forgot to fix calendar event timezone..
19:21:38 <fungi> i have no objection to voting third-party ci systems as long as their feedback is reliable
19:21:40 <pabelanger> yah, I think we can give voting a try
19:22:24 <dmsimard> I don't have backlog to get context, but yes, we (RDO/Software Factory) would like to be third party CI against zuul-jobs. I don't know what shape this will take yet.
19:22:59 <tobiash> the same applies for me
19:23:05 <jeblair> dmsimard: current context is software-factory third-party ci: http://lists.openstack.org/pipermail/openstack-infra/2017-November/005688.html
19:23:10 <clarkb> dmsimard: basically jeblair has asked that we not have any special recheck syntax, just support 'recheck' like upstream zuul. And we seem to be comfortable to try it out in a voting manner (so we'll need to update gerrit ACLs)
19:23:13 <jeblair> which i somehow missed over thankgiving
19:23:43 <dmsimard> Ultimately, one of the objectives is to leverage TripleO(-ci) jobs, roles, and playbooks from within review.rdoproject.org which is anologous to review.openstack.org, so it will be very important for us to be able to re-use zuul-jobs (and potentially other things, but that's another topic) outside of OpenStack
19:24:39 <dmsimard> I'd start with non-voting first to get some confidence that we're doing the right thing
19:24:58 <jeblair> wfm
19:25:09 <jeblair> that's the usual approach i believe
19:25:11 <fungi> just to pile on, i agree wrt standardizing on "recheck" across ci systems. rechecking individual ci systems is moderately dangerous for the same reasons we've resisted requests to recheck individual jobs
19:25:11 <clarkb> dmsimard: thats fine, just wnted to bring up possible voting early as a lot of projects don't allow it at all and wasn't sure were we stood on that
19:25:29 <clarkb> so as to avoid surprises later if there were major disagreements :)
19:25:53 <dmsimard> clarkb: well, it's not clear to me yet what these third party jobs will look like yet
19:26:12 <dmsimard> clarkb: i.e, will it be running base-integration/multinode-integration but just from another zuul for example ?
19:26:41 <fungi> also, to be clear on the third-party ci situation, we've also said in the past that we'll disable accounts for any ci systems which start reporting on infra team repo changes without prior discussion
19:26:42 <dmsimard> I feel like there's stuff we'll realize once we get started
19:26:49 <tobiash> dmsimard: it could be running your most important jobs
19:27:24 <mmedvede> I would object to not allowing third-party CIs have their own recheck syntax, sometimes we want to only nudge our third-party CI when there is an obvious problem with it without wasting upstream CI's resources
19:27:33 <jeblair> my thought is not to be too prescriptive about what they're doing.  we should have ongoing conversations with third-party ci operators to make sure we're making the most of things, but in general, third-party operators are probably best placed to decide what's important to them and what unique things they can bring to the table.
19:27:39 * rcarrillocruz waves
19:27:46 <fungi> mmedvede: "zuul enqueue" via the rpc in that case
19:28:05 <dmsimard> tobiash: it depends, what's the purpose or the use case ? I don't think running a tripleo-based job against zuul-jobs is necessarily worthwhile -- we're interested in testing the roles individually
19:28:13 <mmedvede> fungi: I thought we are not encouraged to comment on the same patch twice without an explicit recheck
19:28:35 <clarkb> dmsimard: but ya sounds like you can go ahead and start trying things out in a non voting capacity, may even want to start in a non reporting manner first. See how that goes then tweak from there
19:28:40 <fungi> mmedvede: that might be a policy some team has put into place, but it's not our policy afaik
19:28:49 <jeblair> mmedvede: there is nothing about that in https://docs.openstack.org/infra/system-config/third_party.html#requirements
19:28:55 <tobiash> dmsimard: yes, that probably depends on what's important to you as zuul-jobs user
19:29:01 <dmsimard> mmedvede, fungi, jeblair: unless mistaken, the recheck keyword is controlled by the third party CI so there's nothing preventing operators from responding to "recheck" and "check myzuulname"
19:29:04 <jeblair> mmedvede: there is a note in there saying that "recheck" should retrigger all systems.
19:29:18 <fungi> mmedvede: we're specifically talking about third-party ci systems which want to vote on changes to the openstack-infra/zuul-jobs repo (and potentially other deliverables of the infra team in the future)
19:29:29 <jeblair> dmsimard: nothing except their willingness to abide by the guidelines we've established
19:29:46 <pabelanger> wait, I thought we said recheck foo was good a while back. I feel like we go back and forth on this every few months
19:29:47 <jeblair> fungi: indeed, let's not get too far derailed on this :)
19:29:56 <dmsimard> jeblair: the important part is that they answer to "recheck", right ? if they answer to "check foo" is that an issue ?
19:29:57 <jeblair> pabelanger: i have never said that.
19:30:13 <pabelanger> jeblair: other infro-root have, IIRC
19:30:36 <jeblair> dmsimard: we don't have an established policy on that.  i would like to, in the context of zuul-jobs only, establish a policy that we don't do that and all systems honor recheck.
19:30:38 <jeblair> only.
19:31:12 <dmsimard> jeblair: that's fine, on our end that means setting up a separate pipeline (because we already have a pipeline meant for third party) but that's not expensive
19:31:50 <jeblair> dmsimard: (or, if it happens to honor something else, just don't mention it)
19:31:56 <dmsimard> sure
19:32:01 <fungi> dmsimard: "our" in this context being rdo ci?
19:32:22 <dmsimard> fungi: yeah, RDO's zuul answers to things like "check rdo experimental" (so we don't trigger "check experimental")
19:32:32 <dmsimard> and possibly other things
19:33:06 <mmedvede> jeblair dmsimard : agree recheck should absolutely retrigger all the CI systems. But this does not exclude ability for third-party CIs to also be triggered separately. I would like there to be an official blessed syntax for this. Right now each CI comes up with their own
19:33:22 <fungi> i'm curious why someone would want upstream experimental pipeline results but not rdo's experimental pipeline results. still, that's not crucial to this topic
19:33:34 <clarkb> ya, I think we may be starting to get into another topic entirely
19:33:41 <clarkb> we can come back to that if there are no other zuulv3 items or finish them
19:34:05 <clarkb> I think we may want to talk about the merging of branches though that wasn't on the agenda. Any other zuulv3 items?
19:34:25 <dmsimard> I have something
19:34:29 <fungi> mmedvede: part of the resistance, historically, is that we feel leaving comments in the code review system is a bad api anyway, and would like to eventually have some other interface fro such tasks
19:35:00 <fungi> and not tie ourselves to a standard involving arbitrary code review comment strings
19:35:51 <clarkb> dmsimard: what was your zuulv3 item?
19:36:06 <dmsimard> I'd love at least a first round of reviews on the 'sqlite over http' ara middleware series to 1) always have ara reports regardless of failure/success 2) reduce even further the impact of storage/inode on the log server
19:36:17 <dmsimard> The reviews are tagged here: https://review.openstack.org/#/q/topic:ara-sqlite-middleware
19:36:48 <dmsimard> And you can see a practical implementation here --
19:37:00 <clarkb> dmsimard: does that also depend on getting the ara install updated independent of the zuul install on the zuul executors?
19:37:04 <dmsimard> Without the middleware: https://logs.rdoproject.org/33/10433/1/check/rdo-registry-integration/Ze74352b77e17444cace463fc9c994213/ara-database/
19:37:06 <dmsimard> With the middleware: http://logs-dev.rdoproject.org/33/10433/1/check/rdo-registry-integration/Ze74352b77e17444cace463fc9c994213/ara-database/
19:37:33 <pabelanger> SSL cert is bad ^
19:37:42 <dmsimard> pabelanger: yeah, logs-dev :(
19:37:45 <jeblair> (i replied to the ml thread with a summary of our discussion on the third-party ci issue)
19:37:55 <dmsimard> pabelanger: I spun it up without getting proper certs (yet)
19:38:26 <dmsimard> clarkb: it doesn't depend on the version of ara on the executors, no
19:38:48 <clarkb> dmsimard: cool, so we can work this independently. I'll make a note to review it
19:39:04 <dmsimard> clarkb: it doesn't even depend on the version of ara on the logserver (where it would sit like htmlify/os-log_analyzer), it's just a wsgi script that happens to be bundled in ara at the latest version but otherwise can be carried in tree
19:39:25 <pabelanger> Didn't we have a set of patchs to install it onto our own dev server?
19:39:32 <pabelanger> logs-dev.o.o for example
19:39:38 <dmsimard> that's the topic I linked earlier, yes: https://review.openstack.org/#/q/topic:ara-sqlite-middleware
19:39:46 <pabelanger> okay cool
19:39:49 <clarkb> #link https://review.openstack.org/#/q/topic:ara-sqlite-middleware changes to run ara out of sqlite db using middleware. Will cut down on inode use on the logs server hopefully allowing us to add ara to successful jobs again
19:40:00 <pabelanger> will look over here today
19:40:07 <dmsimard> I have a todo to resolve a conflict between htmlify and ara rewrite rules but it's otherwise at least ready for reviewing
19:40:39 <dmsimard> I -W one of the patches but it's still worth reviewing :)(
19:41:23 <dmsimard> I'll probably go ahead and rebase the stack since it's been a while
19:41:26 <dmsimard> that was it for my topic :)
19:41:45 <clarkb> #link http://lists.openstack.org/pipermail/openstack-infra/2017-November/005695.html ml thread on merging feature branches back into master on nodepool and zuul repos and shifting dev work to those branches
19:41:56 <clarkb> If you haven't seen it yet and are interested in zuul ^ is probably worth a read
19:42:01 <clarkb> jeblair: anything you want to add to ^ here?
19:42:34 <jeblair> ya
19:43:01 <jeblair> i'd love for someone from the third-party-ci community to jump in on the puppet-openstackci work
19:43:36 <jeblair> that is something that should be straightforward to accomplish and doesn't require any zuulv3 knowledge -- the opposite in fact -- it's work to keep zuulv2 working with puppet-openstackci
19:43:36 <dmsimard> there's an irc channel where they hang out, worth a try to get their attention
19:43:58 <jeblair> true, though there's a problem if they aren't paying attention here.
19:43:59 <clarkb> mmedvede may also know individuals that might be interested?
19:44:00 <dmsimard> they might not be subscribed to the MLs
19:44:32 <AJaeger> we also have project-config-example repo - what should we do with that one? It uses Zuul v2/Jenkins right now
19:45:17 <jeblair> again, if they aren't subscribed to openstack-infra it's a problem.  i will send an announcement to third-party-announce to draw attention to my post, however.
19:45:24 <mmedvede> clarkb: it has been relatively quiet, lennyb fyi ^^
19:45:41 <clarkb> jeblair: thanks
19:47:45 <fungi> AJaeger: good question... i wonder whether it needs branching or can have v2 and v3 content side-by-side
19:47:49 <clarkb> ok any other zuulv3 items before we move on to open discussion?
19:48:01 <fungi> that's in my opinion part of eth puppet-openstackci work to determine
19:48:17 <AJaeger> fungi: and somebody would need to update it. Question is whether anybody is using it at all...
19:48:36 <jeblair> AJaeger, fungi: can likely support side-by-side as we did during our transition.
19:48:48 <fungi> convenient
19:48:52 <jeblair> though should probably just switch to v3 soon.
19:48:52 <mmedvede> clarkb: I'll take a look at puppet-openstackci for zuulv3 branch merge workarounds
19:49:02 <clarkb> mmedvede: thanks
19:49:10 <clarkb> jeblair: ^ sounds like you may have a volunteer?
19:49:14 <jeblair> \o/
19:49:37 <fungi> mmedvede: feel free to delegate/distribute the load to any other interested 3rd-party ci ops who show interest too
19:49:58 <fungi> though hopefully the work involved is relatively minimal
19:50:06 <mmedvede> this is my hope :)
19:50:12 <jeblair> ++
19:50:20 <fungi> but getting some of them to help test it out may make sense
19:50:34 <clarkb> ya I think having third party ci involved just for ^ is worthwhile
19:50:42 <clarkb> even if they aren't able to actively review the changes or write them
19:50:55 <pabelanger> most of the work is going to be moving our zuulv3 stuff from system-config back into puppet-openstackci
19:51:38 <mordred> pabelanger: ++
19:53:16 <clarkb> #topic open discussion
19:54:03 <clarkb> As a general heads up with the firefighting largely behind us I'd like to start organizing the infra TODO list. Basicalyl something that shows new and old infra folk what work is happening and where they can help out if they have spare cycles
19:54:20 <clarkb> You'll probably see me ask for eyeballs on a storyboard board in the near future
19:54:52 * mordred is back to not being able to login to storyboard, fwiw
19:54:57 <clarkb> fun
19:54:59 <pabelanger> I still need to send out ML post about xenial upgrades, I'll try to get that out later today
19:55:00 <AJaeger> the Zuul v3 migration issue etherpad has still some items, could we all review it over the next days, please?
19:55:08 <clarkb> AJaeger: ++
19:56:37 <mordred> ++
19:58:08 <dmsimard> on an openfloor note, unbound reviews are up to try and see if this helps with our ongoing DNS resolution failures: https://review.openstack.org/#/q/topic:unbound-ttl
19:58:30 <dmsimard> should be good to go in, they're set to not change anything and effectively be no-op so that we can try it selectively in some jobs.
19:58:48 <clarkb> dmsimard: is that something we might want to try in a limited fashion on the jobs affected by the problem?
19:59:25 <dmsimard> clarkb: that's exactly the purpose, yes, we're actually not changing the defaults from unbound, but jobs can specify vars for cache-min-ttl and it'll be configured accordingly
19:59:35 <clarkb> gotcah sounds good
20:00:09 <clarkb> alright that is all the time we have, find us in #openstack-infra or on the infra mailing list. Thanks everyone
20:00:12 <clarkb> #endmeeting