19:01:18 <clarkb> #startmeeting infra
19:01:19 <openstack> Meeting started Tue Mar 20 19:01:18 2018 UTC and is due to finish in 60 minutes.  The chair is clarkb. Information about MeetBot at http://wiki.debian.org/MeetBot.
19:01:20 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote.
19:01:22 <openstack> The meeting name has been set to 'infra'
19:01:25 <frickler> o/
19:01:46 * jesusaur lurks
19:01:50 <ianw> o/
19:02:02 <clarkb> #link https://wiki.openstack.org/wiki/Meetings/InfraTeamMeeting#Agenda_for_next_meeting
19:02:07 * fungi has the topic correction queued up for after the meeting, just remind me
19:02:13 <clarkb> #topic Announcements
19:02:29 <clarkb> infra-root should go and sign the rocky release gpg key
19:02:33 <mordred> o/
19:02:38 <clarkb> the directions for doing this are on the system-config docs
19:02:43 <dmsimard> What is the link to see the key on the server again ?
19:02:47 * clarkb makes a note to do it himself
19:02:50 <dmsimard> fungi gave that to me the other day but didn't bookmark it
19:02:55 <mordred> I seem to remember signing it already
19:03:02 <fungi> #link https://docs.openstack.org/infra/system-config/signing.html#attestation Attestation instructions
19:03:39 <fungi> #link https://sks-keyservers.net/pks/lookup?op=vindex&search=0xc31292066be772022438222c184fd3e1edf21a78&fingerprint=on Rocky Cycle key
19:04:03 <fungi> also find them at the bottom of the releases site when in production
19:04:10 <dmsimard> I ended up finding http://sks.spodhuis.org/pks/lookup?op=vindex&longkeyid=on&search=0x184FD3E1EDF21A78 but I guess that's the same thing
19:04:16 <fungi> #link https://releases.openstack.org/#cryptographic-signatures
19:04:50 <mordred> but nope - I have not
19:04:53 <fungi> for the record, it went into production late yesterday, but you can still attest to it any time you like
19:05:19 <clarkb> I'm going to be traveling at the tail end of this week and early next week. Which means it would probably best to have someone else chair next weeks meeting
19:05:31 <clarkb> If you'd like to do that let me know in the infra channel later or via email?
19:05:39 <fungi> i can volunteer, but happy for someone else to take a turn
19:06:05 * fungi got his fill
19:06:11 <clarkb> I'll likely go quiet thursday or friday while I pack and prep for conferency things
19:07:28 <clarkb> And finally since this group includes people that straddle the ops and dev roles you may be interested in a proposal to on the ops mailing list to merge the ops midcycle with the ptg
19:08:02 <fungi> oh, last minute but you can announce that zuul &co are no longer infra deliverables as of today
19:08:41 <clarkb> ya was going to bring that up later too. We've had a long standing zuulv3 priority effort meeting agenda item that maybe we need to reevaluate in this split out? We can talk about it in a short while
19:08:47 <fungi> #link http://lists.openstack.org/pipermail/openstack-operators/2018-March/014994.html Ops Meetup, Co-Location options, and User Feedback
19:09:12 <fungi> (start of the colocation thread)
19:09:33 <clarkb> #topic Specs approval
19:09:51 <clarkb> #link https://review.openstack.org/#/c/550550/ Improve IRC discoverability
19:09:56 <clarkb> that change could still use some reviews
19:10:19 <clarkb> I also wonder if the slack kills irc gateway plan affects our other irc related spec
19:10:36 <clarkb> It doesn't really affect us directly as we don't use slack but there have been a non zero number of people wanting to use our bots with slack
19:11:00 <corvus> clarkb: "the slack kills irc gateway plan" ?
19:11:11 <mordred> clarkb: it makes me less inclined to be accomodating, tbh
19:11:15 <clarkb> corvus: slack recently announced that mid may their irc gateway would be shutdown
19:11:48 <clarkb> corvus: basically forcing all bots to use their applications interface. Which is quite limited on the free tier
19:11:59 <clarkb> (I'm sure that that isn't related at all to killing the gateway)
19:12:03 <corvus> clarkb: you mean http://specs.openstack.org/openstack-infra/infra-specs/specs/irc.html ?
19:12:11 <corvus> clarkb: that doesn't actually mention slack
19:12:36 <corvus> in fact, i'd say a key characteristic of that spec is that we've de-prioritized non-irc support for our bots
19:12:42 <clarkb> corvus: it doesn't but one of the other alternatives we had considered in the past was using that one lib that talks all the protocols.
19:12:47 <clarkb> corvus: yes
19:13:08 <corvus> yeah, so i guess that's now *even less* important
19:13:09 <fungi> #link https://review.openstack.org/319506 [abandoned] Add spec for hosted IRC client
19:13:15 <fungi> is also potentially related
19:13:39 <clarkb> fwiw I'm happy to continue using irc. I think it would be difficult for us to supprot slack natively
19:13:52 <clarkb> but I've also suggested in the past that "hey all the bots work with slack right now because irc"
19:14:18 <fungi> i'm increasingly in favor of helping people find ways to make irc fit their needs rather than support additional protocols which merely fracture the community further
19:14:30 <clarkb> ++
19:14:34 <corvus> yep, sign me up for that
19:14:56 <dmsimard> I know for a fact that several projects are using a slack <-> irc bridge (with something like https://github.com/ekmartin/slack-irc )
19:15:05 <dmsimard> Kolla does it, ARA does as well -- not sure if there's others
19:15:19 <clarkb> ok sounds like we don't need to take a major shift here as we are happy pushing towards irc
19:15:35 <dmsimard> I mean, as much as I dislike Slack.. there's people who dislike IRC just as much and that shouldn't prevent them from contributing and discussing things
19:15:41 <corvus> sure it should
19:15:46 <corvus> it's not free software
19:15:53 <corvus> and we don't require folks to use free software to work with our systems
19:16:04 <dmsimard> I never said it was a requirement to use SLack
19:16:07 <corvus> see https://governance.openstack.org/tc/reference/irc.html
19:16:26 <dmsimard> I said there's people who prefer Slack to IRC and if we can let them use Slack to talk to us, why not ?
19:16:39 <clarkb> that I think is an entirely separate conversation
19:16:46 <dmsimard> Yeah it is, sorry for sidetracking
19:16:49 <clarkb> (I was mostly scopign this to the specs we have related to changes to our bots)
19:17:01 <clarkb> (and sounds like there is agreement that wedon't need to update any specs related to that)
19:17:22 <clarkb> we can talk about the other thing post meeting if we really want to
19:17:43 <clarkb> #topic Priority Efforts
19:17:43 <fungi> translating: "if people prefer proprietary software we should go out of our way to support it" (i can't personally agree with that)
19:17:59 <clarkb> #topic Zuul v3
19:18:13 <corvus> i had an obvious typo in a previous remark, but i'll leave it as an exercise for the reader so we can move on
19:18:38 <clarkb> corvus: I parsed it :)
19:18:53 <corvus> so about dropping this item --
19:19:04 <clarkb> as mentioned earlier Zuul and friends are now independent projects no longer tied to OpenStack (and consequently technically infra)
19:19:12 <corvus> i think we're really close to the v3 release at this point
19:19:22 <fungi> to your earlier foreshadowing, i think we should still have a zuul v3 priority effort until v3 is released (possibly longer). there's plenty of work to be done on the implementing for openstack infra side still
19:19:23 <clarkb> This has been a long standing agenda item and ya I was going to say maybe we just keep it going through the release?
19:19:59 <corvus> and i think we've covered everything in the spec (with the exception of some CD stuff -- but arguably, you can accomplish that with secrets now anyway)
19:20:15 <fungi> there's still a fair amount of job conversion long tail which _could_ justify the priority effort extending past 3.0.0 getting tagged
19:20:18 <corvus> the last several meetings this has mostly been an operational topic anyway; less about development.
19:20:25 <clarkb> then any operational concerns would be normal agenda items. Going forward. I think the consequence of this is that we should probably reevaluate our priority efforts and see if anything needs to be prioritized.
19:21:10 <corvus> keeping this on the list until release and then just adding operational items as they come up seems reasonable.  but also, if we wanted to go ahead and drop it, that works too.
19:21:25 <clarkb> This is neat because it means we are almost done with a long standing item and zuul gets to be free and hopefully used by far more people.
19:22:04 <AJaeger> I think it will come up - somehow. Either with new features or new ways of doing jobs.
19:22:28 <fungi> yeah, it will be nice to cross it off the in-progress list. i suppose v3 release is a good enough milestone we can call it no longer transitional
19:22:37 <AJaeger> Just as a heads-up: tox-siblings is current topic that mordred is digging into, this caused a bit more havoc than expected ;(
19:22:43 <clarkb> AJaeger: ya I think it will remain on topic to talk about how zuul affects infra and openstack. I think bringing this up is more about fungis point
19:22:57 <clarkb> so rather than standing priority agenda item we'll move to as needed agenda items
19:23:03 <AJaeger> clarkb: works for me
19:23:05 <mordred> AJaeger, clarkb: I think we've got tox-siblings fixes almost ready
19:23:28 <AJaeger> mordred: Hope so - just wanted to share with team so that they are aware of it
19:23:37 <mordred> incidentally, one of the tox-siblings bugs was also present in the tox_install.sh scripts - but nobody had noticed it wasn't doing the right thing
19:23:50 <clarkb> which is good I think that means infra has accomplished the majority of why this was a priority effort in the first place. Which is a major milestone
19:24:06 <mordred> clarkb: ++
19:24:08 <clarkb> and it is also a transitional step for zuul itself
19:24:53 <clarkb> we'll all have to take a drink of our favorite beverage once that tag is cut
19:24:55 * fungi toasts another major success
19:25:12 <fungi> oh, i'm early i guess ;)
19:25:40 <clarkb> as for zuulv3 topics proper I think the biggest one is pointing out that if you are running a zuulv3 you want to update all your executors to run latest master
19:25:50 <clarkb> there were several security bug fixes last week. Infra has updated its executors
19:26:01 <clarkb> anything else we want to go over re Zuulv3?
19:26:02 <corvus> and there will be at least one more security fix, hopefully this week
19:27:19 <clarkb> Look out for that on the zuul mailing list
19:27:35 <fungi> at least one more security fix before 3.0.0 presumably
19:27:40 <clarkb> ya
19:27:52 <fungi> i expect plenty more security fixes in the coming months/years because it's software
19:28:15 <clarkb> #topic Project Renames
19:28:17 * fungi should clearly not seek a side job in public relations
19:28:21 <clarkb> fungi: ha
19:28:51 <clarkb> fungi has written up a new project rename process. I suppose we should all review that (do we want to wait before merging it after we perform renames?) and mordred has provided specifics for his project rename
19:29:00 <clarkb> I think this means we are basically ready to give it a go
19:29:13 <mordred> \o/
19:29:31 <fungi> it's not so much written up ans written out. i just deleted things ;)
19:29:46 <clarkb> I'm probably not going to be around to help thursday-wednesday if we want to go and do it real quickly but that shouldn't stop us I don't think
19:29:48 <corvus> oh what's the link?
19:29:56 <fungi> #link https://review.openstack.org/554261 Update Gerrit project renaming for Zuul v3
19:30:01 <fungi> was just fishing it out
19:30:11 <corvus> will review
19:30:16 <fungi> thanks!
19:30:43 <clarkb> looking at the release schedule next week should be safe too
19:30:45 <fungi> it's primarily an attempt at encoding what we discussed in the meeting last week and subsequently over the course of the week in #openstack-infra, so hopefully no surprises
19:30:52 <clarkb> and probably the week after that
19:31:05 <clarkb> though I'm somewhat hopefully we'll do it sooner than that just to get past it
19:31:23 <corvus> i can assist fri, but would prefer not to drive
19:31:33 <mordred> oh jeez. I have a lot of patches to prepare
19:31:59 <fungi> i've been asked to also help with some storyboard imports friday, so am willing to help but would also rather not be in the hot seat
19:32:00 <corvus> mordred: should we wait a bit then?
19:32:17 <clarkb> also I think mordred is also traveling late this week?
19:32:35 <mordred> corvus: no - it's fine - I just need to make rename patches for all the repos listed in fungi's patch
19:32:36 <fungi> next week is starting to sound better
19:32:50 <clarkb> ya maybe March 29 or 30
19:33:01 <fungi> mordred: hopefully most of those don't need patches, it's more a list of places to check whether they need patches
19:33:04 <corvus> dmsimard: have you done a rename yet?
19:33:11 <dmsimard> I have not
19:33:15 * mordred is not travelling friday - but is travelling monday - 29 and 30 are both great
19:33:19 <frickler> that would be the easter weekend, not sure who has holidays there
19:33:25 <clarkb> mordred: oh we'll miss you on the weekend then?
19:33:30 <fungi> dmsimard: [sales pitch] it uses ansible!
19:33:31 * clarkb wonders if he needs to brush up on his zuul
19:33:34 <mordred> clarkb: no - I'll be in LA on the weekend
19:33:45 <mordred> clarkb: I'm just flying to LA first thing saturday morning
19:33:45 <clarkb> mordred: ah just not friday gotcha
19:33:59 <clarkb> frickler: arg
19:34:05 * mordred has symphony tickets for friday for the brandenburg concertos
19:34:25 <clarkb> mordred: if we wanted to do it friday would you be able to drive it?
19:34:31 * corvus now has bach stuck in his head
19:34:37 <clarkb> (to avoid easter weekend conflict)
19:34:50 <mordred> clarkb: yes, I can drive it on friday if need be
19:34:58 <fungi> i have the utmost respect for people of religion, but must admit i have a hard time remembering when religious holidays fall
19:35:07 <clarkb> fungi: easter is a hard one because it moves around
19:35:19 <mordred> also - the followup for openstacksdk for the rename is doing the storyboard transition - so if fungi is already in storyboard transition brainspace, that could be handy
19:35:29 <mordred> easter is easy ...
19:35:40 <AJaeger> clarkb: there's school vacation starting a week earlier for my kids, helps remembering ;)
19:35:40 <fungi> mordred: sure, happy to
19:35:44 <mordred> it's the first sunday after the first full moon after the vernal equinox
19:35:49 <fungi> mordred: though also, renames in sb are super easy
19:36:05 <mordred> fungi: nod- I figured I'd just do it after the rename to keep churn low
19:36:17 <clarkb> if corvus and fungi are able to help out and mordred can drive on friday I think that gives us enough humans to give it a go
19:36:19 <fungi> it's, like, one sql update query
19:36:25 <clarkb> and then any other infra-roots wanting to be involved can help
19:36:32 <clarkb> (I'll likely be floating around too just not persistently)
19:36:48 <mordred> clarkb: are we past needing 3 hour reindexes now?
19:36:56 <clarkb> mordred: yes it should happen all online
19:36:57 <fungi> if people review my updated process draft, i'll happily put together a maintenance plan based on it
19:36:58 <mordred> \o/
19:37:21 <fungi> mordred: the reindex may take 12 hours, but doesn't require gerrit downtime (though does tend to block git replication)
19:37:33 <clarkb> its a far less expensive process
19:38:01 <pabelanger> I'm able to help too
19:38:10 <clarkb> friday is weeked for ianw so we should probably let ianw weekend. frickler dmsimard any interest? If so maybe we aim for an early start?
19:38:32 <fungi> dmsimard is in the same tx i am, i think
19:38:36 <fungi> er, same tz
19:38:48 <ianw> i can be around if earlyish, but not sure i have any special insight for this one
19:38:56 <dmsimard> I'm on eastern (15:39 currently) and I'm leaving in one minute to pick up the kids from school :p
19:39:12 <fungi> yeah, that's my tz too
19:39:19 <dmsimard> I could watch and participate in the rename ? I'm not sure I could drive it on my own
19:39:23 <clarkb> 1500UTC is 8am PDT and early enough that frickler can join in if interested
19:39:26 <dmsimard> brb
19:40:00 <pabelanger> wfm
19:40:01 <frickler> clarkb: well on a friday thats pretty late for me already, so I'd rather skip that
19:40:05 <clarkb> proposal: 1500UTC Friday if that works for mordred (as driver) and others that volunteered to help
19:40:07 <clarkb> frickler: ah ok
19:40:10 <mordred> with only two repos to name - it SHOULD be easy - even with github clicking
19:40:13 <fungi> if the revised plan is good, it doesn't actually require zuul downtime just a very brief (automated( shutdown and startup for gerrit
19:40:17 <clarkb> frickler: I'm thinking it would be good to do it when corvus is awake since it is our first with zuulv3
19:40:23 <fungi> so hopefully people won't be impacted too much
19:40:29 <clarkb> fungi: which means probably no earlier than 1500UTC
19:40:47 <frickler> yeah, just plan it according to your needs
19:40:59 <clarkb> yay tab complete failures
19:41:02 <mordred> clarkb: 1500UTC wfm
19:41:03 <corvus> 1500utc is fine for me
19:41:07 <fungi> i'm good with whatever time you like. i have no life
19:41:18 <clarkb> ok lets say 1500UTC Friday then
19:41:27 <clarkb> and I'll go ping release team after this meeting
19:41:44 <clarkb> thanks everyone it will be nice to put this behind us
19:41:47 <fungi> sounds good, thanks!
19:41:56 <clarkb> #topic General Topics
19:42:09 <clarkb> ianw: it would probably be good to get a recap on all of the afs changes that have happened
19:42:36 <ianw> ok
19:42:52 <fungi> oh, yeah, i'm sure i missed at least some of the details
19:43:08 <ianw> first thing was i updated mirror-update to use backported bionic era 1.8~pre5 packages
19:43:25 <ianw> that is in https://launchpad.net/~openstack-ci-core/+archive/ubuntu/openafs-1.8-xenial
19:43:58 <ianw> that actually seemed to get things very stable.  all reprepro's ran for about 3 days without failure
19:44:27 <ianw> we found one weird warning, which after discussion seems harmless, and i've proposed https://gerrit.openafs.org/#/c/12964/
19:45:13 <ianw> after that, i updated all our fileservers to be running with the settings as suggested by auristor and documented in https://review.openstack.org/540198
19:45:26 <pabelanger> nice
19:45:38 <clarkb> oh good that did merge
19:45:45 <fungi> auristor also mentioned that we should be possibly concerned that our afs servers aren't running as new a version as the client?
19:46:05 <fungi> or maybe i misread
19:47:00 <ianw> yeah, as i understand it, there's just a mismatch between resources in terms of threads and callbacks, with 1.8 having more, that may result in lower performance
19:47:24 <clarkb> which is why we increased resources per that docs change right?
19:47:28 <fungi> oh, right, that the 1.8 client can overrun the number of requests the server is able to accept
19:47:46 <fungi> okay, so the config change does address that concern?
19:48:05 <ianw> clarkb: i think we were just really underspecced to start with, using the defaults.  we may want to tweak things as we move on, now we have the idea that we should
19:48:14 <clarkb> fungi: that was my understanding. Basically increased the number of server side resources to match the client side increases
19:48:20 <fungi> excellent
19:48:24 <clarkb> ianw: gotcha
19:48:28 <ianw> i've been fiddling with getting some of the stats out of various tools and possibly sticking them into graphite, so we can at least see where we might have issues
19:48:42 <fungi> that's super helpful, thanks!
19:49:03 <ianw> and the last thing is the afs docs jobs, which AJaeger pointed out can be quite unstable, seeming to hang
19:49:23 <clarkb> ianw: those jobs will use the older afs packages on the base distro too right?
19:49:40 <ianw> we had a bit of back and forth over what this might be, simultaneous vos releases which corvus pointed out wasn't an issue, etc
19:50:19 <ianw> it's writing from the executor i think, and it seems the rsync hangs?
19:50:37 <ianw> it's wrapped under a significant amount of layers of ansible, etc
19:50:53 <clarkb> oh right and we updated the afs and kernel on the executors
19:51:02 <clarkb> maybe we need to roll them all the way up to your new 1.8 packages?
19:51:25 <ianw> anyway ... i will keep an eye on this.  maybe new settings will help.  otherwise i think we need to keep digging to understand the timeout a bit better
19:51:38 <clarkb> sounds good
19:52:16 <clarkb> Another topic is cloud changes. dmsimard is in the process of bringing up limestone networks' cloud for use as test resources and we are testing out new flavors on vexxhost which require boot from volume
19:52:33 <clarkb> pabelanger: ^ vexxhost seems stable now ya? we just had to switch to using raw images?
19:52:40 <pabelanger> it is!
19:52:52 <pabelanger> we have 10 nodes, and boot-from-volume with 80GB volumes
19:52:54 <clarkb> pabelanger: any indication if the jobs are happier with the new flavors yet?
19:53:07 <pabelanger> clarkb: they seem to be, I haven't seen any timeouts yet
19:53:12 <clarkb> nice
19:53:17 <pabelanger> I also think the new mirror is helping with that too
19:53:27 <fungi> i haven't heard anything else from packethost since we briefed them on our consumption model, but presumably we have them on the way soonish as well
19:53:29 <pabelanger> since we no longer have BW constraints
19:53:49 <clarkb> and I think the change on how ipv6 works in limestone will help us bootstrap more quickly there
19:54:07 <clarkb> dmsimard: I'd be curious to know if you have further mirror booting issues with ipv6 there
19:54:18 <pabelanger> yah, I plan to confirm with mnaser again how many nodes we can get in vexxhost again
19:54:26 <dmsimard> clarkb: (back from school), yes, I'll give it another try
19:54:32 <clarkb> ianw: any linaro cloud updates?
19:54:47 <clarkb> ianw: I guess at this point its adding limited set of jobs to use the arm nodes?
19:54:58 <ianw> i released dib 2.12 with all the required changes, and released nb03 from the emergency file
19:55:11 <ianw> not sure if builds have happened yet, but will keep an eye there too
19:55:42 <ianw> other than that, no, i think use as people want.  i have some things i want to do with devstack support, like etcd versions etc
19:55:52 <ianw> s/no/yes/
19:56:22 <pabelanger> ianw: might want to double check I used the right username / password for nodepool clouds.yaml. I think I did it correctly
19:57:34 <clarkb> really quickly before our hour is up
19:57:39 <clarkb> #topic Open Discussion
19:57:46 <ianw> pabelanger: yeah, i think i did yesterday, lgtm
19:58:01 <frickler> I have a patch up to restore the nodepool runtime graphs https://review.openstack.org/553718
19:58:02 <clarkb> anything else?
19:58:09 <pabelanger> I think https://review.openstack.org/554624 will allow us to 2 stage a gerrit install on review-dev01.o.o
19:58:11 <frickler> but failing to find per-provider metrics
19:58:15 <frickler> help appreciated
19:58:16 <clarkb> #link https://review.openstack.org/553718 restore nodepool runtime graphs
19:58:18 <pabelanger> hopefully will pass tests
19:59:26 <clarkb> #link https://review.openstack.org/554624 prereq for booting review-dev01
20:00:14 <clarkb> corvus may know about finding per provider metrics. frickler I often find reading the source is the easiest way to sort that out:/
20:00:20 <clarkb> and with that we are at time. Thank you everyone
20:00:22 <clarkb> #endmeeting