#openstack-meeting log

19:03:45 <fungi> #startmeeting infra
19:03:46 <openstack> Meeting started Tue Jun  7 19:03:45 2016 UTC and is due to finish in 60 minutes.  The chair is fungi. Information about MeetBot at http://wiki.debian.org/MeetBot.
19:03:47 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote.
19:03:50 <openstack> The meeting name has been set to 'infra'
19:03:51 <anteaya> the letter 3 again
19:03:54 <fungi> #link https://wiki.openstack.org/wiki/Meetings/InfraTeamMeeting#Agenda_for_next_meeting
19:03:58 <mtreinish> fungi: ?
19:04:01 <anteaya> when does the number 3 get a turn
19:04:03 <fungi> anteaya: i'm unoriginal
19:04:04 <mtreinish> oh, I did put something on the agenda last week
19:04:06 <mtreinish> I forgot
19:04:19 <fungi> mtreinish: excellent, someone else with an even worse memory than mine! ;)
19:04:27 <fungi> #topic Announcements
19:04:35 <fungi> #info Tentative late-cycle joint Infra/QA get together to be held September 19-21 (CW38) in at SAP offices in Walldorf, DE
19:04:40 <fungi> check with mkoderer and oomichi for details
19:04:45 <fungi> #link https://wiki.openstack.org/wiki/Sprints/QAInfraNewtonSprint
19:05:13 <fungi> also we did a couple project renames last week. i screwed one of them up by turning puppet back on a few minutes early
19:05:16 <mordred> o/
19:05:42 <fungi> there is now a dead "openstack-infra/ansible-puppet" repo with just a readme in it, in read-only mode. we can safely purge that from disk before the next gerrit reindex
19:05:59 <pabelanger> o/
19:06:01 * AJaeger apolizies for beeing late
19:06:03 * fungi hangs head in shame
19:06:06 <fungi> moving on
19:06:14 <fungi> #topic Actions from last meeting
19:06:21 <fungi> #link http://eavesdrop.openstack.org/meetings/infra/2016/infra.2016-05-31-19.03.html
19:06:29 <fungi> pleia2 set up filtering to direct provider account contact address(es) into a high priority inbox
19:06:32 <fungi> i saw that happened!
19:06:35 <pleia2> so, filtering!
19:06:39 <fungi> filtering!!!
19:06:48 <pleia2> I went ahead and filed a ton of old mail too
19:06:50 * fungi us using up his exclamation mark quota for the month)
19:06:58 <zaro> o/
19:07:05 <notmorgan> fungi: uhm. they have a quota for that? :(
19:07:07 <pleia2> so it should be in the proper place, not everything is in filtering rules though since some companies send from multiple addresses and domains
19:07:14 * bkero lets fungi borrow some of his exclamation mark quota. He's clearly making better use of them.
19:07:32 <pleia2> I figure I'll log in and tweak over time, the work I did last week was a massive first pass over 25k messages
19:08:07 <pleia2> do we have imap or pop access to this?
19:08:15 <anteaya> pleia2: thank you for this
19:08:17 <fungi> supposedly imap works
19:08:29 <pleia2> I couldn't find docs for it through their help menus, so I wasn't sure
19:09:13 <fungi> i think it might be the same thing that i use to read my jeremy@openstack.org address, so sync up with me after the meeting and we'll see if similar parameters work
19:09:22 <pleia2> but I think the next step is figuring out how we stay on top of this email, logging in all the time is not something I can really do, but I could drop it in my imap client where I monitor other addresses
19:09:45 <pleia2> fungi: cool, thanks
19:09:59 <fungi> yeah, same, if it works over imap and there's a high=priority inbox then i'll just check it from mutt as continuously as my other addresses
19:10:07 <fungi> er, high-priority
19:10:12 * pleia2 nods
19:10:18 <fungi> fungi start an ml thread revisiting current priority efforts list
19:10:20 <fungi> #link http://lists.openstack.org/pipermail/openstack-infra/2016-June/004374.html
19:10:25 <fungi> follow up to that and i'll provide a specs update once we come to some stasis there
19:10:36 <fungi> hopefully should have somthing up for approval in next week's meeting
19:10:36 <Zalman{-}> evening, people this is british server?
19:10:38 <pleia2> anyone interested can see how I filtered and we can adjust accordingly
19:10:57 <fungi> #topic Specs approval
19:11:05 <fungi> #info Approved spec "A Task Tracker for OpenStack"
19:11:10 <fungi> #link http://specs.openstack.org/openstack-infra/infra-specs/specs/task-tracker.html
19:11:26 <fungi> (and there was much rejoicing)
19:11:28 <pleia2> \o/
19:11:38 <anteaya> #link small edit https://review.openstack.org/#/c/326680/
19:11:43 <anteaya> and yay!
19:12:15 <Zara> \o/
19:12:33 <fungi> anteaya: oh, lgtm. approving
19:12:42 <anteaya> thanks, it works locally
19:13:11 <fungi> should merge shortly
19:13:17 <anteaya> thank you
19:13:34 <fungi> #topic Priority Efforts
19:14:01 <fungi> i don't see any urgent updates there, but we've got some updates filtering in on the aforementioned ml thread
19:14:28 <fungi> also rcarrillocruz mentioned that hpe is moving our infra-cloud hardware again in ~4-6 weeks to a less damp facility
19:14:39 <Zara> :D
19:14:47 <pabelanger> Did we decided if we wanted to try used the hardware today anyways?
19:14:48 <yolanda> :(
19:15:00 <pabelanger> /used/using
19:15:06 <pleia2> I miss our weekly updates
19:15:10 <fungi> i'll leave it up to rcarrillocruz and anyone else currently wrangling the hardware to determine if that's feasible
19:15:45 <fungi> pleia2: yep, as soon as we get the priority list pruned and updated i'd like to start doing them again, or at least collecting incremental updates
19:16:02 <fungi> or you just mean you miss the updates on infra-cloud?
19:16:03 <anteaya> jeblair also made a point that perhaps we could entertain the idea of another hardware provider
19:16:12 <pabelanger> anteaya: ++
19:16:13 <pleia2> fungi: infra-cloud, the ones cody was doing
19:16:35 <jeblair> yeah, at this point, if someone else offers hardware (or we still have folks interested waiting in the wings), i think that would be a good idea
19:16:38 <anteaya> s/another/a different
19:16:40 <fungi> yeah, there is at least one company who has reached out to us about the logistics of providing hardware/colocation
19:16:50 <jeblair> we're 6 months behind on this, mostly due to hp logistical issues
19:17:03 <fungi> i can get people involved in a discussion with them as long as i get some volunteers to steer that
19:17:34 <pabelanger> I'm happy to get more involved where ever needed
19:17:47 * bkero willing to help out any way he can
19:18:12 <fungi> ideally at least one person currently involved in the infra-cloud effort too, so that we can get some continuity on the effort
19:18:34 <yolanda> i can collaborate, and give advice, but cannot compromise on that at the moment
19:18:53 <fungi> thanks yolanda
19:18:54 <yolanda> i need to figure how much time could i dedicate
19:19:19 <pabelanger> yolanda: I'm free whenever you find the time
19:19:20 <fungi> also no idea if rcarrillocruz is interested in helping with a separate provider, but he said he's missing the meeting today and will catch up afterward
19:19:38 <fungi> so should probably continue this discussion outside the meeting
19:19:48 <yolanda> pabelanger, we can schedule some time and i give you an overview
19:20:06 <pabelanger> yolanda: ack
19:20:22 <fungi> #topic Timeline for first Xenial systems (pleia2)
19:20:48 <pleia2> so, this came up because planet.o.o is a bit stuck with a broken package in trusty, the idea was floated to switch to pulling it from source
19:20:48 <fungi> looks like there are a few drivers for having at least some preliminary use of xenial for our servers
19:21:24 <pleia2> and can't upgrade zanata until it's on xenial, and the i18n team is eager for the new features that the zanata folks put in for us
19:21:55 <pleia2> so this topic is mostly to get a feel for our priority for moving to xenial on some systems
19:22:46 <pleia2> I'd rather not rewrite our planet module if it'll be fixed by an upgrade in a month, but if we aren't planning to go to xenial for 6 months, that's a very different story (and something I need to talk to the i18n folks about)
19:22:51 <fungi> my recollection from the discussions befpre and at the summit was that we weren't going to push to upgrade all our trusty systems to xenial until next year, but that running xenial where it makes sense to do so is acceptable
19:23:32 <pleia2> yeah, that makes sense, just trying to pin down when we should aim for the first xenial systems
19:23:36 <pabelanger> We still have some servers running precise, I'm trying to focus on upgrading them before moving into xenial.  But happy to review code
19:23:37 <fungi> these all sound like cases where running xenial is at least no worse than trying to make things work on trusty
19:24:07 <fungi> we did also say we'd avoid deploying new xenial systems until we finished precise to trusty upgrades
19:24:11 <pleia2> zanata could probably run on trusty if we grab java from a 3rd party repo, but that does not give me good feels :)
19:24:37 <fungi> we can likely knock out zuul and static pretty easily, but wiki is going to be a pretty involved upgrade due to its lack of configuration management
19:24:55 <pleia2> yeah, the wiki is hard
19:25:21 <pabelanger> wiki is last item on my list
19:25:33 <pabelanger> I've been pusing it to the end on purpose because of the amount of work
19:25:39 * pleia2 nods
19:26:09 <pabelanger> with that said, if we want xenail first, I'm happy to skip wiki for the moment
19:26:20 <fungi> i was talking about it earlier today in #-infra, but we really need some clear cut plan on how to maintain it. i did a little research on whether we could run it from distro packages, and what the state is for the puppet modules in the forge, but ultimately i want whoever's going to be working on the uprgade to decide what's the most maintainable long term
19:26:51 <pabelanger> Is the wiki long term now?
19:27:12 <fungi> not necessarily long term, but long enough term to get us a graceful exodus
19:27:22 <pleia2> btw, mediawiki is being removed from debian and that's where ubuntu gets it from
19:27:37 <fungi> which i think involves having it maintainable in the long term (regardless of whether we opt to actually maintain it indefinitely)
19:27:49 <fungi> pleia2: not "being removed" but removed more than a year ago
19:27:50 <pabelanger> Right, I'm hoping 12 more months honestly of keeping wiki.o.o online
19:28:18 <fungi> pleia2: i also looked into rpms, and the packaging situation is even less fun there
19:28:32 <pleia2> aha
19:28:49 <fungi> everyone these days seems to be installing from source
19:28:55 <pleia2> yeah
19:28:58 <fungi> anyway, this is getting off onto a tangent
19:29:00 <pabelanger> maybe we need to revisit moving away from mediawiki to something else? Not that I want to talk about it now
19:29:07 <pabelanger> fungi: ++
19:29:27 <pleia2> loosely speaking, do we think we can aim for having our first xenial systems available in a month?
19:29:35 <pleia2> I can start working on fixes we need in puppet
19:29:43 <fungi> to tie that subtopic up, i'd love to have a very short spec describing a plan to get the wiki into configuration management and upgraded so that it can go onto the priority specs list
19:29:54 <fungi> thanks pleia2. any additional help there is appreciated
19:30:01 <pabelanger> I'd like to stand up our xenial wheel mirror first, if possible.  Assuming rackspace has the image uploaded
19:30:04 <pleia2> I'd like to have a new zanata server up for translations this cycle, which means having it ready for testing by august at the latest
19:30:11 <fungi> pleia2: yeah, that sounds reaosnable
19:30:27 <pleia2> ok, great
19:30:54 <fungi> anybody object to having a xenial server or two in our corral in ~ a month?
19:31:02 <anteaya> I do not object
19:31:20 <anteaya> and I agree with fungi's call for a spec for the wiki
19:31:56 <fungi> being able to meet the i18n team's needs, being able to get planet sanely off of precise, et cetera trump the risks of running a couple xenial servers "early" for me at least
19:32:34 <fungi> #topic Setup a common message bus using mqtt (mtreinish)
19:32:45 <fungi> that topic you forgot to remember
19:32:48 <pleia2> I can start drafting up the wiki spec (and pabelanger or anyone else is welcome to push changes to it as we go)
19:32:56 <fungi> thanks pleia2!
19:32:59 <pabelanger> pleia2: thanks
19:33:04 <mtreinish> fungi: well I did remember to put it on the wiki
19:33:10 <anteaya> mtreinish: yay!
19:33:14 <mtreinish> that counts for something :)
19:33:33 <fungi> mtreinish and i were discussing his gerrit to mqtt event translation and sdague's one-stream-to-rule-them-all vision
19:34:28 <fungi> the idea here is that mtreinish and i would co-author a spec to add something like a firehose.openstack.org server with an mqtt view of everything in our infrastructure, adding gerrit's event stream as the first feature
19:34:43 <pleia2> cool
19:34:46 * mtreinish really likes that hostname
19:34:53 <fungi> something similar to the fedora community infrastructure's fedmsg, but not zeromq
19:35:03 <jeblair> ++ and ++
19:35:07 <anteaya> sounds good
19:35:26 <pleia2> I don't know anything about mqtt and don't want to derail, but I will mention that the Fedora Infra team uses http://www.fedmsg.com/en/latest/overview/ very similiarly
19:35:34 <fungi> this was just a quick sync up in the meeting to see if that's a completely insane idea and we should pack it in and pretend it never came up
19:35:38 <pleia2> there's an instance of it running in debian infra as well
19:37:02 <fungi> yeah, i expect that the mechanics of getting arbitrary system events into mqtt are pretty similar to zeromq, so we can probably borrow a lot from fedora infra and dsa for those
19:37:10 <jeblair> i think it's a swell idea -- one thing i would like to caution though is that i think some of the things that we are using message buses for today we can and probably should use zuul for in zuulv3
19:37:47 <fungi> jeblair: you mean as far as what nodepool is consuming from jenkins?
19:37:51 <notmorgan> jeblair: ++
19:38:14 <fungi> (and also what nodepool is consuming from zuul as of this week)
19:38:37 <fungi> what other message busses are we using?
19:38:40 <jeblair> fungi: well, that specific case is not applicable in v3.  i was thinking more along the lines of some of the post-processing things we do might be better implemented as zuul jobs in v3
19:39:05 <mtreinish> jeblair: oh like the logstash and subunit processing?
19:39:19 <fungi> oh, sure, i would mostly see teh firehose as a third-party/personal point of integration
19:40:00 <mtreinish> fungi: we do use zeromq for that path. Jenkins -> zmq -> gearman client -> gearman -> gearman worker
19:40:02 <jeblair> yeah.  i don't have that all mapped out yet.  *maybe* it's best as an independent queue, or maybe it fits in to zuulv3.  mostly i mention that as something to keep in mind as we start thinking about how we can use it
19:40:03 <fungi> i'm unconvinced we should drive any internal infrastructure off of it, at least initially, and even then only if it makes sense to do so
19:40:16 <anteaya> ++ swell idea
19:40:39 <mtreinish> fungi: yeah that was my thinking too, set it up for external use initially, but it might make sense eventually to start using it for internals too
19:40:45 <jeblair> fungi: that seems like a good way to start -- get to know it before we start depending on it
19:41:20 <fungi> so teh spec definitely wouldn't go into using it for anything other than experimentation
19:41:46 <fungi> but having it might help us drive some additional innovation from outside into the team
19:42:09 <fungi> as people see a consistent place to access some of the currently internalized events
19:42:32 <fungi> anyway, i think that more or less covers it. we still have a couple topics to go today...
19:42:43 <fungi> #topic zuul.o.o / static.o.o migration to ubuntu-trusty: Outage required (pabelanger)
19:42:55 <fungi> so, wanting to work out a maintenance window for these?
19:43:01 <pabelanger> Yup
19:43:09 <pabelanger> I think we need about 1-2 hours to be safe
19:43:32 <pabelanger> I was thinking if we stopped gerrit, we should be safe to stop zuul and detach / attach volumes on static.o.o
19:43:43 <fungi> yeah, i could _probably knock static.o.o out in 15 minutes if all goes smoothly, but wouldn't want to bank on that
19:43:59 <fungi> so 2 hours seems like a safe estimate
19:44:19 <jeblair> zuul should be pretty fast too.  very little state.
19:44:21 <fungi> my schedule is wide open, and i'm happy to help with it
19:44:36 <fungi> in zuul's favor, we also have zuul-dev successfully running on trusty now
19:44:48 <fungi> so shouldn't run into unexpected puppet issues
19:44:58 <rcarrillocruz> heya
19:44:58 <jeblair> and i think the only critical firewall involved is *on* the zuul host
19:45:00 <rcarrillocruz> i'm around
19:45:02 <pabelanger> Yup, I think we can do it pretty quick
19:45:02 <anteaya> whoever wants to drive this please suggest a date
19:45:08 <rcarrillocruz> wootz, quite a bit of notifications
19:45:19 <pabelanger> just need to decide how much heads up to give the community
19:45:33 <pabelanger> too early / late to do it this friday?
19:45:47 <anteaya> friday is fine for me
19:46:08 <fungi> i'm free this friday, but for a two-hour projected outage i'd be hesitant to announce it on such short notice
19:46:36 <anteaya> nothing of note this friday in the release schedule: http://releases.openstack.org/newton/schedule.html
19:46:47 <pabelanger> okay, happy to defer to fungi for the right day
19:46:49 <fungi> you beat me to linking it
19:46:53 <fungi> #link http://releases.openstack.org/newton/schedule.html
19:46:57 <anteaya> yay I beat fungi
19:47:01 <anteaya> that never happens
19:47:13 <fungi> if we do next week, that's the n2 milestone week
19:47:29 <anteaya> fungi: June 17
19:47:35 <anteaya> fungi: you are looking at July
19:47:50 <fungi> indeed, i am
19:47:53 <fungi> le sigh
19:48:06 <jeblair> let's do it before then :)
19:48:15 <fungi> so trove specs proposals deadline is the most exciting thing happening next week
19:48:42 <fungi> yeah, i'm happy to help, and i'm available friday, june 17
19:48:52 <pabelanger> okay, lets aim for that
19:49:04 <pabelanger> I'll get something into pastebin to review
19:49:10 <pabelanger> then send out the email to ML
19:49:11 <fungi> pabelanger: thanks!
19:49:31 <fungi> anything more to discuss on that?
19:49:45 <pabelanger> I think we are good
19:50:07 <fungi> #topic task-tracker migration-- load testing storyboard using storyboard-dev, and gerrit integration (Zara)
19:50:26 <pleia2> exciting :)
19:50:28 <Zara> hi! :) I squooshed two things together...
19:50:41 <fungi> Zara: i assume this is stemming from last week's question about maybe setting up the storyboard plugin on review-dev and pointing it at storyboard-dev?
19:50:49 <fungi> oh, or maybe not? ;)
19:51:01 <Zara> ah, sorry, it was actually a bit different but it stemmed from there
19:51:03 <anteaya> fungi: that is the second part of this two part topic
19:51:14 <fungi> you have the floor!
19:51:17 <Zara> the first thing is more general: if we're expecting lots of new users, we should check storyboard works to scale...
19:51:24 <Zara> what's the best way to do that?
19:52:11 <fungi> we've done some pretty minimal load testing of services in the past
19:52:36 <fungi> i think clarkb found a took to load-test etherpad for example and we tried out pointing it at etherpad-dev
19:53:06 <fungi> the openstackid devs did some load testing of authentication for their summit app pointed at openstackid-dev
19:53:08 <fungi> et cetera
19:53:49 <fungi> so it's generally been custom/ad-hoc whitebox solutions (based on some internal knowledge of the application and expected usage patterns)
19:53:55 <Zara> ah, okay
19:54:29 <Zara> I think it also depends on the expected pace of migration
19:54:45 <Zara> for how heavy a load we need to have tested at any time
19:55:33 <fungi> i'm also perfectly happy to address nonobvious scaling challenges as we come across them rather than spend development rime prematurely optimizing for things we don't know will actually be an issue
19:55:44 <fungi> s/rime/time/
19:55:48 <anteaya> okay that helps, thank you
19:56:03 <anteaya> that was the prime dev concern when I asked are we ready for the masses
19:56:53 <anteaya> Zara: are you ready to move to the second bit on the topic?
19:56:55 <fungi> storyboard already has some pretty aggressive/challenging features on its roadmap, and so unnecessarily taxing the devs with scalability testing seems like an undue burden
19:57:06 <anteaya> fungi: thank you
19:57:10 <anteaya> and yes agreed
19:57:17 <Zara> okay, I was the one who was worried about it, but I'm fine if other people are willing to share that down the line :)
19:57:50 <fungi> note we're running out of meeting time if there's a part 2
19:57:51 * SotK too
19:58:02 <Zara> so the other thing was that I was tasked with gerrit integration, right now I have a big review backlog and so on, and it's unlikely to make fast progress while I'm on it
19:58:03 <fungi> (2 minutes remaining)
19:58:39 <anteaya> Zara: well I think we are asking if someone can help with adding the storyboard gerrit plugin to the review-dev server
19:58:52 <anteaya> at least that was what I was asking ofr
19:58:53 <anteaya> for
19:59:18 <anteaya> and also to squeeze it in we have a StoryBoard bug sprint coming up: https://wiki.openstack.org/wiki/VirtualSprints#StoryBoard_Bug_Sprint
19:59:19 <fungi> zaro seems like an obvious choice as he has knowledge of the puppet module, root access to the review-dev server and is the author of that gerrit plugin
19:59:32 <Zara> ah, related, was going to say: if someone wants it right away, someone else needs to prod about it and push for that sort of thing
19:59:32 <fungi> but i don't know what his availability is
19:59:38 <anteaya> so hopefully we can onboard some new storyboard developers who are willing to help with bugs
19:59:53 <fungi> okay, duly noted
19:59:58 <fungi> and with that, we're at time
20:00:03 <anteaya> thank you
20:00:04 <fungi> thanks everyone!
20:00:04 <zaro> i can help but will need to wait for a little bit
20:00:11 <fungi> #endmeeting