#openstack-meeting log

19:02:09 <jeblair> #startmeeting infra
19:02:11 <openstack> Meeting started Tue May  7 19:02:09 2013 UTC.  The chair is jeblair. Information about MeetBot at http://wiki.debian.org/MeetBot.
19:02:12 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote.
19:02:13 <olaph> o/
19:02:14 <openstack> The meeting name has been set to 'infra'
19:02:43 <jeblair> #topic bugs
19:03:00 <jeblair> (i'm going to abuse my position as chair to insert a topic not currently on the agenda)
19:03:09 <fungi> abuse away
19:03:28 <jeblair> this is a PSA to remind people to use launchpad bugs for infra tasks
19:03:43 <jeblair> i think we were a bit lax about that last cycle (all of us, me too)
19:04:20 <clarkb> we were. ++ to pleia for forcing us to do bug days
19:04:21 <jeblair> especially as we're trying it easier for others to get involved, i think keeping up with bug status is important
19:04:39 <jeblair> yes, much thanks to pleia for that; we'd be in a much worse position otherwise
19:04:40 <jlk> +1 to that
19:04:48 <jlk> us new folks have to know where to find work that needs to be done
19:04:56 <fungi> i couldn't agree more. definitely going to strive to improve on that as my new cycle resolution
19:05:32 <jeblair> so anyway, please take a minute and make sure that things you're working on have bugs assigned to you, and things you aren't working on don't.  :)
19:05:53 <jeblair> btw, i think we have started doing a better job with low-hanging-fruit tags
19:06:05 <anteaya> o/
19:06:12 <jeblair> so hopefully that will be an effective way for new people to pick up fairly independent tasks
19:06:28 <jeblair> any other thoughts on that?
19:06:45 <clarkb> I think we should try and make the bugday thing frequent and scheduled in advance
19:06:53 <jlk> oh something that is also missing
19:06:57 <fungi> seconded
19:07:02 <jlk> a document to outline proper bug workflow
19:07:17 <jlk> maybe that exists somewhere?
19:07:19 <jeblair> clarkb: +1;  how often?  line up with milestones?
19:07:30 <jlk> I just took a guess at what to do
19:07:48 <clarkb> jeblair: I was thinking once a month. lining up with milestones might be hard as we end up being very busy around milestone time it seems like
19:07:48 <jeblair> jlk: no, but i think i need to write a 'how to contribute to openstack-infra' doc
19:08:07 <jeblair> jlk: i should assign a bug to myself for that.  :)
19:08:09 <fungi> lining up between milestones ;)
19:08:23 <clarkb> but any schedule that is consistent and doesn't allow us to put it off would be good
19:08:43 <clarkb> and maybe we cycle responsibility for driving it so that pleia doesn't have to do it each time
19:08:49 <spzala> bknudson: yes, if it exist then use it.. if not, then use virtual default domain
19:09:11 <spzala> sorry, wrong chat box
19:09:34 <jeblair> clarkb: want to mock up a calendar?
19:09:48 <clarkb> jeblair: sure. I will submit a bug for it too :P
19:09:58 <clarkb> #action clarkb to mock up infra bugday calendar
19:10:02 <mordred> o/
19:10:06 <jeblair> jlk: basically, feel free to assign a bug to yourself when you decide to start working on something
19:10:52 <jeblair> #topic actions from last meeting
19:10:58 <jeblair> #link http://eavesdrop.openstack.org/meetings/infra/2013/infra.2013-04-30-19.03.html
19:11:00 <jlk> jeblair: that's what I assumed.
19:11:13 <jeblair> mordred: mordred set up per-provider apt mirrors (incl cloud archive) and magic puppet config to use them ?
19:11:52 <jeblair> mordred: maybe you should just open a bug for that and let us know when there's something to start testing
19:12:20 <mordred> jeblair: yes. I will do this
19:12:24 <jeblair> clarkb: clarkb to ping markmc and sdague about move to testr
19:12:33 <clarkb> I have not done that yet
19:12:42 <jeblair> #action clarkb to ping markmc and sdague about move to testr
19:12:47 <jeblair> i assume that's still a good idea.  :)
19:12:53 <clarkb> it is
19:13:07 <clarkb> and it should be a higher priority of mine to get things in before milestone 1 if possible
19:13:15 <clarkb> mordred did bring it up in the project meeting iirc
19:13:23 <mordred> yeah. people are receptive to it
19:13:41 <mordred> I think on my tdl is "open a bug about migrating everything and set up the per-project bug tasks"
19:14:21 <jeblair> #topic oneiric server migrations
19:14:26 <jeblair> so we moved lists and eavesdrop
19:14:41 <jeblair> the continued avalanche of emails to os-dev seems to indicate that went okay
19:14:52 <jeblair> and meetbot is answering hails so
19:15:01 <jeblair> i guess that's that?
19:15:04 <clarkb> we need to shutdown/delete the old servers at some point. Once we have done that the task is complete
19:15:07 <clarkb> jeblair: not quite
19:15:14 <fungi> a resounding success
19:15:30 <clarkb> we need to delete the old servers (unless you already did that) and mirror26 needs to be swapped out for a centos slave
19:15:48 <jeblair> reed: have you logged into the new lists.o.o?
19:16:07 <fungi> i can take mirror26 as an action item
19:16:15 <jeblair> reed: if not, let us know when you do so and if you have any objections to deleting the old server.
19:16:15 <fungi> unless you wanted it, clarkb
19:16:35 <clarkb> fungi: go for it
19:16:39 <reed> jeblair, yes
19:16:43 <fungi> #action fungi open a bug about replacing mirror26 and assign it to himself
19:16:43 <reed> system restart required
19:16:53 <jeblair> reed: ?
19:17:08 <reed> just logged in, *** System restart required ***
19:17:09 <jeblair> oh.  the issue.  :)
19:17:32 <jeblair> i believe it actually was recently rebooted.
19:17:45 <clarkb> it was rebooted on saturday before we updated DNS
19:17:55 <clarkb> I guess that means that more updates have come in since then
19:17:58 <jeblair> reed: anything missing from the move?  or can we delete the old server?
19:18:01 * fungi is pretty sure our devstack slaves are the only servers we have which don't say "restart required" every time you log in
19:18:28 <reed> jeblair, how should I know if anything is missing? did anybody complain?
19:19:04 <jeblair> reed: no, i think we have the archives and the lists seem to be working, so i don't see a reason
19:19:24 <jeblair> reed: but we didn't sync homedirs (i don't think)
19:19:40 <reed> alright then, I don't think I have anything in the old server there anyway
19:19:41 <jeblair> reed: so if your bitcoin wallet is on that server you should copy it.  :)
19:19:44 <clarkb> jeblair: I did not sync homedirs
19:19:55 <reed> oh, my wallet!
19:19:55 <mordred> jeblair already stole all of my bitcoins
19:20:18 <jeblair> #action jeblair delete old lists and eavesdrop
19:20:21 <reed> one of the cloud expense management systems allows you to request bitcoins for payments
19:20:39 <jeblair> we should charge bitcoins for rechecks
19:20:47 <clarkb> ha
19:20:52 <fungi> we should charge bugfixes for rechecks
19:20:59 <jeblair> #topic jenkins slave operating systems
19:21:17 <jeblair> my notes in the wiki say:      current idea: test master and stable branches on latest lts+cloud archive at time of initial development
19:21:21 <jeblair> and:  open question: what to do with havana (currently testing on quantal -- "I" release would test on precise?)
19:21:28 <mordred> there's an idea about having the ci system generate a bitcoin for each build, and then embed build id information into the bitcoin...
19:21:55 <mordred> oh good. this topic again. my favorite :)
19:22:16 <clarkb> jeblair: I have thought about it a bit over the last week and I think that testing havana on quantal then "I" on precise is silly
19:22:29 <jeblair> clarkb: yes, that sounds silly to me too.
19:22:46 <clarkb> it opens us to potential problems when we open I for dev
19:23:08 <clarkb> and we may as well sink the cost now before quantal and precise have time to diverage
19:23:41 <jeblair> so if we're going to stick with the plan of lts+cloud archive, then i think we should roll back our slaves to precise asap.
19:23:56 <fungi> and the thought is that we'll be able to test the "j" release on the next lts?
19:24:18 <clarkb> fungi: yes
19:24:30 <mordred> lts+cloud archive ++
19:24:44 <mordred> at least until it causes some unforeseen problem
19:24:55 <fungi> makes sense. i can spin up a new farm of precise slaves then. most of the old ones were rackspace legacy and needed rebuilding anyway
19:25:03 <mordred> I believe zul and Daviey indicated they didn't think tracking depends in that order would be a problem
19:25:10 <clarkb> jeblair: I assume we want to run it by the TC first?
19:25:23 <clarkb> but I agree that sooner is better than later
19:25:58 <fungi> the tc agenda is probably long since closed for today's meeting. do we need to see about getting something in for next week with them?
19:26:02 <mordred> honestly, I don't the TC will want to be bothered with it (gut feeling, based on previous times I've asked things)
19:26:22 <jeblair> yes, why don't we do it, and just let them know
19:26:24 <mordred> it doesn't change much in terms of developer experience, since we're still hacking on pypi
19:26:26 <jlk> don't make it a question
19:26:30 <fungi> fair enough
19:26:34 <jlk> make it a "hey we're doing this thing, thought you'd like to know"
19:26:54 <jeblair> if they feel strongly about it, we can certainly open the discussion (and i would _love_ new ideas about how to solve the problem.  :)
19:27:19 <jeblair> mordred: you want to be the messenger?
19:27:23 <mordred> jeblair: sure
19:27:30 <mordred> I believe we'll be talking soon
19:27:41 <jeblair> #action mordred inform TC of current testing plans
19:28:04 <jeblair> #agreed drop quantal slaves in favor of precise+cloud archive
19:28:14 <fungi> #action fungi open bug about spinning up new precise slaves, then do it
19:28:45 <jeblair> any baremetal updates this week?
19:28:55 <mordred> not to my knowledge
19:29:02 <jeblair> #topic open discussion
19:29:20 <fungi> oh, while we're talking about slave servers, rackspace says the packet loss on mirror27 is due to another customer on that compute node
19:29:27 <jeblair> fungi: !
19:29:32 <mordred> fwiw, I'm almost done screwing with hacking to add support for per-project local checks
19:29:41 <mordred> as a result, I'd like to say "pep8 is a terrible code base"
19:29:45 <clarkb> fungi: we should DoS them in return :P
19:29:47 <fungi> they offered to migrate us to another compute node, but it will involve downtime. should i just build another instead?
19:29:48 <jeblair> fungi: want to spin up a replacement mirror27?  istr that we have had long-running problems with that one?
19:29:53 <anteaya> alias opix="open and fix" #usage I'll opix a bug for that
19:30:06 <fungi> heh
19:30:23 <jlk> that's the cloud way right? problem server? spin up a new one!
19:30:26 <jlk> (or 10)
19:30:33 <jeblair> mordred: i agree.  :)
19:30:33 <fungi> yeah, i'll just do replacements for both mirrors in that case
19:30:38 <jeblair> fungi: +1
19:30:51 <clarkb> do we need to spend more time troubleshooting static.o.o problems/
19:31:03 <fungi> oh?
19:31:04 <clarkb> sounds like we were happy calling it a network issue
19:31:16 <fungi> oh, the ipv6 ssh thing?
19:31:19 <clarkb> are we still happy with that as the most recent pypi.o.o failure?
19:31:36 <fungi> ahh, that, yes
19:31:38 <clarkb> fungi: no pip couldn't fetch 5 packages from static.o.o the other day
19:31:42 <fungi> right
19:31:48 <jeblair> clarkb: i just re-ran the logstash query with no additional hits
19:32:07 <fungi> that's what prompted me to open the ticket. i strongly suspect it was the packet loss getting worse than usual
19:32:13 <clarkb> fungi: I see
19:32:33 <fungi> i'd seen it off and on in the past, but never to the point of impacting tests (afaik)
19:32:46 <mordred> so - possibly a can of worms - but based off of "jlk | that's the cloud way right? problem server? spin up a new one!"
19:33:16 <jeblair> fungi: though i believe the mirror packet loss is mirror27 <-> static, wheras the test timeouts were slave <-> static...
19:33:18 <mordred> should we spend _any_ time thinking about ways we can make some of our longer-lived services more cloud-y?
19:33:39 <mordred> for easier "oh, just add another mirror to the pool and kill the ugly one" like our slaves are
19:33:59 <fungi> mmm, right. i keep forgetting static is what actually serves the mirrors
19:34:14 <clarkb> mordred: are you going to make heat work for us?
19:34:17 <fungi> so then no, that was not necessarily related
19:34:20 <clarkb> because I would be onboard with that :)
19:34:29 <jlk> mordred: fyi we're struggling with that internally too, w/ our openstack control stuff in cloud, treating them more "cloudy" whatever that means.
19:35:06 <mordred> jlk: yeah, I mean - it's easier for services that are actually intended for it - like our slave pool
19:35:11 <mordred> otoh - jenkins, you know?
19:35:13 <jlk> yup
19:35:20 <jeblair> mordred: as they present problems, sure, but not necessarily go fixing things that aren't broke.
19:35:26 <jlk> yeah, this are harder questions
19:35:29 <mordred> jeblair: good point
19:35:32 <jlk> jeblair: +1
19:35:34 <jeblair> mordred: we are making jenkins more cloudy -- zuul/gearman...
19:35:57 <jlk> does gearman have an easy way to promote to master?
19:36:01 * mordred used floating ip's on hp cloud the other day to support creating/deleting the same thing over and over again while testing - but having the dns point to the floating ip
19:36:12 <jeblair> jlk: no, gearman and zuul will be (co-located) SPOFs
19:36:13 <mordred> jlk: gearman doesn't have a master/slave concept
19:36:19 <clarkb> mordred: yeah I intend on trying floating ips at some point
19:36:30 <mordred> clarkb: it worked VERY well and made me happy
19:36:35 * ttx waves
19:36:45 <mordred> jeblair: doesn't gearman have support for multi-master operation-ish something?
19:36:59 <jlk> gear man job server(s)
19:37:04 <clarkb> mordred: I think it does, but if zuul is already a spof...
19:37:08 <mordred> ttx: I am tasked with communicating a change in our jenkins slave strategy to the TC - do I need an agenda item?
19:37:13 <mordred> clarkb: good point
19:37:21 <jeblair> mordred, jlk: yeah, actually you can just use multiple gearman masters
19:37:32 <jeblair> mordred, jlk: and have all the clients and workers talk to all of the masters
19:37:32 <jlk> so yes, you can have multiple in a active/active mod
19:37:40 <mordred> so once gearman is in, then our only spofs will be gerrit/zuul
19:37:44 <jlk> but as stated, doesn't solve zuul
19:37:51 <jeblair> mordred, jlk: however, we'll probably just run one on the zuul server.  because zuul spof.
19:37:57 <mordred> yeah
19:37:58 <ttx> mordred: you can probably use the open discussion area at the end. If it's more significant should be posted to -dev and linked to -tc to get a proper topic on the agenda
19:38:09 <ttx> (of next week)�
19:38:18 <mordred> ttx: it's not. I don't think anyone will actually care of have an opinion - but information is good
19:38:41 <ttx> mordred: will try to give you one minute at the end -- busy agenda
19:39:42 <anteaya> jeblair: can I take a turn?
19:39:53 <jeblair> anteaya: the floor is yours
19:39:56 <anteaya> thanks
19:40:09 <anteaya> sorry I haven't been around much lately, figuring out the new job and all
19:40:28 <anteaya> hoping to get back to the things I was working on like the openstackwatch url patch
19:40:49 <anteaya> but if something I said I would do is important, pluck if from my hands and carry on
19:40:51 <anteaya> and thanks
19:41:29 <jeblair> anteaya: thank you, and i hope the new job is going well.
19:41:38 <anteaya> :D thank jeblair it is
19:41:57 <anteaya> like most new jobs I have to get in there and do stuff for a while to figure out what I should be doing
19:42:09 <anteaya> getting there though
19:42:46 <jeblair> anteaya: we do need to sync up with you about donating devstack nodes.  should i email someone?
19:43:00 <anteaya> hmmm, I was hoping I would have them by now
19:43:18 <anteaya> when I was in Montreal last week I met with all the people I thought I needed to meet with
19:43:29 <anteaya> and was under the impression there were no impediments
19:43:37 <anteaya> thought I would have the account by now
19:43:50 <anteaya> you are welcome to email the thread I started to inquire
19:44:02 <anteaya> though it will probably be me that replies
19:44:13 <jeblair> anteaya: ok, will do.  and if you need us to sign up with a jenkins@openstack.org email address or something, we can do that.
19:44:15 <anteaya> let's do that, let's use the official channels and see what happens
19:44:37 <anteaya> I don't think so, I got the -infra core emails from mordred last week
19:44:49 <anteaya> so I don't think I need more emails
19:45:34 <anteaya> email the thread, I'll forward it around, maybe that will help things
19:45:39 <anteaya> and thanks
19:45:43 <jeblair> thank you
19:45:54 <jeblair> i'm continuing to hack away at zuul+gearman
19:46:06 <anteaya> fun times
19:46:08 <jeblair> right before this meeting, i had 5 of it's functional tests working
19:46:19 <fungi> oh, on the centos py26 unit test front, dprince indicated yesterday that he thought finalizing the remaining nova stable backports by thursday was doable (when oneiric's support expires)
19:46:22 <fungi> dprince, still the case?
19:46:31 <jeblair> i'm hoping i can have a patchset that passes tests soon.
19:46:41 <clarkb> jeblair: nice
19:46:56 <anteaya> yay for passing tests
19:47:13 <clarkb> I have a series of changes up that makes the jenkins log pusher stuff for logstash more properly daemon like
19:47:44 <zaro> i'm figuring out how to integrate WIP with gerrit 2.6 configs.
19:48:06 <clarkb> I think that what I currently have is enough to start transitioning back to importing more logs and working to normalize the log formats. But I will probably push that down the stack while I sort out testr
19:48:06 <dprince> fungi: for grizzly we need the one branch in and we are set.
19:48:18 <fungi> dprince: any hope for folsom?
19:48:24 <dprince> fungi: for folsom I think centos6 may be a lost cause.
19:48:30 <fungi> ugh
19:48:54 <jeblair> clarkb: what are you doing with testr?
19:49:13 <fungi> 'twould be sad if we could test stable/folsom for everything except nova on centos
19:49:33 <jeblair> dprince, fungi: hrm, that means we have no supported python2.6 test for folsom nova
19:49:35 <dprince> fungi: it looks like it could be several things (more than 2 or 3) that would need to get backported to fix all that stuff.
19:49:43 <clarkb> jeblair: motivating people like sdague and markmc to push everyone else along :)
19:49:49 <fungi> leaves us maintaining special nova-folsom test slaves running some other os as of yet undetermined
19:49:55 <clarkb> jeblair: I don't intend on doing much implementation myself this time around
19:50:02 <jeblair> clarkb: +1
19:50:33 <sdague> oh no, what did I do wrong? :)
19:50:41 <clarkb> sdague: nothing :)
19:50:42 <dprince> jeblair: the centos work can be done. but I'm not convinced it is worth the effort.
19:50:52 <markmc> jeblair, I'm fine with dropping 2.6 testing on stable/folsom - there should be pretty much nothing happening there now
19:51:05 <fungi> i'll start to look into debian slaves for nova/folsom unit tests i guess?
19:51:10 <jeblair> options for testing nova on python2.6 on folsom: a) backport fixes and test on centos; b) drop tests; c) use debian
19:51:38 <markmc> (b) or we'll get (a) done somehow IMHO
19:52:05 <mordred> I saw b. folsom came out before we made the current distro policy
19:52:06 <fungi> oh, i prefer markmc's suggestion in that case. less work for me ;)
19:52:12 <mordred> s/saw/say/
19:52:45 <clarkb> oh so I don't forget.
19:52:49 <jeblair> okay.  (b) is a one line change to zuul's config
19:52:54 <clarkb> #action clarkb to get hpcloud az3 sorted out
19:53:10 * jlk has to drop off
19:53:12 <jeblair> #agreed drop python2.6 testing for nova on folsom
19:53:21 <jeblair> jlk: thanks!
19:53:23 * mordred shoots folsom/python2.6 in the facehole
19:53:35 <fungi> #action fungi add change to disable nova py26 tests for folsom
19:53:58 <fungi> i'll drop that on top of my oneiric->centos change and we can merge them together
19:54:18 <jeblair> fungi: cool.  oh, sorry, i think it's 2 lines.
19:54:30 <fungi> jeblair: i'll find the extra electrons somewhere
19:54:57 <jeblair> anything else?
19:56:08 <olaph> hubcap
19:56:13 <fungi> a merry tuesday to all! (excepting those for whom it may already be wednesday)
19:56:41 <jeblair> thanks everyone!
19:56:42 <jeblair> #endmeeting