19:02:59 <jeblair> #startmeeting infra
19:03:00 <openstack> Meeting started Tue Dec 10 19:02:59 2013 UTC and is due to finish in 60 minutes.  The chair is jeblair. Information about MeetBot at http://wiki.debian.org/MeetBot.
19:03:01 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote.
19:03:04 <openstack> The meeting name has been set to 'infra'
19:03:06 <lifeless> o/
19:03:08 <jeblair> #link https://wiki.openstack.org/wiki/Meetings/InfraTeamMeeting
19:03:34 <jeblair> #link http://eavesdrop.openstack.org/meetings/infra/2013/infra.2013-12-03-19.02.html
19:03:48 <jeblair> #topic Actions from last meeting
19:04:04 <jeblair> #action jeblair file bug about cleaning up gerrit-trigger-plugin
19:04:10 <jeblair> #action jeblair move tarballs.o.o and include 50gb space for heat/trove images
19:04:22 <jeblair> fungi: do you think it's safe for me to do that ^ now?
19:04:48 <SergeyLukjanov> whisper: and for savanna images too, please
19:04:49 <fungi> mmm, well we have space on logs and on docs-draft
19:04:49 <pleia2> o/
19:05:15 <fungi> and the /srv/static partition itself is 100g mostly empty
19:05:35 <jeblair> fungi: i'd probably make a 200g volume for it
19:05:47 <jeblair> fungi: current real tarball usage is 30g
19:05:56 <fungi> but no available extents in the vg, and no ability to add any more cinder volumes (we ended up needing 10tib to get the two volumes to 50% freeish)
19:06:09 <fungi> unless we request a quota bump
19:06:10 <jeblair> fungi: okay, so we need to ask rackspace for a quota increase
19:06:20 <fungi> i can open a case asking about it
19:06:21 <zaro> o/
19:06:22 <mordred> we could add a second server
19:06:31 <anteaya> o/
19:06:38 <mordred> that has volumes onto which we publish build artifacts
19:06:39 <fungi> mordred: well, the 10tib quota limit is tenant-wide according to their faq
19:06:43 <mordred> oh. ok
19:06:49 <mordred> I grok now - nevermind
19:06:56 <jeblair> mordred: but we may also need to do that.
19:07:01 <mordred> I was somehow thinking it was how many we can attach to one thing
19:07:13 <fungi> #action fungi ask rackspace for cinder quota increase
19:07:15 <clarkb> right beacuse there is a 16 device limit as well
19:07:22 <jeblair> mordred: if the 10tib limit is increased, we'll be able to put 14tib on one server
19:07:29 <fungi> yep
19:07:35 <mordred> k
19:07:56 <jeblair> fungi grow logs and docs-draft volumes so they're 50% full
19:08:00 <fungi> #link http://cacti.openstack.org/cacti/graph.php?action=view&local_graph_id=717&rra_id=all
19:08:04 <fungi> #link http://cacti.openstack.org/cacti/graph.php?action=view&local_graph_id=718&rra_id=all
19:08:05 <jeblair> ^ i think we just covered that; anything else, fungi ?
19:08:16 <fungi> #link https://review.openstack.org/#/c/46099/
19:08:24 <fungi> (the doc update associated with it)
19:08:42 <jeblair> fungi: that link isn't right
19:08:56 <fungi> gah
19:09:04 <fungi> #link https://review.openstack.org/#/c/59615/
19:09:19 <clarkb> though there is an interesting discussion under 46099 :)
19:09:42 <fungi> as for the graphs, nothing exciting, just don't look too closely at where i accidentally stuck the extra space on docs-draft at first and had to rsync the contents over and back during the course of the weekend
19:10:22 <fungi> clarkb: yeah, the comments at the end of 46099 are part of my to do list for today, which is why the mispaste
19:11:45 <jeblair> cool.  the rest of the action items will fall into other topics
19:11:48 <jeblair> #topic Trove testing (mordred, hub_cap)
19:11:56 <jeblair> mordred to harrass hub_cap until he's writen the tempest patches
19:11:58 <fungi> though in the rsync'ing, i learned that the jenkins scp publisher will annihilate any symlinks it finds in the destination path
19:12:02 <jeblair> mordred: ^ how's the harrassing going?
19:12:05 <mordred> jeblair: I believe he started doing that?
19:12:13 <mordred> hub_cap: ^^ ?
19:12:48 <hub_cap> hey
19:13:04 <hub_cap> i havent been harassed enough
19:13:34 <hub_cap> although i think that slicknik is going ot start some of the work. hes getting our dib elements out of trove-integration
19:13:43 <hub_cap> so i suspect he will begin taking my place for this stuff
19:13:52 <mordred> ok. I'll harrass him then
19:14:08 <mordred> I thought you started work on it? or was that just savana?
19:14:30 <hub_cap> i didnt get anything tangible done
19:14:57 <hub_cap> ill be working w/ him on it so ill be sure to pass on what i have
19:15:04 <anteaya> I seem to excel at harrassing for tempest, did you want me to add that to my list?
19:15:18 <anteaya> why should I just bug neutron?
19:15:23 <hub_cap> oh and we have someone from mirantis working on actual tempest tests too
19:15:29 <hub_cap> anteaya: hehe sure :)
19:15:45 <SergeyLukjanov> hub_cap, are there any work started on automating image building?
19:15:53 <SergeyLukjanov> and publishing them to tarballs.o.o?
19:16:06 <hub_cap> hey SergeyLukjanov ive passed the buck to slicknik
19:16:17 <hub_cap> hes moving our dib elements and then i suspect he will work on that
19:16:45 <SergeyLukjanov> hub_cap, hey :) ok, I'll ping him to sync our efforts
19:16:54 <SergeyLukjanov> hub_cap, thx
19:16:57 <devananda> hub_cap: who @ mirantis?
19:17:16 <devananda> hub_cap: two folks there also working on ironic/tempest too
19:17:22 <jeblair> hub_cap: do you expect this to be done by icehouse-2?
19:17:28 <hub_cap> devananda: i cannot remember his name, he just started
19:17:37 <hub_cap> jeblair: its currently targeted for i2
19:17:50 <hub_cap> devananda: he just started working w/ us, ill find out his name and ping u
19:18:05 * mordred loves piles of russians
19:18:36 <anteaya> would it be worthwhile to have a gathering of folks working on tempest tests from various projects
19:18:41 <anteaya> y'know to compare notes
19:19:20 <clarkb> anteaya: maybe push them to the qa channel and open up dialogue? I am not really sure how much of this is happening outside of bubbles
19:19:37 <devananda> i think it would benefit us if the mirantis folks who are starting to work on ironic/tempest tests to talk/meet with infra folks
19:19:40 <devananda> and have been pushing them that way
19:19:46 <anteaya> right, fairly bubbly right now
19:19:49 <hub_cap> clarkb: the rainbows are pretty inside our bubbles though
19:20:07 <anteaya> I'll host a meeting and post to the ml
19:20:17 <anteaya> and hope lots of interested folks show up
19:20:21 <anteaya> how's that?
19:20:40 <clarkb> wfm, though in general I think people just need to learn to lurk in the right places :)
19:20:44 <anteaya> yes
19:20:58 <anteaya> hard to keep up on all the backscroll though
19:21:06 <anteaya> I'm having a hard time
19:21:32 <anteaya> anyway
19:21:33 <jeblair> you don't have to read all the backscroll
19:21:36 <zaro> clean slate for me everyday..
19:21:45 <pleia2> zaro: hehe
19:21:54 <jeblair> anteaya: before you host a meeting about tempest; you might want to talk to sdague.
19:21:59 <clarkb> ++
19:22:02 <anteaya> jeblair: yes, good point
19:22:04 <jeblair> anyway, this is off-topic.
19:22:05 <anteaya> will do so
19:22:17 <jeblair> #topic Tripleo testing (lifeless, pleia2)
19:22:32 <lifeless> hi, perfect timing, we're going to collide :)
19:22:34 <lifeless> one sec
19:22:36 <pleia2> hah
19:22:51 <fungi> any way to temporarily merge two channels? ;)
19:23:16 <clarkb> fungi: we could have a bot mirror discussion
19:23:20 <fungi> indeed
19:23:21 <clarkb> I am sure that is a terrible idea
19:23:23 <lifeless> ok so
19:23:27 <lifeless> pleia2: you should talk ;)
19:23:40 <pleia2> so I've been cleaning up some tripleo scripts so they can be used in the more automated CI stuff
19:24:05 <pleia2> it's also led to some changes in devstack.sh so we can properly support the testing infrastructure (and of course this makes sense in general too)
19:24:33 <pleia2> derekh and dprince have been working on gearman setup and lifeless has been tackling some of the gnarly network bits
19:24:56 <lifeless> yah, the network config is validated now, we know it will work
19:25:22 <pleia2> progress is still being tracked on https://etherpad.openstack.org/p/tripleo-test-cluster
19:25:46 <pleia2> I think derekh and dprince will need an infra chat in the near future
19:26:08 * dprince can't wait
19:26:12 <dprince> :)
19:26:34 <pleia2> dprince: want to pencil something in with jeblair this week? thinking an asterisk call
19:26:42 <jeblair> what's the topic?
19:27:02 <dprince> pleia2/jeblair: we can but derekh will definately want to be on this one too
19:27:12 <lifeless> o/
19:27:35 <pleia2> dprince: you can fill in jeblair of specifics better than I
19:28:18 <dprince> jeblair: topic would be 'TripleO CI: where the rubber meets the road'
19:28:25 <dprince> jeblair: hows that sound?
19:28:41 <pleia2> I think he means specifically what we need from infra :)
19:28:46 <mordred> I love rubber and roads
19:28:57 <pleia2> mordred: no you don't, you're always in airplanes
19:29:07 <clarkb> pleia2: dprince: right, what exactly do we plan on going over
19:29:14 <lifeless> I don't think a call it needed
19:29:36 <lifeless> we have very clear guidance from infra about the contract and interactions, and we aren't ready to move from experimental to check yet.
19:29:43 <mordred> pleia2: doesn't mean I don't love them
19:29:47 <pleia2> mordred :)
19:29:49 <dprince> yeah, I'm not sure it is 100% needed either but it wouldn't hurt to get everyone on the same page
19:30:12 <dprince> but we can probably punt on it for this week at least
19:30:12 <jeblair> dprince, pleia2: is there anything infra-related that's blocking your work?
19:30:18 <lifeless> dprince: I think we are: jeblair has been through the design and code reviewed the broker, for instance.
19:30:27 <jeblair> dprince, pleia2: and do you expect to be making infra changes soon related to it?
19:30:40 <pleia2> jeblair: no blockers yet
19:31:04 <dprince> jeblair: on the HP tripleO side no. On the RedHat TripleO side maybe. But it is still premature to go there I think.
19:31:38 * dprince aludes to his problem with getting IPv4 addresses for a TripleO testing rack
19:32:04 <clarkb> I don't think we need IPv4
19:32:12 <jeblair> clarkb: does hpcloud support v6?
19:32:27 <clarkb> jeblair: no, but hpcloud is only running slaves
19:32:31 <mordred> jeblair: the tripleo rack has v6
19:32:52 <lifeless> the infra control plane is dual stack; for tripleo regions we specify that either we need a /25 or we need IPv6
19:32:55 <fungi> might be interesting to find out what sorts of breakage we hit with ipv6-only slaves
19:33:12 <jeblair> dprince: can you get v4 for the infra control plane?
19:33:38 <dprince> jeblair: I haven't got a hard number yet. Maybe.
19:33:43 <lifeless> jeblair: what's the 'infra control plane' in a slave only cloud ?
19:34:21 <fungi> i would be slightly more comfortable with the ipv6-only idea if rackspace didn't have known brokenness with ipv6 in some ways (particularly dscp marking for low latency communications)
19:34:28 <lifeless> jeblair: if there is infra work needed to support a v6 slave only region, as long as its reasonably modest, I think TripleO will be happy to step up and do it.
19:34:29 <jeblair> lifeless: i assumed you meant the part of the system that talks to infra tools;
19:34:46 <lifeless> jeblair: I meant zuul and nodepool which AIUI run in rackspace
19:35:10 <lifeless> jeblair: we specifically will not have enough ipv4 in many regions to use one ipv4 address per jenkins slave
19:35:31 <lifeless> jeblair: some proposed regions - like MS - have already told use they simply cannot do IPv4 at all.
19:35:52 <lifeless> jeblair: so, I'd like to frame this as - what work do we need to do to enable ipv6 only slave clouds
19:35:54 <jeblair> lifeless: okay, then yes, zuul, nodepool, gear, gearman-plugin will all need to be made to be dual-stack
19:36:14 <lifeless> if you can point us at the work, we can make sure its in our plan to do it
19:37:08 <lifeless> jeblair: should we take this offline to do details?
19:37:18 <jeblair> it's possible some of those may already be; but they are all untested with v6
19:37:27 <lifeless> that was the bit I was missing
19:37:29 <jeblair> nodepool will definitely need some work, but it's moderate
19:37:44 <jeblair> gear may need a small amount or none but is untested
19:38:08 <jeblair> unsure about gearman-plugin and its java gearman dependency
19:38:35 <lifeless> dprince: / pleia2: Can one of you capture the list zuul, nodepool, gear, gearman-plugin and the status jeblair is uttering now into a new etherpad? Called 'ipv6-infra-test-clouds' or something ?
19:39:02 <mordred> lifeless, jeblair do they need gearman-plugin to be ipv6?
19:39:11 <pleia2> lifeless: on it
19:39:11 <jeblair> zuul may not be an issue.
19:39:23 <lifeless> you need anything that talks to jenkins slaves or to the API endpoint
19:39:32 <mordred> kk.
19:39:36 <dprince> lifeless: sure
19:39:47 <pleia2> https://etherpad.openstack.org/p/ipv6-infra-test-clouds
19:39:50 <dprince> pleia2: thanks
19:40:26 <jeblair> the job runners are jenkins slaves, right?  (they talk to the broker to then farm out the work)
19:41:06 <jeblair> iirc, the jenkins slaves are spun up as normal by nodepool from an openstack vm cloud
19:41:15 <jeblair> mordred: so i think that's why ^
19:41:31 <pleia2> right
19:42:10 <mordred> riht. but then they're not running gearman plugin - they just need ipv6 on the ssh connection from jenkins and from nodepool to their cloud - gearman to jenkins stays in our cloud
19:42:39 <jeblair> lifeless: i will be happy to see v6-enabling work happen on these.  i would personally love everything we do to be on v6
19:42:44 <mordred> ++
19:42:53 <fungi> hear hear
19:43:16 <jeblair> mordred: good point, zuul only talks to the gearman master
19:43:33 <lifeless> jeblair: they are, they are already defined in nodepool infact
19:43:38 <jeblair> mordred: good point, zuul only talks to the jenkins master
19:43:47 <lifeless> jeblair: we're using the term tripleo-gate for them, a-la devstack-gate
19:44:27 <jeblair> mordred, lifeless: if the jenkins master can use a v4 address, then you don't need to worry about gearman-plugin.  if the jenkins master is v6 only, then it does need to be considered.
19:44:58 <jeblair> and i suppose we are running the jenkins master.  :)
19:45:04 <lifeless> you're running the master
19:45:12 <jeblair> mordred: so yes, i think we can strike gearman-plugin from that list.
19:45:16 <lifeless> the interface is exactly the same as for d-g
19:45:32 <lifeless> nodepool makes a node, hands it to jenkins, jenkins ssh's to it
19:45:52 <lifeless> so I would expect the list to be nodepool and jenkins
19:46:02 <lifeless> does gear talk directly to slaves?
19:46:44 <jeblair> lifeless: no, gear interactions are only zuul <-> jenkins master
19:46:47 <lifeless> ok
19:46:55 <lifeless> so , I tink we we need to timebox this
19:46:56 <fungi> jeblair: though does that change with jenkinsless slaves?
19:47:02 <lifeless> otherwise both meetings will run out of steam
19:47:15 <jeblair> lifeless: i updated the etherpad.
19:47:16 <mordred> then it would just need turbohipster to be able to do v6
19:47:18 <lifeless> lets pick up ipv6 discussion in a couple of weeks
19:47:24 <jeblair> lifeless: sounds good, thanks
19:47:36 <fungi> great planning progress tho
19:47:38 <lifeless> it's not front line yet - the RH region will be the second one to come online, and the HP does have ipv4.
19:48:02 <jeblair> fungi: yes, in the future, non-jenkins slaves will talk directly to gearman, but they'll do it with gear, so presumably the problem will be solved by then.
19:48:28 <jeblair> (talk directly to zuul's gearman server)
19:49:00 <jeblair> this is why we have these meetings.  :)
19:49:05 <jeblair> #topic Savanna testing (SergeyLukjanov)
19:49:28 <jeblair> SergeyLukjanov: how's it going?
19:49:48 <SergeyLukjanov> I've finalized savanna-devstack integration to be able to use it in d-g jobs last week
19:50:08 <SergeyLukjanov> and create some small patches for tempest to make simple savanna api tests
19:50:24 <SergeyLukjanov> here are the CRs for it - https://review.openstack.org/#/q/topic:savanna-tempest,n,z
19:50:42 <SergeyLukjanov> the next step will be impl api tests for all resources
19:50:51 <SergeyLukjanov> and then to start moving integration tests
19:51:00 <SergeyLukjanov> from savanna repo to tempest scenarios
19:51:22 <SergeyLukjanov> and we'll need to build/publish images for it
19:51:41 <SergeyLukjanov> there are no blockers atm
19:51:54 <jeblair> cool!
19:52:22 <SergeyLukjanov> that's about enabling savanna d-g jobs for exp pipeline in tempest
19:52:22 <jeblair> zaro: what's the most urgent thing you want to discuss?
19:52:30 <SergeyLukjanov> for testing patches
19:52:35 <zaro> upgrade gerrit i guess
19:52:43 <jeblair> #topic Upgrade gerrit (zaro)
19:52:48 <SergeyLukjanov> https://review.openstack.org/#/c/61125/
19:52:50 <zaro> fungi ran a manually upgrade of review-dev from 2.4.x to 2.8.
19:53:03 <fungi> well, it's still in progress there
19:53:05 <jeblair> SergeyLukjanov: i should be able to review that today
19:53:05 <zaro> There was an error during the db schema conversion which is probably a bug in gerrit that isn't handling some badness in the db correctly  #link http://paste.openstack.org/show/54743
19:53:13 <SergeyLukjanov> jeblair, great, thank you!
19:53:26 <zaro> We will need to debug further.
19:53:41 <zaro> It will take a little effort to debug because i would need to setup mysql and get the review-dev site and db dump from fungi.
19:53:56 <zaro> I suggested that in parallel to debugging the problem we should manually run a quick upgrade test against review to see if we get any errors.
19:54:09 <zaro> I think we'll need to do that anyway.
19:54:15 <fungi> if there is a pressing need to have review-dev back on 2.4.x in the interim, we can revert the upgrade (warfile url change) and i can restore the db from teh backup i've been using
19:54:18 <jeblair> zaro, fungi: keep in mind we had horrible id sync scripts running on both servers
19:54:31 <jeblair> zaro, fungi: some cleanup happened to the prod db that may not have happened to the dev one
19:54:35 <mordred> good call. I think testing a review db dump before upgrading is important
19:54:41 <clarkb> ++
19:54:41 <fungi> jeblair: yes, dupes there spring to mind as one possibility
19:54:55 <jeblair> zaro, fungi: so, yeah, testing db dumps from both servers sounds like a very good idea
19:55:15 <fungi> i think there will be multiple tests of each before we're satisfied
19:55:19 <jeblair> zaro, fungi: and also, it's possible we can fix this with further db cleanup.
19:55:25 <clarkb> I think we should also convert the tables to utf8 in this process or at least test the conversion
19:55:29 <jeblair> zaro, fungi: (just something to keep in mind)
19:55:32 <clarkb> since we will have to setup all the machinery anyways
19:55:35 <zaro> jeblair: good idea.  db cleanup.
19:55:47 <clarkb> https://bugs.launchpad.net/openstack-ci/+bug/979227 is the bug tracking that problem
19:55:49 <uvirtbot> Launchpad bug 979227 in openstack-ci "convert gerrit's mysql tables to utf8" [Medium,Triaged]
19:56:06 <zaro> ok, nothing else about gerrit upgrade atm.
19:56:14 <mordred> I support the upgrade
19:56:22 <fungi> i'd like to discuss potentially getting zaro's ssh key added to review-dev so i don't have to worry about getting db exports and file archives to him out of band
19:56:49 <zaro> fungi: what?  i like the personal touch.
19:57:05 <fungi> zaro: yeah, i know. i have that way with people
19:57:18 * zaro nods
19:57:22 <jeblair> fungi: it looks like review-dev has completely segregated credentials (ie, not the same as review)
19:57:27 * mordred is now left with the words fungi zaro and personal touch
19:57:40 * fungi nods
19:57:43 <clarkb> jeblair: I think that is true since it replicates to a different github org
19:57:47 <jeblair> fungi: given that, i am in favor.
19:57:49 <mordred> ++
19:57:51 <clarkb> me too
19:58:10 <fungi> okay, zaro we'll get that taken care of to help speed this along
19:58:42 <zaro> cool
19:59:21 <jeblair> zaro: propose a change to infra/config to add your key and add it to the review-dev server, and make sure you read and agree to http://ci.openstack.org/sysadmin.html#ssh-access
19:59:49 <zaro> will do.
20:00:03 <jeblair> that's it; thanks everyone.  we'll get to other topics next week.
20:00:04 <jeblair> #endmeeting