19:02:12 <fungi> #startmeeting infra
19:02:14 <openstack> Meeting started Tue Jan 14 19:02:12 2014 UTC and is due to finish in 60 minutes.  The chair is fungi. Information about MeetBot at http://wiki.debian.org/MeetBot.
19:02:15 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote.
19:02:17 <openstack> The meeting name has been set to 'infra'
19:02:23 <reed> o/
19:02:27 <fungi> #link https://wiki.openstack.org/wiki/Meetings/InfraTeamMeeting#Agenda_for_next_meeting
19:03:07 <fungi> i took the liberty to throw a few more items on the agenda at the last minute, mainly stuff which has cropped up on which i'm either looking into or failing to find bandwidth to address
19:03:30 <fungi> we may not get to everything mentioned there, but we'll see how far the hour takes us
19:03:41 <fungi> #topic Actions from last meeting
19:03:46 <fungi> there were none
19:03:53 <mordred> yay!
19:03:54 <mordred> we win
19:03:57 <fungi> it was also a very short meeting
19:04:11 <fungi> #topic Trove testing (mordred, hub_cap, SlickNik)
19:04:28 <mordred> I have no status
19:04:34 <fungi> no exciting news?
19:04:37 <mordred> and hub_cap and SlickNik are not here
19:04:47 <mordred> well, I've done no real work in a month, so expect very little from me
19:04:47 <fungi> SlickNik was in here last week and said he's working on it
19:04:51 <mordred> awesome
19:05:05 <fungi> so we'll just assume that's still the case and move along
19:05:15 <fungi> #topic Tripleo testing (lifeless, pleia2, fungi)
19:05:26 <fungi> we apparently have a new tripleo cloud?
19:05:30 <pleia2> the new cloud is up for this, fungi added the new info in a review
19:05:37 <fungi> lifeless and i were discussing it last night
19:05:45 <fungi> and again just now
19:05:51 <pleia2> :)
19:06:02 <fungi> i need to test the credentials by creating a floating-ip for the controller i guess?
19:06:04 <pleia2> I have to schedule a meeting with dprince and derekh to chat about progress otherwise
19:06:22 <fungi> i'll get the details later on what should happen from my end
19:06:30 <pleia2> thanks
19:06:41 <pleia2> that's it from me, holidays + LCA has put me a bit behind
19:06:42 <fungi> the config change for nodepool is at...
19:06:46 <fungi> #link https://review.openstack.org/66491
19:07:05 <fungi> i've updated the creds in hiera to what i think they're supposed to be now
19:07:27 <fungi> the old poc cloud going down exposed a nodepool bug for us too
19:07:27 * mordred is excited
19:07:44 <fungi> #link https://launchpad.net/bugs/1269001
19:07:46 <uvirtbot> Launchpad bug 1269001 in nodepool "Nodepool stops building any new nodes when one provider is down" [High,Triaged]
19:07:54 <pleia2> ouch
19:08:04 <fungi> it resulted in the backup you see in the gate currently
19:08:06 <sdague> yeh, that's why we have a huge gate queue this morning
19:08:11 <fungi> the test nodes graph on the status page is fun
19:08:36 <sdague> we had 250 jobs in the check queue, at least that is trending down
19:09:01 <fungi> i said in the bug i'd make a patch, and then started to dig into the nodepool source, and then was drawn and quartered by other things which cropped up, so if anyone else wants that bug, it's probably not too hard
19:09:25 <fungi> otherwise i hope to get to it before it bites us again
19:10:03 <fungi> anything else on tripleo testing before i move on?
19:10:25 <fungi> #action fungi test new tripleo ci cloud account credentials
19:10:34 <fungi> #topic Savanna testing (SergeyLukjanov)
19:10:46 <SergeyLukjanov> nothing realling interesting this week too
19:10:55 <SergeyLukjanov> changes are under review in tempest
19:11:01 <fungi> okay, want to keep it on the agenda for next week still?
19:11:12 <SergeyLukjanov> basic integration has been already merged in
19:11:27 <fungi> awesome
19:11:30 <SergeyLukjanov> fungi, probably we could move it to the end of agenda
19:11:36 <fungi> okay, will do
19:12:09 <fungi> #topic Zuul release (2.0?) / packaging (jeblair, pabelanger)
19:12:11 <SergeyLukjanov> I don't think that we'll have enough questions in the nearest feature to have searated section on the meeting
19:12:20 <SergeyLukjanov> ^^ about savanna testing
19:12:40 <fungi> #link http://git.openstack.org/cgit/openstack-infra/zuul/tag/?id=2.0.0
19:12:47 <fungi> i guess that happened
19:13:01 <fungi> weeks ago
19:13:22 <fungi> has there been any fallout from it which bears discussing, or should it come off the agenda?
19:13:32 <fungi> i'm thinking the latter
19:13:42 * mordred votes later
19:14:15 <fungi> after more than a month, any bugs should be addressed as, well, bugs
19:14:29 <fungi> #topic Jenkins 1.543 upgrade (zaro, clarkb)
19:15:13 <fungi> i believe the main news here is that jenkins.o.o and jenkins01 still need an upgrade to match 02-04, but sdague has spotted some missing logs which clarkb thinks may be a locking/sync problem in the scp plugin
19:15:42 <sdague> yeh, we're loosing console logs an alarming amount of the time
19:15:49 <sdague> 5 - 10% by what I'm seeing
19:16:02 <sdague> which explains why elastic recheck has been missing a lot of things
19:16:06 <mordred> can we use turbo-hipster yet?
19:16:37 <fungi> current guess based on the logstash client logs is that we're racing and requesting the console log before it's available, so we get a 404
19:17:20 <fungi> and that this is probably the upshot of the threading fix which was made to the scp plugin to work properly on newer jenkins
19:17:35 <fungi> which coincides with when we think this behavior began
19:17:44 <mordred> seems sensible to me
19:18:10 <sdague> yeh, fixing that is somewhat of a blocker for some of the ER work, because if ES isn't a reliable source of truth, a lot of the numbers have no meaning
19:18:16 <fungi> zaro: you were wanting to discuss it in detail with clarkb before digging into it further, you said
19:19:30 <fungi> #action zaro discuss potential scp plugin race with clarkb
19:20:14 <fungi> #action fungi upgrade jenkins.o.o and jenkins01 to match 02-04
19:20:28 <fungi> if somebody else beats me to that, i won't complain
19:20:41 <fungi> #topic Requested StackForge project rename (fungi, clarkb, zhiwei)
19:20:55 <fungi> #link http://lists.openstack.org/pipermail/openstack-infra/2014-January/000594.html
19:21:18 <fungi> stackforge/cookbook-openstack-metering wants to rename to stackforge/cookbook-openstack-telemetry
19:21:52 <fungi> apparently using official terms instead of codenames for openstack projects doesn't keep you from having to rename things
19:21:57 <mordred> :)
19:22:32 <fungi> i'm willing to do this on saturday (the 18th) and clarkb said he expected to be around that day if i ran into major issues
19:23:15 <fungi> i'll tentatively set this for 19:00 utc, but i'll nail down a time when he's around
19:23:36 <fungi> #action fungi rename stackforge/cookbook-openstack-metering to -telemetry
19:23:55 <fungi> #topic Ongoing new project creation issues (fungi, clarkb)
19:24:05 * mordred was just reading the latest on that
19:24:10 <fungi> manage-projects is apparently still broken
19:24:26 <mordred> didn't we make it so that it would error out if group creation didnt' work
19:24:33 <mordred> so that at least you could just re-run it over and over again/
19:24:35 <mordred> ?
19:24:39 <fungi> i tried two more new project creations as guinea pigs yesterday and got the same behavior we'd been seein gpreviously
19:24:46 * mordred is sad
19:25:02 <fungi> i even tried it on one project which was reusing an existing acl, thus no group creation required
19:25:35 <fungi> something prevented it from getting as far as cloning the upstream repo, yet it created an empty project in gerrit and then we got broken mirrors everywhere for it
19:25:56 <mordred> oh god
19:26:00 <mordred> wth?
19:26:00 <fungi> i think next we should run it manually without letting puppet try to run it first, since when i rerun it, everything seems fine
19:26:27 <mordred> yeah. maybe we shoudl just, for the time being, run it manually from time to time
19:26:41 <mordred> since that's probably less work than fixing the broken runs
19:26:50 <fungi> anyway, new project requests are piling up, most have -2 votes on them waiting on this to get working
19:26:52 <mordred> and then once we've figured out what's wrong, we can re-enable the puppet triger
19:27:05 <anteaya> o/
19:27:47 <fungi> mordred: i've been reviewing most of the new project requests even in light of their -2 condition, trying to get them in shape anyway. if you want to look at them and try manual manage-projects runs on them, i won't object
19:28:05 <mordred> fungi: k.
19:28:18 <mordred> that would require me having +2 access and ssh access again
19:28:20 <fungi> though i will admit, my review backlog the past few weeks has been abominable
19:28:26 <fungi> mordred: yes
19:28:45 <fungi> #action fungi get mordred's gerrit group membership reinstated
19:28:56 <fungi> thanks for the reminder ;)
19:29:11 <fungi> #action mordred look at the current state of manage-projects failures
19:29:35 <fungi> #topic Pip 1.5 readiness efforts (fungi, mordred)
19:30:13 <fungi> at this point most stuff is okay, but requirements integration is broken still
19:30:48 <fungi> we have four known global requirements which pip 1.5 will not download without explicit --allow-external --allow-insecure whitelisting
19:31:14 <fungi> i have a change proposed to do that...
19:31:24 <fungi> #link https://review.openstack.org/66364
19:32:03 <fungi> however, it's now hitting an issue with pip 1.5's refusal to follow -f urls in requirements by default
19:32:22 <fungi> apparently we have projects consuming oslo.messaging even though it's never been released to pypi
19:32:37 <mordred> fungi: how about we land a change to run-mirror with the allow-insecure flag turned on to allow -f
19:32:39 <fungi> i did get around to reserving it on pypi yesterday at least, so nobody else can squat it
19:32:47 <mordred> fungi: then we land the changes to things to remove their -f
19:32:56 <mordred> then we land a change to remove the allow-insecure
19:33:23 <fungi> mordred: i do want to try that next, however i also want to make sure it's not going to result in us pulling those things into our actual mirror (which pypi-mirror also builds/updates)
19:34:08 <fungi> and teh command-line flag to allow it only first appeared in pip 1.5, same release which needs it, so if we want it to be able to run on <1.5 we need to pass it as an envvar instead
19:34:11 <mordred> no more so that it would have before
19:34:24 <mordred> I do not think we care about being able to run run-mirror on pip <1.5
19:34:39 * mordred strongly does not care
19:35:02 <fungi> mordred: well, that puts us in a bit of a chicken-and-egg situation while transitioning, since we have pinned our slaves to older virtualenv
19:35:20 <mordred> hrm. wait
19:35:33 <mordred> why?
19:35:38 <fungi> so we need it to work with virtualenv 1.10.1/pip 1.4.1 long enough to switch the mirror updater
19:35:45 <mordred> run-mirror upgrades pip in the venv it creates as one of its first steps
19:36:00 <mordred> so the pip that comes with the venv should not matter
19:36:14 <fungi> ahh, so we should already be failing this way on the mirror slaves the same way we're failing in the requirements integration jobs?
19:36:25 <mordred> yup
19:36:46 <mordred> run-mirror itself creates and operates inside of venvs to protect against bonghits
19:36:47 <fungi> good to know. in that case maybe i just try the cli option and see how far it gets us on that existing patch
19:37:20 <fungi> anyway, weeds
19:37:43 <fungi> anybody have anything else on new pip goings on before i move to the next topic?
19:38:13 <fungi> i'll link the tracking bug and etherpad...
19:38:28 <fungi> #link https://launchpad.net/bugs/1267364
19:38:30 <uvirtbot> Launchpad bug 1267364 in openstack-ci "Recurrent jenkins slave agent failures" [Critical,In progress]
19:38:39 <zaro> o/
19:38:48 <fungi> er, wrong bug
19:39:19 <fungi> #link https://launchpad.net/bugs/1266513
19:39:21 <uvirtbot> Launchpad bug 1266513 in tripleo "Some Python requirements are not hosted on PyPI" [Critical,In progress]
19:39:45 <fungi> #link https://etherpad.openstack.org/p/pip1.5Upgrade
19:40:08 <fungi> #topic OpenID provider project (fungi, reed)
19:40:45 <mordred> mmm. openid
19:40:45 <fungi> smarcet has been working on the php end of things for this and got some of the initial redis module written for puppet which it's using
19:40:55 <reed> mordred, openid is yummy
19:41:00 <fungi> my next phase of the deployment automation is awaiting review...
19:41:06 <fungi> #link https://review.openstack.org/63316
19:41:37 <fungi> i'm a bit swamped and it needs someone to take up the mantle of adding the project-specific deployment steps on top of that
19:42:07 <reed> I will try to identify blockroads with smarcet and try to recruit a mentor for him that is not swamped
19:42:12 <fungi> i have details from smarcet on what commands need to be run to deploy it
19:42:37 <fungi> i just have been doing a horrible job of finding time to help with next steps
19:42:42 <reed> if meanwhile we can merge 63316 that'd be great
19:42:56 <reed> fungi, you've gone already above and beyond, thank you
19:43:21 <fungi> looks like jeblair reviewed it on sunday, so i may just go ahead and merge that change so we can pick up some momentum
19:43:30 <mordred> ++
19:43:54 <reed> #action reed to talk to smarcet and find a mentor to help him get through the CI learning curve faster
19:44:00 <fungi> but definitely, anyone who finds this exciting is more than welcome to pitch in. i find it exciting, just very busy already
19:44:39 <fungi> anyway, trying to get through the meeting agenda, so moving on...
19:44:44 <fungi> #topic Graphite cleanup (fungi)
19:45:02 <fungi> the graphite server is spending a *lot* of time (an entire cpu pegged) in iowait
19:45:29 <mordred> oh, well that's not great
19:45:34 <fungi> the load is also seeming causing it to fail to generate and serve graphs
19:45:51 <fungi> i think we probably need to look at a faster cinder volume for the whisper files (ssd backed media)
19:46:24 <fungi> it's also running out of disk space. the whisper files are fixed size, but we add more and more metrics (new job names, et cetera)
19:46:58 <fungi> i discussed with jeblair and he's on board with autodeleting any whisper files which haven't received an update in 2 weeks or maybe a month
19:47:45 <fungi> it needs someone to look into it, but i'll throw myself on the action item for now as a placeholder and just assume i won't get a chance to look at it between now and the next meeting
19:48:24 <fungi> #action fungi move graphite whisper files to faster volume
19:48:38 <fungi> #action fungi prune obsolete whisper files automatically on graphite server
19:49:01 <fungi> probably best done in the opposite order, so there are fewer files to rsync
19:49:19 <fungi> #topic Maven clouddoc plugin move (zaro, mordred)
19:49:32 <mordred> ugh. what did I do now?
19:49:43 <zaro> so i don't think there's anything more to do on this.
19:49:48 <fungi> i think your name is a historical artifact on there, mordred
19:49:57 <zaro> looks like dcramer is doing the release manually
19:50:14 <zaro> using the maven release plugin
19:50:23 <fungi> i agreed to do the bits mentioned at the end of this review, but haven't found the time (maven nexus org setup stuff)
19:50:30 <fungi> #link https://review.openstack.org/46099
19:51:30 <zaro> that one is not needed unless this one is approved.. https://review.openstack.org/#/c/58349
19:51:56 <fungi> ahh, good to know
19:52:22 <fungi> #action fungi request org.openstack group in sonatype jira for maven nexus
19:52:42 <fungi> i'll likely just defer that until 58349 gets traction in that case
19:52:48 <fungi> thanks zaro!
19:52:54 <zaro> right now dcramer is doing the releases manually bypassing what the openstack CI wants to do
19:53:57 <fungi> #topic Private gerrit for security reviews (zaro, fungi)
19:54:08 <zaro> I'm guessing we can just ignore until ann or dcramer needs something.
19:54:20 <fungi> pretty sure this has ended up on the back burner, since gerrit upgrades are worked further through
19:54:21 <SergeyLukjanov> fungi, I could probably help you with sonatype if you want, I have some groups there
19:54:39 <zaro> private gerrit..  i think we should wait until after gerrit 2.8+
19:54:43 <mordred> ++
19:54:44 <fungi> SergeyLukjanov: when that task wakes back up, i'll try to remember to ping you for suggestions. thanks!@
19:54:46 <zaro> almost there.
19:54:58 <mordred> zaro: I completely agree- it seems bananas to add a gerrit that we'll need to upgrade
19:55:17 <fungi> so that takes us to...
19:55:22 <fungi> #topic Upgrade gerrit (zaro)
19:55:41 <mordred> upgrade upgrade!!!
19:55:44 <zaro> well.. gerrit 2.8 is on review.o.o
19:55:55 <fungi> and seems to be increasingly usable
19:55:55 <zaro> just baking in i guess..
19:56:16 <mordred> you mean review-dev.o.o ?
19:56:17 <zaro> i believe all questions have been answered on etherpad: https://etherpad.openstack.org/p/gerrit-2.8-upgrade
19:56:17 <fungi> i think not enough of us have been around to test things we want to make sure didn't break on it
19:56:40 <mordred> AaronGr: ^^ don't know if you've been tracking this one
19:56:59 <zaro> the next thing i was gonna do was to create a script to semi-automate the upgrade
19:57:06 <fungi> what sort of schedule is google looking at for gerrit 2.9, any idea?
19:57:09 <AaronGr> mordred: i haven't been, no.
19:57:19 <SergeyLukjanov> new review screen looks quite overloaded
19:57:39 <zaro> the idea is to semi-automate 1st upgrade to 2.8 since it's a troublesome process. then automate 2.8 to next releases via puppet.
19:57:49 <mordred> I think that's great
19:58:00 <fungi> apparently 2.9 is taking away the old review screen view entirely, so the sooner we prepare to make the new one usable (upstream patches, whatever) the better on that
19:58:02 <zaro> SergeyLukjanov: not turned on in review-de.o.o
19:58:05 <mordred> SergeyLukjanov: I think we're planning on having the old screen on by default to start with, yeah?
19:58:13 <SergeyLukjanov> here is a topic about 2.9 release https://groups.google.com/d/topic/repo-discuss/rAmliEzSsko/discussion
19:58:45 <fungi> #link https://groups.google.com/d/topic/repo-discuss/rAmliEzSsko/discussion
19:58:55 <SergeyLukjanov> mordred, AFAIK it'll be removed in next gerrit releases and I saw a CR to enable new screen for review.o.o while upgrading gerrit
19:59:33 <SergeyLukjanov> zaro, I know, I've setup an instance for myself and was surprised :()
19:59:35 <fungi> 2.9-rc0 early this week according to mfick
19:59:37 <zaro> fungi: i believe google is targeting march for 2.9 release?
20:00:20 <SergeyLukjanov> due to the first message in https://groups.google.com/forum/#!topic/repo-discuss/rAmliEzSsko master will delete old change screen code
20:00:21 <fungi> okay, we're over time
20:00:31 <fungi> need to get the tc meeting going
20:00:39 <fungi> thanks everybody!
20:00:45 <fungi> #endmeeting