19:01:48 <clarkb> #startmeeting infra
19:01:49 <openstack> Meeting started Tue Jan 28 19:01:48 2014 UTC and is due to finish in 60 minutes.  The chair is clarkb. Information about MeetBot at http://wiki.debian.org/MeetBot.
19:01:51 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote.
19:01:53 <openstack> The meeting name has been set to 'infra'
19:02:03 * clarkb digs up last weeks links
19:02:30 <clarkb> #link http://eavesdrop.openstack.org/meetings/infra/2014/infra.2014-01-21-19.03.html
19:02:55 <clarkb> #topic Actions from last meeting
19:03:14 <clarkb> so last week was a bit crazy and I don't expect many actions were addressed but lets see how we did
19:03:31 <clarkb> reed: has smarcet been hooked up with a mentor for infra things?
19:04:09 <clarkb> I guess read is grabbing lunch, we can come back to that
19:04:21 <fungi> clarkb: he has
19:04:33 <fungi> mrmartin (marton kiss) is helping smarcet out
19:04:38 <zaro> o/
19:04:39 <clarkb> fungi: awesome.
19:04:58 <clarkb> #action mrmartin work with smarcet to get through infra processes
19:05:26 <clarkb> mordred did investigate manage-project failures and found that running it by hand is fine, running under puppet is not
19:05:53 <fungi> maddening
19:05:55 <clarkb> currently if you make changes that would normally trigger manage-projects you need to log into the server and run it by hand as a result
19:06:17 <clarkb> I think we should decouple it from puppet and do something simple like have an at job that puppet create when it would normally trigger manage-projects
19:06:24 <clarkb> but we don't need a solution for that now
19:06:36 <clarkb> #action mordred find a way to run manage-projects automagically without puppet
19:06:41 <fungi> agreed
19:06:51 <clarkb> jenkins.o.o and jenkins01 are still not upgraded
19:06:56 <fungi> nope. bad me
19:07:00 <clarkb> I will take that action from fungi so that he doesn't have all of them
19:07:05 <fungi> thanks!
19:07:25 * fungi already has enough ammunition for self flagellation over here
19:07:34 <clarkb> #action clarkb upgrade jenkins.o.o and jenkins01 to 1.543 and upgrade zmq plugin and scp plugin everywhere
19:07:49 <clarkb> fungi: how is the graphite whisper file move?
19:07:57 <fungi> not yet started ;)
19:08:09 <clarkb> #action fungi move graphite whisper files to faster volume
19:08:24 <clarkb> fungi: same for whisper file pruning?
19:08:25 <fungi> #action fungi prune obsolete whisper files automatically on graphite server
19:09:04 <clarkb> the chef telemetry project rename did not happen as it will apparently break the chef cookbooks until the requester gets back from vacation so we are putting that off until sometime next month
19:09:14 <fungi> right. postponed for now
19:09:37 <clarkb> does anyone know where we are on lifting/removing the virtualenv pin?
19:09:45 * anteaya is sitting in a salt tutorial and would enjoy working with mordred on the manage-projects
19:09:45 <clarkb> I don't recall that happening
19:10:03 <fungi> clarkb: nothing yet, unless there's a pending change i haven't spotted
19:10:08 <clarkb> #action anteaya assist mordred in automating manage-projects again
19:10:25 <clarkb> #action mordred to lift virtualenv 1.10.1 pin when we're ready to babysit it
19:10:45 <clarkb> zaro: where are we on pointing zuul dev at gerrit dev? I think I saw a change for that
19:11:00 <clarkb> zaro: and can we get an update on the gerritbot and jeepyb situation with new gerrit
19:11:01 * zaro checks review
19:11:20 * fungi hasn't reviewed that one yet, sorry :/
19:11:45 <zaro> https://review.openstack.org/#/c/68271
19:11:52 <pleia2> last I saw jeblair had a comment about another update required
19:11:54 <clarkb> #link https://review.openstack.org/#/c/68271
19:11:54 <pleia2> yeah, that
19:11:57 <zaro> didn't realized it got -1, i'll have to update.
19:12:23 <zaro> ok.  so testing review-dev.o.o with gerritbot and jeepyb
19:12:28 <clarkb> #action zaro to point zuul-dev at gerrit-dev
19:12:59 <zaro> verified gerritbot works, no changes required there.
19:13:36 <zaro> jeepy will require some patches.  will probably submit some today.
19:14:05 <clarkb> zaro: should probably submit bugs as you find things to help track what breaks
19:14:13 <zaro> while testing i also found bug with jeepyb and mysql-python which i believe already got merged.
19:14:32 <clarkb> yup that merged
19:14:37 <zaro> clarkb: yes, will do, just getting it all toegther.
19:15:08 <zaro> also there will need to be an update to gerritlib due to the fact that gerrit replication command changed
19:15:29 <clarkb> #action zaro to review jeepyb integration with new gerrit and update gerritlib for gerrit 2.8
19:15:35 <zaro> i also noticed that gerrit version commmand is only available on ver 2.6 and after.
19:16:02 <clarkb> zaro: I think you can assume if that works that you have 2.6 or newer and if it fails you have older
19:16:14 <zaro> so not sure how we can make gerritlib compatible with both ver 2.4.x and ver 2.8
19:16:50 <zaro> i mean how to make the gerritlib replication command compatible
19:17:05 <clarkb> zaro: I think you rely on the presence of the command
19:17:14 <clarkb> and its non presence
19:17:22 <fungi> basically try to do it the new way and if you get a failure try the old way
19:17:25 <zaro> ahh. yeah, that workks.
19:17:33 <clarkb> I think we should move on. lots of stuff on the agenda
19:17:48 <clarkb> #link https://wiki.openstack.org/wiki/Meetings/InfraTeamMeeting
19:17:56 <clarkb> #topic Trove Testing
19:18:16 <clarkb> no mordred or hub_cap or slicknick
19:18:25 <reed> clarkb, he has, all under control
19:19:17 <clarkb> I suppose there isn't much to say about trove testing
19:19:27 <clarkb> #topic TripleO Testing
19:19:37 <clarkb> pleia2: lifeless: I think there is plenty to say about this. Want to fill us in?
19:19:42 <clarkb> fungi: ^
19:20:15 <fungi> there have been a few issues over the past week owing to some sort of neutron failure on the ci cloud
19:20:23 * fungi digs up the bug
19:20:26 <pleia2> right, so last week we made a bunch of progress
19:20:57 <fungi> #link https://launchpad.net/bugs/1271344
19:21:29 <fungi> if we start build/delete looping in nodepool and see ssh timeout errors in the log as the cause, raise the alarm in #tripleo
19:21:34 <fungi> they've been digging into it
19:22:05 <pleia2> I'm also working on getting a fedora jenkins slave up
19:22:06 <fungi> and i think the fedora images are still pending some changes before they'll work
19:22:09 <fungi> that
19:22:12 <clarkb> #info if nodepool gets into a build/delete loop with tripleo cloud raise the alarm in #tripleo
19:22:30 <clarkb> but tests are running nonvoting in the check queue
19:22:36 <clarkb> which is quite the accomplishment
19:22:55 <clarkb> I am excited for the eventual baremetal testing. Will be fun
19:23:06 <pleia2> so we're chipping away at the issues as they come down the pipeline :)
19:23:09 <fungi> right. however when we can't build nodes, we will see changes for tripleo projects piling up in the check pipeline waiting on deploy jobs
19:23:27 <fungi> so that's a good early warning for the current bug
19:24:41 <fungi> probably not a lot else exciting to report on this topic for the moment
19:24:58 <clarkb> ok really happy with how it is coming together
19:25:05 <clarkb> #topic Chef Stackforge project rename
19:25:37 <clarkb> we covered this at the beginning of the meeting but for completeness we postponed again as the project rename will break other cookbooks and the rename requestor needs to fix them but is on vacation currently
19:25:50 <clarkb> #info postponed again as requester isn't around to update the name everywhere
19:25:51 <fungi> as discussed, this is postponed. we can keep it on the agenda or just remove it and wait for the owner to let us know later
19:26:06 <fungi> meh, you said that ;)
19:26:08 <clarkb> #topic ongoing new project creation issues
19:26:23 <clarkb> Mordred is driving up into the hills
19:26:34 <fungi> mordred of the hill people
19:26:46 <clarkb> we seem to have determined that manage-projects works fine when not run by puppet
19:27:20 <clarkb> we can either figure out why puppet is special (probably related to how exec works and it runs in a funny env) or stop running it with puppet completely
19:27:35 * ttx lurks
19:27:43 <clarkb> I like not running it with puppet completely. My simple suggestion is to have puppet submit an at job to run it a few minutes after the puppet run
19:27:48 <clarkb> and use a lock file
19:27:59 <clarkb> but we can possible also use salt or cron or $otherthing
19:28:14 <clarkb> but mordred isn't here so moving on
19:28:22 <clarkb> #topic Pip 1.5 readiness
19:28:47 <pleia2> oh, I think this was supposed to be rm-ed from the agenda
19:29:14 <clarkb> are we using pip 1.5 yet?
19:29:33 <fungi> i believe that's entwined with the virtualenv cap removal
19:29:47 <fungi> which mordred already has an action item for
19:29:52 <clarkb> gotcha, so we can move no
19:30:07 <clarkb> #topic Upgrade Gerrit
19:30:16 <clarkb> zaro: is there anything you want to add to what you had before?
19:30:34 <zaro> just a question..
19:30:42 <zaro> what else should we test?
19:30:58 <zaro> gerritbot, jeepyb, ??
19:31:27 <clarkb> zaro: good question
19:31:46 * fungi ponders
19:31:54 <clarkb> the ATC collector scripts
19:32:11 <zaro> what is that?
19:32:14 <clarkb> pretty sure they page through gerrit data via the CLI
19:32:29 <clarkb> zaro: the scripts that find out who can go to the summit for free. They are in openstack-infra/config/tools
19:32:35 <fungi> clarkb: yeah, though that's really just a mysql query documented in the readme in tools/atc (within the config repo)
19:32:54 <fungi> and then gerrit ssh queries for the review info
19:33:07 <fungi> (in the python script in that same dir)
19:33:25 <fungi> should be pretty easy to confirm
19:33:33 <zaro> anything else?
19:33:44 <fungi> zaro: didn't you say something is also breaking the bug updating hook?
19:34:03 <zaro> yeah, that's in jeepyb
19:34:09 <zaro> update+bug.py
19:34:35 <zaro> working on the patch for that now.
19:34:41 <fungi> k
19:34:54 <fungi> reviewday?
19:34:58 <fungi> openstackwatch?
19:35:05 <clarkb> fungi: reviewday definitely
19:35:18 <zaro> not the same thing?
19:35:35 <fungi> those are different things
19:35:42 <zaro> ok.  will take a look
19:36:07 <clarkb> and of course zuul, but that is listed :)
19:36:19 <clarkb> and I am pretty sure hashar is using zuul with new gerrit so don't expect many problems
19:36:33 <clarkb> anything else?
19:36:33 <fungi> does bugday or releasestatus do anything with the gerrit api?
19:36:46 <clarkb> ttx: ^
19:37:05 <ttx> hmm, release status yes
19:37:21 <ttx> hrm. if by APi you mean SSh
19:37:45 <fungi> ttx: yeah, we've already found at least one backward-incompatible change in the gerrit ssh api
19:37:50 <clarkb> #info things that need testing with new gerrit: reviewday, openstackwatch, releasestatus, zuul, update_bug (jeepyb)
19:37:50 <fungi> also we have elastic-recheck and recheckwatch both potentially consuming info from gerrit
19:38:09 <fungi> and updating it
19:38:22 <clarkb> elastic-recheck I expect to be ok, but we should add that to the list too
19:38:24 <ttx> release status is not on the critical path though. update_bug and update_blueprint a bit more
19:38:40 <clarkb> #info also elastic-recheck and recheckwatch
19:39:17 <fungi> nothing else springs to mind
19:39:41 <clarkb> #topic Private Gerrit for Security Review
19:39:51 <zaro> got it. test openstack infra universe.
19:40:00 <clarkb> zaro: :)
19:40:15 <clarkb> My understanding was that we wanted gerrit 2.8 then a private gerrit
19:40:25 <clarkb> does the work above facilitate private gerrit then?
19:40:30 <zaro> nothing new to report. postponing for gerrit 2.8 first
19:40:46 <clarkb> #info Need gerrit 2.8 up and running before deploying private gerrit
19:40:56 <clarkb> maybe we should take this off of the agenda for now then?
19:41:00 <fungi> yep. only so many irons in one fire
19:41:09 <clarkb> #topic Jenkins SCP Plugin fixes
19:41:15 <zaro> ok. i'll remove it.
19:41:31 <clarkb> zaro fixed the SCP plugin then he had to fix it slightly differently :)
19:41:35 <zaro> i understand there's another issue with upgrade?
19:41:50 <fungi> is there?
19:41:51 <clarkb> zaro: another issue with upgrade? with the scp plugin?
19:42:11 <zaro> jeblair says that upgrading to latest doesn't keep old configs?
19:42:26 <zaro> or doesn't keep them correctly or something like that.
19:42:39 <clarkb> zaro: are you talking about the SCP plugin?
19:42:43 <zaro> yes.
19:42:59 <zaro> scrollback from yesterday morning
19:43:05 <clarkb> I don't know what configs there are then, They are all in the project config
19:43:10 * fungi vaguely recalls discussion about something changing in the xml maybe
19:43:48 <zaro> i was going to look into it today, but looks like it may need to wait until gerrit universe is tested
19:44:06 <clarkb> ok found it in scrollback
19:44:33 <clarkb> that is an unrelated problem and has to do with how upstream did a bunch of refactoring in the plugin then stopped work and we just added to it
19:44:37 <clarkb> back to the fixes at hand
19:45:02 <zaro> clarkb: does that mean we don't need to do anything about it?
19:45:18 <clarkb> zaro: no, it means we don't ahve to do anything about it today :) we will need to fix that before we can release a new version
19:45:40 <clarkb> jenkins04-07 have the latest version of the plugin. jenkins.o.o and jenkins01 have n-2 version and 02 and 03 have n-1
19:46:11 <clarkb> I would like to restart 02 and 03 possibly today to get the latest version then do jenkins.o.o and jenkins01 when I upgrade the jenkins versions there
19:46:21 <clarkb> hopefully by the end of the week all of the jenkinses will be consistent
19:46:51 <fungi> sounds great
19:47:07 <fungi> also, regular jenkins restarts seem to keep the plumbing free of rodents
19:47:19 <clarkb> #info SCP file copying bugs assumed fixed. Upgrade from previous release to master not fixed (xml config isn't handled nicely)
19:47:23 <clarkb> fungi: ha
19:47:39 <zaro> fungi: are we doing that now?
19:47:57 <clarkb> zaro: anything else you want to add about the SCP plugin?
19:48:07 <zaro> nope all done here
19:48:19 <fungi> zaro: not intentionally, but things like yesterday morning seem to crop up more often when jenkins masters run for a little while under heavy use without restarts
19:48:47 <fungi> or i could just be imagining it
19:48:51 <clarkb> #topic removing openstack-ci-admin ML from LP
19:48:59 <fungi> ahh, that was me
19:49:16 <fungi> just noticed we get the occasional e-mails to that list from people asking for help
19:50:09 <fungi> when they should probably open bugs, mail the proper infra ml or find us in irc
19:50:33 <clarkb> remind me, if we kill it the archives stay there?
19:50:45 <fungi> ahh, i think they can
19:50:52 <clarkb> I think we should remove it to make it simpler for peopel to find better contact locations
19:51:04 <clarkb> but we should consider the data too
19:51:17 <fungi> was looking for consensus that it was okay to remove that mailing list from lp, but without jeblair and mordred here we can revisit it next week. not urgent
19:51:31 <clarkb> ok
19:51:37 <clarkb> #topic Savanna testing
19:51:42 <clarkb> SergeyLukjanov: still around?
19:51:45 <SergeyLukjanov> yup
19:51:48 <SergeyLukjanov> o/
19:51:55 <clarkb> go for it
19:52:01 <SergeyLukjanov> the first patch should be landed to tempest soon
19:52:10 <SergeyLukjanov> I mean the first patch with real tests ;)
19:52:20 <SergeyLukjanov> so, I'd like to setup async gate for savanna
19:52:36 <clarkb> cool
19:52:37 <SergeyLukjanov> non-voting for tempests and devstack's check pipelines
19:52:42 <SergeyLukjanov> and voting for savanna
19:52:49 <clarkb> SergeyLukjanov: ya devananda just did something similar for ironic
19:52:54 <clarkb> SergeyLukjanov: so ++ from me
19:53:10 <SergeyLukjanov> clarkb, yup, I saw it and really like the approach
19:53:28 <SergeyLukjanov> it will guarantee that it'll be easy to make it sync
19:53:36 <clarkb> SergeyLukjanov: are there changes proposed to config yet?
19:53:48 <SergeyLukjanov> not yet, will propose them today
19:53:55 <clarkb> sounds good
19:53:59 <SergeyLukjanov> additionally I'm planning to start working on dib elements testing
19:54:04 <SergeyLukjanov> savanna dib elements
19:54:21 <SergeyLukjanov> but still have no enough time to make a proof of concept
19:54:28 <SergeyLukjanov> heh, to make working one ;)
19:54:36 <SergeyLukjanov> so, looks like that's all from my side
19:55:01 <clarkb> ok let us know if you need anything from out end. also I believe the tripleo folks are looking into that sort of thing too
19:55:02 <fungi> SergeyLukjanov: using something similar to how tripleo tests their dib elements?
19:55:05 <clarkb> they might have feedback
19:55:08 <clarkb> fungi: :)
19:55:23 * fungi hasn't a unique thought in his head, clearly
19:55:42 <clarkb> fungi: it just means neither of us is crazy
19:55:50 <SergeyLukjanov> I'm current;y have a problem with defining the approach of how it should be tested for savanna
19:55:57 <fungi> or we're *both* crazy ;)
19:56:20 <SergeyLukjanov> heh, folks, probably I'm crazy too
19:56:24 <SergeyLukjanov> where is my coffee??
19:56:41 <clarkb> ha
19:56:43 <SergeyLukjanov> btw is it already possible to publish our images to tarballs.o.o?
19:56:56 <clarkb> SergeyLukjanov: that is one of the things tripleo is trying to sort out :)
19:57:05 <clarkb> SergeyLukjanov: personally I almost feel like glance would be better
19:57:10 <SergeyLukjanov> ok, looks like I've missed it
19:57:11 <clarkb> but that means running a glance
19:57:35 <clarkb> SergeyLukjanov: maybe start a thread on the infra list and we can get tripleo people to look at it too
19:57:36 <SergeyLukjanov> afaik we already running trove?
19:57:48 <clarkb> so that everyone is happy with where build artifact images end up
19:57:54 <SergeyLukjanov> clarkb, yup, nice idea
19:57:58 <SergeyLukjanov> I'll do it
19:58:38 <clarkb> great. I am going to give everyone a minute or two for random things
19:58:40 <fungi> i believe we at least set aside extra space for images when tarballs were relocated
19:58:51 <clarkb> #topic Open Discussion
19:58:54 <clarkb> fungi: good to know
19:58:57 <SergeyLukjanov> fungi, awesome ;)
19:59:18 <fungi> check the disk space graphs in cacti for confirmation
19:59:21 <clarkb> I did want to point out that dims patched the zmq event publisher plugin so I will try getting that update to all of the masters too
19:59:36 <clarkb> but that is less important than the SCP plugin update
19:59:52 <lifeless> clarkb: tonnes to say, was asleep.
19:59:56 <fungi> clarkb: excellent. between node and master metadata, we should be able to hunt down infra issue impact easier
19:59:57 <zaro> zmq is all setup to auto deploy to jenkins repo.
19:59:58 <lifeless> tripleo testing is ALIIIIIVE
20:00:15 <clarkb> zaro: yup once I test that plugin I can tag a release
20:00:18 <dims> clarkb, +1
20:00:36 <clarkb> and I think we are at time. Thank you everyone
20:00:38 <clarkb> #endmeeting