19:01:48 #startmeeting infra 19:01:49 Meeting started Tue Jan 28 19:01:48 2014 UTC and is due to finish in 60 minutes. The chair is clarkb. Information about MeetBot at http://wiki.debian.org/MeetBot. 19:01:51 Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 19:01:53 The meeting name has been set to 'infra' 19:02:03 * clarkb digs up last weeks links 19:02:30 #link http://eavesdrop.openstack.org/meetings/infra/2014/infra.2014-01-21-19.03.html 19:02:55 #topic Actions from last meeting 19:03:14 so last week was a bit crazy and I don't expect many actions were addressed but lets see how we did 19:03:31 reed: has smarcet been hooked up with a mentor for infra things? 19:04:09 I guess read is grabbing lunch, we can come back to that 19:04:21 clarkb: he has 19:04:33 mrmartin (marton kiss) is helping smarcet out 19:04:38 o/ 19:04:39 fungi: awesome. 19:04:58 #action mrmartin work with smarcet to get through infra processes 19:05:26 mordred did investigate manage-project failures and found that running it by hand is fine, running under puppet is not 19:05:53 maddening 19:05:55 currently if you make changes that would normally trigger manage-projects you need to log into the server and run it by hand as a result 19:06:17 I think we should decouple it from puppet and do something simple like have an at job that puppet create when it would normally trigger manage-projects 19:06:24 but we don't need a solution for that now 19:06:36 #action mordred find a way to run manage-projects automagically without puppet 19:06:41 agreed 19:06:51 jenkins.o.o and jenkins01 are still not upgraded 19:06:56 nope. bad me 19:07:00 I will take that action from fungi so that he doesn't have all of them 19:07:05 thanks! 19:07:25 * fungi already has enough ammunition for self flagellation over here 19:07:34 #action clarkb upgrade jenkins.o.o and jenkins01 to 1.543 and upgrade zmq plugin and scp plugin everywhere 19:07:49 fungi: how is the graphite whisper file move? 19:07:57 not yet started ;) 19:08:09 #action fungi move graphite whisper files to faster volume 19:08:24 fungi: same for whisper file pruning? 19:08:25 #action fungi prune obsolete whisper files automatically on graphite server 19:09:04 the chef telemetry project rename did not happen as it will apparently break the chef cookbooks until the requester gets back from vacation so we are putting that off until sometime next month 19:09:14 right. postponed for now 19:09:37 does anyone know where we are on lifting/removing the virtualenv pin? 19:09:45 * anteaya is sitting in a salt tutorial and would enjoy working with mordred on the manage-projects 19:09:45 I don't recall that happening 19:10:03 clarkb: nothing yet, unless there's a pending change i haven't spotted 19:10:08 #action anteaya assist mordred in automating manage-projects again 19:10:25 #action mordred to lift virtualenv 1.10.1 pin when we're ready to babysit it 19:10:45 zaro: where are we on pointing zuul dev at gerrit dev? I think I saw a change for that 19:11:00 zaro: and can we get an update on the gerritbot and jeepyb situation with new gerrit 19:11:01 * zaro checks review 19:11:20 * fungi hasn't reviewed that one yet, sorry :/ 19:11:45 https://review.openstack.org/#/c/68271 19:11:52 last I saw jeblair had a comment about another update required 19:11:54 #link https://review.openstack.org/#/c/68271 19:11:54 yeah, that 19:11:57 didn't realized it got -1, i'll have to update. 19:12:23 ok. so testing review-dev.o.o with gerritbot and jeepyb 19:12:28 #action zaro to point zuul-dev at gerrit-dev 19:12:59 verified gerritbot works, no changes required there. 19:13:36 jeepy will require some patches. will probably submit some today. 19:14:05 zaro: should probably submit bugs as you find things to help track what breaks 19:14:13 while testing i also found bug with jeepyb and mysql-python which i believe already got merged. 19:14:32 yup that merged 19:14:37 clarkb: yes, will do, just getting it all toegther. 19:15:08 also there will need to be an update to gerritlib due to the fact that gerrit replication command changed 19:15:29 #action zaro to review jeepyb integration with new gerrit and update gerritlib for gerrit 2.8 19:15:35 i also noticed that gerrit version commmand is only available on ver 2.6 and after. 19:16:02 zaro: I think you can assume if that works that you have 2.6 or newer and if it fails you have older 19:16:14 so not sure how we can make gerritlib compatible with both ver 2.4.x and ver 2.8 19:16:50 i mean how to make the gerritlib replication command compatible 19:17:05 zaro: I think you rely on the presence of the command 19:17:14 and its non presence 19:17:22 basically try to do it the new way and if you get a failure try the old way 19:17:25 ahh. yeah, that workks. 19:17:33 I think we should move on. lots of stuff on the agenda 19:17:48 #link https://wiki.openstack.org/wiki/Meetings/InfraTeamMeeting 19:17:56 #topic Trove Testing 19:18:16 no mordred or hub_cap or slicknick 19:18:25 clarkb, he has, all under control 19:19:17 I suppose there isn't much to say about trove testing 19:19:27 #topic TripleO Testing 19:19:37 pleia2: lifeless: I think there is plenty to say about this. Want to fill us in? 19:19:42 fungi: ^ 19:20:15 there have been a few issues over the past week owing to some sort of neutron failure on the ci cloud 19:20:23 * fungi digs up the bug 19:20:26 right, so last week we made a bunch of progress 19:20:57 #link https://launchpad.net/bugs/1271344 19:21:29 if we start build/delete looping in nodepool and see ssh timeout errors in the log as the cause, raise the alarm in #tripleo 19:21:34 they've been digging into it 19:22:05 I'm also working on getting a fedora jenkins slave up 19:22:06 and i think the fedora images are still pending some changes before they'll work 19:22:09 that 19:22:12 #info if nodepool gets into a build/delete loop with tripleo cloud raise the alarm in #tripleo 19:22:30 but tests are running nonvoting in the check queue 19:22:36 which is quite the accomplishment 19:22:55 I am excited for the eventual baremetal testing. Will be fun 19:23:06 so we're chipping away at the issues as they come down the pipeline :) 19:23:09 right. however when we can't build nodes, we will see changes for tripleo projects piling up in the check pipeline waiting on deploy jobs 19:23:27 so that's a good early warning for the current bug 19:24:41 probably not a lot else exciting to report on this topic for the moment 19:24:58 ok really happy with how it is coming together 19:25:05 #topic Chef Stackforge project rename 19:25:37 we covered this at the beginning of the meeting but for completeness we postponed again as the project rename will break other cookbooks and the rename requestor needs to fix them but is on vacation currently 19:25:50 #info postponed again as requester isn't around to update the name everywhere 19:25:51 as discussed, this is postponed. we can keep it on the agenda or just remove it and wait for the owner to let us know later 19:26:06 meh, you said that ;) 19:26:08 #topic ongoing new project creation issues 19:26:23 Mordred is driving up into the hills 19:26:34 mordred of the hill people 19:26:46 we seem to have determined that manage-projects works fine when not run by puppet 19:27:20 we can either figure out why puppet is special (probably related to how exec works and it runs in a funny env) or stop running it with puppet completely 19:27:35 * ttx lurks 19:27:43 I like not running it with puppet completely. My simple suggestion is to have puppet submit an at job to run it a few minutes after the puppet run 19:27:48 and use a lock file 19:27:59 but we can possible also use salt or cron or $otherthing 19:28:14 but mordred isn't here so moving on 19:28:22 #topic Pip 1.5 readiness 19:28:47 oh, I think this was supposed to be rm-ed from the agenda 19:29:14 are we using pip 1.5 yet? 19:29:33 i believe that's entwined with the virtualenv cap removal 19:29:47 which mordred already has an action item for 19:29:52 gotcha, so we can move no 19:30:07 #topic Upgrade Gerrit 19:30:16 zaro: is there anything you want to add to what you had before? 19:30:34 just a question.. 19:30:42 what else should we test? 19:30:58 gerritbot, jeepyb, ?? 19:31:27 zaro: good question 19:31:46 * fungi ponders 19:31:54 the ATC collector scripts 19:32:11 what is that? 19:32:14 pretty sure they page through gerrit data via the CLI 19:32:29 zaro: the scripts that find out who can go to the summit for free. They are in openstack-infra/config/tools 19:32:35 clarkb: yeah, though that's really just a mysql query documented in the readme in tools/atc (within the config repo) 19:32:54 and then gerrit ssh queries for the review info 19:33:07 (in the python script in that same dir) 19:33:25 should be pretty easy to confirm 19:33:33 anything else? 19:33:44 zaro: didn't you say something is also breaking the bug updating hook? 19:34:03 yeah, that's in jeepyb 19:34:09 update+bug.py 19:34:35 working on the patch for that now. 19:34:41 k 19:34:54 reviewday? 19:34:58 openstackwatch? 19:35:05 fungi: reviewday definitely 19:35:18 not the same thing? 19:35:35 those are different things 19:35:42 ok. will take a look 19:36:07 and of course zuul, but that is listed :) 19:36:19 and I am pretty sure hashar is using zuul with new gerrit so don't expect many problems 19:36:33 anything else? 19:36:33 does bugday or releasestatus do anything with the gerrit api? 19:36:46 ttx: ^ 19:37:05 hmm, release status yes 19:37:21 hrm. if by APi you mean SSh 19:37:45 ttx: yeah, we've already found at least one backward-incompatible change in the gerrit ssh api 19:37:50 #info things that need testing with new gerrit: reviewday, openstackwatch, releasestatus, zuul, update_bug (jeepyb) 19:37:50 also we have elastic-recheck and recheckwatch both potentially consuming info from gerrit 19:38:09 and updating it 19:38:22 elastic-recheck I expect to be ok, but we should add that to the list too 19:38:24 release status is not on the critical path though. update_bug and update_blueprint a bit more 19:38:40 #info also elastic-recheck and recheckwatch 19:39:17 nothing else springs to mind 19:39:41 #topic Private Gerrit for Security Review 19:39:51 got it. test openstack infra universe. 19:40:00 zaro: :) 19:40:15 My understanding was that we wanted gerrit 2.8 then a private gerrit 19:40:25 does the work above facilitate private gerrit then? 19:40:30 nothing new to report. postponing for gerrit 2.8 first 19:40:46 #info Need gerrit 2.8 up and running before deploying private gerrit 19:40:56 maybe we should take this off of the agenda for now then? 19:41:00 yep. only so many irons in one fire 19:41:09 #topic Jenkins SCP Plugin fixes 19:41:15 ok. i'll remove it. 19:41:31 zaro fixed the SCP plugin then he had to fix it slightly differently :) 19:41:35 i understand there's another issue with upgrade? 19:41:50 is there? 19:41:51 zaro: another issue with upgrade? with the scp plugin? 19:42:11 jeblair says that upgrading to latest doesn't keep old configs? 19:42:26 or doesn't keep them correctly or something like that. 19:42:39 zaro: are you talking about the SCP plugin? 19:42:43 yes. 19:42:59 scrollback from yesterday morning 19:43:05 I don't know what configs there are then, They are all in the project config 19:43:10 * fungi vaguely recalls discussion about something changing in the xml maybe 19:43:48 i was going to look into it today, but looks like it may need to wait until gerrit universe is tested 19:44:06 ok found it in scrollback 19:44:33 that is an unrelated problem and has to do with how upstream did a bunch of refactoring in the plugin then stopped work and we just added to it 19:44:37 back to the fixes at hand 19:45:02 clarkb: does that mean we don't need to do anything about it? 19:45:18 zaro: no, it means we don't ahve to do anything about it today :) we will need to fix that before we can release a new version 19:45:40 jenkins04-07 have the latest version of the plugin. jenkins.o.o and jenkins01 have n-2 version and 02 and 03 have n-1 19:46:11 I would like to restart 02 and 03 possibly today to get the latest version then do jenkins.o.o and jenkins01 when I upgrade the jenkins versions there 19:46:21 hopefully by the end of the week all of the jenkinses will be consistent 19:46:51 sounds great 19:47:07 also, regular jenkins restarts seem to keep the plumbing free of rodents 19:47:19 #info SCP file copying bugs assumed fixed. Upgrade from previous release to master not fixed (xml config isn't handled nicely) 19:47:23 fungi: ha 19:47:39 fungi: are we doing that now? 19:47:57 zaro: anything else you want to add about the SCP plugin? 19:48:07 nope all done here 19:48:19 zaro: not intentionally, but things like yesterday morning seem to crop up more often when jenkins masters run for a little while under heavy use without restarts 19:48:47 or i could just be imagining it 19:48:51 #topic removing openstack-ci-admin ML from LP 19:48:59 ahh, that was me 19:49:16 just noticed we get the occasional e-mails to that list from people asking for help 19:50:09 when they should probably open bugs, mail the proper infra ml or find us in irc 19:50:33 remind me, if we kill it the archives stay there? 19:50:45 ahh, i think they can 19:50:52 I think we should remove it to make it simpler for peopel to find better contact locations 19:51:04 but we should consider the data too 19:51:17 was looking for consensus that it was okay to remove that mailing list from lp, but without jeblair and mordred here we can revisit it next week. not urgent 19:51:31 ok 19:51:37 #topic Savanna testing 19:51:42 SergeyLukjanov: still around? 19:51:45 yup 19:51:48 o/ 19:51:55 go for it 19:52:01 the first patch should be landed to tempest soon 19:52:10 I mean the first patch with real tests ;) 19:52:20 so, I'd like to setup async gate for savanna 19:52:36 cool 19:52:37 non-voting for tempests and devstack's check pipelines 19:52:42 and voting for savanna 19:52:49 SergeyLukjanov: ya devananda just did something similar for ironic 19:52:54 SergeyLukjanov: so ++ from me 19:53:10 clarkb, yup, I saw it and really like the approach 19:53:28 it will guarantee that it'll be easy to make it sync 19:53:36 SergeyLukjanov: are there changes proposed to config yet? 19:53:48 not yet, will propose them today 19:53:55 sounds good 19:53:59 additionally I'm planning to start working on dib elements testing 19:54:04 savanna dib elements 19:54:21 but still have no enough time to make a proof of concept 19:54:28 heh, to make working one ;) 19:54:36 so, looks like that's all from my side 19:55:01 ok let us know if you need anything from out end. also I believe the tripleo folks are looking into that sort of thing too 19:55:02 SergeyLukjanov: using something similar to how tripleo tests their dib elements? 19:55:05 they might have feedback 19:55:08 fungi: :) 19:55:23 * fungi hasn't a unique thought in his head, clearly 19:55:42 fungi: it just means neither of us is crazy 19:55:50 I'm current;y have a problem with defining the approach of how it should be tested for savanna 19:55:57 or we're *both* crazy ;) 19:56:20 heh, folks, probably I'm crazy too 19:56:24 where is my coffee?? 19:56:41 ha 19:56:43 btw is it already possible to publish our images to tarballs.o.o? 19:56:56 SergeyLukjanov: that is one of the things tripleo is trying to sort out :) 19:57:05 SergeyLukjanov: personally I almost feel like glance would be better 19:57:10 ok, looks like I've missed it 19:57:11 but that means running a glance 19:57:35 SergeyLukjanov: maybe start a thread on the infra list and we can get tripleo people to look at it too 19:57:36 afaik we already running trove? 19:57:48 so that everyone is happy with where build artifact images end up 19:57:54 clarkb, yup, nice idea 19:57:58 I'll do it 19:58:38 great. I am going to give everyone a minute or two for random things 19:58:40 i believe we at least set aside extra space for images when tarballs were relocated 19:58:51 #topic Open Discussion 19:58:54 fungi: good to know 19:58:57 fungi, awesome ;) 19:59:18 check the disk space graphs in cacti for confirmation 19:59:21 I did want to point out that dims patched the zmq event publisher plugin so I will try getting that update to all of the masters too 19:59:36 but that is less important than the SCP plugin update 19:59:52 clarkb: tonnes to say, was asleep. 19:59:56 clarkb: excellent. between node and master metadata, we should be able to hunt down infra issue impact easier 19:59:57 zmq is all setup to auto deploy to jenkins repo. 19:59:58 tripleo testing is ALIIIIIVE 20:00:15 zaro: yup once I test that plugin I can tag a release 20:00:18 clarkb, +1 20:00:36 and I think we are at time. Thank you everyone 20:00:38 #endmeeting