19:03:49 #startmeeting infra 19:03:50 Meeting started Tue Nov 19 19:03:49 2013 UTC and is due to finish in 60 minutes. The chair is jeblair. Information about MeetBot at http://wiki.debian.org/MeetBot. 19:03:51 Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 19:03:54 The meeting name has been set to 'infra' 19:03:58 #link https://wiki.openstack.org/wiki/Meetings/InfraTeamMeeting 19:04:01 o/ 19:04:40 #link http://eavesdrop.openstack.org/meetings/infra/2013/infra.2013-11-12-19.03.html 19:04:53 #topic Actions from last meeting 19:05:07 #action jeblair move tarballs.o.o and include 50gb space for heat/trove images 19:05:10 clarkb: etherpad? 19:05:32 etherpad-dev is dead, didn't get to etherpad.o.o yet. Plan to kill it post meeting 19:05:55 that work kept getting bumped for more important things over the week :/ 19:06:33 clarkb: what are the things you want to double check before killing? 19:06:44 jeblair: that the db backups properly overlap 19:06:47 clarkb: (and more specifically, anything you need help/coordination with) 19:06:55 so that I don't lose any potentially useful data 19:07:03 I shouldn't need help with that 19:07:22 cool 19:07:31 #topic Trove testing (mordred, hub_cap) 19:07:55 hey. nothing to report here. still getting caught up w/ trove 19:08:21 going to touch base in the next few days tho regarding resuming progress :) 19:08:30 ok. makes sense, given the schedule of the past couple weeks. 19:08:33 hub_cap: sounds good! 19:08:44 #topic Tripleo testing (lifeless, pleia2) 19:09:02 working through some basics with dprince and derekh, nothing to report on the infra side at the moment though 19:10:31 o/ 19:10:34 #topic Wsme / pecan testing (sdague, dhellman) 19:10:56 sdague: might be busy/afk today? 19:10:59 yup 19:11:02 dhellmann: ping 19:11:05 I do have an updated from the d-g side though 19:11:07 all week i think sdague said 19:11:10 hi 19:11:20 to get the havana periodic/bitrot jobs in I have started my refactoring of the d-g jobs 19:11:40 I am trying to do one small step at a time for my sanity and for reviewers' sanity 19:11:50 there was also a suggestion i think that sqlalchemy-migrate might ought to get the same love as pecan and wsme as far as dependency gating goes? 19:12:01 but I think a path towards having better d-g job templates is emerging which should make the wsme jobs easy 19:12:07 we are trying really hard to get rid of sqlalchemy-migrate 19:12:17 that sounds like a better solution anyway ;) 19:12:38 we have https://review.openstack.org/#/c/54333/ open for running tox tests for pecan to gate against wsme, ceilometer, and ironic 19:12:51 well, we are - but I think that adding pecan/wsme like gating to it would be good, based on the recent breakage 19:13:00 ok, I wasn't aware of any breakage 19:13:16 s-m made a release and it broke things :) 19:13:22 more testing is better, I was just trying to help avoid extra work 19:14:28 mordred: s-m? 19:14:37 if the JJB job refactor ends up where I would like to have it end up it may not be a whole lot of extra work 19:14:43 anteaya: sqlalchemy-migrate 19:14:49 k 19:15:03 clarkb: does the jjb refactor conflict with https://review.openstack.org/#/c/54333/5 ? 19:15:23 jeblair: it shouldn't I am talking about the d-g jobs specifically 19:15:28 clarkb: k 19:15:37 since they have grown very unweildy and need better templating 19:16:34 clarkb: so your change would make it easier to add full devstack testing for wsme, which is additional to the tox jobs in 54333 19:16:41 yup 19:16:47 cool, makes sense 19:17:10 end-of-topic? 19:17:25 * dhellmann has nothing to add 19:17:30 dhellmann: thanks 19:17:36 * mordred thins refactor is great 19:17:39 clarkb: is etherpad still a separate topic, or already covered? 19:17:44 jeblair: covered 19:17:46 mordred: thin refactors are the best 19:17:53 #topic Retiring https://github.com/openstack-ci 19:18:11 who wanted to talk about this? 19:18:26 we talked about it,m but wanted to get your input too 19:18:26 someone stumbled upon it recently and it was confusing 19:18:55 want to at least see about deleting publications and gerrit-trigger-plugin 19:19:05 or close the whole thing down 19:19:17 pleia2: i think there is more history in publications that need to be pushed into infra/pub 19:19:22 or, add a dummy project with a README that says go to opensatck-infra instead 19:19:31 clarkb: like https://github.com/openstack-ci/moved-to-openstack-infra 19:19:32 clarkb: that's already there 19:19:37 ah 19:19:54 pleia2: the existing historical publications need to be pushed and tagged so they show up 19:19:59 pleia2: then i think publications can go away 19:20:04 ok 19:20:08 that leaves gerrit-trigger-plugin 19:20:25 jeblair: i thought we retained the history for pubs and then cleaned it, so the master branch still has those commits, they just need tags 19:20:45 or better, separating into branches 19:20:50 first 19:21:19 fungi: yeah, i think to be compatible with the new system, we need to make some new commits that move each pub into a top level and then tag those 19:22:08 gerrit-trigger-plugin is a genuine fork; i'm not sure if all our changes were upstreamed 19:22:35 I don't believe they were 19:22:41 i think we need to figure out the status of that, and decide whether it's a useful historical artifact 19:23:22 so it sounds like we have a couple bugs/action items out of this topic... branchify/tag the old pubs, and decide the fate of g-t-p 19:23:37 yep 19:24:06 i'm not in a position to volunteer for those right now, but will certainly do publications if no one gets to it first. 19:24:18 I could use some practice with tag/branch fun if someone will be available to answer questions as I go 19:24:27 pleia2: i can help you on that 19:24:32 pleia2: happy to 19:24:40 ok cool, action me for digging into publications then 19:24:55 #action pleia2 add historical publications tags 19:25:00 pleia2: thanks! 19:25:19 #action jeblair file bug about cleaning up gerrit-git-prep repo 19:25:21 gah 19:25:25 #action jeblair file bug about cleaning up gerrit-trigger-plugin 19:25:35 #topic Savanna testing (SergeyLukjanov) 19:25:43 SergeyLukjanov: ping 19:25:51 nothing interesting to report atm 19:25:52 o/ 19:25:58 is it ok to make a job to build images using savanna-image-elements and publish them to tarballs.o.o? 19:26:05 just to clarify 19:26:26 SergeyLukjanov: absolutely; i believe trove will need to do something similar 19:26:29 mordred, hub_cap: ^ 19:26:31 yah 19:26:33 that's correct 19:26:38 we'd like to generalize that 19:26:54 yep, we talked about this need with hub_cap at summit 19:27:22 and eventually it'll be great torun integration tests in this images... 19:27:40 SergeyLukjanov, would it be possible to run Savanna integration tests on those images before they get published? 19:28:05 ruhe, we can publish master and release images I think 19:28:25 and eventually add integration tests to the gate pipeline 19:28:39 when will achieve 'hadoop vs. nested virt.' 19:28:46 similar to how we do branch-tip, pre-release and release tarballs i expect 19:29:04 so, nothing to add from my side, still hope to start creating CRs this week 19:29:44 fungi, yep, it should work ok 19:30:06 SergeyLukjanov: ok cool, thanks 19:30:09 #topic Goodbye Folsom (ttx, clarkb, fungi) 19:30:18 #link http://lists.openstack.org/pipermail/openstack-stable-maint/2013-November/001723.html 19:30:27 i've officially eol'd the integrated projects 19:30:34 woot 19:30:40 \o/ 19:30:55 still need to know what we're supposed to do (if anything) with things like devstack, grenade, oslo-i, manuals, tempest, reqs 19:30:57 yaaay! 19:31:09 fungi: kill them all 19:31:14 with fire! 19:31:24 they have stable/folsom branches. tag then delete, same as the rest? 19:31:51 fungi: i believe that is the correct thing to do. note that many of them will have significant trouble landing patches to those branches now. :) 19:32:17 and the topic more properly should have been goodbye folsom, hello havana since clarkb worked on getting the new stable/havana jobs in last week 19:32:18 definitely reqs, devstack and tempest 19:32:31 since we can't effectively test them on folsom anymore 19:32:42 oh and grenade, might as well do them all 19:32:51 clarkb: i approved your job templating last night and it seems to have worked 19:33:11 yes we have periodic/bitrot jobs for havana now and changes to d-g are tested against havana and grizzly as well as master 19:33:33 though dprince's fgrenade change is still grinding in the gate of doom 19:33:42 #link https://review.openstack.org/57066 19:34:32 (to move the base for master tests to havana instead of grizzly) 19:34:46 fungi, clarkb, dprince: thanks for this! 19:34:56 #topic Jenkins 1.540 upgrade (zaro, clarkb) 19:35:10 fungi: I am actually a bit perplexed by my grenade failure there. 19:35:46 fungi: can take that offline w/ sdague/dtroyer perhaps though 19:35:58 dprince: i suspect it's just the volume of tempest tests being run against d-g coupled with the nondeterminism in the gate right now 19:36:02 the reason we want to upgrade jenkins is that there was a major fix to reduce number of threads by 75% 19:36:18 dprince, there is a grenade patch that needs to land as well for that to work 19:36:25 and with recent jenkins trouble, we figure it is worth a shot to upgrade and run jenkins that doesn't have so much overhead 19:36:27 maurosr was working on it, I don't know it's status 19:36:47 sdague: hello 19:36:49 clarkb has latest jenkins on jenkins-dev.o.p 19:36:53 fungi: thanks, sdague: can you point me to it... or in the right direction? 19:36:58 I upgraded jenkins-dev yesterday, it seems to be fine but the jenkins test script in config is very old d-g and old zuul oriented 19:37:03 * maurosr reading 19:37:34 and the release notes for new jenkins mention a lot of churn in parts of the api we use, i think 19:37:35 jeblair: fungi: mordred: any opinions on how we should test jenkins-dev (they upgraded the ssh-slaves and credentials plugins that are bundled with jenkins and changed some of the permissions around node creation/update/delete) 19:37:47 oy. that sounds like fun 19:37:56 clarkb: i think the api calls it exercises should still be about the same, right? add/remove nodes, etc... 19:38:14 jeblair: possibly. they actual calls are made in the zuul and d-g source code 19:38:23 maurosr, this is the patch that takes the version specific upgrade scripts into that separate directory, instead of just based on the branch we're in in grenade 19:38:25 clarkb: ah, heh. 19:38:27 jeblair: so I think rewriting it to use nodepool is what we may need to do? 19:38:43 or and this was my crazy idea this morning 19:38:47 could we introduce a new jenkins master with the upgrade? and then move over from there? 19:38:54 maybe we can point prod nodepool at jenkins-dev? 19:39:07 I don't think prod nodepool at jenkins-dev will work due to ssh key mismatches 19:39:11 sdague: yup, I'm finishing it, I can submit it today (yesterday and friday had some troubles and the holiday that didnt let me work on it) 19:39:34 but will submit it today for sure 19:39:47 clarkb: so maybe we need a nodepool-dev? ugh. 19:39:50 replacing the other one which cleans everything 19:39:52 anteaya: that is one possibility but we should be able to do an upgrade in place 19:40:03 jeblair: that was another thing I considered, possibly just run it on jenkins-dev 19:40:04 clarkb: mordred has had success getting nodepool running on his laptop... maybe something similar could be pointed at jenkins-dev instead of needing a nodepool-dev server? 19:40:25 colocating nodepool-dev service on jenkins-dev sounds fine 19:40:42 oh, or locally installed on jenkins-dev itself. yeah, not a bad idea at all 19:40:59 clarkb k 19:41:25 clarkb: sound like a plan? 19:41:27 sure 19:41:38 zaro: any chance you want to work on the puppet to do that? 19:41:38 cool 19:42:02 clarkb: can do. 19:42:09 clarkb, zaro: in fact, i think that may actually be how the old devstack-gate stuff was set up on jenkins-dev. 19:42:31 #action zaro setup dev nodepoold on jenkins-dev 19:42:39 #topic New devstack job requirements (clarkb) 19:43:28 one of the things that came out of adding havana d-g jobs was that we have two large classes of d-g jobs. There are d-g jobs that run against patches and d-g jobs that run periodically against tip of $branch 19:44:09 I would like to propose that we require and new d-g jobs supply both forms as two different templates so that when icehouse rolls around adding d-g jobs for it is as simple as updating zuul layout.yaml 19:44:36 with havana I was juggling a lot of missing pieces and I think staying on top of that through a cycle would be better 19:44:59 clarkb: i agree with the proposal in principle; i may want to see your refactor before agreeing to the specifics 19:45:00 and i think the current state with regard to that is great now, after your last change went in 19:46:20 jeblair: thats fair, there is a little more work to coalesce the branch specific check jobs with the rest of the check/gate jobs and stable branche periodic jobs with master periodic jobs 19:46:40 right now we have ~4 distinct classes of d-g job and I think I can roll that into two 19:46:53 so getting it to two is a good first step 19:47:56 cool; if there are any new d-g jobs while clarkb works on this, we should probably run those changes by him to make sure it fits with this work 19:48:09 yes, agreed 19:48:26 #topic Increased space yet again on static.o.o (fungi) 19:48:39 this was more of a public service announcement 19:48:52 time to add some disk space monitoring? 19:49:02 fungi: was this during the summit, or once again, afterwords? 19:49:18 i increased the logs volume during the summit to 4tb, but then caught it again last week just before it filled up and pushed it to 5tb 19:49:28 http://cacti.openstack.org/cacti/graph.php?action=view&local_graph_id=717&rra_id=all 19:49:36 er 19:49:41 #link http://cacti.openstack.org/cacti/graph.php?action=view&local_graph_id=717&rra_id=all 19:50:06 i confirmed the deletion weekly cron job is working as intended to expire 6-month-old content 19:50:21 we're just on an ever-increasing treadmill of log collection 19:50:44 fungi: that's exciting. 19:50:54 similarly, docs-draft is up to 400gb now 19:51:11 we have a giant firehose filling water ballons 19:51:24 anyway, that's all i wanted to mention on the topic 19:51:35 fungi: i think we should make those volumes be at 50% 19:51:44 jeblair: i'm happy to do that 19:51:50 fungi: cool, thanks 19:52:07 #action fungi grow logs and docs-draft volumes so they're 50% full 19:52:09 i believe we had expected them to stabilize. we might have been wrong about that. :( 19:52:25 so I haven't been able to confirm this yet 19:52:48 but the nova logs exploded in size (~8MB compressed for n-cpu??) due to the iso8601 logging 19:53:02 fungi: any idea if that caused a significant uptick in log size? 19:53:27 clarkb: it was hard to tell, but i'd expect compression to water that increase down if it's repetitive lines 19:53:51 hrm, total artifact size by zuul job would be a nice metric; the new log uploading system could report that. 19:54:20 also, there was no obvious uptick in utilization on the voume, just a fairly steady linear progression there 19:54:28 gotcha 19:54:29 #topic Open discussion 19:54:31 * zaro meeting topic, sorry came in late. 19:54:48 new jjb release 19:55:05 anybody have issues with that? 19:55:11 oh ya, we had someone submit a bug requesting a new release and I am +1 on doing it 19:55:23 Also wanted to remind people about https://review.openstack.org/#/c/56107/ maybe get some feedback about the packaging import 19:55:35 I do think we should get the new jjb-ptl group change in so that zaro and maybe others can do the releases 19:55:54 #link https://review.openstack.org/#/c/56823/ 19:56:01 clarkb: jjb has a ptl? 19:56:04 I'm continuing to work in -neutron 19:56:15 jeblair: no, but read the the change comments for why it was named that way :) 19:56:24 jeblair: mordred wanted consistency 19:56:49 * mordred doens't feel strongly about it 19:56:53 clarkb: well, the group is named that way to remind us to keep it small. :) 19:57:09 but I know we've pushed back on people before making a thing not called -ptl 19:57:13 yup 19:57:16 I am fine with the name 19:57:21 mordred: i'm not opposed to the name 19:57:22 that said - perhaps -ptl is the wrong name for that role and -release-manager is a better name? 19:57:27 * mordred doesn't want to bikeshed too much 19:57:28 mordred: i'm not sold on the _idea_ 19:57:33 ah 19:57:35 gotcha 19:57:56 jeblair: so I suggested it because openstack-ci-core has really taken a back seat in JJB reviews lately 19:58:09 clarkb: one of us has been on vacation 19:58:11 I think a subset of the jjb core group is in a better position to cut releases 19:59:50 and that's about it for time 19:59:59 clarkb: the people who can make releases should definitely be a subset of the jjb core group. it currently is. 20:00:12 #endmeeting