19:01:45 #startmeeting infra 19:01:46 Meeting started Tue Feb 24 19:01:45 2015 UTC and is due to finish in 60 minutes. The chair is jeblair. Information about MeetBot at http://wiki.debian.org/MeetBot. 19:01:47 Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 19:01:49 The meeting name has been set to 'infra' 19:01:53 #link agenda https://wiki.openstack.org/wiki/Meetings/InfraTeamMeeting#Agenda_for_next_meeting 19:01:53 #link previous meeting http://eavesdrop.openstack.org/meetings/infra/2015/infra.2015-02-17-19.01.html 19:01:53 hello 19:01:58 #topic Actions from last meeting 19:02:02 AJaeger update infra manual with info on setting up depends on for CRD between new openstack/* project changes and the governance change accepting that project 19:02:10 i think that just merged a few minutes ago 19:02:13 so yay! 19:02:15 Morning 19:02:20 o/ 19:02:26 evening 19:02:27 thanks AJaeger_ 19:02:35 the rest of the actions i think will fit into later topics 19:02:43 o/ 19:02:51 #topic Priority Efforts (Swift logs) 19:03:14 so i guess it's good we haven't quite gotten around to deploying this everywhere yet 19:03:21 bah. 19:03:25 since the requests release broke us 19:03:34 diversity ftw 19:03:46 is this running in a venv? 19:03:47 what's this you say? a python library made a release that breaks things? 19:04:01 yeah, it's in a venv 19:04:01 jeblair: yes it is 19:04:14 o/ 19:04:22 ok. so we're currently fixing it by rolling forward on image updates with the new, fixed, version 19:04:26 we could just pin to a specific version of all the deps in the venv today with a working set and never update. We would have to pay attention to security updates 19:04:36 ouch, I missed that :-( 19:04:40 but if this had been deployed everywhere, we could have killed images and fallen back to the previous working version 19:04:49 o/ 19:05:05 very true 19:05:17 so it's good to know that if something like this happens again (please no), we have a workable coping strategy 19:05:24 +1 19:05:47 at any rate, this should be running again soon... jhesketh, any blocking reviews? 19:06:08 there is the change jhesketh and I co wrote to do uploads ocncurrently 19:06:22 jeblair: https://review.openstack.org/#/q/status:open+project:openstack-infra/project-config+branch:master+topic:enable_swift,n,z 19:06:24 and the zuul change to fix layout handling of integer vals 19:06:28 #link https://review.openstack.org/#/q/status:open+project:openstack-infra/project-config+branch:master+topic:enable_swift,n,z 19:06:39 gives the uploader glob support like Jenkins 19:06:42 jhesketh: I should update those two changes to use that topic, I will do so now 19:06:59 cool, so everything related will show up there soon. 19:06:59 That's better 19:07:00 https://review.openstack.org/#/c/156788/ 19:07:29 anything else on this? 19:07:32 yep, I'll update a couple of other non-important reviews to that topic too then 19:07:41 just incremental improvements really 19:07:51 #link https://review.openstack.org/#/q/status:open+branch:master+topic:enable_swift,n,z 19:07:59 needed to drop the project restriction to get the zuul change in 19:07:59 the main thing is moving more jobs over, which is in that topic 19:08:48 cool, thanks. (best to avoid merging that one until we have the all-clear on the new image builds) 19:09:00 #topic Priority Efforts (Nodepool DIB) 19:09:02 agreed 19:09:59 mordred, clarkb, SpamapS: anything on this topic? 19:10:09 oh ya, the grub thing 19:10:30 #link https://review.openstack.org/158413 19:10:34 we are not currently setting kernel boot parameters in the grub2 configs in a way that properly restricts our instances to 8GB of memory 19:10:37 clarkb: regarding your thoughts that update-grub may be required... you may be right. I think a simple experiment is in order. :) 19:10:47 yolanda has some work on the nodepool-shade done 19:10:58 shade could really use reviews on: 153623 156247 156088 157509 156954 19:11:02 yes, i started working on it, but i've been evolving shade in the meantime 19:11:03 greghaynes is looking into fixing that in dib and we can probably limp along by explicitly calling update-grub in our element 19:11:10 really using shade in context reveals the weak points 19:11:13 Yep 19:11:35 mordred: can you make that into #links for the record? 19:11:35 latest thing i did is the caching layer, that will be useful for nodepool 19:11:39 it's pending review 19:11:41 SpamapS: ya I basically confirmed it after grepping around, the only place update-grub is called is in the finalize.d/51-bootloader script which is before we make our changes and it is also where grub is installed so we don't have a place to slip edits in currently 19:11:47 mordred: (ideally a topic link) 19:12:24 they don't all have the same topic - I could go edit the topic on them though if you like 19:12:28 after some heavy consideration of the image collapsing work, i've determined that the best path forward involves leveraging bindep to provide usable cross-platform manifests of what we want installed in such a way that individual projects can also override them, so i have several patches proposed to stackforge/bindep in service of that goal 19:12:29 I think we solve the grub thing by first manually running update-grub for now then when dib is fixed update dib and update our element to take advantage of dib fix 19:12:35 Could you hashtag that so it shows in the minutes? 19:12:45 #hashtag ? 19:13:07 Blah. That was a response to mordred's list of bugs 19:13:15 Train wifi is laggy :( 19:13:30 * mordred working on it 19:14:07 oh, also i have a mostly complete bindep manifest which covers the things we install today on bare-centos6, bare-precise and bare-trusty which should make the devstack-.* versions of those behave similarly 19:14:19 fungi: nice 19:14:20 Thanks. I saw jeblair beat me too it 19:14:21 but didn't get a chance to finish it before the meeting 19:14:31 #link bindep for image collapsing: https://review.openstack.org/#/q/project:stackforge/bindep+status:open,n,z 19:14:53 thanks 19:15:18 #link https://review.openstack.org/#/q/status:open+project:openstack-infra/shade+branch:master+topic:dib-nodepool,n,z 19:15:36 #link other-requirements.txt for bindep http://paste.openstack.org/show/181378 19:15:46 the mostly-complete example 19:15:59 i 19:16:13 who is core on bindep? 19:16:16 lifeless: 19:16:23 is core on it 19:16:39 he may welcome help, i haven't asked yet 19:16:51 it sort of sat abandoned for a couple years until i decided to use it 19:17:12 we could also adopt it into infra if lifeless is okay with that 19:17:19 lifeless: ^^ 19:17:24 but it really already has the framework to do most of what we want (hence my relatively trivial patches) 19:17:43 some of which might warrant me adding a few tests for coverage 19:18:35 #topic Priority Efforts (Migration to Zanata) 19:18:52 pleia2: istr you are still iterating on the big "run zanata" patch? 19:19:18 yeah, didn't make progress last week because travel+bad internet 19:19:20 Do we have a specific topic for the vanilla cloud coming up? 19:19:43 tchaypo: yep 19:19:43 mrmartin has been helping me with testing, so I have a change pending to incorporate his comments 19:19:47 I did a review on the patch, and tested it, and found only one dependency issue. 19:20:02 with fixing that, the puppet will run well without errors. 19:20:15 that's my favorite way of puppet running 19:20:24 the bad news, need to check the deployment, because something is still broken with zanata.war deployment 19:20:43 I'll try to allocate some time to trace it. 19:21:00 once this is done, the fun part is writing the system-config file to setup mysql, exim and whatever else we end up meeding 19:21:20 Or needing 19:21:29 yeah, that too ;) 19:21:29 why exim? don't we using postfix? 19:21:38 mrmartin, pleia2: cools, also even thought they don't use puppet, remember the zanata folks might be able to help figure out what's wrong 19:21:45 mrmartin: we have some exim experts here 19:21:55 so zanata has a specific smtp api? 19:22:00 yeah, I've been in touch with carlos throughout, he's helped with standalone.xml problems more than once so far 19:22:02 or notifications or what? 19:22:28 fungi: notifications 19:22:39 if it just needs to be able to send outbound, there's pretty much no extra config needed for that 19:23:03 do we have any limitation on outbound emails? 19:23:12 mrmartin: nope 19:23:17 mrmartin: in hpcloud i think, but not in rackspace 19:23:31 or did hpcloud finally fix their source port 25 block? 19:23:31 ok, I've the same experience with hpcloud. 19:23:34 (one of the reasons we run all our servers in rackspace) 19:23:37 yup 19:23:47 er, egress to destination port 25 block i guess 19:23:50 mrmartin: in hpcloud the rate limit should be such that you can at least test it with a few emails 19:24:00 fungi: more of a rate limit than block, i thought 19:24:09 yeah, rate limit, emails come eventually 19:24:13 got it, i think i never knew the details around it 19:24:15 yep, but sending out bulk emails can hit those limits. 19:24:29 right, enough that lists.o.o or review.o.o would never fly 19:24:45 not even stackforge.o.o flew 19:24:49 pleia2, mrmartin: thanks! 19:24:51 anyway, we're building it in rax-dfw alond with our other servers, so should be fine 19:24:52 and that was in the before times :) 19:24:55 er, along 19:24:59 #topic Priority Efforts (Downstream Puppet) 19:25:03 #link last call on: https://review.openstack.org/#/c/137471/ 19:25:12 that spec looks ready to merge to me. last chance to look it over. 19:25:16 #link needs updates: https://review.openstack.org/#/c/139745/ 19:25:32 asselin: ^ that spec needs another revision, but i think we're all really excited about it 19:25:54 yes, i got good feedback. will update soon 19:25:55 thanks pleia for helping me with the last yard on that one 19:26:23 and finally, we are poised to publish our modules to the forge as "openstackci-foo"... 19:26:31 except i think we actually want it to be "openstackinfra-foo" 19:26:47 oh right 19:26:51 I was supposed to make that 19:26:55 does that change sound good to everyone? 19:26:56 what were peoples' thoughts on thirdpartyci module vs runanopenstackci module? 19:27:06 jeblair: wfm 19:27:28 jeblair: yes 19:27:39 jeblair: sounds good to me 19:27:44 based on jeblair's feedback where -infa uses it also, I prefer not to use 'thirdparty'. I plan to adjust the spec with that 19:27:44 clarkb: what is in the module? 19:28:00 reusable openstack infra 19:28:00 jeblair: what email should I use for the openstackinfra accoutn 19:28:09 #agreed use openstackinfra for puppetforge 19:28:20 #action mordred create openstackinfra account on puppetforge 19:28:21 anteaya: its the proposed thing in https://review.openstack.org/#/c/139745/ one of my comments was don't make this specific to running third party ci since other people may want it too 19:28:27 clarkb: i'm not entirely sure what the difference in setups is between something like what we do, and what a thirdparty tester does 19:28:51 nibalizer: third party tester usually isn't scaling out like we are, its a different opinion on how to run the same software. 19:28:57 mordred: infra-root@o.o? 19:29:00 nibalizer: asselin does it with jenkins, zuul, nodepool on a single node iirc 19:29:03 nibalizer: fewer components, running more services on a smaller number of machines, et cetera 19:29:26 I'm for as generic a name as we can find 19:29:27 some of the services are just slaved to infra though right? 19:29:32 e.g. gerrit? 19:29:38 for folks who what any part of the structure of what we run 19:29:47 nibalizer: infra's CI system is also slaved to infra's gerrit 19:29:48 jeblair: k. account created - there will be a confirmation email 19:29:48 nibalizer: yes so no gerrit here 19:30:24 my gut instinct, and this is promising a lot, is that we could have two new modules, one for running thirdparty ci for infra and one for running your own 19:30:34 nibalizer: but why do they need to be different? 19:30:37 but that might be crazytalk 19:30:55 nibalizer: my sense is the point of this is to _not_ have two modules 19:31:03 is basically what I am saying, if we call it thirdpartyci that implies it should only be used for talking to our gerrit, but maybe you have a gerrit and just want some CI, I dunno 19:31:06 the more that infra and the third-party ci can share, the better, i think. 19:31:07 but to have one that is really reusable for many situations 19:31:09 clarkb: well if you know you're not gonna run gerrit, and gerrit will always be at review.o.o you can make it simpler 19:31:17 I also mentioned maybe that is too much to chew off right now and we can converage later on the more idealistic thing 19:31:22 jeblair: +1 19:31:49 I like to keep the name generic, and start with the parts listed in the spec 19:31:53 what were talking about here is composition layers, and generally making new ones is easy and good, but ofc syncing them is hard 19:32:08 here is my concern 19:32:26 nibalizer: gerrit is not in the scope of the current spec 19:32:31 if we keep the 'thirdparty' label then it has defined and limited scope, even though that scope is pretty big 19:32:43 if we rename it 'run your own infra module' it has enormous scope 19:32:53 asselin: yup I think the content of the things is good as far as what is included. I just want to avoid making it seem that we aren't also possibly solving the problem of running your own regardless of intent 19:33:24 nibalizer: clarkb's actual suggestion was "openstackci", which i think captures the intent 19:33:50 jeblair: yea i get the sense that clarkb wanted to expand scope, im okay with calling it openstackci and we mean 'thirdparty' 19:34:00 nibalizer: well the thing is I don't think its expanded scope 19:34:04 nibalizer: third-party + first-party 19:34:05 nibalizer: what we are giving people is a run your own 19:34:06 I have the spec updated now (not yet submitted to review) to use 'openstackci' 19:34:12 nibalizer: so why not just call it that 19:34:26 clarkb: i think "all of infra" is a larger scope. and i don't think we want to do that. 19:34:35 jeblair: I am not suggestion all of infra 19:35:11 clarkb: great, i think we're all agreed. 19:35:49 asselin: thanks, it sounds like the spec is just about ready then 19:35:58 #topic Priority Efforts (Askbot migration) 19:36:07 ok, so nothing exciting here 19:36:16 the spec merged :) 19:36:17 the spec approved, and we need an approve on the main patch 19:36:21 yeap 19:36:31 and after that somebody need to launch the instance 19:36:38 and i think add some things to hiera 19:36:43 I'll add all of the migration tasks to storyboard 19:36:51 but mostly just making up new passwords; no real coordination needed for those 19:36:57 make new passwords 19:37:28 yeah, the ssl bits are in hiera, but the stuff that can be randomly generated has not been added yet 19:37:38 * fungi meant to and hadn't gotten around to it 19:37:44 and when this is ready, we can pass it for extra testing before opening up to the public 19:37:56 mrmartin: can you #link the main patch? 19:38:17 #link https://review.openstack.org/140043 19:38:42 mrmartin: you were going to write up the actual commands we need to use to extract and import the data it needs, right? i think that's what i originally wanted to see, but may have been lost in the shuffle when you wrote a spec about it 19:39:08 fungi, yes I've this somewhere, I'll add it to the spec 19:39:21 i'm cool with there being a spec, but was really just looking for the basic commands we'll need 19:39:27 nothing extra, just need to do postgresql backup / recovery and reindex the solr db 19:39:36 okay, sounds simple enough 19:39:56 mrmartin: or you could add it to the system documentation (in case we need to reference it in the future) 19:39:58 either way 19:40:02 that's all, but I'll add those commands, I did those tasks in test vm several times 19:40:23 #action mrmartin add operation tasks to askbot system documentation 19:40:31 mrmartin: thanks! 19:40:32 #topic Priority Efforts (Upgrading Gerrit) 19:40:57 zaro proposed a date for the gerrit upgrade: proposed date Fri April 10, 2015 19:41:17 that's in the middle of pycon, and i will be unable to help then 19:41:29 me too 19:41:32 I won't be pyconning and assuming nothing pops up between now and then is fine with me 19:42:04 wfm 19:42:04 yeah, i wasn't planning to be at pycon (didn't get around to submitting any talks) so i'll be aroud 19:42:04 however, might we want to do it on a saturday? 19:42:04 Question: will there be 'something' available to test firewall updates? 19:42:07 around too 19:42:40 asselin: i don't think we were planning on it, but maybe we could run netcat or something on that port? 19:42:44 i can also do that saturday if we'd rather, no problem 19:43:02 fwiw, I try to take saturdays off because sanity 19:43:04 Maybe a web server with a static "future gerrit" page? 19:43:25 heh, yeah netcat running to serve a "you can reach us!" banner and some instructions on how to test 19:43:31 o/ 19:43:44 pleia2: normally, yes, and we do a lot of light gerrit maintenance on fridays. but this has the potential for significant downtime if something goes wrong. 19:43:49 It would be very helpful to have something so we can telnet to the ip & port and make sure connectively is good. Especially w/ coorporate requests. we have a window to test, after which you need a new request if there's a mistake. 19:43:57 web page might be less accurate because people may have browsers set up to proxy arbitrarily 19:44:07 jeblair: *nod* 19:44:48 so instructions to test ought to not involve "go to this url in a browser" 19:44:52 ours is supposed to be implemented soon (Planned Start: 02/24/2015 08:00:00 Planned End: 03/03/2015 16:00:00) 19:45:16 pleia2: want to set that up? i don't think we need to puppet it... 19:45:30 (but can if you really feel like it :) 19:45:54 yeah, i wouldn't bother to puppet it 19:45:59 jeblair: sure, I'll collude with zaro 19:46:05 collusion is good 19:46:19 #action pleia2 set up netcat hello-world on new gerrit port 29418 19:46:20 our software is built on collusion and beer 19:46:34 zaro: how's saturday sound? 19:46:43 sounds good 19:46:52 zaro: you'll have clarkb and fungi on hand at least 19:47:18 should be fun, is that after release? 19:47:22 ftr, i can do either of the next two saturdays 19:47:27 I should pull up the release chart 19:47:41 it's in the RC period 19:47:44 i believe its release candidate week 19:47:52 jeblair, pleia2 thanks! 19:47:54 generally the period where ttx asks us to please be slushy. 19:48:11 #link https://wiki.openstack.org/wiki/Kilo_Release_Schedule 19:48:19 perhaps we should move it to may 9? 19:48:34 I like may 9 19:49:03 i might be travelling then, but not sure 19:49:07 ya I think gerrit upgrade isn't very slushy 19:49:21 may 9 should also work for me 19:49:24 isn't that summit time? 19:49:37 zaro: https://wiki.openstack.org/wiki/Kilo_Release_Schedule 19:49:42 summit is the week of may 21 19:49:56 ya summit starts on the 19th 19:50:02 er 18th 19:50:03 5/9 wfm too 19:50:12 okay, everyone look at your calendars and next week we'll decide on a date 19:50:27 #info april 11 and may 9 suggested as gerrit upgrade dates 19:50:35 #topic Fedora/Centos progress (ianw 2/24) 19:51:16 i'd like to get the image build logs much easier to use so i'm more confident keeping on top of issues with these builds 19:51:45 the log config gen script lgtm last I looked at it 19:51:53 i've been through several iterations of this with https://review.openstack.org/#/c/153904/ 19:52:15 ianw: soon we're only going to do one image build per-type. i think when that's the case, we could just specify those types in the logging config file manually, right? 19:52:46 jeblair: we could, though the script to do it is pretty handy, and maybe its a manual check into git to use it 19:52:53 rather than auto gen on nodepool restart 19:53:05 right, we're getting much closer to just having images named "centos" and "trusty" and so on, and building them once then uploading to all providers 19:53:20 i've said this elsewhere -- i don't like having openstack's production logging config in the nodepool source code 19:53:23 so manually handling the log targets for them seems tractable 19:53:31 i think any solution that includes that is not one i'm in favor of 19:53:46 jeblair: can you add that to the change? 19:53:51 jeblair: ++ 19:53:52 I don't think I had caught you saying that 19:54:11 i would argue that it's not openstack's logging configuration, it's a generic configuration that is useful to anyone using nodepool 19:54:32 let's handle that in the change 19:54:47 ianw: anything else needs discussion with the whole group? 19:55:04 i'd like to promote devstack centos to voting, any issues here? 19:55:37 oh - I guess I should say ... 19:55:52 ianw: i don't think so, but the devstack folks will need to weigh in on that 19:55:59 I believe I've been convinced on just running cloud-init which should make centos/fedora work for dib-nodepool better 19:56:11 also may be worth swapping out a trusty job rather than doing a pure add? 19:56:20 although I need to get an element built that will install it into a venv with the rax patches 19:56:25 yes, i'll propose change but as long as everyone is happy with that general idea 19:56:28 but that needs larger consensus probably 19:56:51 also the f21 job needs to update it's kernel 19:56:59 I am not opposed to centos devstack voting 19:57:11 i guess the answer is use a dib built f21 19:57:13 ianw: that should happen in image builds, right? 19:57:33 ianw: also thanks for helping the glusterfs cinder driver folks with their centos7 testing issues 19:57:42 ianw: we can't do dib with rackspace just yet, but when we can this would avoid that problem 19:57:58 jeblair: ianw started a thread about why it tries but fails 19:58:03 in the -infra list 19:58:04 ok 19:58:23 #topic Infra-cloud (jeblair) 19:58:28 so we're nearly out of time 19:58:33 #link summary email https://etherpad.openstack.org/p/kBPiVfTsAP 19:58:37 #link story https://storyboard.openstack.org/#!/story/2000175 19:58:40 #link etherpad https://etherpad.openstack.org/p/InfraCloudBootcamp 19:58:53 i'll send out an email about this soon (see that first link) 19:59:15 but the short version is that we have some folks that want to join us and help us run a cloud 19:59:28 as part of infra. and help out with infra-ish things too. 19:59:38 yay! 19:59:42 I'll follow up on the email 19:59:54 i think it's pretty exciting 19:59:57 And figure out who I should be giving access to those servers and how 20:00:02 So that we can. I've forward 20:00:23 so anyway, yeah, we'll follow up in email, and talk about this more next week 20:00:24 awesome 20:00:26 thanks everyone! 20:00:29 That is cool. 20:00:30 Move 20:00:33 seeya 20:00:36 #endmeeting