19:02:47 #startmeeting infra 19:02:48 Meeting started Tue Jul 30 19:02:47 2013 UTC and is due to finish in 60 minutes. The chair is jeblair. Information about MeetBot at http://wiki.debian.org/MeetBot. 19:02:49 Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 19:02:51 The meeting name has been set to 'infra' 19:02:59 "and is due to finish in 60 minutes" ! neato :) 19:03:02 o/ 19:03:08 too cool 19:03:25 #topic asterisk server 19:03:38 most of the topics from last meeting were around the asterisk server anyway... 19:03:54 including that i signed up for a DID, so it has a phone number now... 19:03:57 * mordred bows down to the new asterisk server overlord 19:03:57 * fungi missed the last meeting 19:04:11 * fungi forgot to see what got assigned to him in absentia 19:04:15 fungi: i don't think you did; at least based on irc logs 19:04:30 it looks like there was no meeting last week 19:04:33 correct 19:04:43 pabelanger, russellb: should we all dial in? 19:04:53 to see if we can stress test the conf server a bit? 19:04:59 jeblair: sure 19:05:02 should be able too 19:05:12 * fungi has a telegraph^H^H^H^H^Hphone handy 19:05:21 sip:conference@openstack.org 19:05:39 pabelanger: going to use g722? 19:05:41 I will need a real number as android's clients are derpy and I am still without a proper headset 19:05:46 there's that, and the phone number is 512-808-5750 19:05:52 no, was going to use DID 19:05:57 i'll probably just use the DID as well ... 19:06:00 russellb: no client 19:06:14 i have a client, but i like my desk phone more, heh 19:06:17 conference id number? 19:06:19 6000 19:06:29 in 19:06:40 asterisk CLI looks good 19:07:29 6 people on 19:07:59 load is nothing 19:08:58 I can get on with skype and the phone number 19:09:11 *CLI 19:09:15 confbridge list 19:09:21 using the sip client, Jitsi still fails for me 19:09:30 anteaya: likely a codec issue 19:09:41 pabelanger: probably 19:09:58 jeblair: *CLI> core show channels ... shows all channels as calls from the SIP provider 19:10:06 no 19:10:57 yes a little choppy for me too 19:11:39 cell phone 19:11:53 I am just listening, not transmitting 19:12:08 who was that? 19:13:07 nice 19:13:23 pbx*CLI> channel originate Local/6000@public application playback spam 19:14:39 i have silence now... 19:14:48 jeblair: we are still talking 19:14:52 neat! 19:14:54 silence huh? 19:15:10 we have more callspam 19:15:18 CPU is spiking, so we'd need to see why 19:15:27 she will say anything. 19:15:29 almost anything. 19:15:34 ha ha ha 19:15:47 tt-monkeys FTW 19:18:06 just with the small group, the sound is about equivalent to one of the board meeting conference calls, as a listener 19:18:12 fungi has good sound 19:18:22 * ttx lurks 19:18:29 * mordred waves at ttx 19:18:37 ttx: you want to dial in from france? 19:18:54 * mordred got dropped 19:19:12 mordred: i'll pass, unless you REALLY need that tested 19:19:19 skype is charging me money to listen in 19:19:27 * ttx multitasks 19:19:36 so far I have paid 40 cents 19:19:47 ttx: not yet; i think we'd get a local number later 19:21:25 #action clarkb add pbx to cacti 19:21:32 what is the conf number ? 19:21:37 that actually was quite good 19:21:39 we all just hung up 19:21:43 oh .. too late 19:21:44 we just ended the session 19:21:51 what? are we having a party on the phone or something? 19:21:53 https://wiki.openstack.org/wiki/Infrastructure/Conferencing 19:22:04 * jd__ hides 19:22:05 jd__: we missed it 19:22:07 we can try to bum-rush it again next week once we have it in cacti to see what the impact is 19:22:08 i documented it! 19:22:20 you could totally hook the meeting room bot into asterisk too, once the meeting start, a new conference room is create and password outputted 19:22:25 don't think that would be too hard 19:22:44 neat :) 19:22:50 pabelanger: it's only software 19:22:54 russellb: "on his spare time, the hero of cloud computing sets up asterisk" 19:23:00 ttx: heh 19:23:13 ttx: like you're one to talk, writing your own bug tracker 19:23:20 russellb: not jealous at all. 19:23:21 russellb: would be a fun integration 19:24:11 so i think next we want to get a handle on resource usage there... fwiw asterisk is currently almost idle now that we've hung up. 19:24:30 so i'm guessing the high cpu usage is related to our call 19:24:36 I think fail2ban is probably a reasonable thing to try as well 19:24:43 it will act as a rate limiter for badness 19:24:55 or spammers just target us when there is something going on 19:25:35 shall we move on? 19:25:47 there was no spam while we were on the call fwiw 19:25:55 so it was just conference related 19:25:55 jeblair: yes. Transcoding was the hit 19:26:06 but I wouldn't expect it to be spiking 19:26:13 pabelanger: dangit, should have looked to see what codecs were being used ... 19:26:29 ya, I was trying to see that 19:26:37 I think I seen some gsm with ulaw 19:26:43 they were all from the provider, so i can just call back and check 19:27:08 ulaw (which is good) 19:27:15 k, can move on 19:27:26 ok 19:27:30 might just be confbridge 19:27:32 too attractive to spend the whole hour on a phone system ;) 19:27:36 #topic multiple jenkins masters 19:28:14 the multi-step process to get to the point where we can have multiple masters is almost complete... 19:28:33 we just need to do something to ensure that the bitrot jobs only run one place (and also that their logs are correctly processed) 19:29:14 do we also need this? https://bugs.launchpad.net/openstack-ci/+bug/1082803 19:29:15 Launchpad bug 1082803 in openstack-ci "Manage jenkins global config and plugin list" [Medium,Triaged] 19:29:17 to that end, I think i can have a working timer trigger for zuul today, which means zuul/gearman can dispatch those jobs, and we can stop using the jenkins timer triggers for them 19:29:18 jeblair: reviewing your multiple triggers zuul change is on my list of things to do once we are done with bugs 19:29:37 zaro: that would be nice, but i think i will defer it for now... 19:29:53 because i'd like to have multiple masters within a few days (or weeks at the most) 19:30:04 jeblair: I agree. There is a pressing need for multiple masters which we can manage by hand until we have the automagic to do it with tools 19:30:24 especially since multiple will initially be just two 19:30:30 not counting the old one 19:30:38 i believe we're hitting performance problems related to the rate at which we are adding/removing slaves from jenkins for the devstack tests 19:30:54 so being able to scale that up is the motivating factor for this 19:31:08 jeblair: yes, the slave launch logs indicate it is taking up to a minute to add each slave 19:31:10 that certainly sounds like the sort of unusual use case the jenkins devs wouldn't have optimized for 19:31:11 which is really really slow 19:31:31 after we have multiple masters, i plan on looking at more efficient and reliable ways of managing slaves. 19:31:51 that is awsome. i am talking about the gearman-plugin at this year's jenkins conf. will having something good to show! 19:32:09 zaro: cool! 19:32:12 when is that? 19:32:25 oct 23-24 19:32:30 want to come down? 19:32:34 jeblair: on the efficient and reliable ways of managing slaves - I'd love to chat about that once you're there 19:33:15 it's in palo alto. 19:33:38 mordred: my thoughts so far are that d-g should be a daemon instead of a bunch of jenkins jobs, and that gearman-plugin should (optionally) handle offlining slaves when jobs are done (since it's in a good place to do that) 19:33:51 ah yes. that. totally with you 19:34:21 mordred: the daemon will be able to manage the nova boots in a better way. we may still have bottlenecks adding and removing slaves... 19:34:48 howevever, we'll have a better handle on that in that we'll be able to tune how many api calls happen in parallel, etc... 19:34:54 agree 19:35:09 also, we won't be making all these api calls at the same time jenkins itself is managing 16 running jobs (which are making the api calls) 19:35:33 _after_ that, clarkb and I were supposing about kexec re-purposing of slaves, so that there would be less adds/removes 19:35:38 using a daemon will definitely make a lot of pain points less insane 19:35:40 but I agree that tuning what we have first 19:35:49 will let us grok the other things sanely 19:35:55 ah yeah, that could be cool. the only thing about that is that i don't think we can count on always doing that 19:36:07 some jobs will completely break the node and it will need to be destroyed and replaced 19:36:17 but 90% of them may be able to be reused, which could be a huge win 19:36:28 jeblair: yup 19:36:41 jeblair: and a daemon should be smart enough to know when it can and cant make use of kexec 19:37:13 that could be the normal case, and then if we fail to kexec after a minute or two, we kill it and let, say, a hypothetical future low-watermark load based system decide that it needs to replinish the pool. 19:37:44 jeblair: yup. and a hung-kexec probably will not respond to health pings :) 19:37:52 (and hopefully, we can apply a bunch of this to the regular nodes too) 19:38:02 that would be amazing 19:38:06 sounds great 19:39:01 cool, sounds like we have general consensus on a very long road ahead :) 19:39:09 #topic requirements and mirrors 19:39:18 mordred: you want to talk about about what's going on in this area? 19:39:26 ugh 19:39:28 not really 19:39:37 can I just close my eyes and make it go away? 19:39:50 maybe? 19:39:51 so ... there are several issues 19:40:00 (it's worth a shot) 19:40:10 one is that pip installs things not in a dependency graph order 19:40:33 but in a strange combo of first takes precedence and sometimes highest takes precedence 19:41:09 which makes things gigantically problematic for devstack when things change, because sequencing will cause not what you expected to be installed 19:41:12 SO 19:41:15 I did open a bug against this problem with upstream pip dstufft thought it may be fixed in 1.5 19:41:18 current things on tap to fix this 19:41:23 but that doesn't help us today 19:41:40 sdague and I are working on getting the requirements update.py script in good shape 19:41:50 with the idea that in setup_develop in devstack 19:42:09 the first step will be to run the update.py script from requirements on the repo to be setup_develop'd 19:42:18 yes, we're close, except for the fact that update.py doesn't handle oslo tarball urls 19:42:23 this will ensure a consistent worldview of which packages should be installed 19:42:25 I think that's actually the only missing piece 19:42:28 #link https://github.com/pypa/pip/issues/988 there is apparently some undocumented feature that may be useful in mitigating this 19:42:39 sdague: grab most recent trunk - I think that's fixed now 19:42:49 mordred: really? 19:42:58 if so I can kick my devstack job again 19:43:04 mordred: setup_develop sounds like a good idea 19:43:09 then - once that's happening, we should be able to gate requirements on devstack running with requirements 19:43:14 so we can know if a new req will bork devstack 19:43:28 awesome 19:43:38 and then, with the two of those in place, we sohuld be in good shape to auto-propose patches to projects on requirements changes landing 19:43:53 incidentally, I also just wrote a patch to update.py to make it sync the setup.py file 19:43:59 so those really will be managed boilerplate 19:44:05 mordred: maybe you can get dstufft to give us a tl;dr on the undocumented feature referred to in the pip bug 19:44:19 and see if that actually does help us 19:44:22 k. 19:44:30 I'll ask him 19:44:31 also, I've got unit testing for update.py inbound, as soon as mordred does a pbr fix 19:44:45 yup. we have a pbr bug to bug folks about to help this 19:44:50 because tox in requirements/ is fun :) 19:44:54 then I'm also going to get per-branch mirrors up 19:45:34 and finally- I think we should ditch installing from pip and figure out a way to auto-create debs and install from those - because we're spending WAY too much time on this 19:46:05 I think that just moves the pain elsewhere 19:46:08 but I am not working on that 19:46:09 mordred: won't that put us in the "can't use python-foo version X because it's not packaged yet" boat again? 19:46:26 jeblair: yes, that was my worry when we were discussingthis the other day 19:46:43 possibly to both of you ... but we are spending a LOT of effort on this 19:46:47 mordred: which i'm okay with, if we think we're in a better place to deal with that new (less churn, more packagers, cloud archive, etc) 19:46:51 and turning up every corner case in python 19:46:58 s/new/now/ 19:47:07 I don't think it's the right thing to work on right now 19:47:20 but I do think perhaps at the summit, we should discuss what it might look like in earnest 19:47:20 I think it's a summit session, honestly 19:47:26 jinx 19:47:31 sounds good to me 19:47:34 ++ 19:47:43 I still agree that we do not want to be in the business of shipping debs or rpms 19:47:46 mordred: (and i agree with you in principle) 19:47:59 but like we've talked about for infra packages, operationally helpful packages might be helpful 19:48:24 we're at a very different place than we were 2 years ago 19:48:38 very much so 19:48:44 https://review.openstack.org/39363 btw 19:48:46 anything else on this topic? 19:49:00 for those who feel like looking at a pbr change related to helping the requirements work 19:49:13 #top Gerrit 2.6 upgrade (zaro) 19:49:23 #topic Gerrit 2.6 upgrade (zaro) 19:49:31 zaro: how's it going? 19:49:39 ohh. it's going.. 19:49:53 i think i have gerrit with WIP votes working now. 19:50:03 with one small asterisk right? 19:50:26 is that with a patch? 19:50:29 nope! just figured out the bit of prolog.. 19:50:35 neat! 19:50:39 prolog! 19:50:40 zaro: but it requires the patch for the change owner right? 19:50:42 yes, it's gerrit core. 19:50:44 clarkb: thanks, didn't realize the check_uptodate.sh was just nova 19:50:56 zaro: have you proposed it upstream yet? 19:51:14 o/ 19:51:24 i just got it all to work. will submit a patch to upstream this week. 19:51:25 pleia2 found an internet 19:51:30 so couldn't we get WIP equiv by making APROV -1,0,+1 and letting an author -1 APROV his/her own patches? 19:51:34 zaro: neat! :) 19:51:36 let's see what they say. 19:51:41 sdague: that's actually the plan we discussed... 19:51:45 oh... ok :) 19:51:47 if they're amenable to the patch in general 19:52:05 do we think we'd be willing to roll out a 2.6 with only that patch 19:52:06 sdague: and i believe the "let an author..." bit is what zaro was patching to support 19:52:11 i have been in discussions with mfink? and i did it on his suggestion. 19:52:21 oh awesome 19:52:23 jeblair: and/or adding a WIP category, but both changes need the patch zaro wrote to be expressible in the ACLs 19:52:25 sdague: (existing gerrit acls don't support that operation) 19:52:26 ok, I had thought that permission was already in, 19:52:28 ok 19:52:57 ok. that's it for now. 19:53:06 I think, given a history of carrying a whole string of patches, carrying one patch for a cycle would not be as terrible 19:53:13 zaro: i think mfink has a very strong voice in the gerrit community, so if he likes it, that's great. :) 19:53:30 good to hear. 19:53:43 mordred: let's give this "develop upstream" thing a shot, eh? :) 19:53:55 jeblair: sure! 19:54:08 * mordred just wasn't sure what the 2.7 schedule was looking like 19:54:12 zaro: i'm very excited, thanks! 19:54:18 me too 19:54:22 #topic cgit server status (pleia2) 19:54:31 pleia2: welcome! 19:54:34 so, the one thing I wanted to talk about is addressing for grabbing git repos 19:54:44 * mordred welcomes our new git.o.o overlords 19:55:08 the plan right now is to do what fedora does (it's easy) and do git://git.o.o/heading/project and http://git.o.o/cgit/heading/project 19:55:13 so they aren't the same :( 19:55:30 git.kernel.org makes it so they are both git.kernel.org/prod/ 19:55:30 I think jeblair has convinced me that I can get over my issues with the cgit in the url 19:55:35 err /pub 19:55:40 pleia2: I think we can lie and put them all under /cgit 19:55:48 even though cgit doesn't do git protocll 19:55:57 clarkb: yeah, that's really easy 19:56:00 pleia2: i'm okay with that because you have indicated cgit really wants to be set up like that, fedora and kernel work that way... 19:56:01 er 19:56:14 I do not want to add cgit to the git:// url 19:56:20 does kernel.org use rewrites or something? (if first node in the path matches an org, rewrite the url) 19:56:29 you want to put the git-protocol repos under 'cgit/'? that doesn't sound good to me 19:56:34 if we lie about anything, I want the cgit to go away from the urls in places 19:56:49 fungi: probably, I clicked down into a project to see what urls they provide visually and: 19:56:53 git://git.kernel.org/pub/scm/bluetooth/bluez.git 19:56:53 http://git.kernel.org/pub/scm/bluetooth/bluez.git 19:56:53 https://git.kernel.org/pub/scm/bluetooth/bluez.git 19:56:53 but I'm fine with having the web and the clone be different 19:56:55 ^^ ie 19:56:56 pleia2: Error: "^" is not a valid command. 19:57:16 https://git.kernel.org/pub/scm/bluetooth/bluez.git is not cgit 19:57:20 it's regular git 19:57:26 ah, interesting 19:57:47 so maybe that is what we do, put regular git http daemon behind /pub put git:// behind /pub then cgit can have /cgit? 19:57:48 right- I think git clone http:// and git clone git:// should have the same urls 19:57:48 so i think it makes sense to serve cgit from /cgit 19:57:54 mordred: I agree 19:58:02 and I agree with jebalir 19:58:03 worth noting, you don't have to clone from cgit, you can clone from http(s) published copies of the git trees 19:58:09 er, that 19:58:10 and I don't think we need pub - I think that can go in root 19:58:13 and let's serve http and git protocols without a prefix 19:58:21 git clone git://git.openstack.org/openstack/nova.git 19:58:22 wfm 19:58:30 mordred: +1 19:58:42 #topic Py3k testing open for business (fungi, zul, dprince, jd__, jog0) 19:58:49 fungi: 1 minute! 19:58:51 just a quick update 19:58:57 we're basically ready 19:59:03 w00t 19:59:05 couple of reviews which need last-minute attention... 19:59:07 yaay! 19:59:12 oh 19:59:15 #link https://review.openstack.org/#/q/status:open+project:openstack-infra/config+branch:master+topic:py3k,n,z 19:59:29 oops 19:59:44 (plug for my new gate test which will catch missing jobs!) 19:59:54 but that's all the updates i have time for in here. we can pickit up i #-infra 20:00:05 what's missing? 20:00:09 so close! 20:00:12 jd__: see link 20:00:30 jd__: we missed that when reviewnig your change earlier 20:00:37 * ttx whistles innocently 20:00:43 too bad 20:00:47 thanks everyone! 20:00:47 hey 20:00:50 hi 20:00:54 #endmeeting