19:01:11 #startmeeting infra 19:01:12 Meeting started Tue Sep 16 19:01:11 2014 UTC and is due to finish in 60 minutes. The chair is jeblair. Information about MeetBot at http://wiki.debian.org/MeetBot. 19:01:13 Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 19:01:15 The meeting name has been set to 'infra' 19:01:18 o/ 19:01:20 o/ 19:01:20 #link agenda https://wiki.openstack.org/wiki/Meetings/InfraTeamMeeting 19:01:21 hey there 19:01:21 o/ 19:01:34 #link previous meeting http://eavesdrop.openstack.org/meetings/infra/2014/infra.2014-09-09-19.04.html 19:01:48 o/ 19:01:59 o/ 19:02:40 we've established some areas of priority, to help reviewers target their work to make sure we keep moving on the big-picture things we decided on at the summit and meetup 19:02:48 so the first part of the meeting is addressing those 19:02:55 #topic Puppet 3 Migration 19:03:08 last week we said that by now, we hoped that we had a puppetmaster up and were checking hosts 19:03:17 most of the nails are in puppet 2's coffin now 19:03:20 in fact what we did is just did the entire migration on thursday and friday 19:03:28 yay 19:03:42 and are now cleaning up the last few things 19:03:53 good job everyone! 19:03:54 http://puppetboard.openstack.org/fact/puppetversion 19:03:55 yeah, i think we have changes proposed for all the cleanup now? 19:04:05 anything left that has to happen prior to Sept. 30? 19:04:21 give puppet 2.7 a going away party 19:04:21 very much thanks to nibalizer who managed to get most of the prep for that done before we started 19:04:28 yay nibalizer 19:04:42 yes, huge thanks nibalizer! 19:04:47 woo, happy to help! 19:05:03 what is the fate of ci-puppetmaster? will we be turning it off? 19:05:09 yep 19:05:20 did someone move the launch scripts over? 19:05:23 it should be fine to turn off now, in fact 19:05:29 jeblair: yeah, moved and tested 19:05:39 is there anything else on that host we might want? 19:05:48 (logs maybe?) 19:05:49 o/ 19:05:51 yeah, so we have passwords, hiera, launch scripts... i think that should be it. 19:06:12 #link https://review.openstack.org/121654 19:06:21 there's a test server you can qa if you like 19:06:47 fungi-test.o.o iirc 19:06:53 I hopped on it briefly and it looked good 19:07:11 i test-drove the server a little and seemed sane to me too 19:07:28 or i wouldn't have un-wip'd that change ;) 19:08:00 okay, so we can delete at our leisure 19:08:24 who will do that? 19:08:29 cool 19:09:23 anteaya: any of the infra root admins can. i'll do it once the cleanup changes merge, if nobody beats me to it 19:09:28 kk 19:09:31 it takes, literally, seconds 19:09:34 #topic Swift logs 19:10:21 #link https://etherpad.openstack.org/p/swift_logs_next_steps 19:10:27 jhesketh prepared that before the meeting ^ 19:10:35 #link https://review.openstack.org/#/c/109485 19:10:41 jhesketh you workhorse 19:11:04 oh cool, so i hadn't reviewed that because i thought we might still be experimenting with the test job 19:11:20 but if that's first on the list, i'm assuming that the experimental job is working and we're ready to proceed 19:11:43 it is working, but there was perceived slowness? 19:11:53 yep, the experiemental job has been working well for a while (although it hasn't been ran regularly) 19:11:54 clarkb: in what way? 19:12:04 jeblair: in the fetcing of logs 19:12:11 clarkb, jeblair: the slowness is in the fetching 19:12:16 jeblair: I don't think we quantified it super well yet 19:12:21 maybe we should try doing that too? 19:12:28 so os-loganalyze can be slow 19:13:04 do we want to quantify that with the current experimental job, or should we merge 109485 and work from there? 19:13:32 it will probably help to have the larger dataset? 19:13:32 I'm for merging and debugging 19:13:44 having a broader sample set may help bring the performance issues to light, and perhaps inflict pain on people to improve them 19:13:48 if we merge the job will be run more frequently 19:13:49 jhesketh: we check disk first, right? so 109485 isn't a behavior change on its own 19:13:51 there's no reason to block on merging 109485 imo... Unless we want to back out of having logs in swift, we can work on speeding up serving in parallel 19:14:02 ++ 19:14:17 that change lgtm too 19:14:19 jeblair: actually, that's a good point... we check swift first and fall back to disk 19:14:35 oh, so we'll actually end up making viewing logs for all python jobs slow 19:14:42 maybe we should hold off, and/or move a less impacting jjb job over 19:14:53 viewing and presumably indexing (for logstach workers?) 19:15:15 fungi: yeah the logstash workers will be hit but the pipelining should smooth it over for them 19:15:16 obviously i meant logstache 19:15:26 I think the bigger concern is for humans looking to debug their test 19:15:31 yeah, let's go for reduced impact 19:16:02 what is a less impacting jjb job? 19:16:15 the infra jobs are all candidates imo 19:16:22 I agree with that 19:16:23 okay, I'll take a todo to pick a less impacting job and also put up some swift vs disk comparisons 19:16:23 since we can/should be aware of this work 19:16:47 we do seem to enjoy inflicting pain on ourselves first 19:16:54 makes sense 19:16:59 it does seem to be a pattern 19:17:41 #action jhesketh rework 109485 to impact only infra jobs 19:18:06 #topic Config repo split 19:18:25 #link https://review.openstack.org/#/c/121650/ 19:18:42 I have put up a patch to create the new project-config repo 19:19:01 very cool :) 19:19:01 please share your thoughts to ensure I have the tests and acl file as you would like them to be 19:19:16 I am reading up on git filter branch and will be playing with it 19:19:41 anteaya: cool. we want to import this from a repo built with filter branch, so we may want to wip your change until that is ready 19:19:45 once I feel confident that I can filter config so the selected repos are in their own repo and that they are removed from config 19:19:55 I will let you know so we can do the freeze ans such 19:20:03 I can do that 19:20:14 if I can get some feedback on the tests and acl 19:20:23 we also have a bit of work to prepare for the (system-config) repo itself 19:20:25 I would like to get taht confirmed before I move on 19:20:37 okay I will practice with filter branch 19:20:48 let me know or what I can do to get config in shape 19:20:51 should we also time this to a change to remove the said files from the config repo? 19:20:59 yes 19:21:05 we need to update quite a number of places where we currently reference the config files in there to use the new repo instead 19:21:08 they need to happen during the same freeze 19:21:14 and yeah, we should do all of those things around the same time 19:21:18 ie have a dependant change pre-approved so it goes in at the same time, avoiding patches proposing to both repos 19:21:21 in my eyes 19:21:49 anteaya: so once you're done with the filter-branch, we should probably still delay merging that change until everything is ready to go at once 19:21:56 anteaya: and you can run filter-branch again right before we do it 19:22:08 to make sure the new repo has the latest changes 19:22:12 jeblair: yes 19:22:25 jeblair: I hope to have a command I can run anytime 19:22:29 cool 19:23:16 so the other part of this section is the lots-of-puppet-modules split 19:23:33 yesterday we switched over to running the new apply integration test 19:24:03 this is really cool -- we're using the zuul cloner to check out config and the puppet-storyboard module and then we run puppet apply 19:24:47 these are the first jobs to use the zuul cloner right? 19:24:49 it's also a great next-step to further gutting ref management out of devstack-gate 19:24:49 (the clone mapper that hashar added to zuul cloner came in handy, it lets us map "openstack-infra/puppet-storyboard" into /etc/puppet/modules/storyboard in one line of yaml) 19:25:05 clarkb, fungi: yup 19:26:05 i have one review up to split a module out, it has some feedback and i'll put a new patchset up soon 19:26:17 nibalizer: is there a story for this spec? 19:27:06 nibalizer: i don't see one. would you please create one in storyboard and update the spec to link to it? http://specs.openstack.org/openstack-infra/infra-specs/specs/puppet-modules.html 19:27:22 sure 19:27:42 then we should create a task for each puppet repo, so that people can assign those tasks to themselves as they work on it 19:28:08 it would be good to start slow and break out just one module at a time at first to make sure we have the process right 19:28:13 then i think we can go open season :) 19:28:25 after the first couple are behind us, they'll make good low-(medium?)hanging-fruit tasks 19:28:55 anyway, as we do it, we should be able to add them to the integration test so that we can be relatively sure that we're not breaking ourselves as we go 19:28:56 fit for third party participation I am hoping 19:29:18 so documentation of the process is greatly appreciated 19:29:30 anteaya: it should be in the spec; if it changes, we should update the spec 19:29:42 great 19:29:44 (specs are not written in stone. they are written in rst!) 19:29:53 I do believe it is yes 19:29:57 stone is too slow 19:30:08 anything else on these? 19:30:20 anteaya: stone flows faster at higher temperatures 19:30:27 fungi: that it does 19:30:29 #topic Nodepool DIB 19:30:40 i was hoping mordred would be by to share the status here 19:31:04 is he at openstack silicon valley? 19:31:15 i looked at the stack this morning and found that the bottom of the stack, despite 4 revisions since, hasn't addressed a pretty fundamental concern i brought up 19:31:32 so the bottom is now at -2 basically just to get attention. :( 19:31:35 we should talk to yolanda maybe? 19:32:04 i think she's out, but she didn't seem to know the reasoning when she commented on an earlier patchset 19:32:12 hmmm 19:32:30 so basically, i think we're waiting for mordred to finish this, or someone to take it over 19:33:13 if someone wants to take it over, let me know. 19:33:24 fungi: oh, yes i believe he is at ossv 19:33:43 #topic Docs publishing 19:34:09 i haven't started on this yet, and probably won't for a bit yet 19:34:42 if anyone wants to get started on it, feel free (and let me know). otherwise it's probably going to be a few weeks before i start on it in earnest. 19:34:58 what is this? 19:35:00 I can proably take up the dib stuff again since I poked at it before 19:35:09 zaro: http://specs.openstack.org/openstack-infra/infra-specs/specs/doc-publishing.html 19:35:16 jeblair: re d-i-b; i'm very interested in this, but don't want to unilaterally take things over 19:35:16 should be able to get up to speed on it relatively quickyl 19:35:31 ianw: maybe we can work together on it? 19:35:32 also, it's probably good for some of the swift logs stuff to settle out before we really start on docs 19:35:43 ianw: unilaterally taking over mordred's changes is a tradition around here ;) 19:35:58 jeblair: I think that is sane otherwise we will be context switching too much 19:36:08 it really is. 19:36:18 bilaterally taking over mordred's changes is less traditional but should be fine! :) 19:36:38 #topic Jobs on trusty 19:36:41 well, the serving things from swift work has implications on the docs publishing as well 19:36:53 so having those lessons learned behind us could help 19:37:08 clarkb: happy to ... just i talked about things with yolanda and she was at the time actively working on things, but if that is no longer the case, cool 19:37:10 jobs on trusty! 19:37:36 #link https://review.openstack.org/121931 19:37:55 that'll be ready to merge as soon as i'm done confirming the remaining bare-trusty image updates complete 19:38:10 #link https://etherpad.openstack.org/p/py34-transition 19:38:38 that's getting whittled down though there are still a number of outstanding changes linked there which need to merge, and other projects which still need fixes 19:38:59 a few have yet to be investigated yet 19:39:15 the big ones are related to broken things in py34 whcih makes this a bit difficult 19:39:19 overall the majority of our working and voting python33 jobs run well under 3.4 as well 19:39:42 but yeah, we do need at least one ubuntu sru to the python3.4 package in trusty 19:39:55 fungi: is that in progress? 19:40:20 i believe the ubuntu package maintainer has not yet triaged the bug 19:40:26 the bug is filed, lifeless noted it is a good sru candidate, but unsure of where to go from there 19:40:34 hunt down the package maintainer? 19:40:42 with torches 19:40:44 yell at zulcss? 19:41:12 now, now... we don't want to make zul ragequit 19:41:29 but yeah, i'll try to help get it more visibility 19:41:34 i think he likes being yelled at 19:41:42 at least, that's what mordred told me 19:41:57 it's currently impacting oslo.messaging's unit tests on 3.4 19:42:09 segfault in the interpreter, even 19:42:13 fungi: and potentially any ubuntu software run on py3.4 19:42:19 right 19:42:27 since it is a subtle gc bug figure out all the affected things is hard 19:42:48 you should say since it's a segfault, it might be a security bug. 19:42:49 #link https://launchpad.net/bugs/1367907 19:42:51 Launchpad bug 1367907 in python "Segfault in gc with cyclic trash" [Unknown,Fix released] 19:43:05 (for those not wanting to dig it out of the etherpad) 19:44:09 might be a stretch to tease code execution out of an improper cast in the gc, but denial of service is a possibility i suppose 19:44:43 anything else? 19:44:45 also it's happening on teardown looks like 19:45:26 nah, that covers current state for getting rid of the py3k-precise nodes, but not sure what the current state is for the other outstanding precise migration needs 19:45:53 at some point we can hopefully at least simplify if not remove the custom parameter function 19:46:00 clarkb: maybe we can check on that for next week? 19:46:18 #topic Manila project renaming (fungi, bswartz) 19:46:49 check on the bug? 19:47:28 clarkb: other parts of the precise->trusty transition 19:47:47 for scheduling the manila project move, i have stuff going on this weekend (wife's birthday, inlaws visiting) and also won't be around thursday, so unless we want to do the manila rename friday i'll have to bow out. otherwise we punt to next week 19:48:04 i could do friday 19:48:16 friday is good here as well 19:48:31 okay, let's say friday then... early afternoon pst? 19:48:41 things are less insane than in recent weeks, we can probably swing it with only a minor disruption in service. 19:48:42 or late morning pst? 19:49:19 early afternoon works for me if it works for you, fungi 19:49:25 19:00 utc good? 19:49:39 or maybe 20:00 so it doesn't hit lunch? 19:50:05 20:30? 19:50:06 ++ to 2000 19:50:10 or 2030 19:50:13 20:30's good 19:50:20 (since it takes a bit to prepare) 19:50:34 i'll send an e-mail to the -dev ml to give everyone including manila devs a heads up 19:50:41 #agreed rename manila at 20:30 utc on friday sept 17 19:50:46 oops 19:50:48 #agreed rename manila at 20:30 utc on friday sept 19 19:51:05 * jeblair just remembered undo 19:51:09 hah 19:51:16 #topic Fedora/Centos testing updates (ianw 09-16-2014) 19:51:25 hey, we can skip most of this 19:51:31 f20-bare nodes merged, thanks 19:51:35 i'll keep an eye on them 19:51:48 got a d-i-b update. still working on the centos7 images in d-i-b 19:52:03 cool, and we're obviously not quite ready to use it anyway 19:52:18 i am told that HP have production ready centos7 images, so i will be keeping an eye on that and hoping to bring up nodes there when it's ready 19:52:36 that's all for that 19:52:40 #topic Nodepool min-ready issues (ianw 09-16-2014) 19:53:00 #link https://review.openstack.org/#/c/118939/ 19:53:04 has 2 +2s 19:53:07 is the only holdup with this change just review backlog? if a different approach is wanted, i can work on it 19:53:21 so we could probably merge it at will 19:53:42 if anyone wants to review it, do so soon, otherwise i'll merge it, say, tomorrow? 19:53:58 and maybe we can slip in a friday nodepool restart 19:54:26 ok, i'll watch out for updates 19:54:38 #topic Log Download (ianw 09-16-2014) 19:54:53 #link https://review.openstack.org/#/c/120317/ 19:54:57 so i really would like to download a bundle of logs when debugging gate failures 19:55:16 is that review on the right track, or would we rather see it done some other way 19:55:26 sdague: ^ fyi 19:55:37 wget --recursive overriding robots.txt kind of sucks 19:55:50 jhesketh: points out that it should perhaps be included in os loganalyze 19:55:52 and sends down uncompressed logs 19:55:55 I would like to discuss if it fits within osloganalyze 19:56:15 which kind of makes sense to me, since we're really looking at that as our interface to the logs now 19:56:15 which has started diverging from just log markup 19:56:38 well it raises the question of if it should be doing that, but I'm not sure we want to get into that discussion 19:57:38 well, we've already made that choice 19:58:01 it seems like a reasonable fit, and a reasonable feature request 19:58:02 so is the general conclusion move it as a feature of os-loganalyze? 19:58:18 in my opinion, yeah 19:58:21 ianw: can you look into whether that makes sense? 19:58:27 this is a crazy idea so maybe ignore me, but what if the tests ship a tarball only 19:58:38 then loganalyze can serve from with in that? that doesn't deal with swift well 19:58:40 nevermind 19:59:08 why doesn't it deal with swift well? 19:59:15 ok, i'll look at putting it in there 19:59:27 fungi: because we would have to retrieve the entire tarball to get a single file 19:59:38 (or at least potentially the whole file) 19:59:46 fungi: wich will only make the slowness worse 19:59:47 oh, i get it. yeah without local caching that's probably badbadbadness 20:00:05 time 20:00:25 thanks to jhesketh for being here! 20:00:26 thanks everyone; we'll move topics we didn't get to to the top of the agenda next time 20:00:30 #endmeeting