16:01:16 #startmeeting Fuel 16:01:17 Meeting started Thu Apr 24 16:01:16 2014 UTC and is due to finish in 60 minutes. The chair is vkozhukalov. Information about MeetBot at http://wiki.debian.org/MeetBot. 16:01:18 Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 16:01:20 The meeting name has been set to 'fuel' 16:01:34 Who is here. Checking in. 16:01:44 Hello! 16:01:51 Hi! 16:02:07 agenda is here 16:02:12 #link https://etherpad.openstack.org/p/fuel-weekly-meeting-agenda 16:02:15 Hi! 16:02:20 Hi! 16:02:31 #topic Announcements 16:02:48 mihgen was going to say some words 16:03:05 #topic current status (overall), bugs statistics 16:03:14 hi all. we are in a phase of acting bug squashes 16:03:29 with still a few exceptions on mandatory for 5.0 features 16:03:38 so let's talk about bugs first. 16:04:06 after consuming icehouse, I expected even more bugs, frankly, but looks like we are pretty good 16:04:19 over last week there were 53 income bugs 16:04:26 and 94 went away 16:04:38 great 16:04:50 it means that we are doing good job on squashing them. the bad side of the story is that we still have a lot of bugs unresolved 16:05:07 closed 266 in 5.0, still open - 174 16:05:09 is there a link where we can see the summary? 16:05:18 #link http://fuel-launchpad.mirantis.com/project/fuel/bug_table_for_status/Open/5.0 16:05:33 I'm using this to see stats, it gathers stats from launchpad 16:05:47 and a plot 16:05:51 #link http://fuel-launchpad.mirantis.com/project/fuel/bug_trends/5.0 16:06:06 another thing is that we must still support stable/4.1 branch 16:06:25 hi, everyone 16:06:29 folks, please propose changes there, which have tag "backports-4.1.1" in LP 16:06:42 and by proposing them try to no to break it :)_ 16:07:00 we broke stable/4.1 last night, I hope the fix is on the way 16:07:04 aglarendil: right?) 16:07:21 mihgen: yep. there was some miscommunication 16:07:23 angdraug: there were no rabbit3, but we merged puppet module manifests relying on it 16:07:53 looking forward for it being fixed soon. 16:08:02 Ok, about exceptions 16:08:06 yeah, I missed the part about ha-mode in the rabbitmq3 ttl commit 16:08:31 #chair vkozhukalov 16:08:32 Current chairs: vkozhukalov 16:08:34 folks will provide exact status, but we have basically fuel upgrades feature still incomplete and few things around it 16:08:50 that's it for update. 16:08:56 forgot to set a chair ) 16:09:04 anything else I should mention about? 16:09:20 vkozhukalov: let's proceed otherwise 16:09:27 moving on 16:09:43 #topic Activities updates & questions 16:09:49 and thanks everyone for such heroic work on squashing bugs and staying late, but ensuring that ISO works!!! 16:09:57 #topic fuel master upgrade scripts status 16:10:02 evgeniyl: your time) 16:10:08 evgeniyl, you first then I'll talk 16:10:17 ok 16:10:42 mattymo: please wait before I change a topic 16:11:00 *till 16:11:57 So, I had a lot of problems like https://github.com/dotcloud/docker-py/pull/200 with docker-py, it's a binding for docker, today I had working nailgun/nginx/astute, and had a problem with rabbitmq, then I took new iso and images, suddenly there was broken postgresql, I fixed it, not I'm trying to continue to work on rabbit.. 16:12:20 s/not/now/ 16:12:43 evgeniyl: how far are you from finishing this? 16:12:55 broken psql? 16:13:27 dpyzhov: postgresql container was broken, I had to take oldest one. 16:14:09 mihgen: I hope today it will work with additional unmerged patches for docker-py 16:14:27 evgeniyl: ok, looking forward to see it working .... 16:14:36 anything else you want to add? 16:14:48 evgeniyl, still using new postgresql version, right? 16:15:12 meow-nofer_: it's not related with new postgresql 16:15:19 meow-nofer_: there was just broken container 16:15:38 evgeniyl: thanks for the status 16:15:44 moving on? 16:16:03 #topic master node containerization current issues and caveats 16:16:08 caveats :) 16:16:09 mattymo: your turn 16:16:13 We had a number of recurring bugs that really needed to get addressed by the QA team which required fast iso building and distribution, so we split up into two halves this week. 16:16:28 I built containers using early scripts and got containers out to QA and found bugs while adidenko and holser_ worked on makefile improvements and polishing container workflow. 16:17:03 bugs discovered during this time include intermittent network issues, masquerading of source ips (affects rsyslog), and other intermittent connection issues 16:17:09 Accomplished items this week include puppet flow bugs with OSTF, yaml placement and symlink issues, and we found a workaround for issues with docker inter-container communication via iptables. 16:17:41 and on behalf of adidenko and holser_, we have functional scripts that integrate well into our Fuel ISO preparation which run puppet inside a container for each service, then export and compress in a 280mb tar.lrz file and bundle it in the iso and deploys beautifully (But a bit slowly) 16:18:30 mattymo: does it mean that we actually have everything in place for building ISO, but just need to fix bugs? 16:18:43 I believe our latest work items include testing the workaround for docker ICC breakage and log rotation 16:18:57 yes I believe we're mergable today 16:19:00 ICC = inter-container communication 16:19:22 mattymo: ok, that's cool 16:19:30 let's get folks to review 16:19:50 mattymo: and when evgeniyl's code is ready, we will simply replace bash scripts with his code? 16:20:03 actually I would review some Go code, if you have it :) 16:20:55 no 16:20:58 his code relies on mine 16:21:19 mattymo: any other details? moving on? 16:21:20 evgeniyl's system currently does orchestration, I believe, to migrate data from 1 container to another, or to simply replace 16:21:50 dockerctl still gets used for starting, stopping, and doing all the pre/post hooks necessary to get each service up. It ought to be rewritten in python completely 16:21:56 evgeniyl: I thought it's foe bootstrap of master too 16:23:11 ok, moving on 16:23:13 mihgen: we can use my code, but it's not stable enough, and I don't think that we will have tim to test it properly 16:23:25 guts, I have a question on make iso 16:23:34 s/guts/guys/ 16:23:43 alex_didenko: go ahead 16:24:20 do we need an option to make iso with docker and without? Something like variable USE_DOCKER=true/false 16:24:24 evgeniyl: I meant you code to be used in bootstrap_admin_node.sh 16:24:48 so if we export USE_DOCKER=false and run "make iso" - it builds the old-style Fuel iso 16:24:58 I'm not sure 16:25:04 dpyzhov: do we need it? 16:25:10 mihgen: why not? 16:25:19 mihgen: it looks like a good idea 16:25:23 alex_didenko: we have such flag 16:25:23 +1 16:25:36 well ideally I'd like to avoid additional axis in matrix of things which should work 16:25:53 so to work always on production ISO 16:25:56 I think we ought to leave a fallback route just in case we find a critical bug we can't overcome 16:25:59 but default to docker 16:25:59 right now we use PRODUCTION=prod for old-fashion isos and PRODUCTION=docker for docker 16:26:07 unless there are very good points saying no 16:26:14 dpyzhov, has the route I wanted to use too 16:26:20 just because it's a precedent 16:26:27 mattymo: that's the only exception I can see so far 16:26:38 but moving forward, I don't think we need it 16:26:55 if we work on non-containers mode, and release in containers, we are always with risk 16:26:55 I think we should remove it when docker option becomes stable and proven 16:27:00 of breaking stuff 16:27:10 angdraug: ok, then I'd agree 16:27:24 but we are having some issues with iso building performance 16:27:26 +1 to "docker only" once it's stable 16:27:28 alex_didenko: how much resources does it take to support both 16:27:34 and bootstraping of master node AFAIK 16:27:53 mihgen: it should not take much time 16:27:59 few hours maybe 16:28:10 how much do we pay now for containers? 16:28:21 a lot, I guess 16:28:23 how much time does it take to build ISO, and install master node 16:28:38 in comparison with what we had before 16:28:50 it’s a huge difference 16:28:50 to build all containers is an extra 30 minutes.. maybe more or less 16:29:06 building ISO with docker containers on 1 CPU VM with 2G ram: Took 54 min 16:29:06 if we had a build system with AUFS support it would be much faster 16:29:19 but we build and deploy with devicemapper which is far slower 16:29:53 mattymo: what if I update just nailgun 16:30:01 install of master node is bottlenecked in docker load. it replays each layer in an lxc env, dumps it, adds new layer, etc... 16:30:02 mattymo: would not I have to rebuild all containers? 16:30:26 mihgen, it's hard to tell what code is "just nailgun" with a basic utility 16:30:46 what do you mean under basic utility? 16:30:49 but yes if you make changes to nailgun code you need a new nailgun container 16:31:06 mattymo: then I would need to rebuild only nailgun container, right? 16:31:10 mattymo: is it possible to update pre-built container? 16:31:13 and can download all other existing 16:31:17 or if you change nailgun::venv puppet class, that only affects nailgun container. but if you change nailgun::packages or nailgun::supervisor class, you have to rebuild all 16:31:18 please also not that you can build ISO with prebuilt containers - it will take much less time since it just outs archive with containers on ISO 16:31:25 dpyzhov: that's the question which I had too 16:31:27 s/not/note 16:31:33 alex_didenko, but it won't update the package 16:32:01 ok, we need to talk about it separately 16:32:08 we would need to come up with improvements 16:32:18 anyway, otherwise we slow up development 16:32:24 dpyzhov, yes if we script it to, for sure 16:32:45 mattymo: anything else to add? 16:33:38 mihgen, nope. Let's move on 16:33:41 #topic icehouse support 16:33:49 aglarendil: your topic ;) 16:34:12 I am here 16:34:21 we have almost everything done, except several bugs 16:34:38 fortunately almost all of them are related to packages 16:34:55 nice! 16:35:03 unfortunately for rvyalov ) 16:35:19 1) neutron gre datapath is not working in centos 6.5 - IPGRE demux is compiled in and blocks OVS 16:35:51 2) mysql reconnect patches from oslo are still not in the main code for some projects so we are waiting for hardening team to do this 16:36:20 there have been also several bugs with neutron that we have successfully fixed 16:36:40 I hope we can fix everything by the end of the week 16:37:03 aglarendil: cool 16:37:05 that's all, I think 16:37:07 aglarendil: optimist -) 16:37:11 nurla: do you have anything to add? 16:37:22 regarding icehouse? 16:37:49 mihgen: i'll hope "no" 16:37:54 ) 16:38:04 moving on 16:38:04 thanks. vkozhukalov - let's move on 16:38:21 #topic status of versioning for master upgrade 16:38:35 Upgrade part for Nailgun (propagation of versions and orchestrator parameters) as about to be merged. Some additional validation and cosmetic fixes are to be made – can be added as separate PR or included into this one. Doc: https://etherpad.openstack.org/p/upgrades-orchestrator-data. It will be merged when library part (dilyin) is ready. 16:39:10 akasatkin: can you share links to patchsets involved? 16:39:19 akasatkin: we really should move asap with this 16:39:37 mattymo: evgeniyl: can we survive without these fixes ? 16:39:40 Nailgun: https://review.openstack.org/#/c/87722/ 16:40:00 dilyin started library part today 16:40:19 he is not here unfortunately 16:40:24 how hard is his part? 16:40:40 did he provide any estimates? 16:40:58 aglarendil: are you aware about dilyin's part there? ^^ 16:41:00 I suppose it shouldn't be hard. See etherpad 16:41:08 mihgen: yep 16:41:12 for details 16:41:21 https://etherpad.openstack.org/p/upgrades-orchestrator-data. 16:41:53 ok. vkozhukalov - let's move on 16:42:03 #topic vcenter status 16:42:22 nurla: will you provide status or someone else on ^^ 16:43:08 i can provide some updates 16:43:29 ykotko: are you around? what about testing of vcenter integration? 16:43:36 eshumakher: please go ahead 16:43:58 mihgen: we have problems with our env IT problems, but Egor already create some issues 16:44:38 what about Ubuntu? Are you gonna test it? 16:44:40 as I catch we also should check ubuntu 16:44:47 and VNC? 16:45:16 I see an issue that VMs do not get IPs 16:45:33 the problem is in environment setup 16:45:35 yes, this problems with vlans 16:46:12 does it mean, that we can only test that deploy phase passes? 16:46:29 we still able to create VMs, right? we should be able to get VNC access still 16:46:35 even if network is down for VM 16:47:05 yes, we able to create instance 16:47:06 by listening traffic, we can still try to see if there are DHCP requests trying to get out of vmware host 16:47:31 ok, any more updates on this? are there any items we still have to add? 16:47:38 (except docs) 16:48:02 what abot HA? 16:48:29 are we gonna leave it the way it is done now? 16:48:30 eshumakher: we don't get Roman point 16:48:36 about HA 16:48:51 I sent an email with a question how we gonna handle 16:49:06 there were no response, let's create bug for disabling it then 16:49:14 we can revert it any time otherwise 16:49:27 it's not gonna be HA, so I'm strongly against enabling it 16:49:54 this "HA" version was about having nova-network only on one controller, which is SPoF obviously 16:49:56 +1 16:50:15 eshumakher: will you create a bug? or please ask someone to do so.. 16:50:27 ok, i will 16:50:34 eshumakher: thanks 16:50:39 will Egor test Ubuntu and VNC? 16:50:47 ykotko: will you?) 16:51:13 eshumakher: anything else to discuss? 16:51:19 nope 16:51:27 nurla: I hope ykotko will find time to test ubuntu & vnc .. 16:51:36 vkozhukalov: let's move on 16:51:38 Do we need to open 'open discussion' topic? 16:51:51 I have a few bugs which I'd like to raise here 16:51:57 and clarify status / get folks involved 16:52:05 #topic Open discussion 16:52:14 but I would first listen if someone else has any other topic 16:52:26 so please put "?" and let us know if you do 16:53:01 no one so far? Ok. #1277844 Corosync doesn't stop during the primary controller deployment #link https://bugs.launchpad.net/fuel/+bug/1277844 16:53:03 aglarendil: ^^ 16:53:15 please take a look what's our plan over it 16:53:20 angdraug: xenolog you too 16:53:27 I am not quite aware if we can fix this bug in this release 16:53:47 it requires rewriting of start sequence and usage of pacemaker master control plugin 16:53:59 also it is not quite frequently reproducible 16:54:16 i suggest to move it to 5.1 as soon as Centos 7.0 is available 16:54:19 ok, understood. we will likely slip it then 16:54:24 then we can easily 16:54:34 move to dependency-based initialisation 16:54:46 thanks. next - we have bunch of disk -related issues 16:54:59 vkozhukalov: 16:55:00 #1296985 ceph-deploy osd prepare failed. GenericError: Failed to create 1 OSDs #link https://bugs.launchpad.net/fuel/+bug/1296985 16:55:03 this for example 16:55:13 vkozhukalov: will you do it or we should pass it to someone else? 16:55:32 1306491 no disk information leading to error during node allocation #link https://bugs.launchpad.net/fuel/+bug/1306491 16:55:40 there are more I believe 16:55:57 is #1296985 reproducible? 16:56:10 mihgen, I believe ikalnitsky is now working on disk issue 16:56:44 angdraug: should we close #1267937 No warning for HA with OSD and MON roles on same nodes #link https://bugs.launchpad.net/fuel/+bug/1267937, as there is bp about it https://blueprints.launchpad.net/fuel/+spec/fuel-ceph-roles ? 16:56:54 meow-nofer_: I'm not sure about that 16:57:01 yes, the fix on review already: #link https://review.openstack.org/#/c/89813/ (it's about disks) 16:57:14 mihgen: will take a look at https://bugs.launchpad.net/fuel/+bug/1296985 16:57:41 ikalnitsky: meow-nofer_ that's completely from another story, but it's good to see this being fixed too 16:57:57 vkozhukalov: please de-assign all issues you don't work on them 16:58:07 mihgen: the fuel-ceph-roles BP doesn't fix the docs part of #1267937 16:58:09 vkozhukalov: we will try to find folks to fix.. . 16:58:23 angdraug: ok then should we keep a bug as is in 5.0? 16:58:38 or we need to close it, create separate for docs? or simply move to 5.1? 16:58:45 mihgen: ok 16:58:56 angdraug: we should do something with bug in 5.0 :) 16:59:02 2 minutes 16:59:03 I think we should fix it in the docs for 5.0 16:59:05 mihgen, this is exactly workaround for this issue) 16:59:13 that's why I keep it around and assigned to me 16:59:32 aglarendil: 1274756 Expired Keystone tokens should be cleaned up regularly #link https://bugs.launchpad.net/fuel/+bug/1274756 16:59:38 let's move to #fuel-dev 16:59:43 aglarendil: what are we up to with this? 16:59:57 thanks everyone 16:59:57 we are going to implement memcache as a backend for keystone tokens 17:00:06 thanks 17:00:11 #endmeeting Fuel