16:01:19 <vkozhukalov> #startmeeting Fuel
16:01:20 <openstack> Meeting started Thu Sep 11 16:01:19 2014 UTC and is due to finish in 60 minutes.  The chair is vkozhukalov. Information about MeetBot at http://wiki.debian.org/MeetBot.
16:01:21 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote.
16:01:23 <openstack> The meeting name has been set to 'fuel'
16:01:27 <vkozhukalov> #chair vkozhukalov
16:01:28 <openstack> Current chairs: vkozhukalov
16:01:32 <vkozhukalov> hey guys
16:01:41 <vkramskikh> hi
16:01:42 <vkozhukalov> agenda as usual
16:01:52 <vkozhukalov> #link https://etherpad.openstack.org/p/fuel-weekly-meeting-agenda
16:01:58 <agordeev> hi
16:02:11 <vkozhukalov> so let's start
16:02:21 <mihgen> hi all
16:02:24 <vkozhukalov> looks like most of us are here
16:02:35 <vkozhukalov> #topic Patching of OpenStack: current status, remaining issues. (vkuklin)
16:02:42 <aglarendil> hi everyone
16:02:47 <mattymo> hi all
16:02:49 <vkozhukalov> aglarendil: hi
16:02:57 <aglarendil> we merged almost all of the fixes for patching and rollback support
16:03:07 <aglarendil> these fixes make patching and rollback work
16:03:20 <aglarendil> but impose some restrictions on the upgrade process
16:03:27 <meow-nofer__> Hi
16:03:47 <aglarendil> #link https://bugs.launchpad.net/fuel/+bug/1367753
16:03:48 <uvirtbot> Launchpad bug 1367753 in fuel "[docs] Documentation for patching caveats and problems should be added " [High,Confirmed]
16:04:10 <aglarendil> we are removing all the openstack packages prior to deployment of newer openstack code
16:04:40 <aglarendil> and we are also patching nodes one by one in order to allow a user to interfere if anything goes wrong
16:04:57 <aglarendil> right now we have only a couple of issues related to patching
16:05:21 <aglarendil> #link https://bugs.launchpad.net/fuel/+bug/1364465
16:05:22 <uvirtbot> Launchpad bug 1364465 in fuel/6.0.x "[library] Neutron server is down after rollback: "Unable to load quantum from configuration file"" [High,In progress]
16:05:39 <aglarendil> this bug fix is related to centos packaging issue and the fix is under review
16:05:47 <aglarendil> #link https://bugs.launchpad.net/fuel/+bug/1367785
16:05:49 <uvirtbot> Launchpad bug 1367785 in fuel "[upgrade] Openstack cluster update failed: /usr/bin/murano-db-manage --config-file=/etc/murano/murano.conf upgrade returned 1 instead of one of [0]" [High,In progress]
16:06:14 <aglarendil> this seems to be unclear to me whether it is going to be fixed today
16:06:25 <aglarendil> but according to the comments it is not a rocket science
16:06:31 <aglarendil> and does not block other features except murano
16:07:06 <aglarendil> so these are all the issues related to patching, I guess
16:07:14 <aglarendil> correct me if I am mistaking
16:07:50 <aglarendil> it looks like that's it
16:07:53 <vkozhukalov> ok, looks not very optimistic
16:07:55 <aglarendil> any questions
16:08:03 <vkozhukalov> thanx aglarendil
16:08:29 <vkozhukalov> if no questions moving on
16:08:35 <mihgen> I think we can't stil call it as production feature though
16:08:41 <mihgen> still too many bugs, late moment
16:08:47 <angdraug> +1
16:08:52 <mihgen> we want to release next week, can't wait any longer
16:09:00 <mihgen> and also QA doesn't give a green flag
16:09:16 <mihgen> and it's unclear how many issues we might encounter next week with the feature :(
16:09:46 <mihgen> there was a conversation in openstack-dev ML, I've marked issues by "experimental" tag already
16:10:03 <mihgen> which means that patching is experimental along with zabbix for 5.1
16:11:04 <mihgen> ok I think we should move on..
16:11:24 <vkozhukalov> #topic HCF status (let's go over remaining issues) and 5.1 release proposed dates (mihgen)
16:11:47 <asyriy> Hello, I am working on the issue with HP driver
16:12:14 <angdraug> link?
16:12:18 <mihgen> folks we still have around 10 bugs which block us from HCF
16:12:38 <mihgen> I'd like to go over some of them - we should get only 5, no more
16:12:39 <asyriy> https://bugs.launchpad.net/fuel/+bug/1359331
16:12:40 <uvirtbot> Launchpad bug 1359331 in fuel "Add support for HP BL120/320 RAID controller line" [High,Confirmed]
16:13:00 <mihgen> asyriy: I thought we have decided that's gonna be documented only for 5.1 ?
16:13:11 <asyriy> yes,
16:14:41 <mihgen> ok. so let's go one by one other bugs which affect us
16:14:43 <mihgen> #link https://bugs.launchpad.net/fuel/+bug/1322230
16:14:44 <uvirtbot> Launchpad bug 1322230 in fuel "[library] ERR: ceph-deploy osd activate node-11:/dev/sdc4 returned 1 instead of one of [0]" [High,Confirmed]
16:14:52 <mihgen> this was reproduced again by QA engineer
16:15:01 <mihgen> I hope angdraug can take a look on this today
16:15:10 <angdraug> rmoe is looking into this
16:15:19 <mihgen> angdraug: ok, cool
16:15:20 <rmoe> as we speak actually
16:15:51 <mihgen> angdraug: one more on you -
16:15:54 <mihgen> #link https://bugs.launchpad.net/fuel/+bug/1331554
16:15:56 <uvirtbot> Launchpad bug 1331554 in fuel "[library] Raise max file descriptors and process file limits for radosgw" [High,Triaged]
16:16:04 <angdraug> yes, I think I'm going to need help from msemenov_ here
16:16:35 <angdraug> we need to repackage apache and radosgw with init scripts and user assignments that resolve the nofile and nproc limit issues
16:16:54 <angdraug> may be too disruptive for this late in the process
16:17:00 <angdraug> thoughts on postponing to 6.0?
16:17:16 <mihgen> how does it affect the user?
16:17:24 <aglarendil> angdraug: no other workarounds possible?
16:17:36 <angdraug> if there's 24 or more cores on the system, radosgw may refuse to start
16:18:03 <aglarendil> let's then document it and ask the user to modify init script for radosgw
16:18:25 <angdraug> agreed
16:18:29 <mihgen> yeah I think we can add a note to release note about this limitation..
16:18:45 <alex_didenko> we can make a work-around for centos - just put a file into /etc/security/limits.d/ with nproc for "apache" user
16:18:46 <mihgen> angdraug: let's mark as won't fix for 5.1 then, not just move to 6.0
16:18:56 <mihgen> or otherwise we gonna forget to document
16:19:09 <angdraug> I've added release-notes tag to the bug
16:20:05 <mihgen> ok. moving on
16:20:09 <mihgen> #link https://bugs.launchpad.net/fuel/+bug/1360230
16:20:11 <uvirtbot> Launchpad bug 1360230 in fuel "glance issue on ubuntu/gre/ceph/simple" [High,Confirmed]
16:20:33 <mihgen> this is tricky issue which can't be easily reproduced as I understand
16:20:40 <mihgen> reported by akasatkin
16:20:45 <akasatkin> VBox only. ddmitriev and xenolog are looking into it.
16:20:47 <mihgen> not sure about the latest status. aglarendil, xenolog ?
16:21:04 <akasatkin> it's 100% repoducible on particular configurations
16:21:37 <aglarendil> akasatkin: mihgen we could not reproduce it with our system tests
16:21:42 <aglarendil> it is really tricky
16:21:48 <mihgen> ddmitriev: xenolog Any estimates?
16:21:49 <akasatkin> it's VBox only
16:21:57 <mihgen> aglarendil: tricky == vbox really only?
16:22:08 <mihgen> or it's possible to repro with other virt
16:22:09 <akasatkin> not sure )
16:22:09 <mihgen> or hw
16:22:11 <angdraug> if it's really vbox only it's Medium, not High
16:22:26 <aglarendil> mihgen: I have never heard of reproducing of this bug not on VBox
16:22:36 <ddmitriev> it's not reproduced on libvirt, but only on vbox
16:23:55 <mihgen> ddmitriev: you tried exactly same configuration?
16:24:09 <mihgen> in libvirt? I'm wondering if we can move it to 6.0..
16:24:10 <xenolog> mihgen: I can't answer now. This issue reproduced only on virtualbox and not each try
16:24:35 <ddmitriev> mihgen: yes, the same configuration several times on different servers
16:24:55 <mihgen> ok. so if you all confirm that it's only vbox, then let's close as won't fix in 5.1 and track in 6.0
16:25:07 <mihgen> let's move on
16:25:09 <mihgen> #link https://bugs.launchpad.net/fuel/+bug/1367001
16:25:11 <uvirtbot> Launchpad bug 1367001 in fuel "fuel allows you to remove all controllers and add new controllers in the same task" [High,In progress]
16:25:16 <mihgen> aglarendil: https://review.openstack.org/#/c/120741/
16:25:29 <mihgen> aglarendil: folks are looking for your +1 there
16:25:41 <mihgen> there is comment left "Vladimir, so it is ok, if user will be able to delete all controllers, except 1, and deploy another set of controllers in the same task?"
16:25:50 <angdraug> xarses: ^
16:26:14 <mihgen> yeah it was reported by xarses, so xarses pls take a look too
16:26:35 <aglarendil> I +1'ed it but xarses should also take a look
16:26:48 <mihgen> aglarendil: thx
16:26:51 <mihgen> #link https://bugs.launchpad.net/fuel/+bug/1367234
16:26:54 <uvirtbot> Launchpad bug 1367234 in fuel/5.1.x "[library] Logs from MongoDB server should be written to a separate file instead of syslog" [High,In progress]
16:27:08 <mihgen> this one is fixed by tzn
16:27:19 <mihgen> but we are waiting for Fuel CI for stable branches
16:27:26 <mihgen> then we should merge it. It's already in master
16:27:32 <mihgen> #link https://bugs.launchpad.net/fuel/+bug/1367340
16:27:33 <uvirtbot> Launchpad bug 1367340 in fuel/5.0.x "Ensure reliable cleanup of resources in 5.0.X" [High,In progress]
16:27:46 <xarses> Ok I will look
16:28:01 <mihgen> aglarendil: do you know anything on it?
16:28:09 <mihgen> it's on dilyin but he is not in the room
16:28:27 <aglarendil> let me summon him
16:29:17 <aglarendil> dilyin: please comment on https://bugs.launchpad.net/fuel/+bug/1367340
16:29:19 <uvirtbot> Launchpad bug 1367340 in fuel/5.0.x "Ensure reliable cleanup of resources in 5.0.X" [High,In progress]
16:29:58 <dilyin> Ok.
16:30:47 <dilyin> We have a logic incide out corosync service provider that should cleanup service error before the service is started if there are some errors.
16:32:08 <mihgen> dilyin: so what about it?
16:32:10 <dilyin> Previously we were using it to workaround errors during the deploymebtof secoind and third conteroller
16:32:48 <dilyin> Now we don't need it anymore because we have changed the mode our ha cluster works.
16:33:18 <dilyin> But the code that should clean up any ppacemaker errors and start the service is still very usefull.
16:34:00 <dilyin> During the testing of our patching ther was a situation when pacemaker services could not start because they were in error state.
16:34:35 <mihgen> dilyin: > Now we don't need it anymore because we have changed the mode our ha cluster works ---doesn't it mean that we can safely postpone it to 6.0?
16:34:53 <dilyin> Theroreticly suchj situation should ever happen but it it does happen this errors should have been cleaned up ad services should start anyway
16:35:11 <dilyin> but for some reason this code did not work anbd deployment have failed
16:35:36 <dilyin> so i have inserted duplicate cleanup to ensure that any errors will be cleaned up in any situation
16:36:00 <aglarendil> mihgen: briefly
16:36:13 <aglarendil> mihgen: I think we can leave it as is until we get a reproducer
16:36:16 <aglarendil> in 5.0.x
16:36:35 <mihgen> "as is" == confirmed?
16:36:55 <dilyin> This patches should not be harmful i hope. And it would be better to accept them because they can fix some random errors that could appear for some reason
16:37:25 <angdraug> that's a lot of "maybe"
16:37:31 <aglarendil> angdraug: yep
16:37:46 <mihgen> so what's the decision should we make? how safe is the patch?
16:37:58 <mihgen> should we apply it to master only, to both stable/5.1 & master ?
16:38:12 <dilyin> let's test these patches and if there are no problems accept them
16:38:31 <mihgen> dilyin: Fuel CI tests are not enough?
16:38:32 <aglarendil> let's test them with custom tests and bvt
16:38:44 <aglarendil> they may affect other scenarios
16:38:45 <mihgen> Fuel CI does full HA deploymetn
16:38:58 <dilyin> to master... no. we should refactor out corosync module a lot in future. it's really a mess
16:39:01 <aglarendil> let's just test it tomorrow and decide
16:39:15 <mihgen> ok I see.
16:39:24 <mihgen> #link https://bugs.launchpad.net/fuel/+bug/1367776
16:39:25 <uvirtbot> Launchpad bug 1367776 in fuel "[library] ovs bond should be deployed with lacp_time  fast" [High,In progress]
16:39:39 <mihgen> xenolog prepared a fix
16:39:50 <aglarendil> we need to test it
16:39:51 <mihgen> I've requested reported to take a look and test if possible on hw
16:40:01 <mihgen> no response from him yet
16:40:06 <aglarendil> let's move on them
16:40:14 <aglarendil> we do not have a lot of time
16:40:46 <mihgen> so what our decision would be now?
16:41:01 <aglarendil> mihgen: wait for tomorrow at least
16:41:12 <aglarendil> if there are no results - postpone it
16:41:12 <xenolog> I tested fix on simple env with bond — interfaces created sucessfully.
16:41:13 <xenolog> system tests should test fix for situations without bonds.
16:41:52 <mihgen> ok let's wait for the EOD PST
16:42:03 <mihgen> if no response is given, I move it to 6.0
16:42:18 <mihgen> it's actually doesn't affect much, and rather medium than high..
16:42:35 <vkozhukalov> 18 minutes and 4 other topics
16:42:37 <aglarendil> mihgen: let's wait for tomorrow. we have 18 minutes left
16:43:14 <mihgen> Ok we also have two bugs on UI, but those are trivial - to enable experimental mode for patching and disable 5.0/5.0.1/5.0.2 in new deployments for non-experimental after upgrade
16:43:21 <mihgen> vkozhukalov: sorry, let's move on
16:43:31 <vkozhukalov> #topic Upstream Neturon and ML2 Support (xarses)
16:43:48 <xarses> ml2-neutron - updated from newer puppet neutron. but had to back out (2) commits that make a it require puppetlabs/mysql 2.2 or greater.
16:43:48 <xarses> #link https://review.openstack.org/#/c/103280/
16:43:48 <xarses> Its rebased onto current master and am working through some bugs related to the rebase.
16:43:48 <xarses> There are still a number of commits that where made to our neutron manifests that need to be evaluated if they are included in the rebase
16:44:22 <angdraug> any chance of finishing this today?
16:44:32 <xarses> poor at this rate
16:45:02 <angdraug> in that case you'll need to document your progress so far for xenolog to take over while you're on vacation
16:45:11 <xarses> of course
16:45:28 <angdraug> moving on?
16:46:30 <vkozhukalov> if no questions
16:46:33 <mihgen> vkozhukalov: let's move on, yeah
16:46:43 <vkozhukalov> #topic Upstream manifests merging (alex_didenko)
16:46:58 <alex_didenko> so in 5.1 we've updated "stdlib" and applied essential compatibility patches in "mysql" and "keystone" which allow us to use old and new openstack puppet modules simultaneously
16:47:05 <alex_didenko> removed unused puppet modules like: nagios, mmm, squid, git, keepalived, puppetdb, selinux
16:47:10 <alex_didenko> OS modules synced with upstream (4.0.0 for Icehouse): cinder, glance, heat, nova
16:47:15 <alex_didenko> a part of dependency modules synced with upstream like: firewall, apt, ssh, concat, lvm, memcached
16:47:20 <alex_didenko> in progress and on review OS modules: keystone (need to update "swift" firts), neutron, ceilometer
16:47:26 <alex_didenko> we tried to do minimal changes in the synced from upstream modules and move all our customizations into our own modules/classes (like "openstack::cinder") where it's possible
16:47:28 <alex_didenko> done :)
16:47:48 <angdraug> are we reporting/proposing our fixes back to upstream on ongoing basis?
16:47:52 <xarses> alex_didenko: are we pushing them back upstream?
16:48:08 <mihgen> alex_didenko: excellent, thanks!
16:48:18 <alex_didenko> not really, only 1 fix was pushed upstream atm
16:48:45 <xarses> moving the changes into our modules, are going to cause the same issues as we had with out of sync modules. We do work, and it gets lost, or conflicted
16:48:53 <alex_didenko> there are not many fixes, mostly applying our logic to existing manifests
16:49:09 <angdraug> as long as it's fuel specific and not working around upstream bugs, fine
16:49:22 <mihgen> yeah folks let's discuss it in ML
16:49:23 <angdraug> if we have to work around bugs or missing features in upstream, we should at least report them
16:49:28 <alex_didenko> ideally in the future we should have all our login in our modules and patch upstream ones only with improvements and fixes
16:49:33 <angdraug> mihgen: +1
16:49:53 <vkozhukalov> ok moving on
16:49:56 <alex_didenko> then we'll be able to push a lot of those changes into upstream
16:50:06 <vkozhukalov> #topic Juno support (mattymo)
16:50:15 <alex_didenko> all our login = all out logic
16:50:41 <vkozhukalov> mattymo: around?
16:50:44 <mattymo> yes
16:50:57 <xenolog> alex_didenko, +1
16:51:04 <mattymo> I recently did some deployments with the MOS team on juno-2 milestone. We found a lot of fun dependency issues, and are taking them out relatively quickly
16:51:19 <mattymo> we have almost everything deploying, except glance, murano, and heat (more dependency issues)
16:51:33 <mattymo> There were just 3 bugs in puppet that had to be addressed, which is nice
16:51:34 <mihgen> mattymo: neutron?
16:51:36 <xenolog> I already propose making wrappers oround upstream manifests.
16:51:38 <mattymo> yes neutron works
16:51:50 <mihgen> mattymo: excellent
16:51:53 <mattymo> But today MOS team is rebasing on juno-3 packages, so we'll have to backpedal a bit and retest for dependency bugs
16:52:03 <mattymo> It's moving along well, and all Fuel Master components work fine on Juno libs
16:52:13 <mattymo> (that's it for me)
16:52:14 <mihgen> mattymo: are we staring to land changes into fuel-library master?
16:52:18 <mattymo> mihgen, not yet
16:52:32 <angdraug> is master/6.0 ci fully operational now?
16:52:32 <mattymo> mihgen, there's a number of patches ready to merge whenever core reviewers would feel generous enough to hit +2
16:52:40 <mihgen> mattymo: thanks, good progress. So for changes - they should be backward compatible with icehouse
16:52:44 <mihgen> then Fuel CI will do +1
16:53:09 <mihgen> CI and mirrors to rvyalov and bookwar_
16:53:10 <mattymo> mihgen, not sahara. Sahara renames the service from sahara-api to sahara-all
16:53:38 <angdraug> general note: lets not forget about kilo, and have a package repo & ci jobs for that as soon as stable/juno is branched
16:54:03 <rvyalov> angdraug: yes
16:54:21 <mihgen> angdraug: kilo is too early if we can't get fast juno yet
16:54:21 <bookwar_> mihgen: we are yet to define openstack mirrors versioning scheme
16:54:33 <angdraug> not saying right now, but we should plan for it
16:54:34 <vkozhukalov> 5 minutes
16:54:37 <mihgen> bookwar_: omg I thought it's resolved
16:54:40 <vkozhukalov> 1 topic
16:54:52 <mihgen> let's discuss issues in ML then please bookwar_
16:54:59 <mihgen> if broader opinion needed
16:55:10 <vkozhukalov> #topic Blueprints under question
16:55:22 <vkozhukalov> #link https://blueprints.launchpad.net/fuel/+spec/blank-role-node
16:56:11 <vkozhukalov> #link https://blueprints.launchpad.net/fuel/+spec/cpu-overcommit-setting
16:56:20 <vkozhukalov> #link https://blueprints.launchpad.net/fuel/+spec/coreos-fuel-master (mattymo)
16:56:28 <mihgen> well these are not for 4 minutes
16:56:46 <vkozhukalov> mihgen: agree
16:56:49 <mihgen> we might want to bring those to discuss during Fuel Meetup..
16:56:49 <xarses> I like the idea of coreos, but bp is mostly boiler plate and needs to be updated
16:56:50 <mattymo> I just wanted my blueprint to get some public visibility and welcome any comments and suggestions
16:57:08 <mihgen> for blank-role-node, +1 for the idea..
16:57:16 <mihgen> not sure how easy it is though
16:57:16 <angdraug> yes, coreos is tempting but fitting it into 6.0 is going to be a problem
16:57:24 <xarses> angdraug: +1
16:57:25 <angdraug> blank-role-node -- should be simple enough
16:57:39 <angdraug> ditto overcommit
16:57:42 <angdraug> my 2c :)
16:57:52 <vkozhukalov> one more time about september Fuel meetup
16:58:04 <vkozhukalov> #link https://etherpad.openstack.org/p/fuel-september-meetup-2014
16:58:14 <vkozhukalov> everyone is welcome to join
16:58:34 <vkozhukalov> it is supposed to be held likely at Mirantis office
16:58:43 <vkozhukalov> 615 National Ave, Mountain View
16:58:59 <mihgen> thanks
16:59:03 <vkozhukalov> ok, thanx everyone
16:59:07 <vkozhukalov> great meeting
16:59:14 <mihgen> thanks
16:59:15 <vkozhukalov> #endmeeting