16:01:19 #startmeeting Fuel 16:01:20 Meeting started Thu Sep 11 16:01:19 2014 UTC and is due to finish in 60 minutes. The chair is vkozhukalov. Information about MeetBot at http://wiki.debian.org/MeetBot. 16:01:21 Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 16:01:23 The meeting name has been set to 'fuel' 16:01:27 #chair vkozhukalov 16:01:28 Current chairs: vkozhukalov 16:01:32 hey guys 16:01:41 hi 16:01:42 agenda as usual 16:01:52 #link https://etherpad.openstack.org/p/fuel-weekly-meeting-agenda 16:01:58 hi 16:02:11 so let's start 16:02:21 hi all 16:02:24 looks like most of us are here 16:02:35 #topic Patching of OpenStack: current status, remaining issues. (vkuklin) 16:02:42 hi everyone 16:02:47 hi all 16:02:49 aglarendil: hi 16:02:57 we merged almost all of the fixes for patching and rollback support 16:03:07 these fixes make patching and rollback work 16:03:20 but impose some restrictions on the upgrade process 16:03:27 Hi 16:03:47 #link https://bugs.launchpad.net/fuel/+bug/1367753 16:03:48 Launchpad bug 1367753 in fuel "[docs] Documentation for patching caveats and problems should be added " [High,Confirmed] 16:04:10 we are removing all the openstack packages prior to deployment of newer openstack code 16:04:40 and we are also patching nodes one by one in order to allow a user to interfere if anything goes wrong 16:04:57 right now we have only a couple of issues related to patching 16:05:21 #link https://bugs.launchpad.net/fuel/+bug/1364465 16:05:22 Launchpad bug 1364465 in fuel/6.0.x "[library] Neutron server is down after rollback: "Unable to load quantum from configuration file"" [High,In progress] 16:05:39 this bug fix is related to centos packaging issue and the fix is under review 16:05:47 #link https://bugs.launchpad.net/fuel/+bug/1367785 16:05:49 Launchpad bug 1367785 in fuel "[upgrade] Openstack cluster update failed: /usr/bin/murano-db-manage --config-file=/etc/murano/murano.conf upgrade returned 1 instead of one of [0]" [High,In progress] 16:06:14 this seems to be unclear to me whether it is going to be fixed today 16:06:25 but according to the comments it is not a rocket science 16:06:31 and does not block other features except murano 16:07:06 so these are all the issues related to patching, I guess 16:07:14 correct me if I am mistaking 16:07:50 it looks like that's it 16:07:53 ok, looks not very optimistic 16:07:55 any questions 16:08:03 thanx aglarendil 16:08:29 if no questions moving on 16:08:35 I think we can't stil call it as production feature though 16:08:41 still too many bugs, late moment 16:08:47 +1 16:08:52 we want to release next week, can't wait any longer 16:09:00 and also QA doesn't give a green flag 16:09:16 and it's unclear how many issues we might encounter next week with the feature :( 16:09:46 there was a conversation in openstack-dev ML, I've marked issues by "experimental" tag already 16:10:03 which means that patching is experimental along with zabbix for 5.1 16:11:04 ok I think we should move on.. 16:11:24 #topic HCF status (let's go over remaining issues) and 5.1 release proposed dates (mihgen) 16:11:47 Hello, I am working on the issue with HP driver 16:12:14 link? 16:12:18 folks we still have around 10 bugs which block us from HCF 16:12:38 I'd like to go over some of them - we should get only 5, no more 16:12:39 https://bugs.launchpad.net/fuel/+bug/1359331 16:12:40 Launchpad bug 1359331 in fuel "Add support for HP BL120/320 RAID controller line" [High,Confirmed] 16:13:00 asyriy: I thought we have decided that's gonna be documented only for 5.1 ? 16:13:11 yes, 16:14:41 ok. so let's go one by one other bugs which affect us 16:14:43 #link https://bugs.launchpad.net/fuel/+bug/1322230 16:14:44 Launchpad bug 1322230 in fuel "[library] ERR: ceph-deploy osd activate node-11:/dev/sdc4 returned 1 instead of one of [0]" [High,Confirmed] 16:14:52 this was reproduced again by QA engineer 16:15:01 I hope angdraug can take a look on this today 16:15:10 rmoe is looking into this 16:15:19 angdraug: ok, cool 16:15:20 as we speak actually 16:15:51 angdraug: one more on you - 16:15:54 #link https://bugs.launchpad.net/fuel/+bug/1331554 16:15:56 Launchpad bug 1331554 in fuel "[library] Raise max file descriptors and process file limits for radosgw" [High,Triaged] 16:16:04 yes, I think I'm going to need help from msemenov_ here 16:16:35 we need to repackage apache and radosgw with init scripts and user assignments that resolve the nofile and nproc limit issues 16:16:54 may be too disruptive for this late in the process 16:17:00 thoughts on postponing to 6.0? 16:17:16 how does it affect the user? 16:17:24 angdraug: no other workarounds possible? 16:17:36 if there's 24 or more cores on the system, radosgw may refuse to start 16:18:03 let's then document it and ask the user to modify init script for radosgw 16:18:25 agreed 16:18:29 yeah I think we can add a note to release note about this limitation.. 16:18:45 we can make a work-around for centos - just put a file into /etc/security/limits.d/ with nproc for "apache" user 16:18:46 angdraug: let's mark as won't fix for 5.1 then, not just move to 6.0 16:18:56 or otherwise we gonna forget to document 16:19:09 I've added release-notes tag to the bug 16:20:05 ok. moving on 16:20:09 #link https://bugs.launchpad.net/fuel/+bug/1360230 16:20:11 Launchpad bug 1360230 in fuel "glance issue on ubuntu/gre/ceph/simple" [High,Confirmed] 16:20:33 this is tricky issue which can't be easily reproduced as I understand 16:20:40 reported by akasatkin 16:20:45 VBox only. ddmitriev and xenolog are looking into it. 16:20:47 not sure about the latest status. aglarendil, xenolog ? 16:21:04 it's 100% repoducible on particular configurations 16:21:37 akasatkin: mihgen we could not reproduce it with our system tests 16:21:42 it is really tricky 16:21:48 ddmitriev: xenolog Any estimates? 16:21:49 it's VBox only 16:21:57 aglarendil: tricky == vbox really only? 16:22:08 or it's possible to repro with other virt 16:22:09 not sure ) 16:22:09 or hw 16:22:11 if it's really vbox only it's Medium, not High 16:22:26 mihgen: I have never heard of reproducing of this bug not on VBox 16:22:36 it's not reproduced on libvirt, but only on vbox 16:23:55 ddmitriev: you tried exactly same configuration? 16:24:09 in libvirt? I'm wondering if we can move it to 6.0.. 16:24:10 mihgen: I can't answer now. This issue reproduced only on virtualbox and not each try 16:24:35 mihgen: yes, the same configuration several times on different servers 16:24:55 ok. so if you all confirm that it's only vbox, then let's close as won't fix in 5.1 and track in 6.0 16:25:07 let's move on 16:25:09 #link https://bugs.launchpad.net/fuel/+bug/1367001 16:25:11 Launchpad bug 1367001 in fuel "fuel allows you to remove all controllers and add new controllers in the same task" [High,In progress] 16:25:16 aglarendil: https://review.openstack.org/#/c/120741/ 16:25:29 aglarendil: folks are looking for your +1 there 16:25:41 there is comment left "Vladimir, so it is ok, if user will be able to delete all controllers, except 1, and deploy another set of controllers in the same task?" 16:25:50 xarses: ^ 16:26:14 yeah it was reported by xarses, so xarses pls take a look too 16:26:35 I +1'ed it but xarses should also take a look 16:26:48 aglarendil: thx 16:26:51 #link https://bugs.launchpad.net/fuel/+bug/1367234 16:26:54 Launchpad bug 1367234 in fuel/5.1.x "[library] Logs from MongoDB server should be written to a separate file instead of syslog" [High,In progress] 16:27:08 this one is fixed by tzn 16:27:19 but we are waiting for Fuel CI for stable branches 16:27:26 then we should merge it. It's already in master 16:27:32 #link https://bugs.launchpad.net/fuel/+bug/1367340 16:27:33 Launchpad bug 1367340 in fuel/5.0.x "Ensure reliable cleanup of resources in 5.0.X" [High,In progress] 16:27:46 Ok I will look 16:28:01 aglarendil: do you know anything on it? 16:28:09 it's on dilyin but he is not in the room 16:28:27 let me summon him 16:29:17 dilyin: please comment on https://bugs.launchpad.net/fuel/+bug/1367340 16:29:19 Launchpad bug 1367340 in fuel/5.0.x "Ensure reliable cleanup of resources in 5.0.X" [High,In progress] 16:29:58 Ok. 16:30:47 We have a logic incide out corosync service provider that should cleanup service error before the service is started if there are some errors. 16:32:08 dilyin: so what about it? 16:32:10 Previously we were using it to workaround errors during the deploymebtof secoind and third conteroller 16:32:48 Now we don't need it anymore because we have changed the mode our ha cluster works. 16:33:18 But the code that should clean up any ppacemaker errors and start the service is still very usefull. 16:34:00 During the testing of our patching ther was a situation when pacemaker services could not start because they were in error state. 16:34:35 dilyin: > Now we don't need it anymore because we have changed the mode our ha cluster works ---doesn't it mean that we can safely postpone it to 6.0? 16:34:53 Theroreticly suchj situation should ever happen but it it does happen this errors should have been cleaned up ad services should start anyway 16:35:11 but for some reason this code did not work anbd deployment have failed 16:35:36 so i have inserted duplicate cleanup to ensure that any errors will be cleaned up in any situation 16:36:00 mihgen: briefly 16:36:13 mihgen: I think we can leave it as is until we get a reproducer 16:36:16 in 5.0.x 16:36:35 "as is" == confirmed? 16:36:55 This patches should not be harmful i hope. And it would be better to accept them because they can fix some random errors that could appear for some reason 16:37:25 that's a lot of "maybe" 16:37:31 angdraug: yep 16:37:46 so what's the decision should we make? how safe is the patch? 16:37:58 should we apply it to master only, to both stable/5.1 & master ? 16:38:12 let's test these patches and if there are no problems accept them 16:38:31 dilyin: Fuel CI tests are not enough? 16:38:32 let's test them with custom tests and bvt 16:38:44 they may affect other scenarios 16:38:45 Fuel CI does full HA deploymetn 16:38:58 to master... no. we should refactor out corosync module a lot in future. it's really a mess 16:39:01 let's just test it tomorrow and decide 16:39:15 ok I see. 16:39:24 #link https://bugs.launchpad.net/fuel/+bug/1367776 16:39:25 Launchpad bug 1367776 in fuel "[library] ovs bond should be deployed with lacp_time fast" [High,In progress] 16:39:39 xenolog prepared a fix 16:39:50 we need to test it 16:39:51 I've requested reported to take a look and test if possible on hw 16:40:01 no response from him yet 16:40:06 let's move on them 16:40:14 we do not have a lot of time 16:40:46 so what our decision would be now? 16:41:01 mihgen: wait for tomorrow at least 16:41:12 if there are no results - postpone it 16:41:12 I tested fix on simple env with bond — interfaces created sucessfully. 16:41:13 system tests should test fix for situations without bonds. 16:41:52 ok let's wait for the EOD PST 16:42:03 if no response is given, I move it to 6.0 16:42:18 it's actually doesn't affect much, and rather medium than high.. 16:42:35 18 minutes and 4 other topics 16:42:37 mihgen: let's wait for tomorrow. we have 18 minutes left 16:43:14 Ok we also have two bugs on UI, but those are trivial - to enable experimental mode for patching and disable 5.0/5.0.1/5.0.2 in new deployments for non-experimental after upgrade 16:43:21 vkozhukalov: sorry, let's move on 16:43:31 #topic Upstream Neturon and ML2 Support (xarses) 16:43:48 ml2-neutron - updated from newer puppet neutron. but had to back out (2) commits that make a it require puppetlabs/mysql 2.2 or greater. 16:43:48 #link https://review.openstack.org/#/c/103280/ 16:43:48 Its rebased onto current master and am working through some bugs related to the rebase. 16:43:48 There are still a number of commits that where made to our neutron manifests that need to be evaluated if they are included in the rebase 16:44:22 any chance of finishing this today? 16:44:32 poor at this rate 16:45:02 in that case you'll need to document your progress so far for xenolog to take over while you're on vacation 16:45:11 of course 16:45:28 moving on? 16:46:30 if no questions 16:46:33 vkozhukalov: let's move on, yeah 16:46:43 #topic Upstream manifests merging (alex_didenko) 16:46:58 so in 5.1 we've updated "stdlib" and applied essential compatibility patches in "mysql" and "keystone" which allow us to use old and new openstack puppet modules simultaneously 16:47:05 removed unused puppet modules like: nagios, mmm, squid, git, keepalived, puppetdb, selinux 16:47:10 OS modules synced with upstream (4.0.0 for Icehouse): cinder, glance, heat, nova 16:47:15 a part of dependency modules synced with upstream like: firewall, apt, ssh, concat, lvm, memcached 16:47:20 in progress and on review OS modules: keystone (need to update "swift" firts), neutron, ceilometer 16:47:26 we tried to do minimal changes in the synced from upstream modules and move all our customizations into our own modules/classes (like "openstack::cinder") where it's possible 16:47:28 done :) 16:47:48 are we reporting/proposing our fixes back to upstream on ongoing basis? 16:47:52 alex_didenko: are we pushing them back upstream? 16:48:08 alex_didenko: excellent, thanks! 16:48:18 not really, only 1 fix was pushed upstream atm 16:48:45 moving the changes into our modules, are going to cause the same issues as we had with out of sync modules. We do work, and it gets lost, or conflicted 16:48:53 there are not many fixes, mostly applying our logic to existing manifests 16:49:09 as long as it's fuel specific and not working around upstream bugs, fine 16:49:22 yeah folks let's discuss it in ML 16:49:23 if we have to work around bugs or missing features in upstream, we should at least report them 16:49:28 ideally in the future we should have all our login in our modules and patch upstream ones only with improvements and fixes 16:49:33 mihgen: +1 16:49:53 ok moving on 16:49:56 then we'll be able to push a lot of those changes into upstream 16:50:06 #topic Juno support (mattymo) 16:50:15 all our login = all out logic 16:50:41 mattymo: around? 16:50:44 yes 16:50:57 alex_didenko, +1 16:51:04 I recently did some deployments with the MOS team on juno-2 milestone. We found a lot of fun dependency issues, and are taking them out relatively quickly 16:51:19 we have almost everything deploying, except glance, murano, and heat (more dependency issues) 16:51:33 There were just 3 bugs in puppet that had to be addressed, which is nice 16:51:34 mattymo: neutron? 16:51:36 I already propose making wrappers oround upstream manifests. 16:51:38 yes neutron works 16:51:50 mattymo: excellent 16:51:53 But today MOS team is rebasing on juno-3 packages, so we'll have to backpedal a bit and retest for dependency bugs 16:52:03 It's moving along well, and all Fuel Master components work fine on Juno libs 16:52:13 (that's it for me) 16:52:14 mattymo: are we staring to land changes into fuel-library master? 16:52:18 mihgen, not yet 16:52:32 is master/6.0 ci fully operational now? 16:52:32 mihgen, there's a number of patches ready to merge whenever core reviewers would feel generous enough to hit +2 16:52:40 mattymo: thanks, good progress. So for changes - they should be backward compatible with icehouse 16:52:44 then Fuel CI will do +1 16:53:09 CI and mirrors to rvyalov and bookwar_ 16:53:10 mihgen, not sahara. Sahara renames the service from sahara-api to sahara-all 16:53:38 general note: lets not forget about kilo, and have a package repo & ci jobs for that as soon as stable/juno is branched 16:54:03 angdraug: yes 16:54:21 angdraug: kilo is too early if we can't get fast juno yet 16:54:21 mihgen: we are yet to define openstack mirrors versioning scheme 16:54:33 not saying right now, but we should plan for it 16:54:34 5 minutes 16:54:37 bookwar_: omg I thought it's resolved 16:54:40 1 topic 16:54:52 let's discuss issues in ML then please bookwar_ 16:54:59 if broader opinion needed 16:55:10 #topic Blueprints under question 16:55:22 #link https://blueprints.launchpad.net/fuel/+spec/blank-role-node 16:56:11 #link https://blueprints.launchpad.net/fuel/+spec/cpu-overcommit-setting 16:56:20 #link https://blueprints.launchpad.net/fuel/+spec/coreos-fuel-master (mattymo) 16:56:28 well these are not for 4 minutes 16:56:46 mihgen: agree 16:56:49 we might want to bring those to discuss during Fuel Meetup.. 16:56:49 I like the idea of coreos, but bp is mostly boiler plate and needs to be updated 16:56:50 I just wanted my blueprint to get some public visibility and welcome any comments and suggestions 16:57:08 for blank-role-node, +1 for the idea.. 16:57:16 not sure how easy it is though 16:57:16 yes, coreos is tempting but fitting it into 6.0 is going to be a problem 16:57:24 angdraug: +1 16:57:25 blank-role-node -- should be simple enough 16:57:39 ditto overcommit 16:57:42 my 2c :) 16:57:52 one more time about september Fuel meetup 16:58:04 #link https://etherpad.openstack.org/p/fuel-september-meetup-2014 16:58:14 everyone is welcome to join 16:58:34 it is supposed to be held likely at Mirantis office 16:58:43 615 National Ave, Mountain View 16:58:59 thanks 16:59:03 ok, thanx everyone 16:59:07 great meeting 16:59:14 thanks 16:59:15 #endmeeting