16:02:09 #startmeeting Fuel 16:02:10 o/ 16:02:11 Meeting started Thu Nov 20 16:02:09 2014 UTC and is due to finish in 60 minutes. The chair is kozhukalov. Information about MeetBot at http://wiki.debian.org/MeetBot. 16:02:12 long time fan, first time attendee 16:02:12 Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 16:02:14 The meeting name has been set to 'fuel' 16:02:21 #chair kozhukalov 16:02:22 Current chairs: kozhukalov 16:02:28 agenda 16:02:36 #link https://etherpad.openstack.org/p/fuel-weekly-meeting-agenda 16:02:51 #topic Announcements (mihgen) 16:02:57 hi folks 16:03:08 our schedule remains unchanged: 16:03:10 #link https://wiki.openstack.org/wiki/Fuel/6.0_Release_Schedule 16:03:24 we have a bit more than one week left before hard code freeze 16:04:18 I was looking thru our bugs targeting 6.0, and the race of income, I can say from experience that we are becoming more to yellow status rather than green 16:04:37 main reason is that we still getting a large flow of bugs in, even while we are closing a lot 16:05:08 mihgen, do we have any stats on how bug squashing went last week? 16:05:17 any statistics available? 16:05:54 so the question is how we can be better here. I believe we have to work very close with QA, and get their broken env for further analysis / troubleshooting right away to close the gap between the bug report and its fix 16:06:04 stats are pretty common, which I have 16:06:08 #link http://fuel-launchpad.mirantis.com/project/fuel/bug_trends/6.0 16:06:21 I've ran through LP with scripts and manually too 16:06:42 fuel-library did a very good job on bug squashing / triaging 16:06:53 though bugs are growing still 16:06:54 Hi there! 16:07:25 Especial danger for me are bugs on fuel-devops, as any failure in infra causes delays in our processes 16:07:46 so every smoke/bvt failure should be always considered with highest attention 16:08:23 I think that's it for 6.0 bugs which I wanted to share. 16:08:38 mihgen: for me it looks like amount of all bugs is growing bug amount of open bugs is not 16:09:14 not exactly, they are being quickly moved to confirmed / > committed, and this is very, very good 16:09:22 thanks for those who keep watching new 16:09:39 ok, let do our best here 16:09:40 Ok , for features in upcoming releases, I see a number of email threads in openstack-dev discussing design approaches, and it is awesome 16:10:25 let's continue discussions and getting blueprints filled in. Once we are becoming less loaded from bugs, we need to switch to design proposals 16:10:52 that's it about 6.0 & upcoming, any comments? 16:11:41 kozhukalov: let's move on? 16:12:13 #topic 5.1.1 release status (angdraug) 16:12:17 primary focus in 5.1.1 is critical bugs 16:12:22 #link https://bugs.launchpad.net/mos/+bug/1392261 16:12:31 if we can get this and all pending kernel patches land tomorrow we can still have code freeze this week 16:12:37 and we still need more reviews for release notes! 16:13:06 anyone from osci here? 16:13:11 or mos-linux? 16:13:32 angdraug: today there were a several landings to stable/5.1 16:14:29 dburmistrov: any updates from your side on 5.1.1 ? 16:14:31 afaict kernel wasn't one of them 16:15:14 angdraug: we add latest ubuntu security updates to 5.1.1 16:15:31 https://review.openstack.org/#/c/135561/ (fix for #1392261) fails CI 16:16:07 dburmistrov: yup, but that didn't include kernel, right? 16:16:23 this is experimental feature, so we should not count on this one 16:16:25 #1373459 is still in progress for 5.1.1 16:16:51 angdraug: this one though might block scale lab, changeset to master landed, needs to be merged to stable/5.1 now 16:16:53 https://review.openstack.org/#/c/135024/ 16:16:54 angdraug: yes, it doesn't include kernel 16:16:57 mihgen: doesn't mean we should leave it completely broken when we already have a patch up for review 16:17:29 I'm in a worry for https://bugs.launchpad.net/fuel/+bug/1371104, VMs are losing connectivity and their IPs after HA failover 16:17:47 not reproduced for 6.0, we should triple-check it for 5.1.1 16:18:08 could be mlx specific 16:18:44 angdraug: not so sure. 16:18:53 if it is, we can't hold back 5.1.1 for that 16:19:00 This one is unclear how affects us: https://bugs.launchpad.net/fuel/+bug/1394137 Mysql backend readiness checks should rely on TCP connectivity instead of sockets 16:19:32 I doesn't see clear message on how exactly it affects UX 16:20:19 yes, I asked the same question in the comments 16:20:31 mihgen: I think we can ask holser, but he is not here 16:20:34 It’s not critical in my opinion 16:20:57 We lived without it for quite a while 16:21:06 so it's high there. I'm not sure if it has to be in 5.1.1 especially 16:21:40 tzn: yep, that was my point. I'm aftraid that we can introduce more issues with it at the beginning than to solve 16:21:47 tzn: do you know why holser and bdobrelia are not here? 16:22:02 bogdando is bdobrelia 16:22:13 oops, right 16:22:18 so my point is that we should re-consider targeting 5.1.1 16:22:22 for this bug. 16:22:41 we should reconsider its priority first 16:22:44 and that's pretty much of important/unclear bugs we have 16:23:05 if it's not a regression, it's Medium at best 16:23:13 thanks 16:23:23 thanx angdraug 16:23:27 ok, let's leave comments there again… and decide 16:23:31 moving on? 16:23:35 yes please 16:23:48 #topic Debian porting status (zigo) 16:23:56 Hi there! 16:24:19 I've been doing a lot of repacking, and for that, I opened git repos on github. 16:24:25 As I needed to have one git repo per package. 16:24:35 Lots of stuff are already uploaded and present in Debian sid. 16:25:00 For Ruby dependencies, I uploaded: ruby-cstruct ruby-raemon ruby-rethtool ruby-symboltable 16:25:06 For the rest, I have worked out packages for: 16:25:17 zigo: wait those should be existing, no? 16:25:21 cobbler fuel-agent fuel-library nailgun-agent python-fuel-agent-ci python-fuelmenu python-invoke python-rudolf 16:25:21 fabric fuel-astute fuel-nailgun python-daemonize python-fuelclient python-invocations python-network-checker python-shotgun python-xmlbuilder 16:25:28 mihgen: They were *not*. 16:25:29 I mean those are pretty general-purpose ruby libs 16:25:33 whoops ok 16:25:38 mihgen: But they were not in Debian. 16:25:57 Then now, I'm currently working on doing a hardware discovery bootstrap image. 16:26:10 It almost worked, until today, where it can't find /sbin/init, not sure why ... 16:26:11 sounds pretty cool, though didn't know that we've got too much stuff .. ) 16:26:16 Though all dependencies are already built. 16:26:39 I believe I'll need help to test out that ISO image when it will boot again. 16:26:44 ummm.. debian based bootstrap. good to hear that we are moving towards that 16:26:57 Can anyone volunteer to test that on an already installed Fuel setup ? 16:27:10 kozhukalov, when you have a senior devel with a wide open schedule, miracles can happen 16:27:18 zigo, yeah I'll help you get it up and working 16:27:20 zigo: you can ping me 16:27:25 Cool, thanks guys. 16:27:48 mattymo: don't be jealous :) 16:27:50 As for the rest, I need to finish the cobbler package, which is in theory working, but needs a bit of polishing. 16:27:53 I am jealous! 16:27:59 That's it. 16:28:04 Any question, remarks, help? 16:28:25 None, ok ... 16:28:26 thx zigo, the question would be how many of those repos should be on stackforge 16:28:27 Then just one thing. 16:28:28 remark 16:28:28 zigo, it's the hardest part by far. Getting the mcollective portion is going to be very time consuming as well, so focus there first 16:28:33 just let us know what you need help with. this is awesome 16:28:42 Yeah, that's what I wanted to ask ! 16:28:47 Could we have everything move to stackforge ? 16:29:04 Obviously, I'm not the one to do that, cause that will break current build. 16:29:04 zigo: not sure if we want to move there everything 16:29:11 But maybe we can setup a deadline to do it? 16:29:14 it's worth to discuss in email 16:29:15 After 5.1.1 is out maybe? 16:29:26 is it relevant to do that for every sub package? 16:29:26 we have good news about ironic integration and it looks like we are able to deliver zero implementation by 6.1, so i am not sure we really need cobbler package 16:29:28 zigo, that version of debian will be used for our purposes? 16:29:29 Like, before working on 6.0? 16:29:37 xarses: YES ! :) 16:29:54 integration with current code is different question even. Like with image-based provisioning, 16:30:01 kozhukalov: Well, it's almost done, so never mind! :) 16:30:05 we have to prepare very very smooth path 16:30:19 in order no to break what we've got already 16:30:22 Yeah, I'm currious to know what path we'll be taking, and the schedule. 16:30:29 it'll disrupt our upgrade path for sure, so that's a future discussion 16:30:34 FYI, I already have Ironic ready for both Debian and Ubuntu, so we could use that. 16:30:48 kozhukalov: we need to talk about what happens when cobbler is gone. It does alot of things for us that we will have to start doing if we remove it 16:30:56 mattymo: Should we schedule to break everything after 5.1.1?!? 16:31:05 zigo, after 6.0, please 16:31:06 regardless of it doing provisioning 16:31:11 zigo: so I think all these questions should go to openstack-dev 16:31:15 mattymo: When will that be? 16:31:21 pls don't break even after 6.0 ;) 16:31:32 zigo: you missed https://wiki.openstack.org/wiki/Fuel/6.0_Release_Schedule 16:31:37 Ok, so what's the plan to move to artefact building? 16:31:51 xarses: not agree, for me it looks like cobbler does a little for us 16:32:03 it's under discussion, hopefuly we can manage to get some stuffing for it in 6.1 16:32:04 Maybe we should just open a branch for 6.1, and work there? 16:32:19 zigo: you can always do the work in your github fork 16:32:31 we're not yet ready to create stable/6.0 16:32:34 Yeah, I got no other ways anyway. 16:32:34 though it is gonna be nightmare to accept > 1000 lines of code 16:32:48 or you can work with patchsets in gerrit that aren't merged 16:32:56 that's how we did juno 16:32:59 regarding packaging repos, shouldn't that be fully upstreamed in debian.org repos? 16:33:01 I'm ok if we don't decide when do move to it yet. 16:33:10 +1 to mattymo on working in progress patches in gerrit 16:33:19 and people can review in parallel 16:33:24 angdraug: That's what I'm doing. But upstream repo should have a match (eg: without the debian folder). 16:33:32 angdraug: Then I should just pull from that ... 16:33:43 zigo: we can put something to review.fuel-infra.org git too 16:34:03 zigo: thx, ok folks should we move on? 16:34:10 I don't mind where, as long as we start separating things into smaller repos. 16:34:11 ok, guys, we still have a lot to discuss 16:34:12 folks, lets take it to the mailing list, there's too much to discuss 16:34:16 let's move on 16:34:17 Yeah, let's move on to the next topic! :) 16:34:36 #topic Changing the handling of /etc/puppet/modules on Fuel master (mattymo) 16:34:48 ok I have a few words prepared, and I hope we'll have some discussion 16:35:01 kozhukalov: it does dnsmasq, reserved dhcp, hostnames, tftp/pxe boot, 16:35:21 So, this came as a result of some surprised looks (and comments) from people outside of Fuel Library team. Host manifests were already merged over a month prior and now we have to reconsider our solution. 16:35:30 LP bug 1382531 solves the Puppet host manifest chicken and egg solution by using pre-packaged Puppet in /etc/puppet, and then rsync serves Puppet from /puppet directory which is /etc/puppet from the Fuel Master host. 16:35:42 (code still on review, along with the revert of host manifests) 16:36:18 host maniHost manifests change was intended to give us an easy way to launch Docker containers using modified puppet manifests without any dirty hacks. 16:36:29 In light of recent discussions, i've decided the only way to really have an effective Fuel Master CI test is to build fresh containers. This avoids breaking Fuel upgrade and rollback (even though there were alternatives). It will add time to our test, but hopefully give better results. 16:36:58 and that code is here on review now (not tested yet): https://review.openstack.org/#/c/136031/ 16:37:26 16:37:34 mattymo: I feel it's more right approach now 16:38:06 #link https://bugs.launchpad.net/fuel/+bug/1382531 16:38:57 we've got a few places already where we had to do some hacks for upgrades 16:39:04 like ssh keys for instance 16:39:23 define dirty hacks please 16:39:25 that's an astute architectural problem, not related to dockerization 16:39:28 so our upgrade script already grows with workarounds, adding one more doesn't seem to be a good option 16:39:37 mattymo: which one? 16:39:41 angdraug, env specific ssh keys are stored in /var/lib/astute 16:39:43 mihgen, ^ 16:39:45 so basically current suggestion is fine to me 16:40:07 mattymo: yeah, and it is an issue of astute, correct 16:40:32 is rsync container stateful? 16:40:34 it should go in postgres DB as well or be stored/retrieved by nailgun 16:40:48 wow, guys 16:40:50 so my point is that we should not design something similar to what we have with ssh keys 16:40:53 the only stateful containers are cobbler, postgres, and astute 16:40:56 we have to store ssh keys outside container 16:41:00 mattymo: yep, postgres is fine 16:41:20 ikalnitsky, you can back up containers. we have no backup scripts for "outside container" data 16:41:32 containers should be stateless 16:41:38 i think all our containers should be stateless 16:41:45 mattymo: that's not a problem 16:41:57 and we can have one single stateful place, or two 16:42:00 but not more 16:42:34 cobbler can become stateless if astute rebuilds all profiles on deploy and purges all old ones 16:42:38 that's trivial 16:42:41 astute as well 16:42:45 so it should be always clear and transparent, then we would not see bugs with upgrades all the time 16:42:47 but postgresql we need some consensus 16:43:05 postgresql data should be put on the host node 16:43:13 aglarendil: +1 16:43:15 +1 16:43:18 aglarendil, and how do you propose a rollback? where is the old data? 16:43:27 mattymo: we can put astute/cobbler settings on host node too 16:43:45 simply back your database up 16:43:47 mattymo: this is what having a VOL for the psqldata is for so it can be used in seperate images 16:43:53 not even that 16:43:55 ikalnitsky, propose a rollback strategy for astute that works with host node data too 16:43:59 mattymo: we will better write script for backuping some data from host rather write workarounds inside fuel_upgrade script 16:44:01 and if something is broken - just upload old data 16:44:01 it has to be two storages in my opinion: DB & folder on fs 16:44:02 it doesn't have to be on the host node 16:44:12 Docker can snapshot data just fine for us already 16:44:16 you can have multiple postgres db instances running in parallel, and pivot between them 16:45:00 ok, guys, what about having a statefull storage containers for database/astsute/so and mount them to our stateless containers? 16:45:01 we are not gonna make a consensus now I think. But topic is hot. 16:45:12 who can volunteer to start email thread on it? 16:45:18 angdraug, it's a little tricky unless we decouple the data and the application because once you create a container, you can't change exposing the port so easily (in antequated versions of docker) 16:45:19 ikalnitsky: +1 16:45:20 We may use host filesystem with built-in snapshotting. 16:45:21 It requires additional design, because in this case rollback is tricky, because you have db, with new schema, you should drop previous db and db from backup 16:45:48 ok, let's leave this for ML 16:45:52 moving on 16:45:53 we have to live with the fact that we may be stuck with docker 0.10 for the next several months 16:45:54 But in this case we can mount volume to different directories and use old method to upload db. 16:46:06 mattymo: re old version, we have to loose it soon, its horrible 16:46:13 guys, stop it. backing files and DB is not rocket science 16:46:14 #topic Lost commits during upstream manifests merge (mihgen) 16:46:28 aglarendil: it's not 16:46:29 yeah I wanted to bring this topic 16:46:39 there is email in openstack-dev about it too 16:46:50 the thing is that we sync upstream modules 16:47:00 okay, Mike, I see the only issue with this was keystone-related bug 16:47:06 and then start to see regressions 16:47:15 aglarendil: nope, there were others 16:47:25 which ones? 16:47:35 so the thing is that we have to 1) go ahead and analyze how it happened 16:47:50 2) proactively check everything we merged again 16:47:56 it looks like we intentionally did not merge keystone commit 16:48:04 3) figure out how we can safely sync new changes 16:48:13 aglarendil: minor issue with nova-api worker, so 2 regressions so far: in keystone and nova modules 16:48:16 because it was breaking things before we merged Juno keystone release 16:48:45 and keystone Juno was merged right before summit 16:48:53 so far we just did not apply this fix 16:49:02 alex_didenko: you forgot the cinder glance_api_version regression 16:49:24 few words on keystone module - we started to sync+adapt it on Apr 3, merged on Nov 8 16:49:30 aglarendil: there was https://review.openstack.org/#/c/129918 for example 16:49:54 angdraug: thx, 3 regressions now :) 16:50:23 here is how it was merged: https://review.openstack.org/#/c/86007/ 16:50:34 -1 from Fuel CI, and one +1 16:50:42 +3300 lines of code 16:50:47 Mike, you are pointing the wrong one 16:51:03 that's the sync commit, it may not pass CI - it's OK 16:51:28 we do 2 commits: 1st - sync commmit, upstream module "as is" 16:51:47 2nd - adaptation commit which depends on sync commit - it should pass CI 16:52:00 what do you mean under adaptation? 16:52:05 I think you are forgetting that before the summit there was basicly no puppet-openstack modules that supported juno 16:52:31 so the fact that we found problems when we merged the juno release packages isn't a shock 16:52:33 adaptation - make new upstream module work in Fuel 16:52:34 I believe it's exactly 1st stage where you miss our tweaks / bug fixes made before 16:52:42 the juno rc packages mostly worked 16:52:43 Mike, we established upstream modules merge process. 16:53:03 xarses: nope, it's not about that. See lost optimizations: https://review.openstack.org/#/c/129918 16:53:03 firstly, we merged modules as-is and then applied differences we needed 16:53:10 mihgen: you're wrong and you're right 16:53:23 ok let's move to ML then 16:53:25 tweaks and bugfixes should all be done in the adaptation commit 16:53:30 7 minutes and two topics 16:53:32 and yes we miss them 16:53:32 I just want to say it's important 16:53:56 angdraug: that's what I'm trying to understand how we can not to miss them... 16:54:07 let's move this discussion to ML. please reply to my message 16:54:10 let's pay attention to that topic in ML then 16:54:13 I will 16:54:19 ok 16:54:31 #topic image based provisioning (agordeev) 16:54:35 hi 16:55:00 image based provisioning. 16:55:02 2 high pritority bug fixes were landed just few hours ago. 16:55:04 https://bugs.launchpad.net/fuel/+bug/1390492 16:55:05 https://bugs.launchpad.net/fuel/+bug/1391896 16:55:08 also, 2 new bugs appeared today. I'm not sure if they're really representing bugs in fuel. 16:55:09 I mean it is not clear and questionable, either we have glitchy testing environment or fuel. Investigating them. 16:55:11 https://bugs.launchpad.net/fuel/+bug/1394599 16:55:13 https://bugs.launchpad.net/fuel/+bug/1394617 16:55:15 otherwise, all everything looks fine and no other complains came since last weekly meeting from QA. 16:55:18 I'm done, thanks! 16:55:53 agordeev: thanx 16:55:56 any q? 16:56:01 i'm prepared for questions, if any 16:56:17 looks like no q 16:56:21 yeah who is being invlolved too ? 16:56:27 in the code reviews 16:56:32 apart two of you and dpyzhov 16:56:49 mihgen: qa folks also 16:57:05 not enough, we've got a bunch of python force 16:57:24 let's figure out how to knowledge transfer and involve them into the work as well 16:57:31 mihgen: they are also reviewed 16:57:42 dshulyak1: and akasatkin 16:57:47 ok, good 16:57:57 #topic Ironic integration status (kozhukalov) 16:57:58 thx 16:58:07 we had a discussion with two ironic cores today and we agreed about our seeing of zero step implementation of Fuel Ironic driver 16:58:16 we are planning to come up with spec drafts for both Fuel and Ironic next week and have another much more detailed discussion 16:58:24 there is still no ML thread for that just to make it more more valuable when spec draft are ready 16:59:10 xarses: all that stuff about tftp and dhcp is available for nailgun or incapsulated into provisioning data, so ironic do the same, and we even have ready to use astute ironic driver 16:59:19 i am done 16:59:22 kozhukalov: if you can summarize in email what spec is going to be about is gonna be better 16:59:31 than to wait before spec is complete 16:59:51 i am talking about spec drafts 17:00:05 ok looks like we have no more time 17:00:13 thanx everyone 17:00:21 #endmeeting