#openstack-meeting log

20:00:49 <zaneb> #startmeeting heat
20:00:50 <openstack> Meeting started Wed Jul 30 20:00:49 2014 UTC and is due to finish in 60 minutes.  The chair is zaneb. Information about MeetBot at http://wiki.debian.org/MeetBot.
20:00:51 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote.
20:00:53 <openstack> The meeting name has been set to 'heat'
20:01:14 <zaneb> #topic roll call
20:01:18 <mspreitz> here
20:01:20 <tspatzier> hi
20:01:22 <stevebaker> */
20:01:22 <elynn_> o/
20:01:24 <jasond> o/
20:01:26 <tango> Hi
20:01:29 <jpeeler> hey
20:01:32 <skraynev> good evening :)
20:01:46 <stevebaker> shardy is on PTO
20:02:03 <zaneb> ryansb?
20:02:05 <stevebaker> SpamapS: you about?
20:02:45 <zaneb> #topic Review action items from last meeting
20:02:54 <zaneb> There weren't any \o/
20:03:03 <zaneb> #topic Adding items to the agenda
20:03:14 <zaneb> #link https://wiki.openstack.org/wiki/Meetings/HeatAgenda#Agenda_.282014-07-30_2000_UTC.29
20:03:51 <zaneb> anything to add to that?
20:03:51 <stevebaker> zaneb: I'd like to talk functional tests
20:03:57 <zaneb> ok
20:04:05 <tspatzier> wow, pretty full agenda this week
20:04:15 <zaneb> indeed
20:04:18 <mspreitz> functional tests = system tests = integration tests ?
20:04:40 <stevebaker> mspreitz: sort of ;)
20:04:49 <randallburt> present
20:04:53 <zaneb> mspreitz: = Tempest scenario tests
20:04:55 <zaneb> #topic Spec approval criteria
20:05:22 <zaneb> it has been brought to my attention that the spec approval criteria are too wooly ;)
20:05:42 <mspreitz> stevebaker: I do not understand why the answer is not "yes", but maybe it does not matter.  We certainly need more better in tempest
20:06:21 <zaneb> so, to be clear: I don't think we want the PTL to be the only approver
20:06:52 <zaneb> my goal as PTL is to *reduce* the number of places where the PTL is the single point of failure
20:07:10 <tspatzier> zaneb: I think some threshold rule before any core could +A
20:07:19 <tspatzier> ... would make sense
20:07:27 <skraynev> zaneb: agree
20:07:28 <zaneb> however, it makes people nervous when I just say "use your judgement on when something has had enough discussion to approve:
20:07:33 <stevebaker> mspreitz: functional testing and integration testing are conceptually a bit different, but practically speaking with heat there is huge overlap
20:08:05 <zaneb> so the proposal from asalkeld is that it would be OK to approve after 3x +2
20:08:31 <zaneb> obviously that wouldn't preclude using your judgement as well!
20:08:49 <tspatzier> I think that makes sense for average specs. For something like convergence maybe a bit more?
20:08:55 <zaneb> so you don't have to approve after 3x +2, but nobody will yell at you if you do
20:09:04 <stevebaker> that sounds fine, I've also been assuming that there is no barrier to start working on something which has yet to be approved
20:09:19 <tspatzier> stevebaker: +1
20:09:43 <zaneb> tspatzier: the problem with that is that people become nervous about whether they should approve at all
20:09:49 <zaneb> for anything
20:09:49 <skraynev> stevebaker: possibly it will be faster in some cases :)
20:10:05 <stevebaker> Although I hear some crazy companies insist that no work happens without an approved spec ;)
20:10:20 <zaneb> stevebaker: +1, although you have to accept the risk of your patch getting -2'd ;)
20:10:33 <stevebaker> zaneb: absolutely
20:10:57 <tspatzier> zaneb: agree. and I actually think that _had_ enough discussion. so we kind of reached threshold there. I was more thinking if something like this would come again ...
20:11:43 <zaneb> stevebaker: which is the Right Way IMO - the engineer working on it is best-placed to weigh the risk
20:12:20 <SpamapS> stevebaker: o/
20:12:24 <tspatzier> talking about specs, could someone look at https://review.openstack.org/#/c/98742 and see if this is good to go? I am especially looking at stevebaker; SpamapS had some opinion as well some time back.
20:12:25 <SpamapS> sorry lunch went long
20:12:28 <stevebaker> sounds like there are no objections to this plan
20:12:34 <zaneb> tspatzier: I think people can be relied upon to seek out wider consensus for bigger changes like that without encoding it in the rules
20:12:52 <stevebaker> tspatzier: will do
20:13:30 <tspatzier> stevebaker: thanks - all your comments should be addressed; the last few cycles were typos only. ... and I started implementing it ;)
20:13:50 <stevebaker> tspatzier: ok, sorry for being tardy getting back to it
20:13:53 <tspatzier> zaneb: yes, agree. and I was not thinking of hard rules like fixed numbers.
20:14:04 <skraynev> zaneb: should we vote now about solution 3x +2 ?
20:14:10 <tspatzier> stevebaker: np
20:14:44 <stevebaker> zaneb: can we discuss functional testing early in the agenda, I need to leave early
20:14:53 <zaneb> skraynev: do we need a vote?
20:15:02 <zaneb> nobody has spoken up with a -1
20:15:10 <stevebaker> skraynev: lazy consensus!
20:15:17 <skraynev> zaneb: ok, cool :)
20:15:29 <zaneb> #agreed Specs can be approved after 3 +2 votes from core team members, though of course people should exercise their judgement on more major changes, just as they already do with regular reviews
20:15:30 <skraynev> stevebaker: yeah..
20:15:51 <zaneb> #topic functional testing
20:15:54 <zaneb> stevebaker:
20:16:39 <stevebaker> So, it has been decided to move much project-specific functional testing out of tempest and back into the projects, because the process isn't scaling
20:16:48 <zaneb> \o/
20:17:12 <stevebaker> And I've started moving the scenario tests (which use heatclient) into heat.tests.functional
20:17:21 <mspreitz> Hooray!
20:17:32 <SpamapS> tspatzier: FYI, my feeling on that spec has changed of late. I'll discuss w/ you in open discussion or after in #heat.
20:17:44 <stevebaker> the heat-slow job will go away, and there will be a new heat-functional job
20:17:50 <tspatzier> SpamapS: ok
20:17:57 <SpamapS> I'm quite happy that we'll be more responsible for our own functional testing.
20:18:03 <BillArnold_> stevebaker can we start submitting patches for new tempest tests then?
20:18:08 <stevebaker> the functional tests will have zero dependency on tempest, so there is a bit of forklifting required
20:18:15 <SpamapS> I never really liked the fact that we couldn't land our own tests.
20:18:31 <SpamapS> Seemed like an unnecessary hurdle.
20:18:41 <stevebaker> #link https://review.openstack.org/#/q/topic:bp/functional-tests,n,z
20:18:55 <stevebaker> there is a spec and some WIP code ^
20:19:06 <skraynev> stevebaker: I suppose, that other tests will be moved in heat.tests.unittests , right?
20:19:29 <stevebaker> in the medium term there will be a tempest-lib which will contain another heat client, and then the API tests will move into the heat tree too
20:19:47 <stevebaker> skraynev: meh, maybe
20:20:08 <stevebaker> so what I actually wanted to discuss was...
20:20:08 <zaneb> stevebaker: that last part seems unnecessary from our perspective
20:20:18 <zaneb> (moving the API tests)
20:20:33 <stevebaker> zaneb: yeah, I'm not concerned about the timeline for that
20:21:13 <stevebaker> so as I'm currently implementing the functional tests, the following is happening:
20:21:44 <stevebaker> there will be a "functional" env defined in tox.ini which only runs the functional tests
20:22:20 <stevebaker> there will be no configuration file to specify cloud-specific options (although it does use oslo.config)
20:22:47 <therve> stevebaker, How "functional" will it be? devstack-ish?
20:23:10 <zaneb> therve: it's using devstack AIUI
20:23:15 <stevebaker> functional tests will run if credentials are sourced, even for unit tests. If no credentials are sourced, functional tests are skipped
20:23:20 <therve> Oh, remote functional testing?
20:23:46 <stevebaker> therve: devstack will configure things for the functional tests to run. They will assume a full working cloud
20:24:34 <stevebaker> if glance doesn't have the exact named image required for a test, the test will be skipped
20:24:44 <therve> stevebaker, So devstack configures heat, but not the other services?
20:25:15 <mspreitz> We should probably agree on a few standard image names
20:25:33 <stevebaker> so one thing I wanted discussion on is skipping vs failing when test preconditions are not met (credentials, images)
20:25:50 <SpamapS> mspreitz: cirros and not-cirros ... done. ;)
20:26:17 <mspreitz> BTW, is that image-existence-checking covered in general, or is it only looking at parameters named image?
20:26:20 <skraynev> stevebaker: will it be voting job?
20:26:21 <stevebaker> therve: no, devstack configures a full cloud, and the functional tests bring up all manner of full stacks
20:26:32 <stevebaker> skraynev: for heat, yes
20:26:39 <SpamapS> stevebaker: seems like the default should be to fail if missing whatever would be in the gate/check environment.
20:26:51 <zaneb> stevebaker: TBH I'd prefer that we _not_ run functional tests when running the unit tests, and fail if the preconditions are not met
20:27:17 <stevebaker> mspreitz, SpamapS: image name could be something like heat-functional-test-image
20:27:21 <SpamapS> and yea, tox -epy27 shouldn't run functional tests.
20:27:34 <SpamapS> but 'tox' should
20:27:35 <skraynev> SpamapS: +1
20:27:41 <zaneb> SpamapS++
20:27:43 <mspreitz> SpamapS: +1
20:28:19 <stevebaker> In that case we should probably tag individual tests as functional so we can do the same tox filtering as tempest
20:28:22 <mspreitz> Will one image suffice for all the functional tests?
20:28:43 <BillArnold_> SpamapS +1
20:28:46 <zaneb> stevebaker: is there any value in keeping functional tests under heat.tests.*
20:28:47 <zaneb> ?
20:29:05 <zaneb> given that it's not actually using anything in heat, just heatclient
20:29:15 <stevebaker> mspreitz: probably. It can have all the things installed on it with diskimage-builder
20:29:16 <zaneb> why not make it a separate package?
20:29:56 <SpamapS> zaneb: I was wondering the same thing.
20:30:08 <stevebaker> zaneb: heat.tests.functional seems to be an established convention in other projects already
20:30:22 <stevebaker> but I have no strong feelings on that
20:30:23 <SpamapS> It will be more cleanroom-ish if it is in its own repo
20:30:34 <stevebaker> SpamapS: repo?!
20:30:37 <SpamapS> no importing heat.common :)
20:30:41 <SpamapS> stevebaker: yes
20:30:46 <stevebaker> nooooooo
20:30:47 <zaneb> stevebaker: I would guess that they're doing something different with it
20:31:13 <SpamapS> One could argue the best place for it is in heat-templates
20:31:45 <zaneb> SpamapS: there are advantages to a two-step commit process, but IMO only for the API tests
20:31:47 <stevebaker> SpamapS: I do think it is important to keep it in tree, if for no other reason than to make it visible to heat developers. I will be doing a lot of "-1, write functional tests for this"
20:32:18 <mspreitz> Will we have to write both functional and unit tests for everything?
20:32:27 <stevebaker> mspreitz: absolutely, yes
20:32:29 <mspreitz> Will functional test be enough, no unit test required?
20:32:41 <mspreitz> I guess not
20:32:42 <stevebaker> mspreitz: I mean, it depends on the feature
20:32:45 <zaneb> SpamapS: I do understand the suggestion, but I would vote for separate package in same repo
20:33:03 <SpamapS> Yeah that's fine. I think it will make it easier to write tests, so +1
20:33:05 <skraynev> stevebaker: OMG. more and more tests... :)
20:33:28 <SpamapS> Honestly we should sit down and review the tests we have, and see how many of them are actually functional.
20:33:34 <stevebaker> mspreitz: a new resource needs full unit test coverage, and a functional test to prove it actually creates a real cloud resource
20:33:36 <mspreitz> You know, scenario tests cover stuff outside Heat too
20:33:46 <SpamapS> That's actually necessary as we move further into convergence, since some of the functional-ish unit tests will have to be exploded anyway.
20:33:49 <skraynev> stevebaker: it will be look like 5 strings main changes and 100 for tests :)
20:34:19 <SpamapS> Anyway, sounds like we all agree and can move on?
20:34:21 <stevebaker> any non-trivial template brings up things outside heat, so they're sort-of integration tests
20:34:38 <zaneb> stevebaker: if we're ready to move on, the next topic is peripherally related...
20:34:45 <stevebaker> sure
20:35:10 <zaneb> #topic Heat Gap Coverage plan
20:35:21 <zaneb> #link https://wiki.openstack.org/wiki/Governance/TechnicalCommittee/Heat_Gap_Coverage
20:35:27 <zaneb> there is our plan ^
20:35:39 <zaneb> the TC are happy with it
20:35:55 <zaneb> we need an owner and target for #3
20:36:19 <zaneb> if only there were a clear expert in this area...
20:36:23 <randallburt> well that's not so bad.
20:36:25 <stevebaker> \o, but it will need to be rewritten for the brave new functional test world
20:36:26 * zaneb strokes chin
20:36:47 <zaneb> stevebaker: cool, thank you, and feel free to rewrite
20:36:55 <stevebaker> OK, will do
20:37:03 <zaneb> we also need to target a milestone
20:37:19 <zaneb> which is weird because this is a task that is never really "done"
20:37:21 <SpamapS> cool
20:37:38 <zaneb> I think the TC would like it to be juno-3
20:37:48 <zaneb> stevebaker: do you think that's realistic?
20:37:50 <stevebaker> I'd like to think we can continue writing tests during feature freeze
20:37:51 <zaneb> (I don't)
20:38:09 <ryansb> kinda hinges on wether tests count as a feature
20:38:12 <skraynev> he-he. Now I know, that never = juno-3
20:38:24 <stevebaker> It would be nice to have *a* voting job which runs something by juno-3
20:38:54 <zaneb> stevebaker: agreed
20:39:10 <zaneb> I don't know whether we could consider it "done" at that point though
20:39:10 <therve> I wonder how much infra work we'll need. That may be the bottleneck
20:39:16 <stevebaker> maybe an appropriate milestone is juno release
20:39:20 <zaneb> the TC review at every milestone anyway
20:39:29 <zaneb> for progress
20:39:55 <ryansb> I don't feel like *a* voting job is too lofty a goal for j3
20:40:41 <stevebaker> ryansb: no, it shouldn't be ;) It will be interesting to see how quick we can add the tests which have been lingering in the tempest reviews
20:40:52 <zaneb> stevebaker: ok, feel free to update the target as you see fit when you're rewriting it
20:41:09 <zaneb> #topic Scaling group member health maintenance
20:41:17 <zaneb> mspreitz: this is you
20:41:27 <mspreitz> I'd like to get something done about this in Juno-3
20:41:29 * zaneb still hasn't finished the ML thread on this :/
20:41:34 <stevebaker> zaneb: OK. The TC will end up with something radically different to approve, but I'm sure they'll be fine with that
20:41:49 <mspreitz> I want to avoid junking up the interface,...
20:41:56 <mspreitz> follow AWS's lead, lean interface...
20:42:04 <zaneb> stevebaker: that's fine. it should take all of 3 minutes to re-review
20:42:05 <mspreitz> you automatically get something simple if you say nothing.
20:42:21 <therve> mspreitz, I looked at the spec, and it's nothing but simple :/
20:42:49 <therve> Also, I'm not sure we should go in the direction of adding monitoring to Heat
20:42:52 <mspreitz> therve: maybe language issue, not sure what you mean. I think the interface is simple.
20:42:59 <mspreitz> I am also concerned about the impl
20:42:59 <SpamapS> -1 no monitoring in Heat
20:43:04 <SpamapS> connect the API's
20:43:06 <SpamapS> thats what we do
20:43:07 <therve> We barely removed what we add
20:43:14 <therve> added
20:43:16 <mspreitz> Problems abound when I try to figure out how to outsource the monitoring
20:43:29 <zaneb> mspreitz: I'm concerned that we're going to spend a lot of time on something that is only applicable to autoscaling, when the same effort put into convergence fixes the problem everywhere
20:43:35 <SpamapS> If we need actual monitoring, we should be asking a monitoring as a service to do it
20:43:45 <therve> mspreitz, I feel that the spec tried to address too many concerns at once
20:44:06 <mspreitz> zaneb: not sure about that size comparison.  I already have something running.
20:44:15 <therve> Also, we tried to decouple autoscaling and load balancing, and you're putting it right back
20:44:19 <zaneb> #link https://review.openstack.org/#/c/110354/1/specs/scaling-group-health.rst
20:44:58 <mspreitz> therve: I think users want the option to have health determined by load balancer
20:45:14 <mspreitz> therve: it's a better test than Nova existence and status
20:45:38 <therve> mspreitz, Sure, that doesn't mean the autoscaling group should know about the load balancer
20:45:47 <therve> It should be notified about the health of the members
20:46:11 <mspreitz> So, no generalization of Ceilometer alarm actions in Juno-3
20:46:23 <mspreitz> No adding notifications to Load Balancer in Juno-3
20:46:27 <mspreitz> I already asked about those
20:46:51 <mspreitz> Adding notifications to load balancer is a maybe for Kilo
20:47:13 <randallburt> wait, why is Heat monitoring your infrastructure at all (for any other reason than convergence)? Isn't that what you should set up via your template?
20:47:14 <therve> mspreitz, What do you mean, you asked in neutron and ceilometer projects?
20:47:29 <mspreitz> I asked colleagues in those projects, yes
20:47:45 <therve> Okay, but Heat shouldn't take the burden of refused/postponed features
20:47:48 <SpamapS> Heat isn't monitoring anybody's infrastructure.
20:48:21 <mspreitz> I like Convergence, but (a) it will be a while before it is ready, and (b) it does not attempt to cover the whole problem
20:48:26 <tspatzier> randallburt: I think what mspreitz is proposing is that you get some reasonable behavior comparable to AWS also if you don't do rocket science in a template
20:48:26 <mspreitz> Convergence is not automatic
20:48:27 <SpamapS> Heat does, however, query the state of API's it has requested things from, for the purpose of either re-requesting them or satisfying dependencies. Where we got this notion that it will monitor things, I don't really know.
20:48:39 <mspreitz> Convergence does not pay attention to load balancer's opinion of health
20:48:50 <mspreitz> Convergence does not take external advice on health
20:48:57 <SpamapS> mspreitz: so write something that is automatic. Teach Heat to talk to it.
20:49:06 <randallburt> SpamapS:  That's what I thought. I mean, convergence will want to know when reality doesn't match the template, but as far as the "health" of the service/infra you used heat to spin, up, Heat shouldn't monitor that.
20:49:41 <SpamapS> randallburt: Heat cannot monitor that. Nor will it connect the dots from failure A to infrastucture piece B.
20:49:47 <randallburt> but Heat doesn't care about the semantics of the things it deploys and shouldn't; that should be expressed in your template using existing services IMO.
20:49:54 <zaneb> mspreitz: so if your external monitoring thing decides something is unhealthy, kill it and convergence will replace it
20:50:15 <randallburt> SpamapS:  yeah, that's what I thought.
20:50:17 <SpamapS> AutoScale is just a special case where we have simplified hooks for changing the scale of a group.
20:50:22 <mspreitz> zaneb: no, convergence is not automatic
20:50:37 <SpamapS> convergence is not
20:50:42 <SpamapS> period
20:50:42 <mspreitz> scaling group are where the user is NOT in control of the template
20:50:55 <zaneb> mspreitz: the idea of convergence is that it is automatic
20:50:55 <SpamapS> but it will be, eventually, continuous, once we agree on how that should work.
20:51:57 * mspreitz is a little lost in parallel threads of conversation
20:52:09 <mspreitz> SpamapS: Convergence will eventually be *what* ?
20:52:22 <JNRao> openstack IRC meeting
20:52:37 <SpamapS> mspreitz: the point is that convergence, even when it is automatic, isn't going to try to provide ways that it can infer that if your response time is high that means your DB needs another slave or your app server farm needs another node. That is for the monitoring as a service bits to define. Heat just encapsulates the group.
20:53:10 <SpamapS> mspreitz: continuous.. meaning eventually it will respond to the nova notification that your node went down
20:53:41 <SpamapS> that is the last in the string of 3 sub-specs
20:53:43 <mspreitz> SpamapS: in the interim while Convergence is not automatic, scaling a group does not replace a member that is deleted or sick, right?
20:53:53 <SpamapS> mspreitz: right.
20:54:41 <SpamapS> if you have an immediate need for bolting that on, I'd suggest doing so as a resource plugin
20:54:41 <mspreitz> So a scaling group will accumulate junk (deleted or sick members) over time
20:54:47 <SpamapS> or an external API
20:54:55 <ryansb> that's the current state
20:55:11 <SpamapS> ((5 minutes))
20:55:33 <skraynev> SpamapS: may continue in #heat ?
20:55:38 <mspreitz> Let me ask a few ground rules...
20:55:48 <mspreitz> OK to add an index or table to the DB schema?
20:56:00 <zaneb> mspreitz: let's take this back to #heat
20:56:05 <mspreitz> OK to put implementation stuff in the template that implements a scaling group?
20:56:10 <zaneb> we have another agenda item to deal with today
20:56:16 <mspreitz> zaneb: OK
20:56:25 <ryansb> ack
20:56:26 <zaneb> #topic Using rally for Heat benchmark and quotas patches
20:56:32 <skraynev> ok
20:56:39 <zaneb> boris-42: o/
20:56:48 <boris-42> zaneb hi there
20:56:54 <skraynev> 1. I told about short article: https://docs.google.com/a/mirantis.com/document/d/1s93IBuyx24dM3SmPcboBp7N47RQedT8u4AJPgOHp9-A/edit#heading=h.jr3hotxxhpit
20:57:14 <skraynev> sorry, it's not published yet, so just google doc.
20:57:21 <zaneb> ick
20:57:53 <skraynev> it's example how to use rally for heat performance
20:58:09 <skraynev> and second part about adding rally job for Jenkins reports
20:58:27 <zaneb> 2 minutes
20:58:30 <skraynev> ok
20:58:36 <boris-42> zaneb I'll be here=)
20:59:01 <zaneb> skraynev: was this leading to a question?
20:59:07 <SpamapS> We definitely need some benchmarks
20:59:17 <SpamapS> metadata access was REALLY bad, it is just "kind of bad" now
20:59:30 <zaneb> +1 for benchmarks
20:59:31 <SpamapS> anyway, we're out of time
20:59:40 <ryansb> to #heat!
20:59:50 <skraynev> zaneb: one question was about quotas patches.
20:59:58 * SpamapS steps outside into 98F weather ..
21:00:04 <skraynev> got to #heat...
21:00:06 <zaneb> skraynev: ok, let's discuss in #heat
21:00:10 <zaneb> #endmeeting