#openstack-meeting log

20:01:23 <ttx> #startmeeting tc
20:01:24 <openstack> Meeting started Tue Feb 14 20:01:23 2017 UTC and is due to finish in 60 minutes.  The chair is ttx. Information about MeetBot at http://wiki.debian.org/MeetBot.
20:01:25 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote.
20:01:27 <openstack> The meeting name has been set to 'tc'
20:01:28 * edleafe pulls up a chair
20:01:29 <flaper87> I just want to say you guys are the worst valentine's date I've ever had.
20:01:39 <flaper87> no offense, though
20:01:41 <flaper87> :D
20:01:49 * mordred slyly puts his arm around flaper87's shoulder
20:01:54 <ttx> flaper87: I think I had other TC meetings on Valentine's day Tuesday, so maybe not
20:02:03 * dhellmann sprinkles rose petals at flaper87's feet
20:02:05 <ttx> Our agenda for today is at:
20:02:09 <ttx> #link https://wiki.openstack.org/wiki/Meetings/TechnicalCommittee
20:02:18 <ttx> #topic Update projects.yaml with results of the PTL elections
20:02:19 <flaper87> LOL
20:02:23 * flaper87 stfu
20:02:25 <ttx> #link https://review.openstack.org/430481
20:02:33 <ttx> Looks like I can approve this now unless someone screams
20:02:48 <stevemar> ttx: go for it
20:02:53 <ttx> done
20:02:57 * stevemar is happy to be off that list hehe
20:03:16 <ttx> stevemar: offciially relieved from that particular duty
20:03:20 <ttx> #topic PTG organization
20:03:24 <mugsie> stevemar: +1 :)
20:03:28 <ttx> So we have the PTG coming next week, was wondering if you had questions
20:03:30 * mordred hands stevemar a pie
20:03:31 <ttx> Random bits of information:
20:03:37 <stevemar> mordred: better be apple
20:03:37 <ttx> We'll be communicating / synchronizing via #openstack-ptg, so join that
20:03:44 <ttx> And we have a number of reservable rooms for inter-project discussions which you can book via
20:03:44 <EmilienM> o/
20:03:48 <ttx> #link https://ethercalc.openstack.org/Pike-PTG-Discussion-Rooms
20:03:56 <ttx> We'll do 9am - 5pm every day (although the rooms will stay accessible until 6pm)
20:04:00 <ttx> Lunch will be served between noon and 1:30pm
20:04:10 <ttx> it's ok to cut it short
20:04:22 <ttx> There will be a happy hour on Tuesday 5-7pm, and a feedback fishbowl session at 5pm on Thursday
20:04:34 <ttx> Otherwise you should self-organize
20:04:53 <ttx> We should have some email today(?) with pointers to group-friendly restaurants in the area
20:05:02 <ttx> in case you want to set up dinners
20:05:05 <stevemar> lbragstad: ^
20:05:12 <flaper87> sounds fantastic
20:05:18 <ttx> Other questions ?
20:05:27 <EmilienM> 5pm? I scheduled sessions until 6
20:05:33 <ttx> EmilienM: it's fine
20:05:36 <flaper87> EmilienM: rooms available till 6
20:05:47 <EmilienM> we don't want to miss parties though :-P
20:05:53 <ttx> The bar should not be empty yet
20:05:55 <flaper87> EmilienM: then stop working
20:05:57 <flaper87> :P
20:05:59 <stevemar> lol
20:06:01 <fungi> is there a separate lunch location, and if so then is it still allowed to bring lunch back to the team rooms instead?
20:06:11 <ttx> EmilienM: it's on Tuesday, your sessions for TripleO start on Wed ?
20:06:20 <EmilienM> ttx: right
20:06:35 <EmilienM> I'll make sure flaper87 doesn't go at bar before
20:06:35 <ttx> fungi: no idea
20:06:43 <ttx> probably ok
20:06:45 * fungi will "wing it"
20:06:47 <edleafe> EmilienM: move your last session to the bar
20:06:57 <ttx> edleafe: that's the spirit
20:07:25 <ttx> ok, feel free to hit me or diablo_rojo if you have questions, we'll do our best to extract knowledge from the events team and answer
20:07:28 <fungi> apparently there's a bar in the same hotel, so easy to move any session you like to it probably ;)
20:07:31 <EmilienM> we'll be releasing final ocata in our case, we'll need strong drinks
20:07:58 <EmilienM> ttx: thanks of all the infos!
20:08:00 <ttx> #topic Document current base services
20:08:09 <ttx> #link https://review.openstack.org/430965
20:08:20 <ttx> This is introducing the concept of "base services"
20:08:30 <ttx> which are things that OpenStack projects can reasonably assume will be present in any OpenStack installation and may therefore freely leverage the features of
20:08:38 <ttx> It's not really a new concept, but we never actually listed the things that are OK to assume will be present
20:08:45 <ttx> Currently: a database, a message queue and Keystone
20:09:02 <ttx> which kind of made it harder for us from having discussions on how to grow or limit that set
20:09:12 <ttx> So I think this will really help us, by providing a base framework for future necessary debates
20:09:20 <ttx> (debates like ending the postgres support, or adding a DLM, or being able to assume Barbican will be present)
20:09:28 <ttx> Questions, comments ?
20:10:02 <mtreinish> ttx: well this doesn't really effect the postgres discussion. You said an oslo.db compatible db in there
20:10:10 <mtreinish> postgres fits that
20:10:26 <flaper87> I just had one comment on it but not a blocker for sure
20:10:26 <ttx> mtreinish: it doesn't affect indeed, describves status quo
20:10:41 <ttx> flaper87: yeah, we can rename after the merge if necessary
20:11:06 <ttx> feels like we have enough votes to pass it, then we'll evolve from there
20:11:08 <stevemar> i think the wording is loose enough around the DB that if we can approve that aside from the postgres discussion
20:11:15 <flaper87> ttx: yup, voted
20:11:28 <flaper87> ttx: I can do the follow-up one if you want, since I brought it up
20:11:39 * dims_ many apologies for showing up tardy
20:11:53 <ttx> ok approved
20:12:02 <fungi> the intent seems to be (with databases for example) that you only rely on features of the rdbms exposed through oslo.db, which at least allows us to tune for a common featureset between multiple backends
20:12:08 <ttx> flaper87: sure -- dtroyer's proposed title sounds good
20:12:51 <ttx> fungi: yes. If we want to not support postgres anymore, we can eiether block it at oslo.db level or replace that oslo.db statement by something stronger
20:13:17 <fungi> though the way it's written, i could see people interpreting it such that you can depend on the advanced features of a single database which happens to be a supported oslo.db backend, even if you're not using oslo.db to leverage it (and so may use features not provided by oslo.db)
20:13:41 <ttx> We'll get back to that later in the meeting when we cover postgresql
20:13:49 <fungi> wfm
20:13:54 <ttx> It's a living doc so feel free to propose updates :)
20:14:05 <ttx> I think it accurately describes the current situation
20:14:11 <ttx> #topic Glance: Changing response status code. What's the best path forward?
20:14:13 <dims_> agree
20:14:15 <dtroyer> ++
20:14:17 <ttx> #link https://review.openstack.org/#/c/420038/
20:14:19 <ttx> #link https://review.openstack.org/#/c/425487/
20:14:22 <ttx> flaper87, rosmaita: o/
20:14:25 <flaper87> o/
20:14:33 <rosmaita> o/
20:14:37 <flaper87> So, I've been talking with rosmaita about this and digging into the topic
20:14:45 <flaper87> I'll let rosmaita do all the talking, though
20:14:49 <rosmaita> ok
20:15:00 <rosmaita> Glance has proposed to fix a bug, an API call that returns a 200 whereas a 204 (No Content) is more appropriate, by changing the software to return the correct code.  Ordinarily, this would be a questionable move, but we've argued that in this particular case because the documents have always stated this call returns a 204 and all the other related calls in fact return 204s. However, representatives of the QA team saw the change, and did not li
20:15:18 <rosmaita> The Glance team consulted the API-WG Guidelines which state that this kind of thing is generally not acceptable, but the details of this case seemed to make it an exception.  So the Glance team met with the API-WG to see what they thought, and they agreed this was a legitimate exception. Unfortunately, the QA team merged a tempest test that covers this call and expects a 200 in the response, before the discussion was resolved, and now we can't
20:15:39 <rosmaita> We're bringing this up at the TC because we'd like a solution to this particular situation and we feel the need of a non biased body to provide guidance. This issue is also related to the  ongoing discussion about the proposed api-compatability tag
20:15:47 * rosmaita takes a deep breath
20:15:49 <flaper87> rosmaita: I think your IRC client cut some of the pastes :P
20:16:46 <rosmaita> shoot, i'm using irssi, but looks ok to me
20:16:55 <edleafe> Wearing my API-WG hat, we felt that it was indeed a reasonable change
20:17:04 <dhellmann> is that tempest test part of what's used for defcore?
20:17:10 <ttx> did not li..
20:17:14 <ttx> ke it ?
20:17:17 <mordred> I believe making the changes, in this case, are improvements to the user
20:17:18 <rosmaita> don't think so, it's metadefs
20:17:24 <dhellmann> ok, good
20:17:33 <edleafe> The return code doesn't change the meaning. 200->204 is not the same as, say, 400-404
20:17:35 <flaper87> Based on how the discussions on this topic have evolved and the parties involved, I think I'd be good with this change happening. The API-WG was part of the discussion and it's not part of defcore
20:17:39 <dhellmann> I'm curious to understand why the QA team thought adding that test on their own was appropriate.
20:18:06 <rosmaita> well, they were expanding test coverage of glance to metadefs
20:18:12 <rosmaita> actually, they reported the bug
20:18:12 <fungi> are the tempest reviewers objecting to a change that will accept either of 200 or 204 for that call?
20:18:22 <dhellmann> OK, but this change was in progress, right?
20:18:41 <rosmaita> dhellmann: yes
20:18:41 <dhellmann> I guess I'm trying to understand how this turned into a blocking situation instead of an opportunity for a conversation.
20:18:58 <rosmaita> fungi: https://review.openstack.org/#/c/432611/
20:19:01 <mordred> sounds to me like the process is working ... the QA team found a bug where the code didn't do what the docs say it should and the dev team fixed it
20:19:17 <rosmaita> yes, the disagreement is on how the fix should go
20:19:23 <dhellmann> And why it had to come to the TC for a decision. Not that asking us to make one is wrong, just we'd like I think to avoid having to do that when possible.
20:19:25 <mtreinish> fungi: normally that's not how things like this are handled. But they can be in edge cases
20:19:36 <mtreinish> tempest likes the api to work the same on all releases
20:19:56 <mordred> so does mordred
20:20:01 <edleafe> dhellmann: Glance did come to the API-WG for a discussion, but our conclusions didn't seem to sway QA
20:20:10 <fungi> mtreinish: yeah, looks like oomichi has objected to 432611 on the grounds that tempest is used for defcore (though sounds like the test in question actually isn't)
20:20:20 <dhellmann> it sounds like the argument for changing the response code is the new value is more accurate and more consistent, and the argument for not changing it is that we don't change response codes?
20:20:22 <mordred> _but_ - also sometimes we find things that are just bugs - and low-impact bugs too
20:20:24 <flaper87> dhellmann: I think the current issue is that the discussion has hit a deadlock
20:20:34 <mtreinish> mostly the issue here stems from a lot of precedence that changes like this need to be handled in a way that things don't change between releases without a versioning mechanism of some sort
20:20:37 <edleafe> flaper87: bingo
20:20:38 <ttx> and escalation was the only solution
20:20:40 <dhellmann> flaper87 : right, and I would like to know why we have two teams deadlocked
20:21:24 <dhellmann> we can deal with the immediate issue of deciding how to proceed, but we should also deal with the issue of getting deadlocked in the first place
20:21:30 <ttx> dhellmann: I suspect one of the dimensions of this issue is that the tempest tests live in tempest in that case
20:21:31 <edleafe> dhellmann: one reason is that the API-WG is simply advisory, whereas the TC has a little more power to resolve such issues
20:21:52 <mordred> I mean - I think the urge to be conservative here is a good one, and I honestly would not mind the process for things like this be to get a TC greenlight - just so that we don't slippery slope back into the world of making brekaing changes across releases
20:21:54 <ttx> i.e. the QA team has authority on one part of the fix and Glance on the other
20:21:57 <mtreinish> fungi: I think oomichi is referring to the nature of tempest being used against any cloud and using defcore as an example
20:22:14 <mtreinish> but I could be wrong
20:22:20 <dims_> guess folks who are relying on current behavior will break. folks who read the docs will just scratch their heads. so we lean towards the not-breaking-people direction?
20:22:21 <ttx> so if they don't agree on the fix and nobody yields... escalation is the right process
20:22:39 <fungi> i would have hoped to see rosmaita follow up to the -1 on 432611 with appropriate counterarguments rather than just assuming the opinion of any reviewer is immutable, but maybe there was other subsequent discussion which isn't reflected in that review?
20:22:46 <ttx> the only way to avoid that in the future is to put all tempest tests in project-specific repo but we rules the other way recently
20:22:54 <mordred> dims_: they'll break in theory. in practice, the chances someone has code that is explicitly checking for 200 vs 204 and making different actions based on it is none
20:22:57 <edleafe> In case anyone wants my POV on this: https://blog.leafe.com/api-longevity/
20:22:57 <ttx> ruled*
20:22:59 <dhellmann> ttx: only for defcore tests
20:23:07 <flaper87> fungi: I think that's the cae, although probably not ideal
20:23:11 <ttx> dhellmann: apparently for that one as well
20:23:22 <mordred> because theonly different action you'd make from 200 to 204 is to not look at the payload - but there is no payload anyway,so there is literally no legitimate consumption difference
20:23:23 <dhellmann> that policy only applies to tests intended for use by defcore
20:23:24 <dhellmann> well, my point is the test could just be moved
20:23:26 <rosmaita> fungi: see the other patches ttx linked for the arguments about this
20:23:27 <dhellmann> though that doesn't solve the collaboration problem
20:23:49 <mtreinish> mordred: well unless you have a poorly written client that assumes a specific response code to measure success
20:23:59 <edleafe> dhellmann: you are correct in pointing out that there are two problems here
20:24:00 <mtreinish> mordred: in the past we've asserted that's a thing we've cared about
20:24:01 <dhellmann> mordred: do we anticipate someone looking at the specific error code at all?
20:24:02 <sdague> mordred: yeh, I think it's mostly the difference of opinion of "is not really broken, why fix this"
20:24:36 <sdague> dhellmann: people do stuff all the time where they just == KNOWN_GOOD_THING
20:24:39 <flaper87> mordred: ++
20:25:12 <ttx> so yes, (1) how do we solve that disagreement and (2) how do we avoid similar situations in the future
20:25:19 <mordred> yah. it's a jugement call. I'm saying I come down fairly strongly  on "just fix the bug" - because someone who wrote a client that is _specifically_ coding against the 200 and not the 204 coded to a specific success code that is in contradiction to the api docs so holy crap what were they doing?
20:25:21 <sdague> right, we've got 2 camps here. QA) don't change things unless it's really hurting people.
20:25:22 <dtroyer> also, KNOWN_GOOD_THING is often defined by what it does, not what the docs say
20:25:37 <dhellmann> mordred : they were coding against how a cloud they use actually works?
20:25:38 <sdague> Glance) lets make this more compliant
20:25:41 <fungi> rosmaita: yeah, skimmed and it looks like the pushback from oomichi was that there should be thorough discussion before changing that behavior. now that it's been discussed is he still unwilling to budge?
20:25:43 <dims_> right dtroyer
20:26:04 <sdague> and seems like the TC call is really where we want that slider to be
20:26:15 <mugsie> I do know if i was writing a client for something, and the docs disagreed with the actual response, I would just throw my hands up, and code it to what worked
20:26:20 <rosmaita> fungi: yes, but as you can see, both sides of this have support
20:26:20 <ttx> sdague: and I think it's appropriate for the TC to make that call, tbh
20:26:26 <dhellmann> mugsie : right
20:26:38 <mordred> mugsie: right. but this is a different _success_ code - coding against a specific code is crazy in teh first place
20:26:40 <sdague> because I think that after the recommendation, the teams will most likely run with it
20:26:52 <edleafe> mugsie: but I wouldn't be shocked if later the behavior changed to match the docs - especially when it is consistent with everything else
20:26:52 <bastafidli> can there be intermitent fix when documentation gets adjusted to highlight both behabiours (200 and 204) and then queue real fix for later release?
20:27:01 <mordred> the _only_ reason I could _possibly_ think of to check the specific 200 vs. 204 is if you were coding to a spec - in which case you would have been wrong
20:27:06 <mugsie> mordred: sure - but in some APIs different succes codes mean different things
20:27:07 <rosmaita> mugsie: what edleafe said
20:27:08 <mordred> otherwise you'll be coding to 2xx
20:27:24 <mugsie> one is "its done" anther is "come back and check"
20:27:38 <mugsie> in the 204 case its not as clear cut
20:27:49 <stevemar> mordred: IIRC thats what triggered the keystone patch to change things from 200 -> 204
20:27:49 <mordred> right - but this is "it's done" and "it's done and I don't have a payload"
20:27:51 <dhellmann> mordred: I'm not sure we have previously applied any expectation that users will be rational when we limit changes of this nature. :-)
20:27:58 <mordred> dhellmann: :)
20:28:00 <sdague> dhellmann: yeh, that's really the crux
20:28:01 <dtroyer> this is probably the closest we will come to this sort of exception being mostly harmless
20:28:07 <mordred> dtroyer: ++
20:28:19 <stevemar> dtroyer: true
20:28:19 <sdague> so... just to be warned
20:28:29 <dtroyer> I really don't think we are in danger of setting precedent we will eventually regret here
20:28:36 <mordred> me either
20:28:37 <edleafe> mugsie: if I saw docs = 204 and behavior = 200, I'd code 200 <= response < 300 and be done with it. :)
20:28:42 <ttx> OK, let's make some progress here -- (1) which approach would you rather take, Glance or QA ? And (2) how would you avoid such disagreement in the future ?
20:28:45 <flaper87> fwiw, in this case, I'd prefer to take the API-WG's advice and go with it
20:28:45 <mordred> edleafe: ++
20:28:52 <sdague> there have been individuals across multiple projects wanting to change up all the success codes to match the api-wg recommendations
20:28:59 <mordred> ttx: I explicitly do not want this to be about a general response
20:29:01 <sdague> because probably about 30% of succes is wrong
20:29:06 <dhellmann> ttx: to be clear, it's a choice between correcting the docs and between changing the code?
20:29:09 <sdague> by strict http standards
20:29:10 <mordred> this is a very specific case with a very specific set of tradeoffs
20:29:16 <mtreinish> dtroyer: there is a simple middle ground here just version the api. So it matches the docs moving forward but maintains backwards compat for older clients
20:29:16 <flaper87> dhellmann: yes
20:29:18 <mordred> that I think rosmaita has done a great job of enumerating
20:29:24 <dtroyer> sdague: sure, and much of that will cause much more heartburn… are the docs wrong (different) in those cases too?
20:29:26 <fungi> dhellmann: or between changing the documented behavior and correcting the code ;)
20:29:27 <sdague> so, if this slider moves, that all comes into play
20:29:28 <mtreinish> which is what I always recommend for an api change
20:29:29 <mordred> we absolutely do not need to make any larger decisions on policies
20:29:33 <dhellmann> fungi : yes
20:29:58 <ttx> mordred: so you would explicitly not provide an answer for (2) (how to avoid such disagreement in the future)
20:30:05 <mordred> ttx: yes. very much so
20:30:09 <ttx> and rule case by case
20:30:09 <sdague> dtroyer: where the docs have been wrong on the nova side, we've been just fixing the docs
20:30:10 <dtroyer> mtreinish: assuming clients ever actually pay attention to API versions… it's more of a "notice a break" and reactively deal with it situation
20:30:11 <mordred> this is not a precedent-needing problem
20:30:12 <mugsie> but, now the docs will be wrong for one release, and right for one - where as if the docs are fixed, they are right for all versions
20:30:13 <mordred> yes
20:30:21 <ttx> mordred: wfm
20:30:23 <dtroyer> sdague: perfectly good option too
20:30:34 <edleafe> ttx: I understood 2) to mean "why don't teams cooperate"?
20:30:35 <fungi> i like flaper87's position. we have the api working group for a reason, and the glance team did consult with them to get an answer. i see no reason to disagree with them on api-behavior-specific topics
20:30:35 <mtreinish> dtroyer: right, by versioning it you don't break anyone
20:31:01 <mtreinish> mugsie: right, I view this as more a doc bug then anything
20:31:01 <edleafe> mtreinish: now if only Glance supported microversions... :)
20:31:03 <mordred> versioning just shifts the complexity to a different place
20:31:10 <mordred> for this
20:31:10 <ttx> fungi: ++
20:31:13 <stevemar> mordred: yeah
20:31:16 <flaper87> fungi: ++
20:31:21 <smcginnis> mordred: +1
20:31:32 <mugsie> mordred: ++
20:31:45 <dtroyer> versioning needs to be done, but I don't think this is a zero-sum situation
20:31:50 <dhellmann> fungi : ++
20:32:04 <mtreinish> edleafe: right, this issue has come up before with glance more than once in the past few weeks, and it's why I proposed that tag
20:32:05 <fungi> also, the tc contradicting the api working group on a topic like this feels like a vote of "no confidence" in them
20:32:07 <dtroyer> fix the friggin bug, and get up to date with versioning…we really need both
20:32:23 <edleafe> dtroyer: +1
20:32:24 <mordred> dtroyer: totes. versioning needs to be done. but punting to versioning for this one I think is overkill
20:32:25 <ttx> OK, it feels like there is a majority agreeing to take Glance's side on this one ? And no majority to make that a precedent
20:32:41 * ttx prepares a #startvote
20:32:43 <flaper87> ttx: you may want to have an actual IRC vote, just for logging purposes
20:32:44 <mordred> ttx: if there is any precedent, it's that the teams all acted appropriately
20:32:44 <dims_> i can go with that ttx
20:32:48 <flaper87> ttx: that
20:32:50 <flaper87> :D
20:32:57 <mordred> ttx: glance talked to the API-WG - the QA team was appropriately conservative
20:33:11 <dims_> ++ mordred
20:33:14 <rosmaita> mordred: ++
20:33:15 <dtroyer> mordred: ++
20:33:17 <mordred> if the same situation arises again and all of the teams follow this pattern I don't think it'll be bad for openstack
20:33:22 <mtreinish> mordred: no versioning should have been done for the other breaking change glance made a few weeks ago where they completely changed the membership api
20:33:23 <sdague> so, I don't actually understand how you can reject the precident though
20:33:25 <edleafe> mordred: good summary
20:33:30 <fungi> i definitely don't see this as setting any precedent for anything other than agreeing with the api-wg on their assessment of effectively non-impactful minor changes
20:33:34 <ttx> #startvote Should that conflict be solved by taking Glance's or QA's approach? Glance, QA, abstain
20:33:35 <openstack> Begin voting on: Should that conflict be solved by taking Glance's or QA's approach? Valid vote options are Glance, QA, abstain.
20:33:36 <openstack> Vote using '#vote OPTION'. Only your last vote counts.
20:33:38 <mordred> fungi: ++
20:33:49 <flaper87> #vote Glance
20:33:51 <mordred> #vote Glance
20:33:55 <dhellmann> #vote Glance
20:33:56 <fungi> #vote Glance
20:34:03 <mtreinish> #vote QA
20:34:06 <dtroyer> #vote Glance
20:34:09 <EmilienM> #vote Glance
20:34:17 <sdague> #vote abstain
20:34:24 <ttx> #vote abstain
20:34:41 <dims_> #vote Glance
20:34:51 <ttx> (would rather come up with a precedent but understand why its not necessarily desirable here)
20:35:09 <sdague> ttx: you can't not though
20:35:34 <dtroyer> sdague: sure, the precedent includes the amount of (non)breakage potential too
20:35:38 <ttx> ok, 30 more seconds
20:35:58 <sdague> dtroyer: that's fine, like I said a rather large amount of success codes are non HTTP pure
20:36:28 * mugsie will bet this *will* be used as an example of a change that was allowed in the future
20:36:36 <dtroyer> sdague: right.  and a large number of thouse would be painful, so any precedent doesn't apply
20:36:39 <sdague> we've had specs in Nova to go clean those up, which we've kept pushing back on because it's a lot of churn for not much clear value
20:36:39 <ttx> #endvote
20:36:39 <openstack> Voted on "Should that conflict be solved by taking Glance's or QA's approach?" Results are
20:36:40 <openstack> QA (1): mtreinish
20:36:41 <openstack> Glance (7): dims_, fungi, mordred, dhellmann, dtroyer, flaper87, EmilienM
20:36:42 <openstack> abstain (2): ttx, sdague
20:36:48 <ttx> mugsie: doesn't prevent anyone from re-raising it to TC
20:36:55 <sdague> ttx: it doesn't
20:36:57 <sdague> but it won't be
20:37:14 <fungi> it sets a precedent that backward compatibility rules are open to interpretation in some cases
20:37:22 <sdague> it's fine that was the decision, but please don't pretend teams won't self censor on it
20:37:35 <mtreinish> fungi: which I think is a step in the wrong direction
20:37:37 <mordred> and that the API-WG should be consulted when they are open to interp as well
20:37:40 <mtreinish> fungi: the apis change enough as it is
20:38:11 <ttx> #info <mordred> ttx: glance talked to the API-WG - the QA team was appropriately conservative
20:38:23 <ttx> ok, next topic
20:38:27 <ttx> #topic Deprecate postgresql in OpenStack
20:38:32 <ttx> #link https://review.openstack.org/427880
20:38:49 <ttx> On this one I was wondering if we can really use that 8% figure to say that "the ecosystem has settled on MySQL as the backend"
20:38:57 <ttx> that user survey metric is a bit weird, since 24% say they use MongoDB
20:39:12 <sdague> ttx: because of ceilometer, right?
20:39:17 <flaper87> ttx: telemetry?
20:39:19 <fungi> and also of that 8% only half were apparently describing production environments?
20:39:24 <ttx> Also Xen/Xenserver represents 6% of Nova deployments... has the ecosystem settled on KVM ?
20:39:33 <ttx> sdague: 24% of deployments using ceilometer ?
20:39:40 * ttx looks
20:40:05 <fungi> i'm more concerned that we have fairly major distributions (suse and huawei) deploying with postgresql by default instead of mysql
20:40:28 <ttx> it feels like the backlash is not coming from a vocal minority as much as I expected it
20:40:28 <fungi> and curious to know what impact this would have on them
20:41:03 <dtroyer> fungi: I'm curious what they've seen when there has been so little testing as it is on pg
20:41:06 <ttx> sdague: you're right, probably comes from Ceilometer 60%
20:41:31 <mtreinish> dtroyer: yeah, I'm curious about that too
20:41:33 <EmilienM> in TripleO, we use MongoDB with Zaqar messaging
20:41:34 <sdague> fungi: yeh, it seems like products by suse, huawei, and windriver are based on pg, but I've never seen an organic instance in the operator community based on that
20:41:34 <edleafe> While not an API, this also seems like a breaking change
20:41:46 <gordc> dtroyer: tbh, there isn't much breakage. the rare times we notice issues, we'll just patch it upstream.
20:42:05 <ttx> does anyone know what CERN is using ? Tim sounded like he would rather keep pg
20:42:06 <dims_> fungi : we will need some time to find the dev teams in there and ask them
20:42:14 <dtroyer> gordc: how hard is it to identify that as a DB issue?  or are you good enough at it by now to see it quickly?
20:42:19 <sdague> ttx: galera I was pretty sure
20:42:30 <EmilienM> ttx: mysql AFIK
20:42:35 <fungi> sdague: ftr, i understand there is a very large telco in europe basing their public cloud on huawei's distribution at least (they said as much in either a keynote or a board meeting, maybe both)
20:42:46 <ttx> anyway, I think that shouldn't block our decision -- but I think it calls for a deprecation period
20:42:48 <edleafe> It's interesting that patching issues like this are acceptable, but patching a minor API success code change causes so much concern
20:42:52 <mordred> fungi: yes. I have an account on it
20:43:05 <gordc> dtroyer: i don't actually manage the product stuff so i can't give you accurate details on how quickly it gets identified
20:43:08 <ttx> at the very minimum
20:43:26 <dtroyer> I think the step of deprecating pg is necessary to either move forward with officially unsupporting it or to rally (again) the support required to properly maintain it
20:43:33 <gordc> dtroyer: we're not tracking master so we really only notice things when we start pulling in update for next version
20:43:35 <ttx> i.e. we would not intentionally scrap pg-compatibilty code until +1year
20:43:59 <sdague> ttx: but we should definitely ensure it's not really in docs and the like
20:44:12 <mordred> yah. I don't think removal without a full and conservative deprecation cycle is an option
20:44:26 <sdague> because right now there is implied support from upstream
20:44:27 <dtroyer> gordc: ok, so given the lag from master to distro, the removal of most pg jobs early in Ocata(-ish) only now gets to you?
20:44:34 <ttx> sdague: well, if the other is "deprecated" it should not look good in docs indeed
20:44:41 <gordc> dtroyer: we had pgsql test in ceilometer gate and we were noticed a pg break every 8 months or so... majority the db stuff we do in openstack is really generic.
20:45:03 <dtroyer> gordc: ok, thx
20:45:18 <mugsie> dtroyer: I would not assume that it would even on the radar of some distros yet
20:45:18 <ttx> sdague: I agree the mismatch between support upstream and usage is distro is a bit weird
20:45:24 <dhellmann> is the problem with supporting it the lack of folks to fix breaks, or is there a pressing technical need for something only available in mysql?
20:45:35 <gordc> dtroyer: i can't recall how many issues reached downstream. when i asked, they gave me 3 examples of them patching upstream to fix pgsql
20:45:42 <edleafe> dhellmann: the former
20:45:53 <dhellmann> edleafe : ok, because I've heard both arguments
20:45:54 <mordred> edleafe: I'd say both actually
20:45:58 <mordred> I think the reason I'm advocating for us to consider this is so that we can consider _not_ doing really generic database stuff
20:46:02 <dhellmann> mordred : "pressing"?
20:46:06 <ttx> Theoretical: if someone shows up and fixes PG support and works in QA team to support it, would we reverse the decision ?
20:46:07 <mordred> nothing is pressing
20:46:11 <mordred> openstack has worked for years
20:46:18 <sdague> dhellmann: from my point of view the problem is the extra overhead of "oh, we'd like to do utf8 right on these fields"
20:46:23 <dhellmann> right, ttx phrased my question more directly
20:46:31 <sdague> "oh, wth does pg do there? do we have to care?"
20:46:33 <sdague> feature dies
20:46:35 <mordred> yup
20:46:37 <mordred> sdague: ++
20:46:50 <mordred> there's a ton of db improvement that dies on the vine
20:46:57 <dhellmann> I am surprised that unicode is even something we have to deal with. Doesn't sqlalchemy do that?
20:46:58 <dtroyer> ttx: we've seen that for short periods before, how do we gauge long-term commitment?
20:47:00 <edleafe> mordred: the impetus was lack of support. Once we consider dropping it, then the possibilities of MySQL-centric stuff start to bloom
20:47:02 <sdague> dhellmann: no
20:47:08 <dhellmann> ffs
20:47:09 <mordred> dtroyer: unicode is execptionally difficult to get right
20:47:11 <mordred> gah
20:47:13 <mordred> dhellmann: ^^ sorry
20:47:24 <sdague> the db engine is really important
20:47:32 <sdague> especially if you are trying to index things for search
20:47:38 <ttx> happens that databses are hard to abstract after all
20:47:39 <mordred> ++
20:47:49 <sdague> there is this assumption that sqla and oslo.db abstract the db
20:47:51 <sdague> they do not
20:47:55 <bastafidli> edleafe: ++
20:48:02 <sdague> they provide some convenience functions
20:48:18 <gordc> i mention this on patch... but breaking something that works for some hypothetical improvements is kinda sketchy.
20:48:20 <sdague> and if you write some thing in ormish ways, some common 80% cases are made to look the same
20:48:49 <ttx> gordc: devil's advocate: in openstack we say that what's not tested is broken
20:48:54 <dhellmann> gordc : I'm hearing a very specific improvement in non-ascii text support
20:49:12 <ttx> I'm more concerned about the lack of testing or resources to work on it
20:49:18 <gordc> ttx: fair enough.
20:49:27 <fungi> gordc: would you object to proposed model/schema changes that improved performance in mysql but worsened performance in postgresql (or vice versa)?
20:49:41 <EmilienM> ttx: right, it seems like a very few (1 or 2) people actually maintain it
20:49:57 <dhellmann> ttx: the question in my mind is whether we phrase the deprecation as reversible if support shows up, or if we say it's settled once and for all
20:50:02 <gordc> fungi: nope. i think that'd be good.
20:50:04 <ttx> gordc: I'm saying "devil's advocate" because my personal preference on this is that we should find a way to make it work
20:50:07 <fungi> just curious whether we need some actual examples of beneficial changes we've avoided/abandoned because they'll impair one specific backend
20:50:24 <dims_> how much deprecation period are we looking at?
20:50:57 <gordc> ttx: i'd prefer a periodic check to see actually validate pgsql is unmaintained
20:51:03 <stevemar> this is a *very* liberal use of our deprecation policy :\
20:51:06 <ttx> dhellmann: well... deprecation will not actively break, it's more of a statement that we'll start actively breaking pG by removing code or adding MySQl-specific stuff by one year
20:51:11 <dtroyer> fungi: I would think that would help some decision makers somewhere know (and work on their upstream mgmt)  if they want to fund support
20:51:12 <gordc> or if it's just  working and nothing pgsql is being done.
20:51:18 <ttx> so it's easily reversed if we can be convinced
20:51:45 <ttx> but as dtroyer said, would take a lot to convince it's strategic involvement and not purely tactical
20:51:46 <dhellmann> ttx: right, but if someone comes along offering to do support and their work is going to be turned away because we don't want that work, we should just say that up front
20:51:48 <sdague> my concern about all of this, is the same bind we get into with "this does't work" "oh we need people to support it"
20:51:49 <mordred> ttx: I think that's a great way to phrase it - it would also give folks a time to work up things like what fungi is talking about
20:51:55 <sdague> because it's not really true, we need maintainers
20:52:01 <sdague> people that are proactively on top of things
20:52:16 <dtroyer> sdague: ++
20:52:20 <mtreinish> sdague: ++
20:52:29 <sdague> because, what happens with just the "raise hands to support" is someone else already burdened in the community has to own the maintenance
20:52:32 <dtroyer> and this is one way to let our contributing companie know where they need to pay attention
20:52:32 <smcginnis> One concern would be if pg goes away, we add some mysql specific functionality, then someone shows up and wants to maintain and add back in pg support.
20:52:34 <ttx> sdague: yes as I said it would take a lot to convince us that PG is actually maintained
20:52:39 <smcginnis> It will be much more difficult then.
20:52:47 <sdague> smcginnis: it will be
20:53:00 <sdague> because we decided not to burden existing contributors with it
20:53:07 <sdague> so that they could focus on more important things for users
20:53:10 <sdague> it's all trade offs
20:53:10 <ttx> smcginnis: will be too late by then
20:53:23 <smcginnis> sdague: Yep, fair point. Just pointing that out with talk of keeping the door open for it.
20:53:24 <fungi> i'm also curious what "deprecation" would look like. should we rip out any postgresql deployment examples in master branches of install documentation at the start of the deprecation period, or at the end of it?
20:53:24 <dhellmann> smcginnis : right, that's another reason for us being clear if what we're really saying is that we are going to drop pg and not accept maintenance work
20:53:28 <ttx> smcginnis: we can only reverse that decision during the deprecation period, while we don't touch anything
20:53:33 <EmilienM> smcginnis: I second you. Also some companies might need to fork some projects to maintain pg support ... which wouldn't be cool either.
20:53:34 <smcginnis> dhellmann: +1
20:53:35 <sdague> and, it's fine to say that our resources should be spent on a db that is rarely used
20:53:45 <gordc> just curious, but how are we certain it's not maintained?
20:54:03 <dhellmann> gordc : because people were not showing up to deal with breaks in the gate?
20:54:20 <dhellmann> and that led to teams dropping the postgresql jobs from their gate
20:54:32 <dtroyer> I think deprecation here means a) documenting the current state (untested), and b) waiting a period, sya a year, for the situation to change, and c) re-evaluating at the end of that period before allowing mysql-specific bits
20:54:32 <ttx> and now it's probably broken but noone knows
20:54:36 <gordc> dhellmann: i see. i didn't realise there were more. the pgsql we found was patched within 12 hours
20:54:39 <mordred> yah - postgres has _consistently_ been ignored for YEARS
20:54:43 <mordred> this isn't like a recent thing
20:54:55 <mugsie> from our projects perspective, every now and again, someone comes in to fix pg, gets a gate and leaves.
20:54:55 <dhellmann> gordc : at this point only the telemetry team was gating, iiuc
20:54:58 <sdague> yeh, pretty clear the fact that we dropped pg from the gate an no one was there building alternative testing around it, means it's not maintained
20:55:04 <mordred> mugsie: ++
20:55:10 <mugsie> then 6 months later something breaks, and we just drop the gate
20:55:10 <edleafe> mordred: so it's been deprecated, but we're now just owning up to it?
20:55:20 <gordc> dhellmann: neutron i mentinoed they have periodic postgresql
20:55:26 <mordred> edleafe: the openstack community has literallky never actually cared about postgres
20:55:27 <ttx> ok, I think we made progress. Let's discuss this F2F next week and I think we'll be able to make a call for Pike
20:55:31 <dhellmann> gordc : a periodic job is not a gate job, though, right?
20:55:55 <mugsie> and a periodic job is only as good as the people watching the results
20:56:02 <edleafe> mordred: it seems that enough people cared to create the tests in the first place
20:56:06 <sdague> and, more importantly, no one is getting ahead of things
20:56:12 <ttx> sdague: ++
20:56:13 <edleafe> mordred: once they left, no one picked up the slack
20:56:17 <sdague> just having a test job isn't maintaining anything
20:56:18 <ttx> It's reactive rather than proactive
20:56:18 <gordc> dhellmann: right. i forgot what question was... if it's being gated on? it's not. :)
20:56:23 <mordred> edleafe: that's how I define "community doesn't care"
20:56:24 <dhellmann> gordc : right
20:56:41 <gordc> i'm just wondering if it's actually broken. :)
20:56:42 <edleafe> mordred: and that's how I define effectively deprecated
20:56:43 <mordred> edleafe: a random human helicoptering in to do work once and then going away does not equate to overall care and feeding
20:56:47 <mugsie> we (designate) actually do have a voting pg gate job right now
20:56:48 <mordred> edleafe: totes
20:56:49 <mtreinish> gordc: I don't see any periodic jobs with postgres in the name: http://status.openstack.org/openstack-health/#/?groupKey=build_name&searchProject=postgres
20:56:57 <mordred> edleafe: oh - sorry - yes, I agree with you - I was just trying to agree more strongly
20:57:03 <edleafe> mordred: :)
20:57:12 <ttx> #topic Open discussion
20:57:16 <gordc> mtreinish: just going by what neutron said.
20:57:16 <ttx> Quick couple of topics
20:57:18 <ttx> We'll skip the TC meeting next week due to PTG
20:57:21 <edleafe> Yeah, I'm just saying let's be honest about it
20:57:24 <ttx> and amrith raised the issue of the "OpenStack Clean" bot posting comments on unworthy changes
20:57:28 <ttx> #link http://lists.openstack.org/pipermail/openstack-tc/2017-February/001333.html
20:57:34 <ttx> While I generally agree with what it says, it feels like a slippery slope to have anonymous bots trolling on reviews
20:57:40 <ttx> Any opinion on that one ?
20:57:50 <mugsie> ttx: I think the account has been disabled
20:57:56 <gordc> mtreinish: we dropped ours recently.
20:57:59 <dhellmann> yes, that doesn't seem like the right answer there and I agree with disabling the bot
20:58:03 <ttx> yes, but in a legal vacuum -- should we have a nobot policy ?
20:58:10 <ttx> or only vetted bots ?
20:58:17 <EmilienM> the comments are really useless but don't hurt anyone though I agree with disabling the bot.
20:58:21 <mugsie> vetted only seems resonable
20:58:22 <dims_> only infra approved bots
20:58:36 <ttx> dims_: sounds good
20:58:36 <EmilienM> dims_: yes, +1
20:58:36 <mtreinish> gordc: that link is going back a month
20:58:42 <mtreinish> gordc: you can change the period
20:58:45 <dhellmann> vetting bots that post seems like a reasonable standard
20:58:51 <dtroyer> similar to the 3rd party CI requirements?
20:58:57 <ttx> #agreed Only infra-approved bots are allowed on Gerrit
20:58:59 <mtreinish> ttx: are we sure that's a bot? It's review here seems like a human: https://review.openstack.org/#/c/430164/
20:59:13 <ttx> mtreinish: it's a human that repeats itself very often then
20:59:15 <notmyname> how is that done? sounds like a "Real Name" policy
20:59:24 <dhellmann> mtreinish : maybe it was a bulk update
20:59:34 * fungi isn't sure he wants the infra team burdened with vetting every single automated system which comments on reviews... there are hundreds already
20:59:36 <sdague> also, bots aren't supposed to vote in CR
20:59:39 <ttx> notmyname: Turing test ?
20:59:45 <sdague> bots are only supposed to vote in V
20:59:49 <mugsie> notmyname: we kind of have that with the CLA
20:59:57 <dhellmann> fungi : ++
21:00:04 <smcginnis> stackalytics ranking improvement bot?
21:00:08 <ttx> notmyname: if the bot can convince us it's human, it's probably good enough
21:00:17 <edleafe> smcginnis: ha!
21:00:23 <fungi> mugsie: no idea how the cla has anything to do with code review comments though
21:00:25 <ttx> and we are out of time...
21:00:40 <EmilienM> ttx: thx for chairing!
21:00:40 <ttx> See you all next week ?
21:00:43 <EmilienM> o/
21:00:46 <ttx> or almost all
21:00:48 <edleafe> See you in Atlanta!
21:00:56 <rosmaita> thanks everyone!
21:01:00 <ttx> #endmeeting