Tuesday, 2019-06-04

*** takashin has joined #openstack-placement00:24
*** mriedem has quit IRC00:44
*** ttsiouts has joined #openstack-placement01:00
*** ttsiouts has quit IRC01:06
*** Sundar has quit IRC02:08
openstackgerritMerged openstack/placement master: Resource provider - request group mapping in allocation candidate  https://review.opendev.org/65758203:24
*** Sundar has joined #openstack-placement05:11
*** e0ne has joined #openstack-placement05:43
*** belmoreira has joined #openstack-placement05:49
*** Sundar has quit IRC06:06
*** evrardjp_ is now known as evrardjp06:43
*** belmoreira has quit IRC06:43
*** ttsiouts has joined #openstack-placement07:09
*** e0ne has quit IRC07:09
*** ttsiouts has quit IRC07:34
*** ttsiouts has joined #openstack-placement07:35
*** ttsiouts has quit IRC07:39
*** ttsiouts has joined #openstack-placement08:02
*** tetsuro has joined #openstack-placement08:07
*** takashin has left #openstack-placement08:30
*** e0ne has joined #openstack-placement08:32
*** tssurya has joined #openstack-placement08:40
*** ttsiouts has quit IRC09:01
*** ttsiouts has joined #openstack-placement09:02
*** ttsiouts has quit IRC09:06
*** mugsie_ is now known as mugsie09:52
*** cdent has joined #openstack-placement10:27
openstackgerritMerged openstack/os-traits master: Create trait for NUMA subtree affinity  https://review.opendev.org/65789810:47
efriedcdent:   https://review.opendev.org/663005 Release os-traits 0.14.0  for ^10:52
* efried rolls10:52
openstackgerritChris Dent proposed openstack/placement master: perfload with written allocations  https://review.opendev.org/66075411:11
openstackgerritTetsuro Nakamura proposed openstack/placement master: PoC: DNM: resourceless request  https://review.opendev.org/66300911:17
*** ttsiouts has joined #openstack-placement11:22
*** ttsiouts has quit IRC11:33
*** ttsiouts has joined #openstack-placement11:34
*** ttsiouts_ has joined #openstack-placement11:37
*** ttsiouts has quit IRC11:38
*** tetsuro has quit IRC11:39
*** cdent has quit IRC11:54
*** mugsie is now known as mugsie_12:18
*** mugsie_ is now known as mugsie12:18
*** Sundar has joined #openstack-placement12:24
*** cdent has joined #openstack-placement12:48
efriedSundar, cdent: o/12:51
* cdent waves12:51
Sundarefried: What's up?12:52
efriedcdent: cf ~1h ago, Sundar wanted to talk through some of the ways in which nested magic will satisfy "device affinity" use case12:52
efriedSundar: This is purely from a "device colocation" standpoint. You would set your model up so your device was a child of the compute (or NUMA node); and then your PFs (or whatever you want to call them) are children of the device RP.12:54
SundarYes. Before we make specific code changes like https://review.opendev.org/#/c/657419/, it would be good to have a sense of how it maps to all use cases12:54
efriedSo the device RP would actually provide no resources.12:54
efriedBut you would tag it with a trait like HW_DEVICE_ROOT (https://review.opendev.org/662811)12:55
efriedNow your device profile would list let's say two devices, each in their own request group:12:56
efriedand then you would add a "resourceless request group" for the device:12:56
efriedand then you would add a same_subtree to tie them together:12:57
efriedSundar: Does that make sense? And satisfy the use case you had in mind?12:57
Sundarefried: "required_DEV_COLOCATION=HW_DEVICE_ROOT" -- is this a new clause? Does not look like a trait12:58
efriedIt's just a request group with a (newly-supported) string instead of numeric suffix12:58
efried...and no 'resources' component.12:58
SundarWhy not: trait:HW_DEVICE_ROOT=required ?12:59
efriedthat's flavor syntax.12:59
efriedand also, it's not granular12:59
efriedwe need granular here.12:59
SundarWell, I am thinking from the perspective of flavors and request groups.13:00
efriedso like, if it weren't for the fact that the suffixes today are dynamically generated to guarantee uniqueness, you could have done the above with13:00
efriedsorry, HW_DEVICE_ROOT13:00
efriedSundar: Yes, this example is only addressing one use case: device colocation when the devices are all listed in the same device profile.13:01
SundarI suppose you mean: resources1:VF=1&....13:01
efriedeh? Isn't that what I said?13:02
SundarMay be I am nitpicky about the syntax. When you switch the punctuation around, I am not sure if it i a new thing13:02
SundarInstead of required3, one would have expected: trait3:HW_DEVICE_ROOT=required13:03
efriedI am talking about placement querystring syntax now13:03
efriedexpressing it from a flavor or a device profile is not a challenge13:03
efriedis that how we're doing granular in flavors? traitX:TRAIT_NAME=required => ?requiredX=TRAIT_NAME ?13:04
SundarOK. That goes to what I am saying. The use case (or usage model or hwhatever) consists of what is said in the flavors, device profiles and Neutron port RGs. That's what the operator says/sees.13:04
SundarIf we start from that, there will be more clarity (at least for me) about how it all ties together13:04
efriedokay, then I'm going to have to remind myself what the proposed device profile syntax is...13:05
SundarNP, just say it in flavor syntax and I can translate to device profiles. They are very similar13:05
efriedokay, then I'm going to have to remind myself what the flavor syntax is...13:06
SundarThat will help, lol13:06
efriedOkay, yes, so the flavor syntax today is13:10
efriedresources${suffix}:${resource_class}=${quantity}  ==>  ?resources${suffix}=${resource_class}:${quantity}13:10
efriedtrait${suffix}:${trait_name}=required  =>  ?required${suffix}=${trait_name}13:10
*** ttsiouts_ has quit IRC13:10
efriedprior to placement microversion 1.33, ${suffix} is only allowed to be an integer. As of 1.33 it can be an arbitrary string.13:10
*** ttsiouts has joined #openstack-placement13:11
efried(I'm actually not sure if the flavor side is validating the pattern of ${suffix}, but I kinda think we would have not done that, left it up to placement)13:11
SundarYes, I got that. When stating the use case, we can skip the Placement syntax13:11
SundarI think we'd do: resources1:VF=1; resources2:VF=2; trait3:HW_DEVICE_ROOT=required13:12
efriedand you'll need a new syntax for same_subtree13:12
SundarDoes that make sense (except for numeric suffixes)?13:12
SundarYes, that's my next q. How do you say same_subtree in flavors?13:13
efriedNot sure I see a reason it shouldn't be: same_subtree=1,2,313:13
SundarIs that at the same level as group_policy, syntax-wise13:13
efriednever really knew why we had to bass-ackwards the syntax in flavors, but here I don't even see a way you could reasonably do that.13:13
efriedexcept it can be repeated.13:14
efried(without suffixes)13:14
efriedso I guess for that reason, we would need to figure out a different way to express it in flavors :( :( :(13:14
efriedmaybe that's why we did it that way. Unique keys.13:14
efriedanyway, that's a solvable (if unpretty) problem.13:15
SundarHa. WHy not: resources_G1:VF=1; resources_G2:VF=2; trait_G3:....13:15
SundarThen: same_subtree=G1,G2,G313:15
*** ttsiouts has quit IRC13:15
efriedyes, exactly13:15
efriedexcept same_subtree=_G1,_G2,_G3 (the whole suffix required)13:15
efriedbut that's the idea.13:16
efriedPoint is, placement will allow you to use same_subtree more than once in the same querystring.13:16
efriedSo you can have multiple distinct affinity groupings.13:16
efriedBut doing that in flavor, where key has to be unique, will require some syntactic finagling.13:16
SundarGood so far. Now, I should add the same syntax to device profiles. So an operator can say that for CYborg devices13:16
efriedYes. But not in the current spec IMO13:17
*** mriedem has joined #openstack-placement13:17
SundarYou mean add a supplemental spec to current device profiles spec in CYborg?13:17
efried"support for affinity/colocation is out of scope and will be addressed later" - again IMO13:17
efriednot on the cyborg side, on the nova side. But yes, a supplemental spec13:18
efriedbecause it has a dependency on nested magic, which your current spec does not IIUC.13:18
SundarAgreed, the current Nova-CYborg spec will be left as is. No nested magic there13:19
SundarANyways, back to the feature13:19
SundarIf the device profile (or Neutron port RG) want to express a NUMA dependency, how do we name the dependee group?13:20
efriedI see what you're saying about the cyborg side - yes, assuming you're doing pattern/schema validation of the device profile syntax, you'll need to tweak that on the cyborg side.13:20
efriedbut that will be very minor13:20
efriedand I suppose somewhere you'll have to state the tree model cyborg will be pushing to placement.13:20
efriedOkay, so now you're saying you want your *devices* to have NUMA affinity with something coming from the *flavor*?13:21
efriedor with each other?13:21
SundarYes, devices may have NUMA affinity with some resource expressed in a flavor RG13:21
SundarIOW, request groups are in different subsystems -- flavors, neutron ports, device profiles13:22
SundarYet, latter 2 may want to refer to RGs in the flavor for NUMA affinity13:22
efriedRight, so here's where it'll get tricky.13:23
efriedBecause the components are coming from different places, we'll have to agree on a naming convention for the suffixes13:24
cdentefried, tetsuro: interesting bug that I feel into while doing other stuff: https://storyboard.openstack.org/#!/story/200582213:24
cdentfell, but feel works too13:24
efriedWe can't dictate that they be exact values; it'll have to be substring of some kind.13:24
*** ttsiouts has joined #openstack-placement13:26
SundarYes. We don't have to solve that right now -- just identifying the scope of what I see as missing13:26
efriedwe start getting pretty tightly coupled unfortunately. Like, a device profile that cites numa affinity won't be able to be used with a flavor that doesn't.13:26
cdentthat seems quite unfortunate13:27
SundarBut the operator sets the device profile in the flavor, so anyway there is a linkage13:27
cdentsorry meant to put a ? on the end of that13:27
Sundarefried: On a different note -- Re. "state the tree model cyborg will be pushing to placement", no trees: It is still a bunch of RGs, just as if we wrote all that in the flavor.13:28
SundarSorry, I have a call in a few min13:28
SundarBut I hope I managed to convey where the disconnects are13:28
efriedI'm talking about the resource provider model, not the GET /a_c or device profile13:28
efriedcyborg is responsible for creating the providers in placement.13:29
efriedTo support this colocation use case, it will have to create a (sub)tree structure13:29
SundarYes, sure. CYborg should create the tree in Placement appropriately13:29
efried"convey where the disconnects are" - I'm not sure what you're getting at here.13:29
efriedWe've at least superficially thought through all the use cases we know about and figured out how they can be satisfied at least on the placement call side.13:30
efriedWe haven't designed the nova side in detail13:30
efriedbut we have a sense for how it would work.13:30
efriedwhich fed the placement design13:30
efriedare we done yet? Of course not.13:31
cdentand that's fine, we can adapt13:31
*** cdent has quit IRC13:38
*** cdent has joined #openstack-placement14:02
efriedcdent: Perhaps we should discuss group_policy=isolate (in context of https://review.opendev.org/#/c/662191/3/doc/source/specs/train/approved/2005575-nested-magic-1.rst@266) here in IRC for a bit.14:12
*** mriedem is now known as mriedem_away14:14
cdentif you like, but if there's some concept that's difficult to digest, writing it down would be better for more peole14:14
efriedcdent: Yes, I assumed we could summarize and refer to this chat14:17
cdentI guess what I'm trying to say is: if it is best to chat about it for you to be able to convey the information, then yes, let's, but my preference would be to use asynchronous modes14:18
cdent(in all cases)14:18
efriedThe point is that the real world use cases mriedem_drillface identified are broken by group_policy=isolate14:19
cdentif you haven't seen this yet, there's some stuff about that here: https://anticdent.org/openstack-denver-summit-reflection.html14:19
cdentright, but existing use cases are not14:19
cdentit's okay to have two ways14:19
efriedyeah, I know your preference for async and agree with it to a certain extent14:19
efried"existing use cases" - sorry, perhaps we should clarify which is what. AFAIK there are no existing use cases other than the ones mriedem_novocaine identified, for which he had to hack up something different because we didn't have this yet.14:20
cdentthe two i mentioned don't count?14:21
efriedyou mean VCPU:1&VCPU:1&isolate? Not really, no. What are those useful for today?14:21
cdentif you have numa inventory and all you care about is that you get two different nodes (which I understood as _the_ reason isolate was created), that provides it14:22
cdentin any case: I'm not saying we should extend the syntax, I'm simply saying we can keep the existing because of two reasons: a) it appears to be useful for that concept (which is useful beyond nova, conceptually), b) it has to say around anyway because of microversions14:23
cdents/should/shouldn't/ # sigh14:23
efriedYeah, I think what I'm getting at is, as soon as we have >1.33, group_policy=isolate is going to cause more problems than it solves. If you want to address the above simple use case, where you aren't worrying about the anti-affinity policy bleeding into other request groups (like, where you aren't listing multiple devices and/or ports), you can use 1.33 or lower and group_policy=isolate.14:25
efriedput another way: "<1.33 is for simple shit. >=1.33 is for complicated shit." And a single global group_policy is "simple".14:26
cdentthat's essentially what I said too14:26
cdentthe difference might be that i wasn't thinking "turn off isolate after 1.33"14:26
cdentinstead: "don't use it most of the time after 1.33"14:27
cdent(with the added subtext of "don't use >= 1.33 unless you are a crazy person who hates clouds"14:27
efriedso perhaps it's time to quit requiring it and default it to none.14:27
cdentperhaps it is14:30
cdentI don't see an issue with that. My statement: was we need to keep this functionality, that's all14:31
efriedKeep it as in not remove it from <1.33, agree of course.14:32
cdentthere's a difference between what you're saying here:  "quit requiring it and default it to none" and what you're saying with "we should just get rid of group_policy at this microversion"14:33
efriedyes. I would prefer the latter, but the former is acceptable.14:34
efriedthough weak14:34
cdentI have mixed feelings on this14:34
cdentbecause of my mixef feelings about microversions14:34
cdentmost of the time I would prefer to not have to think about microversions14:34
cdentwhen we remove functionality at a microversion, I do14:35
efriedOkay, but at some point we're going to need to do something richer so we can express multiple affinity and anti-affinity subgroups. Do you agree that we'll need to deprecate/remove group_policy at that time?14:36
cdentthere's a third option here which is: don't mix group_policy with <some thing that got added at X.YY>14:36
efriedAt the very least it would have to be mutually exclusive with ... yeah, that.14:37
cdentI think mutually exclusive is better than removing14:37
efriedOkay, in this spec, let's move forward with "make group_policy optional, defaulting to none", you okay with that? (More below about the specific corner case)14:38
cdent(also I wish we didn't have to express all this stuff, but you know, I have to accept it)14:38
cdenti am okay with that14:39
cdentnot that I'm clear on what it really gains except making existing code need to change14:39
cdentcallers which are already writing group_policy have the option not to do so?14:40
efriedFor the corner case in question, we could:14:40
efried- Log a warning, but let it play out, which will likely result in no candidates - or possibly some really weird ones - but that's on you, cause you were warned.14:40
efried- 40014:40
cdentor are we more concerned about code that doesn't exist yet?14:40
efriedcode that doesn't exist yet14:40
efriedthat will use the microversion that doesn't exist yet.14:40
efriedthey shouldn't have to say group_policy=none if that's the only thing that makes sense.14:40
cdentyou make it sound like we have client code that _first_ branches on microversion and then does everything else, which is not the case14:41
cdentbut let's not get into that. I'm happy to accept the path described above14:41
efriedWhich, warn or 400?14:42
cdentwhat is the corner case you're thinking about? for the most part a warning log is something a client can never be guaranteed of seeing, so something informative is almost always better14:42
cdent"you make it sound..." was in response to "code that doesn't exist yet"14:43
efriedthe corner case is the one mriedem_nowspit was trying to solve when he was hacking up request filters to add traits to the request based on some configuration.14:43
efriedHe ended up adding the trait to the instance's flavor, which is hacky14:44
efriedwhat he would have liked to do is add a unique request group that just identifies the trait14:44
efriedso he didn't have to mess with the flavor14:44
efriedbut if group_policy=isolate was set (somewhere), that would result in not getting any candidates back IF the trait in question happened to be on the compute node resource provider (which it would have been) and the VCPU was in a different request group (which it would have been).14:45
efriedI can't remember whether group_policy=isolate forces the resources from the unsuffixed group to not overlap with other groups as well...14:46
* cdent tries to digest14:46
efried(looked it up, yes, they're allowed to overlap. So in this case, *assuming* the VCPU/MEMORY_MB ended up in the unnumbered group, it probably would have worked.)14:48
cdentcan you translate "but if..." into query params please14:49
efried?resources=VCPU:1,MEMORY_MB:1024&required2=COMPUTE_SUPPORTS_MULTIATTACH&group_policy=isolate  ==> I think this would work, because group_policy doesn't apply to the unnumbered group's interaction with numbered groups.14:52
efried?resources1=VCPU:1,MEMORY_MB:1024&required2=COMPUTE_SUPPORTS_MULTIATTACH&group_policy=isolate ==> This would *not* work because resources1 and required2 would have to be satisfied by different providers, but VCPU+MEMORY_MB+COMPUTE_SUPPORTS_MULTIATTACH are all on the compute RP.14:52
cdentand the latter is something you might want to express in a mixed numa/non-numa environment?14:54
cdentbecause, to me, neither of those queries are things I would expect to see14:54
cdentin the first one I would not use group_policy and nor would I use required214:55
cdent(just required)14:55
cdentand in the second I would target numa or not numa. It would work if the trait was on the compute-node and the inventory on numa. and it would _not_ work to target a non-numa host, which is the desired outcome14:56
cdent(with a query of that form)14:56
cdentjust because someone can write grawlix when talking to placement doesn't mean we should accept it14:58
efriedYeah, so the original desired solution in mriedem_temporarysunglasses's case was to do as you say, add the trait to the unnumbered group.14:58
efriedThat turned out to be hard for mechanical reasons in nova, not because it wouldn't have been the right thing to do in this case.14:58
* cdent nods14:58
efriedand assuming we support resourceless unsuffixed request group, it would still work - assuming the mechanical problem can be solved - in the general case where we can't know for sure that there will be any resources requested in unsuffixed groups.14:59
cdent(before I forget to mention it and I time out, I'm going to be gone tomorrow and thursday for a funeral, but will back with the pupdate on friday)14:59
cdentright, and resourceless looks like a runner, yes?15:00
efriedyou mean we want to do it? Yes, definitely15:00
cdentthat is: it's not an assumption, we're going to support it15:00
efriedso where does this get us? "group_policy=isolate will behave weirdly but correctly if you use it in the same breath with a suffixed resourceless request group - deal with it" ?15:01
cdentweirdly according to who? from what I can tell it's doing what it says it will.15:02
efriedIt will do what it says, which may or may not be what you mean15:02
cdentI think this supports what you were saying before: we can make it default to none if it is not there and _if you choose to use it with isolate_ be aware that we've handed you some rope15:04
cdentI think that's okay: most of the stuff beyond simple required and resources is rope15:04
efriedCorollary: "We will not make placement behave strangely/inconsistently to work around a weird code setup in nova that makes it hard to add shit to the unsuffixed group"15:04
*** ttsiouts has quit IRC15:04
cdentyou can simplify that to15:04
cdentWe will not make placement behave strangely/inconsistently to work around nova15:05
*** ttsiouts has joined #openstack-placement15:05
efriedor $consumer15:05
cdentbut thus far nova is the one that needs to go to jail the most15:05
efriedI was forgetting that there was another solution to matt's case15:05
cdentor let's make it even simpler15:06
cdentConsistency is important to placement15:06
cdent(with the subtopics of: "cuz, damn, it is already pretty hard to think about"15:06
efried(I'm not actually planning to write any of those things down into like a Tao Te Ching of Placement, just useful to put them in my head while we talk about this.)15:07
cdentbtw: "consistency is important..." is pretty much what I was trying to say on the review where I said "That's a bit of a logic flaw ..."15:08
cdenta Tao of Placement is a good idea15:08
cdentin our copious free time15:08
*** ttsiouts has quit IRC15:09
efriedyou've mostly convinced me that it's not worth changing group_policy's required-ness.15:10
efriedexcept that it's a thing I've hated since the beginning.15:10
cdentsometimes we just have to eat that. there are so many things that I've hated since the beginning.15:10
cdent(including group_policy)15:16
*** e0ne has quit IRC15:32
efriedcdent: See if I captured/summarized accurately if you please: https://review.opendev.org/#/c/662191/3/doc/source/specs/train/approved/2005575-nested-magic-1.rst@26615:46
cdentwill do in a mo, in debug zone at the mo15:46
*** Sundar has quit IRC15:57
*** cfriesen has joined #openstack-placement16:08
cfriesenwhen placement is using the nova_api DB, where are the "resource class ID" defined?  table "resource_classes" is empty, but I see "resource_class_id" used in the "inventories" and "allocations" tables.16:09
cdentcfriesen: if you're back a few versions, the id comes from the index of an enum, defined either in a file called rc_fields, or if you're really far back, a field in the fields.py down in the objects hierarchy16:11
cfriesencdent: this would be Pike16:11
cdentcfriesen: yeah, nova/objects/fields.py16:12
cfriesencdent: found it, thanks.16:13
cdentand there's a cache defined in nova/db/sqlalchemy/resource_class_cache16:13
cdentwhich sort of merges custom resource class ids and the ids from the enum16:14
cdentefried: yeah, that's a good summary16:16
efriedk, thx16:16
*** mriedem_away is now known as mriedem16:20
cdentefried: if you have a moment to gaze at https://storyboard.openstack.org/#!/story/2005822 you might have an insight on where the problem lies and perhaps more importantly what the actual desired result is. I've got to dash to dinner with my folks, but will be back a bit later for a bit more work before skipping the next two days. if you've got no clue, ping tetsuro, perhaps16:27
*** cdent has quit IRC16:29
*** tssurya has quit IRC17:38
*** amodi has quit IRC18:23
*** Sundar has joined #openstack-placement18:33
*** e0ne has joined #openstack-placement18:54
*** e0ne has quit IRC18:55
*** cdent has joined #openstack-placement19:20
*** Sundar has quit IRC19:23
*** e0ne has joined #openstack-placement19:29
*** e0ne has quit IRC19:30
*** altlogbot_1 has joined #openstack-placement19:40
*** Sundar has joined #openstack-placement19:50
*** e0ne has joined #openstack-placement20:23
openstackgerritChris Dent proposed openstack/placement master: Stabilize AllocationRequest hash  https://review.opendev.org/66313720:34
*** e0ne has quit IRC20:41
cdentefried, mriedem : that's a fix for the bug discussed earlier. It's a fix as is, but I've left some suggestions on the patch for how to make it deluxe if people feel like they want to before I'm back on friday20:42
*** e0ne has joined #openstack-placement20:42
*** e0ne has quit IRC20:43
mriedemi didn't pay attention to any earlier bug discussion20:43
mriedembut i reckon i can look at this regardless20:43
mriedemunless it involves NRP NUMA GPU TRAIT CITY20:43
cdentit's a python problem, not a concept problem20:44
cdentwhich is rare around these parts lately20:44
cdentsort of refreshing really20:44
mriedemheh right20:44
mriedem"can we just get back to slice and index error bugs please"20:44
mriedemoh i see py27 hashseed issues20:45
cdentnot really hashseed, but similar20:45
mriedemcdent: btw, are you going to a trump rally while he's somewhat local or something, is that why you'll be gone?20:46
mriedemi can only assume it is20:46
cdentheh. now Im going to make you feel bad: I'm going to a funeral20:46
efriedI was going to say it's roughly the same thing, but thought that might be in poor taste.20:47
cdentbut if there was a trash trump rally nearby I'd totally hit that20:47
*** dklyle has quit IRC20:49
*** dklyle has joined #openstack-placement20:49
cdentanyway, that's it, see yas20:55
*** cdent has quit IRC20:55
mriedemwell i threw a question in there but don't feel qualified to vote on it20:58
*** mriedem is now known as mriedem_away21:22
*** artom has quit IRC22:50
*** takashin has joined #openstack-placement23:54

Generated by irclog2html.py 2.15.3 by Marius Gedminas - find it at mg.pov.lt!