14:03:34 #startmeeting neutron_qos 14:03:35 Meeting started Tue Apr 21 14:03:34 2015 UTC and is due to finish in 60 minutes. The chair is ajo. Information about MeetBot at http://wiki.debian.org/MeetBot. 14:03:36 Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 14:03:38 The meeting name has been set to 'neutron_qos' 14:03:43 ajo: I can send ireanb sms 14:03:57 moshele, thanks :) 14:04:02 #topic Announcements 14:04:09 ajo: pong 14:04:16 hi sc68cal :) 14:04:24 #link #link https://etherpad.openstack.org/p/neutron-liberty-qos-code-sprint 14:04:29 woops :) double link :) 14:04:31 #link https://etherpad.openstack.org/p/neutron-liberty-qos-code-sprint 14:05:06 I guess most of you are already aware of the qos code sprint Red Hat is proposing, feel free to signup if you believe you can come, or collaborate remotely. 14:05:49 but, we have specs to get merged, and a lot to agree before a code sprint can happen :) 14:06:02 any question about it? :) 14:06:25 ok, let's move on 14:06:37 ajo: irenab will join shortly 14:06:49 hi 14:06:56 hi irenab !! :) 14:07:01 sorry for being late 14:07:08 np :) 14:07:12 just in time 14:07:15 #topic updated spec 14:07:24 #link https://review.openstack.org/#/c/88599/ 14:07:36 I have updated sc68cal spec, with a few ideas about splitting the initially proposed model 14:08:02 into QoSProfiles, composed of QoSPolicies 14:08:43 for ratecontrol/bandwidth limiting, may be it's too much, but I wanted to make the models/api to grow without big changes, or breaking backward compatibility of the API ends in the future. 14:08:50 ajo: what's the problem you're attempting to solve with that setup? 14:09:11 aveiga, for example, there were concerns in previous models about applying several profiles to one port/network 14:09:27 aveiga, for example, you could like to mark packets + bandwidth limit them... 14:09:49 or you may want (in the future), to be able to write QoSPolicy rules which target traffic selectively 14:10:05 ajo: cbits and vhoward- from Comcast are also from Comcast and interested in the QoS work 14:10:09 for example... tcp.port=80 14:10:27 aveiga, does it sound reasonable? 14:10:31 ajo: I see where you want to go with this 14:10:55 aveiga, something like security groups, but for applying QoS policies to different types of traffic 14:11:05 you can have a slective overload of that setup, where ports on tenant x by default have forced ratelimiting, but maybe they also want to add a mark for some traffic 14:11:13 ajo: on same neutron port, right? 14:11:20 irenab correct, 14:11:46 the fun part is the collapsing logic, because if they're doing different things it's fine, but some things are one property only or may be mutually exclusive 14:11:54 exactly 14:11:54 I suppose that's an implementation detail though 14:12:13 we need a way to provide the feedback on failure, though 14:12:20 I was going to get into that, we may need some sort of logic to check we can add an extra rule that works with previous ones in the profile 14:12:36 aviega: agree, and this may sometimes depend on QoS backend implementation 14:12:42 it would be a poor UX to allow them to try setting two DSCP marks against the same port withotu a way to tell them why it won't work or which one was actually applied 14:13:01 aveiga, but for a first iteration, it seems only ratecontrol/bandwidth limiting is getting accepted by neutron-drivers (to control how we design/grow the feature), 14:13:11 :( 14:13:28 aveiga, anyway, I'm all for it, if we finish the first steps, 14:13:37 ok 14:13:39 let's go ahead and start developing the U/S dscp part 14:13:56 ajo: what is your intention regarding ref implementation? 14:13:58 it's probably not going to be too complicated with comcast code + the bare bones in place 14:14:16 as long as we can get the api nailed down, we can port our stuff over and merge later 14:14:58 aveiga, sounds good, may be, if you want you can work in a follow up spec to the current one, to extend with dscp, so we know how it looks 14:15:04 yes data model and api would like to see that solidified in spec, we can help 14:15:16 I think if API will open enough, different plugins may have various options to support, sometimes richer than ref implementation 14:15:47 irenab, about ref implementaion, I have another point in the meeting, give me 1 min :) 14:15:53 irenab, I agree 14:15:55 +1 irenab 14:15:59 that comes to another question, 14:16:08 we're defining the bandwidth limiting fields, 14:16:33 but do you believe we should accept, other fields? 14:16:49 ajo: at API level? 14:17:00 I guess, accepting different fields without control could lead to non interoperable policy rules 14:17:25 irenab, talking about the parameters dict: https://review.openstack.org/#/c/88599/7/specs/liberty/qos-api-extension.rst 14:17:29 line 125 14:18:36 I have defined as much as I can think of, but for example, max_latency_ms can't be applied to ovs (AFAIK) but it was proposed on previous versions of the spec. 14:18:39 sc68cal, do you recall why? 14:19:05 ajo: In my opinion there is a difference if we require this at API level or at the implementation/db level 14:20:08 irenab, what do you mean 14:20:09 ? 14:20:22 ajo: That was proposed in the spec for the linux bridge implementation for rate limiting 14:20:29 ah, ok API / vs impl 14:20:30 since it was using tc as the driver 14:20:48 it might be a good idea to abstract out the api calls as high as we can 14:20:57 aha, sc68cal , so tc supports latency settings 14:21:02 ajo: yes 14:21:06 There are validations that can be done at API level, so from API perspective, I think there should be ‘key’, ‘val’ pairs 14:21:14 aveiga: the problem with making this all abstact is how do we validate 14:21:19 yup 14:21:26 I agree with aveiga , we may abstract the parameters as much as we can, but... 14:21:33 but where the implementation is involved , we can check the supported ‘keys’ only 14:21:41 every implementation is a bit different 14:21:49 irenab, that makes sense, 14:22:02 may be we can check the default keys, and leave room for non-default ones 14:22:15 backend should report what fields they support? 14:22:23 so every vendor could leverage extra capabilities 14:22:24 in my opinion we should not close API for only current known list 14:22:39 matrohon: +1 14:22:49 irenab, ok, we may need to check that with neutron core/drivers to see if it's ok, are we doing such thing in other places? 14:23:10 matrohon, or we can pass parameters to backend for checking 14:23:25 I gues extra_dhcp_opts are similar in this sense 14:23:29 it looks like extension suppport for plugins 14:23:39 aha 14:24:30 #action ajo allow extra settings in the parameters dictionary to be checked by the plugin/specific implementation. 14:24:33 ajo : calling a backend during a transaction should be avoided 14:24:49 matrohon, true 14:25:05 in that case, we can check available parameters with backend at start 14:25:18 extensions could be interesting, is it possible to pass an object as the value in a k/v pair? That way you might be able to provide, say, min and max bandwidth and min/max latency in the "bandwidth" key 14:25:37 but ok, I guess this is all implementation detail we can discuss over the spec. 14:26:01 ajo : +1 14:26:06 is it ok to discuss the details over the spec? 14:26:09 thanks matrohon 14:26:10 ajo +1 on flexable and letting the backend define the validation. 14:26:48 #topic reference implementation 14:27:17 (irenab, I had another point later for the service-plugin vs mechanism-driver or other ways...) 14:27:30 ajo: ok :-) 14:27:47 about reference implementation, I guess it should go in ml2-ovs, I know irenab you were interested in ml2-sriov, not sure if it's yet the case 14:28:20 sc68cal, in the previous spec, there was somebody proposing ml2-linuxbridge which I guess makes sense too if we have people willing to do it 14:28:30 ajo: I think SR-IOV is in moshele hands now 14:28:38 moshele +1 14:28:38 :) 14:28:50 ajo: yes, so we would have a ml2-lb and ml2-ovs implementation ready for revival 14:28:52 we were interested in ml2-ovs also 14:29:07 the mech driver 14:29:15 I think there was also a ml2-ovs rate limit impl. floating around too 14:29:31 ajo: I will do the ml2-sriov 14:29:37 waht QoS functionality is going to be implemented? only rate limit? 14:29:37 irenab, I recall your design where a few parts could be reused across several ml2-agents, if I didn't get it wrong 14:30:03 ajo: right. The idea was to extend existing l2 agents in a reusable way 14:30:16 at least ratelimit, yes 14:30:23 that's the recommendation from neutron drivers 14:30:28 and PTL 14:30:57 if we put testing in place, + a good design, we have a good amount of work ahead, 14:31:18 but I'm all for writing more policy types if we finish early 14:31:35 irenab: no we are looking to mark traffic not just rate limit 14:31:37 irenab, ajo : are speaking about a modular agent revival? 14:32:07 matrohon: related to this, right 14:32:09 We are also looking to filter traffice based on DSCP marking 14:32:18 ajo: for the ml2 extension_manager we will need this https://review.openstack.org/#/c/162648/ change as well 14:32:29 we are looking to blueprint ml2-ovs mech driver for qos so we will keep you all in the loop on that to what cbits said 14:32:38 cbits: with upstream OVS impementation? 14:33:03 yes 14:33:49 moshele, It looks like it eventually get merged as it's fixing a bug, right? 14:33:55 it will 14:34:00 from the API perspective, it makes ense to have all these options (and maybe more) available from the beginning, while only subset can be implemented in the first iteration 14:34:27 irenab, which options? 14:34:42 ajo: yes it was freezed because of kilo, but we will push it now 14:34:43 mentioned by cbit 14:35:04 cbits, ack :) 14:35:09 DCSP marking,.. 14:35:09 I missed cbit commit about DSCP filtering 14:35:16 cbits: that's out of scope 14:35:28 filtering is an SG change, not a QoS change 14:35:35 it's related, looks more like a security group change, correct 14:35:39 or rather, port security 14:35:42 that's an extension that you guys to the sec group API, that you're going to have to carry 14:36:12 cool 14:36:19 they could propose an extension to the security groups API to have a dscp field U/S 14:36:49 #topic access control / ACLs 14:36:51 ajo +1 14:37:10 in previous "pre-meeting" we discussed about access control to QoS profiles 14:37:12 ajo: extension of an extension. yaaay. :) 14:37:23 sc68cal or just update the extension :) 14:37:48 ajo: kevin updated the RBAC spec https://review.openstack.org/#/c/132661/ to be generic 14:37:49 we believe in the feature we could leverage the design of RBAC to control QoS profiles access by tenants 14:37:57 #link https://review.openstack.org/#/c/132661 14:38:04 yeah, I saw it, great news :) 14:38:17 I need to read it through, I didn't have time yet 14:38:35 I think that's very good news, but let's not stretch ourselves too much for first iteration 14:38:50 but, for this cycle, there was a genera agreement, that having the admin controling the policies/rules setting to ports/networks itself 14:38:57 ajo: It looks suitable for managing QoS profiles 14:38:58 and disallow it on tenants for now 14:39:07 ajo: admin/owner? 14:39:31 I'd say admin only for now, but it's just a matter of configuring the policies 14:39:59 for example, you don't want tenants randomly making bandwidth changes to modify a setting done by admin 14:40:32 ajo: true, but we can do checks for if the network is shared 14:40:36 but, if anybody needs a change on that, they only need to modify /etc/neutron/policy.json 14:40:38 then disallow 14:40:53 ajo: good point. 14:41:15 sc68cal, we can think of probably common defaults, but everybody is going to make different uses, I guess 14:41:23 some operators may not want to rely on admin to set profiles 14:41:33 ajo: true, but you make a good point, policy.json is easy to change 14:41:34 and some operators may not want tenants to modify / set new profiles, I guess 14:41:42 ajo: and hey, it's actually getting some documentation, finally! 14:41:46 sc68cal, we could provide instructions on how to do it 14:41:55 sc68cal, really? :D :-) 14:41:56 that's good 14:42:15 ok 14:42:20 a tiny topic before we jump into a bigger one 14:42:33 #topic ratelimiting vs ratecontrol 14:42:59 I was thinking that ratecontrol "type" could be used for both min/max rate if I'm not missing anything 14:43:42 not sure if we need separate types to define traffic guarantees 14:44:01 or just have a "min" field, specifying what's the minimum bandwidth we're providing on a port 14:44:37 does it sounds reasonable/unreasonable? 14:44:48 little fuzzy on difference 14:44:57 care to educate? 14:45:30 from the ovs/linux-htb point of view, having it together, makes the "min" parameter meaningful, and we can control priorities over ports/bandwidth.. 14:46:07 ok, I see no -1's at least :-), I will change it on the spec, but feel free to complain or ask me to change it back 14:46:15 let's move on 14:46:35 #topic service-plugin, or simple mechanism driver 14:46:43 irenab, do you want to lead this one? 14:46:52 ajo: ok 14:47:01 thank you :) 14:47:22 the question is if the QoS profile/policy management should get its own service or be part of the Core plugin 14:48:18 I tend to see it belongs to service plugin, potentially with different providers 14:48:20 Not sure if this is relevant, but I'd say we should get the API extension into core then have our own repo to rapidly iterate 14:48:39 similar to all the *aaS repos 14:49:01 sc68cal: thats what I had in mind 14:49:13 ok, so guess I'm in the service plugin camp 14:49:15 Well, I'm not 100% convinced we need a separate repo here, at least for the start 14:49:44 the extension definition can be hosted in the same repo as well 14:50:00 I'm happy about the design advantages that come with making it a service plugin, 14:50:28 and it's tightly coupled to the reference implementation of the ovs-agent which lives on tree 14:50:30 irenab: are you sure? last I checked the core repo still had the extensions for all the *aas stuff 14:50:47 or was that just a procedural thing, where they hadn't split them out yet 14:50:59 sc68cal: not for new introduced ‘services’, like l2gw 14:51:20 since it apply to core resources, I feel it's more in the scope of Core plugin extension 14:51:45 matrohon, that's my feeling too 14:51:48 I guess its just easier to maintain both api and impementation together, but do not have strong opinion where to put it. 14:51:59 irenab, sc68cal , could you enumerate advantages of making it a service plugin instead? 14:52:28 ajo: Ideally it'd allow us to iterate in our own repo, have our own cores and such 14:52:33 we're basically setting parameters of the networks and ports 14:52:43 ajo: the trouble is - as you pointed out, the agents 14:53:29 sc68cal, yes, but I believe it's going to become an important feature for many telcos/operators, and it modifies parameters of basic resources like ports, 14:53:42 ajo: I think mostof the advanced services eventually apply to core resources 14:53:44 I'm not sure if that fits on the category of an advanced service living on a seperate repository 14:54:00 irenab, well, they make use of them... 14:54:03 but do they modify them? 14:54:11 may be FwAAs modifies routers, 14:54:12 but all the policies/profiles management is quite stand alone logic 14:54:32 Let' 14:54:48 Worst case we can do a little research on the service plugin 14:54:51 we eventually map port to some profile UUID 14:55:09 right now most of the existing code ties right into the core 14:55:09 irenab, sc68cal , matrohon , what do you think about looping in cores in next neutron meeting, and see how they think about it 14:55:11 ? 14:55:15 ajo: good idea 14:55:22 or a ML thread 14:55:28 ML should work too 14:55:28 ml thread might be better 14:55:36 probably better 14:55:38 sc68cal +1 14:55:47 I would vote for service plugin for clean separation of concerns, quicker iterations 14:56:03 you'll have to have a dedicated MD for the service plugin to be aware of any changes on a port/networks? 14:56:14 the recent spirit to spin every possible part out :-) 14:56:38 armax introduced a registry mechanism that can be reused too 14:56:45 yes 14:56:50 matrohon: callback? 14:56:56 matrohon, the callbacks, right? 14:56:57 :D 14:57:05 irenab: agreed - we just have to figure out how to get the agents to work with the qos service with no code changes to core 14:57:05 yep 14:57:42 sc68cal, fwaaS extends the l3-agent, reight? 14:57:43 sc68al : this is the scope of the module agent work in progress... 14:57:44 we tried to solve it in some AgentExtension way 14:57:48 but I believe that way is not very sustainable 14:57:58 we may need extending the agents in a more dynamic way 14:57:59 ajo: I'd have to check how they do that 14:58:05 currently, there is no way to extend agent (LB or OVS) 14:58:11 i know vpnaas just runs a whole new agent that inherits from the l3 agent 14:58:14 it 14:58:16 is nasty 14:58:18 yes, matrohon , correct 14:58:19 the l3agent is easily extensible since kilo 14:58:29 ok 14:58:29 but not the l2agent 14:58:35 let's talk about it in the mail thread 14:58:38 1 min left :) 14:58:55 matrohon: sounds about the right time to make a change :-) 14:59:04 irenab : +1000 14:59:09 my opinion is more on keeping it to the core, but I'll make an impartial thread start, so we can discuss freely 14:59:25 ok 14:59:33 rosellab has been very active on the agent, but none of her improvment has merge in kilo 14:59:35 core is tough, they're splitting reference from neutron 14:59:41 I also had a request from salv-orlando , to share our proposed API with operators and telcos 14:59:44 to get feedback 14:59:53 for the neutron-lib work 14:59:59 since people complains generally about neutron API usability 15:00:05 ajo: souds great, the earlier the better 15:00:07 * sc68cal trying to get as many words in 60 seconds 15:00:11 :D :D 15:00:16 it's not a strict requirement. but it is to avoid previous mistakes 15:00:24 like the one we made with load balancing apis 15:00:25 salv-orlando +1 15:00:28 ++ 15:00:38 * sc68cal pokes aveiga 15:00:52 ajo: any work items for next week? 15:00:53 ok, I believe we shall free the channel 15:00:57 sorry, catching up post power outage 15:01:09 aveiga: you're our telco guinea pig 15:01:17 let's discuss over #openstack-neutron-qos 15:01:20 #endmeeting