14:08:10 #startmeeting neutron_qos 14:08:11 Meeting started Wed Jan 27 14:08:10 2016 UTC and is due to finish in 60 minutes. The chair is ajo. Information about MeetBot at http://wiki.debian.org/MeetBot. 14:08:12 Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 14:08:15 The meeting name has been set to 'neutron_qos' 14:08:21 Hi everyone :-) 14:08:31 howdy 14:08:41 hi 14:08:47 #link https://etherpad.openstack.org/p/qos-mitaka The usual etherpad-agenda 14:09:31 Last meeting was a bit lonely, due to strange things with ics, or human error and misleading from my side :) 14:10:26 We had a couple of things merged since last meeting 14:10:43 RBAC spec: https://review.openstack.org/#/c/254224/ .. yey hdaniel ! 14:10:52 :)) 14:10:54 and Validation for ml2 extension driver to be enabled when qos service plugin is used with the plugin: https://review.openstack.org/#/c/253853/ 14:11:10 yes slawek, probably more QoS related stuff was merged, 14:11:59 prolly. let's not celebrate for too long look at stuff NOT merged :) 14:12:10 *and look 14:12:12 :) 14:12:24 yep 14:12:31 #topic Tracking items 14:12:46 If somebody dares to call me optimist, will be right. I'm still working in the RPC callback rolling upgrade core logic: https://review.openstack.org/#/c/265347/ 14:13:23 I was working on another patch right now, and I will continue after meeting, it looks considerable well, thanks ihrachys for the reviews 14:13:46 ajo : Hello 14:13:59 hi reedip_ welcome 14:14:05 I think we should be almost there with it; as long as tests are ready. 14:14:23 honestly, I was not checking those since it seemed to me we should shake API first. 14:14:38 ihrachys, yes, I think my current testing is reasonable, but I'm the writer, probably I must do an extra pass and verify I'm not missing stuff 14:15:02 ihrachys, ack, let's get into testing afterwards 14:15:38 also, I mush push a second patch for integration with the agent, adding some db migrations, I have a version locally for that which I haven't updated lately, now it's out of sync with the core logic api 14:16:04 I suggest we get the first piece in, then care about the 2nd one 14:16:28 yes, this is why I'm not updating the 2nd part patches without the 1st part being ready, it's too much rework every time 14:16:35 righ 14:16:37 *right 14:16:37 wasted work 14:16:53 ok, so let's keep that rolling, and going forward, 14:17:26 as a note, when this (https://review.openstack.org/#/c/268040/1) is ready, the DSCP patches will need to be rebased on top of it 14:17:34 but now that breaks the world, 14:17:49 and it's quite far from my local version 14:17:54 ajo: it's already based on l2-agent-extensions patch, I bet we don't want to have too deps 14:18:06 *two deps 14:18:42 right 14:18:56 ok, they probably don't need the dept 14:19:18 or to rebase on top of mine, as long as it's not merged until rpc callbacks upgrade support is merged 14:19:20 we'll manage the order manually 14:19:27 agreed 14:19:41 not that there are people pushing the patch apart from us :) 14:20:14 ihrachys: can we use the Depends-On for that? 14:20:25 ihrachys, will the gate check the dependency is merged before ? 14:20:54 ajo: we can, in theory. but I would keep test failures separate 14:21:05 ajo: if you depends-on, you get all failures from all patches 14:21:20 aha 14:21:24 ok 14:21:45 and honestly, I would not be bothered by dscp piece right now until we get the deps in. the deps are the blockers that concern me a lot. 14:22:21 ok, let's try to move the upgrades quicker, 14:22:34 ok, next topic then 14:22:44 #topic RBAC status 14:23:01 hdaniel, could you update about RBAC integration status? :) 14:23:05 sure 14:23:28 the patch is rewritten - to be used in a more "magical" way 14:23:33 hello 14:23:54 per ihrachys suggestion - I've added metaclass to inject the functionality 14:24:05 will push first wip patch today. 14:24:12 nice 14:24:25 wait, till you see it , 14:24:29 :) 14:24:32 lol 14:24:34 lol 14:24:38 ok let's see it first indeed :D 14:24:53 the client's patch needs to be rebased too, but I didn't have the time yet 14:25:09 I must admit, magical sounds good and worrying ;D 14:25:18 for example, mixins could seem magical at first sight 14:25:33 so, let's see it :D 14:25:46 mixins are like mushrooms - they are magical, but there's a price to pay .. 14:25:54 lol 14:26:08 ok, so I'm pretty sure I'll push the WIP draft today 14:26:21 sounds reassuring :) 14:26:24 thanks hdaniel :) 14:26:37 ok, next topic 14:26:46 #topic Linux bridge integration 14:26:57 slaweq, the stage is yours 14:27:08 ok, so 14:27:21 my patch for qos in linuxbridge is (I hope) almost ready 14:27:29 most reviewers gave "+1" for this 14:27:32 it was pretty clean the last time I checked, yes 14:27:41 I have to update some docs and release notes to it 14:27:51 we need fullstack, that's the concern now I guess 14:27:53 but I'm still fighting with this fullstack tests support 14:28:05 have you tried to add rootwrap filters as I suggested before? 14:28:16 that is not so easy for linuxbridge (I'm testing connectivity there) 14:28:23 ihrachys: not yet 14:28:37 I'm in business trip this week and I have not too much time to do this 14:28:51 but I hope to do it today or tomorow 14:29:04 that's fine. note that I will be pretty much off the next week. 14:29:17 FYI: fullstack test for connectivity is passing for me with linuxbridge but when I run it as root 14:29:36 when I run it as normal user that linuxbrigde agent is not spawning properly 14:29:44 and test is failing 14:29:54 slaweq, looks like the root wrap filters for fullstack then 14:29:59 ...and we suspect it may be missing IpNetNs rootwrap filter for lb agent 14:30:08 so I have to solve this issue and do some (big) refactoring to address all coments from reviewers 14:30:29 ajo, ihrachys: yes, it looks like that :) 14:30:31 slaweq: pay special attention to John's comments, he is our fullstack guru :) 14:30:47 but I have to check it and check where I should configure rootwrap rules for tests 14:31:03 I know where to add it for "normal" working scripts but not for tests 14:31:31 right. I think you need to run deploy_rootwrap.sh as part of fullstack target 14:31:38 as we do for functional target 14:32:01 ok, thx for tips ihrachys 14:32:04 ok, I guess we have a plan here. 14:32:11 let's move on 14:32:13 so generally it is all about linuxbridge 14:32:23 thanks slaweq and ihrachys :) 14:32:29 #topic DSCP markings 14:32:42 njohnston, vhoward , any update on the topic? 14:33:14 I need to admit I haven't looked into the code lately. was spending time on reviews for ajo and David for l2-agent-extensions. 14:33:25 yes, I was going to point that out 14:33:34 #link https://review.openstack.org/#/c/267591/ 14:33:43 right, that one 14:33:57 and 14:33:58 I think we are mostly ok with API part, just need testing coverage 14:33:59 #link https://review.openstack.org/265347 14:34:01 "just" 14:34:05 ok 14:34:17 is David around? 14:34:24 davidsha: 14:34:38 davidsha, ping :) 14:35:21 Hi, sorry 14:35:50 I'd loop yamamoto and ann to check the patch :) 14:35:56 Nate said he was working on tests for the l2-agent 14:36:22 I'm working my way through the comments. 14:37:16 oh nice that you do stuff in parallel. :) 14:37:22 :) 14:37:46 I try to get back to the patch every day, hopefully I down slow down it 14:37:55 ajo: ok, I will add them. 14:38:08 I have pinged yamamoto_, akamyshnikova on irc, but cannot find Ann's email 14:38:16 ihrachys, : thanks 14:38:39 ok, 14:38:46 #topic Open discussion 14:38:58 anybody want's to raise any other topic or important bug? 14:39:02 * ajo looks at the bug list 14:39:05 ajo : pimg 14:39:11 ping* 14:39:23 hi reedip, go ahead :) 14:39:23 I just want to say we need more reviews on our patches in general 14:39:24 ajo: any update on scheduling part? 14:39:51 oh scheduling... I saw today the spec was -1'd 14:40:00 due to some parallel effort from inside nova 14:40:04 ihrachys: yes, probably I should be spending my time more in QoS patches instead of other stuff 14:40:10 ajo: can you plz help approving https://bugs.launchpad.net/neutron/+bug/1505631 14:40:11 Launchpad bug 1505631 in neutron "QoS VLAN 802.1p Support" [Wishlist,New] - Assigned to Reedip (reedip-banerjee) 14:40:41 vikram: it will be discussed on next drivers meeting 14:40:48 ihrachys, : true, I need to look at the other approach, I looked at it once a month ago, and it didn't look like compatible with what we wanted to do, but may be it's able to model our problem and I just didn't get it 14:40:51 on one of next meetings ;) 14:40:52 ihrachys: thanks 14:41:02 ajo: the neutron side RFE for reporitng actual bw was rejected 14:41:17 vikram, , correct, it has the same RPC callback upgrade support dependency as DSCP does, 14:41:24 ajo: yeah, we should clarify that with nova folks before they get too far in their implementation 14:41:25 ajo: yup 14:41:48 wow , open discussion indeed :) 14:41:53 irenab: I think it was sort of 'postponed' 14:41:57 irenab, yes, as ihrachys said, we need to clarify who pushes that, and how 14:42:02 just because we are not sure how scheduler will work 14:42:08 ajo: I just wanted to share about https://bugs.launchpad.net/neutron/+bug/1505627 14:42:09 Launchpad bug 1505627 in neutron "QoS Explicit Congestion Notification (ECN) Support" [Wishlist,New] - Assigned to Reedip (reedip-banerjee) 14:42:09 if it's going to be the agents, we don't need centralised reporting to neutron-server 14:42:09 ajo: ihrachys agreed 14:42:35 if somebody has some time to analyze again the other nova spec 14:42:37 it would be great 14:42:44 I have not enough bandwidth currently :( 14:43:06 #link https://review.openstack.org/#/c/253187/ resource-providers: generic resource pools 14:43:21 may be every host can be a resource pool... 14:43:24 irenab: are you up for the challenge? :) 14:43:45 but then , we'd have to see how to integrate us updating the resource pool details... I almost don't remember 14:43:51 ihrachys: I am afraid I am over my BW currently 14:43:52 may be jaypipes is up and around ;) 14:44:25 that's sad 14:44:38 well, we'll get to it some time 14:44:48 ajo: should we have some todo in etherpad? 14:45:06 reedip_, vikram we may look at that when we have bandwidth: but please, move this https://beta.etherpad.org/p/Notepad to etherpad.openstack.org 14:45:14 beta.etherpad.org can be eventually deleted 14:45:19 it's just a demo 14:45:26 ajo : ok, in a minute 14:46:04 ihrachys, isn't that the track items? 14:46:13 ihrachys, slow moving are the ones unnatended 14:46:24 ajo: yeah, but it's like a general scheduler thing. we need some details like the link to the alternative spec 14:46:52 oh sorry, I see it's ther 14:46:54 *there 14:47:09 ajo, vikram, ihrachys: https://etherpad.openstack.org/p/QoS_ECN 14:47:13 ihrachys, np, making it more clear 14:47:52 ok, that's fine now 14:48:18 thanks reedip_, link it to the bug so it doesn't get lost 14:48:26 already done ajo 14:49:51 ok 14:49:55 anything else ? 14:50:11 or shall we do the farewell dance ? :) 14:50:15 * ihrachys dances 14:50:22 :-) 14:50:25 lol :) 14:50:30 lol 14:50:43 ajo: I'm in the Nova mid-cycle... 14:50:50 ajo: how can I help you? 14:50:54 hi, jaypipes :) 14:51:05 ok, let's not end the meeting, we have 10 more minutes 14:51:15 we were talking about your spec: https://review.openstack.org/#/c/253187/ 14:51:32 and we wonder if we could use your model to track available bandwidth in hosts 14:51:42 track and decide on 14:51:57 yes, and decide on, for instance scheduling purposes 14:52:13 jaypipes, ^ 14:52:35 ajo: yes, I am almost done with the next revision which has a description of the use case of routed networks in Neutron and using generic resource pools in Nova to give the cloud user ability to specify a port and have Nova able to understand that that port is in a particular subnet, which is attached to a particular segment and that segment is attached to a rack of compute nodes. 14:53:10 ajo: oh... bandwidth... 14:53:19 jaypipes, ok, that's related, and probably more detailed the our intent 14:53:24 ajo: sorry, that wasn't the use case I talked abotu with Carl. 14:53:35 jaypipes, correct, it's something differnt 14:53:45 our intent was to track available bandwidth on hosts 14:53:54 towards providing bandwidth guarantees to instance ports 14:53:59 yeah, Carl cares about L3, but here we have QoS case. 14:54:01 ajo: the problem with QoS and bandwidth is that unlike other resources, bandwidth is an entirely transient thing. it changes from minute to minute. 14:54:19 jaypipes, not for guarantees 14:54:40 jaypipes, for guarantees you need to know the total host bandwidth (ingress & egress) over each specific physical network 14:54:52 ajo: it's not a quantifiable resource, though... it's a qualitative assertion about something.. more of a capability than a resource. 14:55:04 and then, it's left to the underlaying technology, to make sure the bandwidth limits and guarantees are kept within limits 14:55:09 the important constraint is 14:55:41 on a physical network: sum(port[i].guaranteed_bw) < total_bw on that physnet 14:55:48 for an specific host 14:56:22 the idea is that no instance would be scheduled to that host, if it's using a port with an specific bw requirement, that we won't be able to guarantee 14:56:31 ajo: what is the difference between "guaranteed bw" and normal bw? 14:56:38 because we'd be overcommitting the host 14:57:04 jaypipes, : the idea of the guarantee is that 14:57:18 you provide a minimum bandwidth for a port (instead of a maximum) 14:57:18 ajo: no, sorry, lemme rephrase my question... 14:57:45 so, in any case, if the port is trying to use that bandwidth, (or more) it's packets are sent first 14:57:53 then the packets of ports with no guarantees are sent 14:58:13 or the packets of ports with lower guarantees (or already sending over it's minimum bandwidth guarantee) 14:58:40 ajo: if a port has 100 total bw capacity, and 10 instances are given 10 guaranteed bw units each, is there some other process (not an instance, or some privileged thing?) that can consume bandwidth on that port and essentially overcommit the port to more bandwidth than SUM(port[i].guaranteed_bw)? 14:58:44 there are a few ways to do it, but basically it's based on queues and packet prioritisation 14:59:07 the assumption here of course is that underlying infra is capable to handle any bandwidth from any compute node 14:59:13 jaypipes, if the guaranteed ports and not using the traffic, the unguaranteed can use it 14:59:21 ihrachys, correct 14:59:35 that's another level of complexity I don't feel capable of tackling yet 14:59:38 jaypipes: well, it does not require ideal fit, it's best effort 14:59:47 correct, 15:00:00 ajo, ihrachys: so I think the current modeling of generic resource pools will be able to meet these needs. 15:00:19 jaypipes, thanks for the feedback, I will re-read your spec as soon as I can 15:00:21 cool with me :) 15:00:24 ajo, ihrachys: or at least the latest (almost ready to be uploaded) patch's model will meet those needs ;) 15:00:25 an try to fit our needs with your model 15:00:33 we need to consider the case, that's all we are asking :) 15:00:39 yup, of course. 15:00:45 thanks a lot :) 15:00:50 I'm ending the meeting 15:00:51 thanks a lot for replying! 15:00:56 #endmeeting