21:01:11 #startmeeting networking 21:01:12 Meeting started Mon Aug 25 21:01:11 2014 UTC and is due to finish in 60 minutes. The chair is mestery. Information about MeetBot at http://wiki.debian.org/MeetBot. 21:01:13 Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 21:01:15 The meeting name has been set to 'networking' 21:01:18 o/ 21:01:27 HI 21:01:28 #link https://wiki.openstack.org/wiki/Network/Meetings Agenda 21:01:34 o/ 21:01:34 O/ 21:01:38 #topic Announcements 21:01:44 `o 21:01:52 #link https://launchpad.net/neutron/+milestone/juno-3 Juno-3 21:01:53 o/ 21:01:57 We're approaching the end of Juno now. 21:02:03 Feature Freeze is September 4, next week. 21:02:16 mestery: it’s actually the end of Augusto ;) 21:02:18 Any features which haven't merged will be moved out of Juno and back into the Kilo pile. 21:02:34 hi 21:02:44 salv-orlando: Next Thursday (9-4) is Feature Freeze :) 21:02:50 salv-orlando: Ah, got it. ;) 21:02:59 We have a lot of code to review and try to merge 21:03:06 And the gate is under heavy load at the moment 21:03:20 The check queue is over 200 at this moment in fact 21:03:40 Any questions on the end of Juno at this point? 21:04:31 mestery: idk. But I think we still have 84 targeted blueprints and we’d start to let know people that some of them can’t possibly merge. 21:04:47 salv-orlando: I actually bumped some out, we only have 66 21:05:02 mestery: thanks. I was actually asking for trimming the number 21:05:07 you must be able to read my mind. 21:05:19 mestery: +1 21:05:21 66 still seems like an awful lot. 21:05:22 salv-orlando: But it's true, we'll be lucky to merge even 10 or so if things progress as normal. 21:05:23 mestery, salv-orlando: Did all 66 make the FPF? 21:05:31 rkukura: I think so, yes. 21:05:53 I've seen more than a few examples that technically met the FPF but are in no shape to be merged. 21:06:18 marun: In those cases, please put a -2 on the patch, if a FPF patch was rushed, then we should focus on other BPs which are in better hsape. Fair? 21:06:45 marun: that's what the review is for :-) 21:06:45 mestery: very much so. And any patch author that receives a -2 is welcome to argue their case - it's by no means final. 21:07:00 marun: Agreed. 21:07:04 mestery, marun: I would say that it’s more than fine to focus exclusively on blueprints whose code is in good shape. 21:07:16 salv-orlando: ++, 21:07:35 One other note here, starting the week after next we'll be going with the rotating meeting schedule for this meeting, per the email thread on the list. 21:07:40 I'll post more details on that this week. 21:08:02 I'd also like to change the format of this meeting to be more issue focused rather than sub-team rollup focused, the change will coincide with the meeting time change. 21:08:20 Any other announcements for the team? 21:08:31 What's the deadline for bug fixes? 21:08:45 banix: release day! 21:08:50 :) 21:09:00 * mestery thinks everyone always wants to know the very latest they can land code. :) 21:09:01 Ok thx. 21:09:05 banix: What salv-orlando said ;) 21:09:30 #topic Bugs 21:09:39 enikanorov: Hi, are you around? 21:10:18 salv-orlando: How is the gate holding up from a neutron perspective this week so far? 21:11:07 Does anyone else have any bugs they want to bring up in this slot? 21:11:54 #topic Neutron CI Systems 21:11:58 dane_leblanc: Here? 21:12:01 mestery: Hi There! 21:12:06 Hi 21:12:11 dane_leblanc: Can you give an update to the team from the third party meeting? emagana, hi! :)( 21:12:30 dane_leblanc: This is on the issue you raised on the mailing list and at the third party meeting today. 21:12:33 Question came up about what to do when a Ci was blocked on a bug fix 21:12:40 #link http://www.mail-archive.com/openstack-dev@lists.openstack.org/msg32385.html 21:12:59 Consensus was to vote "SKIPPED" rather than not at all for changes that don't contain the fix 21:13:13 There's a section in the Neutron 3rd party CI wiki 21:13:24 #link https://wiki.openstack.org/wiki/NeutronThirdPartyTesting 21:13:24 If people can check it out 21:13:37 Would be good to get consensus 21:13:43 we need to figure out a mechanism for that with zuul. 21:13:46 Does Skipped refer to a failing test or the whole job? 21:13:49 dougwig: ++ 21:13:50 kevinbenton has some ideas on how to do that 21:14:17 kevinbenton: Did you post your proposal on the ML? 21:14:21 Skipped for skipping the entire testing for the changes without a fix 21:14:29 Not here. Mobile ATM so can't type much 21:14:59 kevinbenton: N.p. I will be good to include it in the same thread 21:15:00 I'm trying kevinbenton's proposal on a zuul/jenkins setup 21:15:12 Will add a hint on the wiki if it works 21:15:15 Sounds a bit if an overkill unless the bug is causing failures in most of the tests you run 21:15:16 dane_leblanc: Your feedback on the approach will be good to get. 21:15:32 kevinbenton’s post to the ML: #link http://lists.openstack.org/pipermail/openstack-dev/2014-August/043636.html 21:15:42 dane_leblanc: can you msg me the suggestion, and i can also try it out? 21:15:45 salv-mobile: I believe that was the case, yes, it was failing most of the tests. dane_leblanc, correct me if I'm wroing. 21:15:49 SumitNaiksatam: Thanks! 21:15:57 #link http://lists.openstack.org/pipermail/openstack-dev/2014-August/043636.html 21:16:01 salv-mobile: One example is a new plugin for which plugin code is not yet merged 21:16:01 SumitNaiksatam: thanks for the link ;) 21:16:06 * salv-orlando btw I will talk about the gate in open discussion 21:16:12 Yes, failing all for a period of time 21:16:13 salv-mobile: thanks :) 21:16:32 And let's say your hardware depends on enabling code in your plugin, which isn't merged yet 21:16:47 Then all other changes will fail on your testbed 21:17:21 dane_leblanc: Thanks for driving this one, I know you've been working hard on this for the past week. 21:17:31 mestery: thanks 21:17:34 dane_leblanc: so it’s more about how to prove your CI is working even if the code it needs to test is not yet merged. 21:17:34 mestery: +1 21:17:51 dane_leblanc: thanks! 21:18:01 salv-orlando: We still want to test against the plugin 21:18:19 while it's being reviewed 21:18:27 * mestery is seeing multiple salv's in this meeting. 21:18:46 mestery: Do you want to discuss all other CI systems? 21:18:48 mestery: thanks not a bad thing! :-) 21:18:53 dane_leblanc: definetely. my take is that CIs needs to be extensive and accurate, and we should be strict about them - but not pedant 21:18:54 emagana: Lets use this slot, yes. 21:18:56 dane_leblanc: I usually update my CI to include such patch and apply the patch before running tests - does it make sense? 21:19:13 it sounds like you’re having this problem because it’s been required to vote on ALL patches 21:19:56 Yes, but even if it's required just on one section of Neutron 21:20:09 * salv-orlando had to reboot his mac and used his smartphone in the meanwhile 21:20:45 emagana: Did you want to discuss other CI systems now? 21:20:54 mestery: Yes! 21:21:14 On the third-party CIs review, great response from most of the CI owners. 21:21:40 dane_leblanc: correct, but I would like to avoid eccessive pedantry. I think it would be enough proof for me to see that a CI leaves a simple comment stating “sorry I can’t test anything in this patch" 21:21:54 However, there are still few CIs that are not fulfilling the minimal requirements for Juno.. SO, what will be the action to taken for those CIs? 21:22:18 emagana: They will be marked as untested in the release notes at a minimum 21:22:19 to be taken* 21:22:30 What do others think here? 21:23:07 +1 21:23:08 mestery: if you send a warning notice on the mailing list with the requirements it won’t come as a surprise to the various maintainers 21:23:23 mestery: I am fine with that. I will also propose for the next summit (I mentioned before) to have a short session on third-party testing requirements 21:23:30 salv-orlando: I'll send another warning notice, though this isn't new, I sent emails in June/July that this was coming. 21:23:34 salv-orlando: But another warning isn't a bad idea. 21:23:36 emagana, I think you already sent a notice on a the mailing list. So it's not a surprise. 21:23:51 also I’ve seen a trend towards assuming 3rd party CIs should “tend to” the upstream CI. 21:24:11 mestery: All the questions and experiences that I have been collected will be used as baseline for the requirements but this time the removing code action will be enforced! 21:24:13 I think our goal is not to force plugin/driver maintainers to do continuois development 21:24:23 * continuous 21:24:40 lyxus: indeed 21:24:47 salv-orlando: Our goal is to ensure they meet the third party requirements as documented on the wiki if they want to be upstream. 21:24:50 The goal of 3rd party CIs is to prove the stuff which is in the source code tree works, is reasonably efficients and does not bitrot 21:25:08 salv-orlando: +2 21:25:21 That's what the wiki requirements are trying to enforce. If people feel the requirements there don't, we should update the wiki. 21:25:47 salv-orlando, agree and not pollute the reviews... 21:26:07 mestery: I have not received any negative feedback yet about the minimal requirements for Juno 21:26:07 mestery: my point is simply that I’m seeing more emphasis on how CIs run the tests rather than on which tests the CI run 21:26:31 I would remark that the latest bit is the most important… I hope that’s a shared feeling 21:26:32 mestery: On the contrary, based on those discussions I have seen improvement on most of the CI staff 21:26:36 salv-orlando: Yes, I get it, lets make sure the wiki reflects the importance of which tests to run. 21:26:46 emagana: Good, then we're helping here :) 21:27:11 mestery: I will continue the audit at least every other week 21:27:18 emagana: cool, thanks! 21:27:25 mestery: I think the wiki is fine, no worries. 21:27:29 Until we reach Kilo and have a proper process in place 21:27:47 mestery: I think emagana did update the wiki with that informaiton - i.e. what tests to run 21:27:50 salv-orlando: Thanks for reviewing the wiki! 21:27:55 emagana, For the audit, just give it couple of hours for the CI running -1 manually 21:27:57 emagana: We need to continue working with the broader community here, we have the most drivers/plugins, but what we're doing is applicable to nova and cinder at the least as well. 21:28:03 lyxus: You got it! 21:28:08 Sukhdev: That information was there a while back. 21:28:20 * salv-orlando seems like recentyl everyone wants a process for everything. 21:28:22 mestery: Absolutely, this is why I want to bring this to the summit! 21:28:38 emagana: What is the link to the wiki? 21:28:40 mestery: He further updated it last week - with Juno requirements and listed the minimal set of tests required 21:28:41 * mestery feels like a government employee 21:28:44 mestery: Anyway, enough on CI.. let's move forward 21:28:50 salv-orlando: you mean more process isn't the solution to all our problems? ;) 21:29:00 #link: https://wiki.openstack.org/wiki/Neutron_Plugins_and_Drivers#Existing_Plugin_and_Drivers 21:29:01 Sukhdev: This wiki? https://wiki.openstack.org/wiki/NeutronThirdPartyTesting 21:29:01 mestery: ha ha - which Govt. ? 21:29:36 mestery: yes - see Juno Relese Minimal Requirements section - I think this is new 21:29:36 songole: Both wiki are related to the CI staff 21:29:37 marun: I think more process is the way to trick yourself into believing you’ve solved a problem 21:29:37 Our wiki is a mess. 21:29:38 Argh. 21:29:41 emagana: thanks. 21:29:44 * mestery takes a note to update the wiki. 21:29:57 Ok, lets move on now. 21:30:20 * mestery doesn't see markmcclain here for a parity update ... 21:30:22 mestery: Thanks for the time! 21:30:26 emagana: Sure! 21:30:32 emagana: But stand at the ready sir ... 21:30:34 #topic Docs 21:30:38 emagana: See what I did there? 21:30:50 mestery: Lol.... Ok.... Interesting news! 21:31:01 * markmcclain rejoins on real connection 21:31:12 On the docs side, we will completely change the Admin Guide... it is really messy 21:31:19 markmcclain: We'll come back to parity :) 21:31:40 emagana: Yikes 21:31:54 The proposal of having a Networking specific guide will not happen (still to be confirmed) we have enough guides already and one more will complicate things even mroe 21:31:58 more* 21:32:19 * mestery thinks guides are like policies ... 21:32:43 On the vendors specific side, there is a proposal to remove all vendor specific staff from the admin guide as well... is still on the table and will be bring to the ML 21:33:02 mestery: if you mean that they’re meant to help people understanding stuff by confusing them more, you’re probably right. 21:33:18 salv-orlando: You said it, not me. :) 21:33:41 I want to thank carl_baldwin for giving us a good set of reference document for the DVR documentation 21:33:52 emagana: +1, nice work carl_baldwin! 21:33:53 emagana: yw 21:34:08 I have a meeting with annegentle and phil H. to go over DVR 21:34:31 please let us know if you need any additional information on DVR 21:34:43 After, Juno-3 the amount of work on Docs will increase.. so stay tune 21:34:53 Swami: I will for sure ;-) 21:35:28 mestery: Done! 21:35:29 emagana: Agreed, we'll need to really get some focus on docs soon. 21:35:34 emagana: Thanks again for the update! 21:35:40 #topic Parity 21:35:43 markmcclain: Hi there! 21:35:48 mestery: hi 21:36:21 so a late update is that we had a good chat with the operators at the ops meetup about migration 21:36:37 markmcclain: Awesome! 21:36:41 helped to better define a few of the parameters they'd like to see for kilo 21:37:42 otherwise we're on track for Juno 21:38:05 what about those parameters? 21:38:08 markmcclain: Will the nova team agree to saying nova-network is deprecated starting in Juno? 21:38:22 * salv-orlando bets no 21:38:43 mestery: still working on that 21:38:49 Networking discussion during Ops Meet-up #link: https://etherpad.openstack.org/p/SAT-ops-networking 21:38:58 there is general agreement that functional parity exists 21:39:00 markmcclain: cool 21:39:20 performance testing is needed to get better numbers there 21:39:39 Yeah, I'm not sure there is grounds to deprecate so long as Neutron can't match nova network in the scalability department. 21:39:44 * sc68cal needs to propse that "hug it out" design summit session to help fix perceived nova/neutron friction 21:39:57 marun: Can you share those scalability numbers? I've not seen any for nova-network :) 21:40:02 markmcclain: obviously I’m more interested in the disagreements. Especially when it comes to subjective vs objective judgment. Maybe can you share something on the ml later on? 21:40:03 markmcclain: Any updates on the shared scalability lab? 21:40:11 sc68cal: No friction, we're in good shape actually between the teams. 21:40:44 sc68cal: It's not a matter of friction, just that nova network continues to be a good choice for scale when its limitations can be lived with. 21:40:47 I need to put more work tomorrow in making the RPC refactor for the security groups in shape, that should help scalability a lot, 21:40:49 From what I gathere so far we were talking about a perception based on operators reports regarding scalability and performance 21:40:56 mestery, sc68cal: agreed the team are communicating well 21:41:12 ajo_: yes that work will help 21:41:36 I seem to have discussed with somebody ideas to get realistic numbers using a simulator or something like that? 21:41:36 salv-orlando: I don't think it's perception. We need reproduceable tests showing that neutron's reference implementation is workable at scale. 21:41:41 salv-orlando: yes will craft an email 21:41:43 perception -> just perception 21:41:53 * mestery hates when we use "reference implementation" ... 21:42:00 I would argue neutron's in-tree implementation 21:42:02 mestery: what would you prefer to call it? 21:42:06 marun: I was pointing that there are no number yet we can quote as a reference. 21:42:14 1st party driver :) 21:42:15 Reference to me implies it's not designed to scale, but now I'm arguing pedantry here. 21:42:18 marun: ++ 21:42:19 markmcclain: ++ 21:42:30 salv-orlando: So was I 21:42:31 haha 21:42:56 maybe it's just me who needs to hug it out then 21:42:59 so we do have people deploying various components that we can get real world numbers 21:43:05 * mestery gives sc68cal a hug 21:43:12 mestery: You can call it built-in backend if you want 21:43:22 also at the ops meetup there were more teams deploying neutron than nova-net 21:43:34 markmcclain: That's an interesting data point. 21:43:45 I don't see why it's so hard to define a (sorry) reference deployment, test with nova network, and then work to ensure that neutron can handle that scale too. 21:44:01 Neutron 25 - Nova-network 9 21:44:03 Numbers from real deployments are only so useful when we need to be able to iterate on improvements. 21:44:05 marun++ 21:44:21 marun: I think that's what the canonical people did with icehouse, right? James Page posted a blog on that, and where neutron fell apart compared to nova-network 21:44:39 mestery: something like that, but we need to work harder at reproduceability 21:44:43 marun: I think the main issue they hit on neutron was SG scaling, which we hopefully can fix in juno yet. 21:44:46 marun: I agree 100% 21:44:51 marun: It’s not hard for me as well. I just think we need to find a few people willing to work on this. Do the testbed, define metrics, collect data, analyzie data, publish data 21:44:54 marun: And having a way to test this througout the cycle would help as well. 21:44:57 marun: so that is part of the plan also we need to properly model control plane so that we can construct some tests we can use for detecting regressions in the gate 21:44:58 mestery, for the numbers, it looked to me like they were hitting the RPC message sizes 21:45:07 ajo__: Agreed 21:45:24 salv-orlando: I understand that this is coming, that hp, dell et al are going to donate hardware to a scale lab and devote resources to define a reproduceable scale test. 21:45:34 salv-orlando: these things always take too long, though :/ 21:45:41 marun: ah I remember I had that conversation with you now! 21:45:53 It would be good to have some metrics we can automate and test on bare-metal on CI, 21:45:56 * mestery wants more information on the scale lab. 21:46:10 to have indicators on scalability, if something changes for bad, we know it.. 21:46:24 anyway sg also had an issue that ml2 triggered their update a gazillion of times, which has been solved too. 21:46:27 ajo__: ++, running this throught the cycle to track where things break would be good. 21:46:28 mestery: you're not the only one. 21:46:48 if you couple that with the RPC thing it must have been a nightmare when you had thousands of vms on the default security group 21:47:17 salv-orlando: ++ 21:47:46 ajo_, marun: I have a feeling we would be stepping a bit into the benchmarking area, where there are already people actively working on that. 21:48:29 OK, lots of good discussion in the parity area this week. 21:48:39 salv-orlando, it doesn't mean we have to design and make specific benchmarking, may be just reusing something that exists, or tunning it to our needs. 21:48:45 Lets move on with 12 minutes left now. 21:48:53 #topic Tempest 21:48:57 mlavalle: Any updates here this week? 21:49:23 mestery: nope, with the surgery of last week, I have to catch up 21:49:36 mlavalle: OK, continue healing up and we'll talk later :) 21:49:41 #topic L3 21:49:43 carl_baldwin: Hi! 21:49:43 :-) 21:49:54 mestery: hi. everything is on the wiki. I’ll give you this time back if you would like. 21:50:14 I pushed a patch to enable parallelism on the DVR job. https://review.openstack.org/#/c/116200/ 21:50:18 mestery, mlavalle" wanted to say something about tempest 21:50:42 * mlavalle listening to Sukhdev 21:50:43 carl_baldwin: I had one question on the BGP stuff 21:50:44 #link http://lists.openstack.org/pipermail/openstack-dev/2014-August/043934.html 21:50:44 salv-orlando: I’ll have a look. thanks. 21:50:50 salv-orlando: how’s that doing? 21:51:01 carl_baldwin: Have you been following this BP at all? Is this looking good? Just curious mostly. 21:51:05 armax: Idk - until its merges we dont’ know 21:51:23 mlavalle: Have been chasing why sceranrio tests are failing some time when using devstack - 21:51:23 mestery: it has been lower on the list but I’m going to review it this week. 21:51:24 carl_baldwin: it’s just because it’s with parallelism that you see the race appears 21:51:37 but if you want to keep things simple atm I’ll put it back in wip 21:51:40 carl_baldwin: OK, I'll have a look too, just curious how likely it is to merge with 10 days left. 21:51:47 mestery: BGP code has come a long way and I believe devvesa has had good success with it. 21:51:57 carl_baldwin: That's awesome news, very encouraging! 21:52:08 salv-orlando: ok 21:52:15 mestery: I will be able to say more after a first pass through the code. 21:52:29 carl_baldwin: OK, awesome, thanks for the updates here! 21:52:38 I will make an assessement soon and be in touch. 21:52:39 mlavalle: filed a bug and logged the findings 21:52:40 Sukhdev: since we are close to the end of the meeting, let's take this off-line 21:52:47 * mestery notes we're running low on time now .... 21:52:54 mlavalle: yes lets do it 21:53:00 #topic IPV6 21:53:03 Hellooo 21:53:05 sc68cal: I saw your email today on V6 in plugins ... 21:53:08 Worth discussing here? 21:53:11 A bit 21:53:12 * mestery thinks it is 21:53:16 cool 21:53:20 I figured nobody's had a serious flamewar this week yet 21:53:43 * mestery gest out his fire retardent suit again 21:53:47 kevinbenton has a good patch that works on the big pain point with the one convergence plugin 21:54:10 anything with v6 in the test name gets skipped now, so it's a decent band-aid 21:54:12 thanks to kevinbenton for the patch 21:54:30 sc68cal: what are the requirements for Juno? 21:54:45 songole: perhaps you can update the patch with a link to a bp you can file for your plugin (for the v6 support) 21:54:46 I'll leave the bigger question of a plugin that inherits from db_base_plugin to those with more knowledge/experience 21:55:05 sc68cal: That may be bst handled on the ML given the time left here. 21:55:06 it just seems unusual to have a plugin inhert then totally error out on a specific subnet type 21:55:10 sc68cal: But thatnks for starting this conversation! 21:55:11 SumitNaiksatam: ok 21:55:14 no problem 21:55:17 thanks for the time 21:55:24 OK, any other V6 updates sc68cal? 21:55:35 nope, we got the dhcpv6 code merged :) 21:55:42 sc68cal: woot! 21:55:51 now we can focus on more difficult things :) 21:55:52 :-) 21:56:11 sc68cal: :) 21:56:18 #topic Open Discussion 21:56:25 4 minutes for anything else at this point 21:56:26 incubator update? 21:56:43 dougwig: We should have that created tomorrow (/cc markmcclain) 21:56:49 it’s not the first time we had a pluggin skipping unit tests because they were unsupported 21:56:49 I can call out at request, but the plugin in question was threatened removal 21:56:49 I think as a community we should not have double standards. 21:56:49 mestery: nice 21:56:58 one second 21:57:05 so neutron and other are crashing the gate - bug 1349617 21:57:06 Launchpad bug 1349617 in neutron "test_volume_boot_pattern fails in grenade with "SSHException: Error reading SSH protocol banner[Errno 104] Connection reset by peer"" [High,New] https://launchpad.net/bugs/1349617 21:57:20 dougwig: have been on road, but have been coordinating with infra 21:57:26 salv-orlando: I saw a patch which hit that one last night while reviewing 21:57:37 salv-orlando: Does this happen with nova-network as well as neutron? 21:57:37 mestery, markmcclain: ty 21:57:46 markmcclain: nice, please let me know if i can jump in and help in any way 21:57:48 I’ve spent some time on this bug. It’s not the first time we see it. It happened back in june 21:58:03 * mestery remembers this from June as well. 21:58:17 and it was a weird problem that disappeared once we removed a patch from armax which just added an addiitonal connectivity check 21:58:19 salv-orlando: In that case, wasn't it that the VM wasn't even booted? I recall not even seeing a console, and thus the ssh failed because the VM wasn't there. 21:58:55 mestery: the news indeed are that the console log output is now there and the vm boots. However, if you look at the boot message you see an error while processing cirros datasource 21:59:04 with points to a metada/config drive problem. 21:59:32 that’s where my analysis stop. As I will need to take some PTO soon I would like to hand off this bug 21:59:32 salv-orlando: OK, thanks, let me know if you want a second set of eyes on this one as well. 21:59:41 CI tests with metadata or file injection now? 21:59:47 salv-orlando, mestery: I saw same behavior last week and filed this bug https://bugs.launchpad.net/devstack/+bug/1360480 21:59:49 Launchpad bug 1360480 in devstack "destack has a silent failure which causes tempest tests to fail" [Undecided,Confirmed] 22:00:02 Sukhdev: May be best to duplicate to the bug salv-orlando mentioned 22:00:10 OK, thanks for joining this week everyone! 22:00:22 Next week is Feature Freeze, so lets see if we can push a few more features into Juno. 22:00:25 #endmeeting