21:02:57 #startmeeting Networking 21:02:58 Meeting started Mon Dec 16 21:02:57 2013 UTC and is due to finish in 60 minutes. The chair is markmcclain. Information about MeetBot at http://wiki.debian.org/MeetBot. 21:02:59 Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 21:03:02 The meeting name has been set to 'networking' 21:03:10 hi 21:03:12 #link https://wiki.openstack.org/wiki/Network/Meetings 21:03:20 #topic Announcements 21:03:42 Last week was an interesting week for the OpenStack gate 21:04:06 'may you live in interesting times' 21:04:15 hello 21:04:19 We merged a small change that had exacerbated an existing race condition 21:04:47 you mean moving rpc to the end of __init__? 21:04:58 yes.. that revert 21:05:38 that patch will be a good one to dive into because it could better reveal the root cause of the race we've been battlting 21:05:38 interesting 21:05:54 the other takeaway from the experience 21:06:19 Is that we need to trust the gate when it fails 21:06:40 and investigate the logs and cause before issuing rechecks or reverificiations 21:07:32 if you have a patch that fails don't be surprised if a member of the core team asks for a reason why the proposed patch was unrelated to any failures that required a recheck or reverify 21:08:38 also if your patch requires multiple rechecks just to get a +1 from testing infrastructure 21:08:51 expect to get questions during the review 21:09:58 #link https://launchpad.net/neutron/+milestone/icehouse-2 21:10:07 Icehouse-2 is rapidly approaching 21:10:22 along with that 3rd party testing 21:10:34 mestery led a meeting last week to share info 21:11:02 Yes: The hope is to get past likely hurdles together rather than individually. 21:11:02 #link http://eavesdrop.openstack.org/meetings/networking_third_party_testing/2013/networking_third_party_testing.2013-12-12-17.00.log.html 21:11:05 And share information on that. 21:11:10 We have another one this week. 21:11:24 Dec 19th at 1700UTC? 21:11:32 2200 UTC Thursday on #openstack-meeting-alt 21:12:02 #info 3rd Party Testing Dec 19th at 2200 UTC in -alt 21:12:36 Lastly the end of the year holidays are approaching, so many take time out off 21:13:01 I'm thinking we'll meet December 23rd and then skip December 30th 21:13:28 +1 21:13:33 +1 21:14:06 Any other objections? 21:14:13 s/other// 21:14:22 +1 21:14:34 +1 21:15:04 #info No Neutron team meeting December 30th 21:15:15 Ok let's move onto bugs 21:15:18 #topic Bugs 21:15:31 #link http://not.mn/gate_status.html 21:16:39 * notmyname can answer question about that graph 21:16:49 notmyname: awesome graph 21:17:15 notmyname: definitely helps to visualize what is happening 21:17:48 it reminds me a keith haring's painting 21:18:17 haha 21:18:31 anteaya: want to highlight any of the bugs you have listed? 21:18:36 sure 21:19:00 marun has a patch in review: https://review.openstack.org/#/c/61168/ 21:19:09 this bold red line is the chance that a new patch, 100% correct itself, has of passing the 6 gate jobs tracked. and for a gate queue that is 30 deep, that means that it is that percentage (eg 62% right now) is raised to the depth: .62**38 == 1.290875032e-06% chance of all passes clearing the gate right now, even if they are all perfect 21:19:11 for bug https://bugs.launchpad.net/neutron/+bug/1192381 21:19:12 notmyname: how's the patch pass chance calculated? [can answer in open discussion time to avoid collisions] 21:19:13 Launchpad bug 1192381 in neutron "dhcp dnsmasq lost port in host config file" [Critical,In progress] 21:19:26 notmyname: thanks for your answer 21:19:37 would be great if we could cross that off the list today, marun can you address comments and submit a new patchset? 21:19:53 anteaya: can do 21:20:02 nati_ueno: what is the obstacle on https://bugs.launchpad.net/neutron/+bug/1112912, this has been in existance for 1 year? 21:20:04 Launchpad bug 1112912 in neutron "get_firewall_required should use VIF parameter from neutron" [Critical,In progress] 21:20:10 marun, thanks 21:20:10 salv-orlando: patch pass chance is the multiplication of the pass chance for each of the 6 jobs (they are independent variables for that calc) 21:20:19 notmyname: ouch .62**38 hurts 21:20:25 anteaya: That's one is still active 21:20:40 nati_ueno: yes, what more to do you need to bring it to closed? 21:20:48 markmcclain: ya. exponents mean that even a 5% drop _dramatically_ reduces the chance that patches actually land 21:21:09 anteaya: get reviewed https://review.openstack.org/#/c/21946/ 21:21:35 markmcclain: eg a 95% chance means that a 10-deep queue has less than 60% chance to clear 21:22:04 nati_ueno: okay, let's work on getting Jenkins happy with the patch and then getting reviews on ti 21:22:14 anteaya: sure 21:22:29 markmcclain: this is the ssh bug: https://bugs.launchpad.net/neutron/+bug/1253896 21:22:31 Launchpad bug 1253896 in neutron "Attempts to verify guests are running via SSH fails. SSH connection to guest does not work." [Critical,In progress] 21:22:46 you are the champion on it right now, more to say on it? 21:23:06 not too much other than it went a bad to really bad and now back to bad again 21:23:18 nati_ueno: also can you update the report for the bug with that patch url? I couldn't see it when I read the report, I may have missed it 21:23:23 there are others who have been digging on it too and most things are wired properly 21:24:14 markmcclain: are you willing to drive some discussions about it this week to see what can be done to make it even less bad that it is now? 21:24:17 anteaya: Thanks. I'll add the url 21:24:22 nati_ueno: thank you 21:24:32 hi, sorry for late 21:24:39 anteaya: yes I won't be on any airplanes this week :) 21:24:50 markmcclain: great, thanks 21:24:57 want to give yourself an action item? 21:25:19 #action markmcclain to drive 1253896 work 21:25:24 thanks 21:25:29 next confirmed bug 21:25:37 this one needs a champion: https://bugs.launchpad.net/neutron/+bug/1210483 21:25:38 Launchpad bug 1210483 in neutron "ServerAddressesTestXML.test_list_server_addresses FAIL" [Critical,Confirmed] 21:25:47 markmcclain: I will hand over to you all my knowledge on bug 1253896 so far then 21:25:49 Launchpad bug 1253896 in neutron "Attempts to verify guests are running via SSH fails. SSH connection to guest does not work." [Critical,In progress] https://launchpad.net/bugs/1253896 21:25:57 please volunteer before I track someone down 21:26:22 salv-orlando: sounds good I figure many of us can collaborate on it 21:26:25 markmcclain: you again with https://bugs.launchpad.net/neutron/+bug/1230407 21:26:26 Launchpad bug 1230407 in neutron "VMs can't progress through state changes because Neutron is deadlocking on it's database queries, and thus leaving networks in inconsistent states" [Critical,Confirmed] 21:26:35 any thoughts on it currently? 21:27:14 armax: is this bug still open? https://bugs.launchpad.net/neutron/+bug/1243726 21:27:17 Launchpad bug 1243726 in neutron "tempest failure: No more IP addresses available on network" [Critical,Confirmed] 21:27:17 It's been somewhat infrequent 21:27:17 http://logstash.openstack.org/#eyJzZWFyY2giOiJcIkFzc2VydGlvbkVycm9yOiBTdGF0ZSBjaGFuZ2UgdGltZW91dCBleGNlZWRlZCFcIiIsImZpZWxkcyI6W10sIm9mZnNldCI6MCwidGltZWZyYW1lIjoiNjA0ODAwIiwiZ3JhcGhtb2RlIjoiY291bnQiLCJ0aW1lIjp7InVzZXJfaW50ZXJ2YWwiOjB9LCJzdGFtcCI6MTM4MDE1NzA5OTYwMiwibW9kZSI6IiIsImFuYWx5emVfZmllbGQiOiIifQ== 21:27:34 you have two patches merged against, is it still an issue? 21:27:39 There have been only 9 occurrence of 1230407 in a lot of time. And all while the gate was brokeb 21:27:46 it does not look like a burning issue 21:27:56 broken. Note that 153896 can manifest also as 1230407 21:28:04 sorry I mean 1253896 21:28:18 since it still occurs that's why I have not closed it 21:28:24 well sdague has asked that once critical bugs remain in that status unless they are completely gone 21:28:33 My understanding is that the bug has been isolated and markmcclain has the definitive for it, which is splitting API/RPC servers 21:28:39 however there are related bugs open against this one 21:28:56 salv-orlando: that was the last update I posted to that bug yes 21:29:04 anteaya: I can take a stab at 1210483. If I need help, I'll yell 21:29:12 armax: are you able to update the bug report to reflect that? 21:29:21 mlavalle: awesome thank you 21:29:23 salv-orlando: doesn't having wsgi out of process get us halfway there? 21:29:30 of course I am, you're asking me if I will? 21:29:30 :) 21:29:42 armax: okay, will you? 21:29:44 marun: yes 21:29:45 * salv-orlando chops armax's finger 21:29:45 :) 21:29:48 yup 21:29:52 thank you 21:30:05 this is mine https://bugs.launchpad.net/neutron/+bug/1250168 21:30:08 Launchpad bug 1250168 in neutron "gate-tempest-devstack-vm-neutron-large-ops is failing" [Critical,Confirmed] 21:30:19 but I need someone to take over, I simply offered a revert for it 21:30:36 can someone else take this and make it disappear? 21:30:40 arosen? 21:30:44 as in no longer occuring 21:31:03 sure i can take a look at it. 21:31:07 arosen is our nova/neutron guy. This bug pertains to nova/neutron interface 21:31:08 arosen: thanks 21:31:11 arosen: thanks 21:31:13 great 21:31:17 sorry for taking so long 21:31:25 https://bugs.launchpad.net/neutron/+bug/1251784 needs a volunteer 21:31:29 Launchpad bug 1251784 in tripleo "nova+neutron scheduling error: Connection to neutron failed: Maximum attempts reached" [Critical,Fix released] 21:31:34 and to be triaged 21:32:21 this is in the neutron/nova interface 21:32:48 anyone in addition to arosen available to help? 21:32:58 not showing any occurrences of this in the last 7 days 21:33:19 it is still critical is it not? 21:33:28 markmcclain: it looks like that one is a timeout in neutron 21:33:35 the notes in the bug say no hits since Nov 28th 21:33:48 I think it might be safe to close this one 21:33:53 okay 21:34:07 last two that need volunteers: https://bugs.launchpad.net/nova/+bug/1210483 21:34:08 Launchpad bug 1210483 in neutron "ServerAddressesTestXML.test_list_server_addresses FAIL" [Critical,Confirmed] 21:34:16 and https://bugs.launchpad.net/neutron/+bug/1254890 21:34:17 Launchpad bug 1254890 in tempest ""Timed out waiting for thing" causes tempest-dsvm-neutron-* failures" [Low,In progress] 21:34:29 i've taken 1210483 21:34:38 enikanorov: thank you 21:34:46 let me know how your progress goes 21:34:51 ok I also thought mlavalle said he was working on it too 21:35:18 markmcclain: you are correct, sorry 21:35:24 Yes 21:35:57 enikanorov: how do you feel about looking at 1254890? 21:36:28 ok, I'll take a look 21:36:32 thank you 21:36:33 done 21:36:45 anteaya: thanks for reviewing the bugs 21:37:02 All any other bugs the team should be tracking? 21:37:15 for 1254890 the log stash query refers to large_ops only. I've seen the same error in other places too. 21:37:30 salv-orlando: good to know 21:37:48 If you remove indeed large_ops from the query you'll find more occurences. 21:38:25 #topic Nova Parity 21:38:28 beagles: hi 21:38:33 hi 21:39:21 so, we didn't have much movement on parity specific stuff last week unfortunately 21:39:38 ok.. what resources do you need to help move things along? 21:40:45 I think it would be useful to align the efforts for starters.. obviously there are lots of people doing stuff that is related 21:40:57 the gate... the additional tests, etc 21:41:03 ok 21:41:08 if we can carve out some time this week to sync that would be good 21:41:20 beagles: who do you need to sync with? 21:41:41 I'm guessing at least me + mlavalle? 21:41:42 I can help from the testing side 21:41:49 actually I see mlavalle's great etherpad and wiki pages but haven't seen many tests yet 21:41:49 but mlavalle holds everything 21:41:55 oops and salv-orlando 21:41:58 mlavalle, rossella_s, markmcclain, if arosen could join that would be great 21:42:11 #action markmcclain to sechedule a time for syncing this week 21:42:13 I have just bits of knowledge about status of testing and a few parity features I've been involved with 21:42:18 beagles: just propose a time and I'll make myself available 21:42:44 ditto 21:43:03 that'd be great.. can we do it tomorrow a.m. EST? 21:43:07 we'll chat in the -neutron channel so it's logged and everyone can participate since it will be an on going discussion once we kick it off 21:43:24 something like 10:00 am EST.. (what is that UTC?) 21:43:43 1500 UTC 21:43:45 mlavalle: yes, that is 17:00 UTC 21:43:52 sorry, 15:00 UTC 21:44:00 folks UTC = EST +5 21:44:07 beagles: that's a little early for me. I'm on the west coast. Could we push for 11:00 EST instead? 21:44:26 arosen: sure 21:44:34 so 16:00 UTC 21:44:36 thanks :) 21:44:50 ok for me 21:44:56 #info parity+testing 1600UTC in openstack-neutron 21:45:08 fine with me 21:45:16 #topic Tempest 21:45:25 mlavalle: anyything add that's not on the agenda? 21:45:36 yeah 21:46:06 first of all, continued gap analysis for API tests. Added List available extensions, provider extended attributes for networks, binding extended attributes for ports, external network extension, configurable external gateway modes extension, quotas, security groups and rules, agent management extension, extraroute extension to the etherpad 21:46:36 several people have already assigned themselves work from there 21:46:39 in % how many of these tasks have assignees? 21:46:41 mlavalle: whos doing the work on tempest side? 21:46:43 cool… thanks for adding them 21:47:06 and enikanorov has volunteered to refactor the rest client for neutron 21:47:26 yeah, asking because i want to make it less of copy-paste work 21:47:36 i'll finish the gap analysis this week, covering the whole api and all the extensions 21:47:54 for the sake of time i'll stop here 21:48:15 enikanorov: i'm also the tempest side 21:48:22 tempest parallel tests: just a quick update that I'm starting to push all the relevant patches upstream 21:48:23 enikanorov: I've seen your patch, nice work so far 21:48:30 thx 21:48:34 in my internal server I have a 80% success rate on parallel job 21:48:59 which is not worse than other gate jobs apparently 21:49:03 salv-orlando: great progress on your parallel blueprint 21:49:09 salv-orlando: cool 21:49:31 hopefully we'll be able to close that last 20% 21:49:34 only issue I am tracking and can't explain so far is that at some point some VM do not sent DHCPDISCOVER even if they're perfectly wired 21:49:55 this happens also on the upstream gate (just happened on one of my patches), and therefore I will file a bug soon 21:50:04 Or perhaps is a bug that haunts just me? 21:50:07 interesting 21:50:49 ok we're running short on time again 21:51:21 salv-orlando: when you file the bug, please let me know and I will try to reproduce in my dev system 21:51:31 malavalle: sure 21:51:42 so we can compare notes 21:51:46 #topic IPv6 21:52:03 There's a mailing list thread on hairpinning per vif 21:52:23 please chime in on that thread if you have thoughts on it 21:52:26 #topic ML2 21:52:46 nothing critical to discuss today 21:53:15 The only item I have is that we will be canceling the meetings on 12-25 and 1-1. 21:53:19 Seems that there is discussion still to be had on providernet vs multi-provider? 21:53:24 Will send email with a note to openstack-dev. 21:53:50 mestery: thanks for the heads u 21:53:52 up 21:53:58 #topic Open Discussion 21:54:08 salv-orlando, enikanorov http://lists.openstack.org/pipermail/openstack-dev/2013-December/021984.html https://bugs.launchpad.net/neutron/+bug/1214115 21:54:10 Launchpad bug 1214115 in neutron "ipavailabilityranges race condition when allocating from same range on multiple neutron-servers" [High,In progress] 21:54:24 what are next steps on this bug? I'm happy to do leg work 21:55:07 not sure if the original patch submitter is still active, but also happy to pick that up and massage it if needed 21:55:09 geekinutah: contact the current bug assignee 21:55:12 it's interesting if carl baldwin's patch could address this issue 21:55:14 geekinutah: I thought I have already removed my -2 there. 21:55:32 hmmm, still shows -2 21:55:35 geekinutah: https://review.openstack.org/#/c/58017/ 21:55:56 ahh, okay thx 21:55:56 the patchset is abandoned 21:56:08 so the state cannot change until the review is active again 21:56:09 I am happy to reconsider. You might understand that if we suspect a risk of introducing an issue worse that the bugs being fixed we act a big conservative in the 3rd milestone 21:56:11 enikanorov: My impression was that my patch probably would not address that bug. 21:56:16 markmcclain: correct 21:57:08 geekinutah: reach out and see if the original person is still interested in the work if not offer to take it on 21:57:24 I will do that, also I'll look at carl_baldwin's patch 21:57:35 if you have any questions feel free to ask in the IRC channel or mailing list 21:57:44 Any other open discussion items? 21:57:46 Please consider me offline from Jan 1. until code sprint in Montreal 21:57:50 yeah, it seems that retries are needed 21:58:07 I also remember we did something similar for generating tunnel ids 21:58:13 any detail questions about the code sprint, ensure you have the answers you need prior to Holiday Break 21:58:15 anteaya: thanks for the heads up 21:59:49 anybody experience issue with devstack on a local system, beyond migration issues , like the sudo /usr/local/bin/neutron-rootwrap 21:59:52 12-16 14:55[ dkehn]: /etc/neutron/rootwrap.conf ip netns exec qprobe-83b5c2ea-ca44-4b1\ 21:59:55 12-16 14:55[ dkehn]: 12-16 14:55[ dkehn]: c-a7d6-46235386b287 ping -w 1 -c 1 10.1.0.4; failures 21:59:58 pinging namespaces 22:00:39 dkehn: +1 22:00:49 I have not 22:01:03 dkehn: is that the debug agent? 22:01:04 We're at time for this week 22:01:07 Ok remember tomorrow at 1600UTC in #openstack-neutron we'll kick off the discussion on Nova Parity and Testing overlapping 22:01:12 #endmeeting