#openstack-meeting-3 log

06:29:48 <anil_rao> #startmeeting taas
06:29:49 <openstack> Meeting started Wed Mar  9 06:29:48 2016 UTC and is due to finish in 60 minutes.  The chair is anil_rao. Information about MeetBot at http://wiki.debian.org/MeetBot.
06:29:51 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote.
06:29:54 <openstack> The meeting name has been set to 'taas'
06:30:06 <anil_rao> Hello
06:30:13 <vasubabu> hi
06:31:00 <anil_rao> Let's get started
06:31:02 <fawadkhaliq> hi guys
06:31:15 <anil_rao> #topic Spec Discussion
06:32:34 <reedip> hello
06:32:48 <reedip> I sent an email yesterday on the Openstack-dev ML
06:33:21 <reedip> I basically compared the Juno Spec with the current one and found some of the concerns which were raised by the reviewers in the Juno spec may not have been resolved in the current spec
06:33:35 <anil_rao> #link http://lists.openstack.org/pipermail/openstack-dev/2016-March/088645.html
06:33:56 <reedip> this was one of the action items I had to do last week ( Thursday ) , but other factors made me lazy
06:34:03 <reedip> so was able to put it up yesterday
06:34:25 <anil_rao> reedip: Do you want to go over this list items
06:34:55 <reedip> anil_rao : I think the mail should be self-explanatory ... otherwise it might take a lot of time
06:35:15 <reedip> anil_rao: lets discuss it at the end, so that other points are covered first ( thats my take)
06:35:16 <anil_rao> Sure
06:36:19 <anil_rao> Soichi said in an email that he would like to discuss the Dashboard review comments next week. He needs some time to go over them and consider options
06:37:14 <reedip> yes, that is I guess the second point of today's discussion. I updated the Agenda, but found Soichi's email later...
06:37:58 <anil_rao> Let's move on. We can discuss the few bugs and then come back to the Spec
06:38:30 <anil_rao> reedip: Do you want to say something about the list bugs
06:38:51 <reedip> Yup
06:39:18 <reedip> I lost track of it a bit, but I think its mainly related to the TaaS requirements
06:39:50 <reedip> Basically if tap-service/tap-flow or other Neutron CLIs which anil_rao mentioned in the below link
06:40:37 <reedip> #link http://lists.openstack.org/pipermail/openstack-dev/2016-March/088454.html
06:40:48 <reedip> If they do not have any resources
06:41:13 <reedip> or in other words, if no tap-service/tap flow exists when tap-service-list/tap-flow-list is executed
06:42:41 <reedip> then due to a python-cliff bug
06:42:47 <reedip> #link https://bugs.launchpad.net/python-cliff/+bug/1539770
06:42:47 <openstack> Launchpad bug 1539770 in cliff "Empty set causing out of range error" [Undecided,Fix released] - Assigned to Doug Hellmann (doug-hellmann)
06:42:58 <reedip> the issue occurs
06:43:20 <reedip> This means we need to update the cliff version to 2.0.0
06:43:25 <reedip> if the error occurs
06:43:44 <reedip> yamamoto_ : I think we need to change TaaS requirements for this, right?
06:44:38 <yamamoto_> cliff is in python-neutronclient's requirements, not ours.
06:44:55 <yamamoto_> the problem is not taas specific right?
06:45:05 <anil_rao> I have seen this problem with other neutron client list commands too, so it is not TaaS specific.
06:45:08 <reedip> yup, its not taas related directly
06:45:23 <reedip> anil_rao: can you take the latest neutronclient branch?
06:45:23 <anil_rao> See the few examples at the bottom of my email (in the above link)
06:45:54 <reedip> or better "pip install -e . " may solve the problem in your system
06:46:12 <reedip> when executed in the python-neutronclient cloned code
06:46:13 <anil_rao> I will be redoing my DevStack setup tomorrow (toasted it today due to a TaaS bug) so I can take in the new neutron client.
06:46:50 <reedip> latest NeutronClient has the following Requirements
06:46:52 <reedip> cliff!=1.16.0,!=1.17.0,>=1.15.0 # Apache-2.0
06:47:14 <yamamoto_> i think 1.17.0 is excluded for the reason.
06:47:47 <anil_rao> reedip: Are you using the new neutron client, which is why you are not seeing this problem?
06:48:16 <yamamoto_> anil_rao: which version of cliff are you using?
06:48:20 <reedip> anil_rao : I pull my NeutronClient daily  :)
06:48:28 <reedip> Handy #link http://docs.openstack.org/developer/cliff/history.html#id1
06:48:36 <anil_rao> :-)
06:48:54 <anil_rao> reedip: Thanks for the link
06:49:24 <yamamoto_> #link https://review.openstack.org/#/q/Ie77a9622d16bd02b3607fc1c9f8da20dc1ffb856
06:49:32 <reedip> As the link states, with the new neutronclient, we can have 2 versions 1.15.0 or 2.0.0 ...
06:51:03 <yamamoto_> as the problematic version of cliff is being excluded in global-req even for liberty, i don't think we have anything to do for this issue.
06:51:58 <anil_rao> There is another problem I wanted to discuss
06:52:27 <reedip> software engineers have to be problem solvers, so go ahead :)
06:53:25 <anil_rao> If we have an existing tap-service to which a tap-flow has been attached and then the VM whose port is the source port for the tap-service is terminated, that port is deleted.
06:53:44 <reedip> I think I mentioned it in the email
06:53:55 <reedip> vinay actually mentioned this in the spec
06:53:58 <reedip> just a minute
06:54:13 <reedip> Yup, its there
06:54:20 <reedip> point d)
06:54:21 <reedip> d) Outcome of Deleting the VM where TaaS operates
06:54:22 <reedip> Following might be added to the Spec:
06:54:22 <reedip> 1. Deletion of the VM (and port attched to it) from which we were mirroring
06:54:22 <reedip> (source of the mirror):
06:54:22 <reedip> In this case we would do a cascade delete of the 'Tap_Flow' instances that
06:54:23 <reedip> were associated with the port that was deleted.
06:54:25 <reedip> 2. Deletion of the VM (and port attched to it) to which we were mirroring
06:54:27 <reedip> (Destination of the mirror):
06:54:29 <reedip> In this case we would do a cascade delete of the 'Tap_Service' instance
06:54:33 <reedip> that was associated with the port that was deleted.
06:55:04 <reedip> This is currently missing in the spec and the design, and may need some work
06:55:58 <anil_rao> I think that is somewhat similar but I am trying to describe something a little different. :)
06:57:05 <anil_rao> Essentially, if for some reason we end up with the state where TaaS thinks that a functioning tap-flow is attached to a tap-service but the source port associated with the tap-flow no longer exists, we hit a bad situation
06:58:14 <anil_rao> At this stage, issueing tap-flowd-delete or even tap-service-delete commands just return an error saying that the source port of the tap-flow does not exist. However, the TaaS DB entries are still intact and subsequent tap-flow-list commands continue to refer to that source port.
06:58:32 <reedip> anil
06:58:41 <anil_rao> Yes
06:58:42 <reedip> anil_rao : so we need a purge type command ?
06:58:53 <reedip> which clears the DB
06:59:02 <reedip> or have a callback from nova
06:59:06 <anil_rao> Not really. Here is what I think we need to do
06:59:36 <anil_rao> We need to hook up with Nova callbacks, but even if that is not there we should at least do the following:
07:00:10 <anil_rao> tap-flow-delete should always clean up the tap-flow if the specified port is the source port of the flow.
07:00:31 <anil_rao> Similarly tap-service-delete should also behave in the same way.
07:00:55 <anil_rao> This would mean that they are 'safe' calls and always ensure proper cleanup when invoked with their respective IDs
07:00:55 <yamamoto_> i don't see why nova needs to be involved.
07:01:28 <yamamoto_> what we care is a neutron port, isn't it?
07:02:14 <soichi> hi, sorry for late
07:02:23 <anil_rao> We just need a notification when the port is no longer associated with an instance
07:02:49 <anil_rao> But as I said above, irrespective of this tap-service-delete and tap-flow-delete should clean things up when invoked.
07:03:04 <anil_rao> Today these calls return a failure saying that the port in question no longer exists.
07:03:22 <yamamoto_> i guess what we need is a FK for tap_flows.source_port and some cleanup in l2 agent.
07:03:34 <reedip> anil_rao : if a port doesnt exist, then cant we clear the DB ?
07:03:59 <reedip> I mean if we are sure that the tap-flow is broken, then cant we rollback the changes ?
07:04:09 <anil_rao> yamamoto_: yes
07:04:34 <reedip> ok, I think yamamoto_ defined my point in a pretty little gist :)
07:05:23 <anil_rao> reedip: Let me look at this code path some more. I thought we had set things up so that these delete calls would properly clean up the DB and then also clean up the state in the OVS bridges.
07:05:43 <reedip> anil_rao : yeah sure
07:06:29 <anil_rao> Until then you don't want to delete ports associated with tap-flows and tap-services, while those resources are active. :-)
07:06:54 <anil_rao> Otherwise you will not be able to clean out those tap-services and tap-flows any more.
07:07:08 <reedip> run devstack again :D
07:07:19 <anil_rao> that will work. :)
07:08:20 <anil_rao> I'll report back on some traffic related updates next week.
07:09:16 <anil_rao> soichi: We got your email. Sorry for being late with the review comments on the Dashboard for TaaS. As you have suggested we can discuss this next week.
07:09:52 <soichi> ok. thank you.
07:10:47 <anil_rao> If there a no more updates do folks want to get back to the Spec discussion?
07:11:42 <yamamoto_> anyone working on agent extension?
07:12:07 <reedip> no, but it was my AI
07:12:23 <yamamoto_> there's a brief api doc available
07:12:26 <yamamoto_> #link http://docs.openstack.org/developer/neutron/devref/l2_agent_extensions.html
07:12:33 <reedip> I would start working on it this week
07:12:37 <anil_rao> yamamoto_: Did you hear anything more about OVS resource reservation?
07:12:47 <yamamoto_> anil_rao: nothing
07:12:56 <reedip> yamamoto_ : thanks, will let you know if there are any hurdles
07:13:24 <yamamoto_> reedip: thank you.  it's better to sort out issues earlier.
07:13:35 <reedip> yamamoto_ : yeah sure
07:13:55 <reedip> okay, so can we start with the SPEC : 10 minutes left
07:14:04 <anil_rao> I am not sure how this will all work out with several agent extensions simultaneously working
07:14:32 <anil_rao> reedip: Yes, lets discuss the spec
07:14:38 <yamamoto_> anil_rao: they will work when they happens to work. :-)
07:14:52 <reedip> lol .... yamamoto_ is right
07:15:00 <anil_rao> yamamoto_: :-)
07:15:12 <reedip> okay, well the first issue which some of the reviewers pointed out in the SPEC is
07:15:23 <reedip> The point of reference for Ingress and Egress
07:15:58 <reedip> Neutron reviewers consider that any data traffic coming INTO the switch from a VM is ingress ( and opposite is Egress)
07:16:01 <anil_rao> We had a lot of discussion on this I remember in the earlier cycles. For the implementation which is the current TaaS code-base ingress and egress are w.r.t. the entitiy attached to the port.
07:16:22 <reedip> while for TaaS, any data coming IN to the VM is Ingress
07:16:45 <reedip> that might need to be explicitly mentioned in the SPEC
07:17:06 <anil_rao> It is mentioned if I am not mistaken.
07:17:13 <reedip> because as we are applying for Neutron Stadium, the definition which neutron reviewers carry might differ with our explanation
07:17:31 <reedip> Yeah it is...
07:17:49 <reedip> But it was written in the previous spec as well
07:17:50 <anil_rao> I am okay with either way. Does not really matter a big deal.
07:18:01 <reedip> and they were of the opinion that it should be swapped
07:18:19 <reedip> anil_rao : that would also mean changing the definition of the DIRECTION attribute for tap-flow
07:18:30 <anil_rao> The reason we chose the way it is is because Security Groups uses the same concept, i.e. w.r.t to instances.
07:18:32 <reedip> if we are to swap the definition now
07:18:50 <reedip> I think they can be convinced on this point , due to SG
07:19:12 <anil_rao> Users typically don't care about the underlying switch. If you see SEcurity Groups it is always w.r.t. to the entity attached to the port and not the switch.
07:19:23 <reedip> yeah
07:19:23 <anil_rao> If we go the switch route it will lead to confusion with the end user.
07:20:00 <soichi> anil_rao: +1
07:20:12 <yamamoto_> otoh, port mirroring is usually a functionality of a switch.
07:20:14 <reedip> anil_rao : as the user has not yet interacted with TaaS, I think we can still create the definition ....
07:20:47 <reedip> I had to google what is OTOH ( on the other hand )
07:21:18 <anil_rao> The thing is this. The user will essentially want to monitor some endpoint, an instance, a DHCP server, a load balancer etc. That is the real use case. I think we need to think in terms of that.
07:21:25 <yamamoto_> i think we can't avoid confusions as far as we use terms like in, ingress, etc.
07:22:44 <reedip> anil_rao : yamamoto_ is right...
07:23:14 <reedip> anil_rao: also here, we are monitoring a DHCP server (for example) but we are mentioning traffic in terms of the VM
07:23:24 <reedip> there would be some sort of confusion
07:23:26 <anil_rao> I think its odd that when one wants to examine traffic coming into a VM, you end up saying the direction should be egress.
07:23:47 <anil_rao> Its hard to corelate that to the ingress SG rules for the VM.
07:24:16 <soichi> TaaS is a service for end users, i think. So, it should be easy to understand in terms of end user.
07:24:30 <reedip> anil_rao : I think we can continue this discussion( and other points) on the ML itself. That ways Neutron cores can also pitch in
07:24:33 <anil_rao> reedip: There is no difference between the DHCP server and a VM
07:24:45 <anil_rao> reedip: Sounds good. :-)
07:25:08 <anil_rao> We can move on to the next topic(s)
07:25:57 <reedip> sure
07:26:27 <anil_rao> What is item (b) about?
07:27:42 <anil_rao> We are about to run out of time. Any thoughts on item (b)?
07:29:40 <yamamoto_> i don't understand (b)
07:30:09 <anil_rao> Same here. Let's continue this discussion on the ML as reedip has recommended.
07:30:13 <yamamoto_> both of l3 and lbaas are service plugins.
07:30:34 <yamamoto_> sure
07:31:06 <anil_rao> Well, we are out of time for today. We'll meet up again next week.
07:31:14 <anil_rao> #endmeeting