14:00:37 <slaweq> #startmeeting neutron_drivers
14:00:38 <openstack> Meeting started Fri Apr  3 14:00:37 2020 UTC and is due to finish in 60 minutes.  The chair is slaweq. Information about MeetBot at http://wiki.debian.org/MeetBot.
14:00:39 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote.
14:00:41 <openstack> The meeting name has been set to 'neutron_drivers'
14:00:44 <mlavalle> o/
14:00:50 <slaweq> hi everyone :)
14:01:01 <ralonsoh> hi
14:02:21 <slaweq> lets wait few more minutes for njohnston haleyb amotoki and yamamoto
14:02:30 <yamamoto> hi
14:03:11 <mlavalle> /me had a conversation recently with dalvarez. He agrees that slaweq "esta jodido con ese jefe"
14:03:40 <slaweq> :)
14:03:42 <mlavalle> while we wait I might as well share
14:03:45 <haleyb> hi
14:04:59 <slaweq> ok, we have quorum already so I think we can start
14:05:06 <njohnston> o/
14:05:13 <slaweq> #topic RFEs
14:05:24 <slaweq> we have 3 rfes for today
14:05:33 <slaweq> first one is: https://bugs.launchpad.net/neutron/+bug/1592028
14:05:34 <openstack> Launchpad bug 1592028 in neutron "[RFE] Support security-group-rule creation with address-groups" [Wishlist,Triaged] - Assigned to Roey Chen (roeyc)
14:07:54 <njohnston> Wow, that is quite an oldie
14:08:13 <mlavalle> well, I resucitated it
14:08:15 <slaweq> njohnston: yes, but mlavalle told me that he is interested in doing that
14:08:38 <njohnston> I think it's a good idea; any sophisticated firewall system has the ability to construct these kinds of objects
14:08:46 <mlavalle> at my employer we want to implement it
14:09:18 <slaweq> I think yamamoto has got valid question, he asked it in comment to this RFE
14:09:20 <mlavalle> we would develop upstream and then use it in our internal deployments
14:09:46 <njohnston> It would be especially good if they could be either global or tenant-local; in the private cloud scenario the ability for a central admin group to be able define IP blocks for common services is something I have seen a lot of pent up demand for
14:10:43 <mlavalle> and the answer to yamamoto's question is most likely yes
14:10:59 <mlavalle> although I would have to dig deeper in our use case
14:12:21 <slaweq> personally I don't think that such update would be more complex than now add of new port to same security group when 'remote_group_id' is used there
14:12:42 <slaweq> You need to update ipset on all hosts where this is used and IMO in this new case it would be similar
14:14:20 <slaweq> mlavalle: are You going to implement that only for iptables_hybrid driver or for others too?
14:15:46 <mlavalle> I don't think our intent is implement it for the hybrid driver
14:16:08 <mlavalle> we don't use the hybrid driver
14:16:15 <slaweq> so only openvswitch driver?
14:16:28 <mlavalle> yes
14:16:49 <njohnston> I'm all right with that
14:17:00 <slaweq> me too
14:17:09 <slaweq> I would just want to clarify one more thing
14:17:13 <mlavalle> I mean, is the way forward, right?
14:17:16 <slaweq> in RFE there is written something like:
14:17:19 <slaweq> "NOTE: For the purpose of the use-case above, the default allow-egress rules are removed ("zero trust" model) once the default sg is created."
14:17:49 <slaweq> but You are not going to propose removal of those rules from default groups?
14:18:12 <mlavalle> no, keep in mind we are re-purposing an existing RFE
14:18:26 <mlavalle> that clearly needs adaptation
14:18:43 <njohnston> yeah, it was unclear to me why that would be a mandatory part of the proposal
14:18:44 <slaweq> yes, I just wanted to make this clear as this would be pretty big backward incompatibility :)
14:18:55 <mlavalle> LOL
14:19:12 <slaweq> other than that I'm fine with accepting this rfe
14:19:24 <njohnston> I like this on a number of levels.  +1
14:19:27 <slaweq> but we will also probably need spec with description of new API
14:19:36 <njohnston> yes definitely
14:19:37 <slaweq> so +1 from me
14:19:45 <ralonsoh> +1 from me (waiting for the spec)
14:19:52 <mlavalle> yeah, we are fully expecting that the next step is a spec
14:20:12 <yamamoto> +1
14:20:45 <slaweq> haleyb: any thoughts?
14:21:07 <haleyb> i'm fine with it, +1 from me
14:21:31 <mlavalle> ok, thanks, I'll assign that RFE to me then
14:21:34 <slaweq> ok, so rfe approved, as a next step we are waiting for spec :)
14:21:43 <mlavalle> and will write the spec
14:21:44 <slaweq> thx mlavalle for revive this old spec
14:21:49 <njohnston> yes thanks mlavalle!
14:21:50 <mlavalle> :-)
14:22:07 <slaweq> so next one
14:22:09 <slaweq> https://bugs.launchpad.net/neutron/+bug/1869129
14:22:10 <openstack> Launchpad bug 1869129 in neutron "neutron accepts CIDR in security groups that are invalid in ovn" [Undecided,In progress] - Assigned to Jake Yip (waipengyip)
14:22:21 <slaweq> this one was reported as bug in ovn SG driver first
14:22:44 <slaweq> but then someone raised point that this is currently potential security issue e.g. in iptables driver
14:22:54 <slaweq> and that we should fix it on API level
14:23:13 <slaweq> so I wanted to discuss it here as potentially new rfe
14:23:23 <haleyb> so i did have a thought on this one, as it reminded me of another bug i fixed
14:23:38 <haleyb> https://bugs.launchpad.net/neutron/+bug/1582500
14:23:39 <openstack> Launchpad bug 1582500 in neutron "icmp, icmpv6 and ipv6-icmp should raise duplicated sg rule exception" [Wishlist,Fix released] - Assigned to Miguel Lavalle (minsel)
14:23:43 <haleyb> https://review.opendev.org/#/c/453346/ was the fix
14:24:24 <haleyb> in that case the SG api was changed to always force the SG rule to have a common name, can we just treat this bug similarly and change it to write the correct value to the DB?
14:25:00 <haleyb> and maybe have to change the RPC side as well, or fix the other drivers
14:25:01 <njohnston> I was thinking along similar lines to haleyb
14:25:05 <slaweq> haleyb: so if someone will set e.g. 192.168.1.100/24 in rule, we will change it to 192.168.1.0/24
14:25:10 <slaweq> correct?
14:25:28 <haleyb> slaweq: yes, use cidr.cidr from netaddr.IPNetwork()
14:25:51 <haleyb> i realize the caller might notice the difference
14:26:38 <slaweq> we can add e.g warning in logs to tell at least to cloud admin that we did such convertion
14:27:19 <njohnston> yes, if the caller did "create rule with $ip/$mask; check for creation by listing groups and grepping for $ip/$mask" then we broke them if $ip gets converted.  Hopefully people don't use such antipatterns though.
14:27:49 <haleyb> right.  i haven't verified but if we allow 192.168.1.100/24 and 192.168.1.99/24 rules, fixing to use the network would also possibly fix a duplicate rule issue
14:28:21 <slaweq> haleyb: You mean this warning on agent's side?
14:28:48 <njohnston> I think the warning would be on the server side since we should do this conversion before we store it in the db
14:29:32 <yamamoto> does it solve the security concern? (i don't understand the security concern)
14:29:33 * cgoncalves lurks and notices a potential Neutron stable API breakage coming
14:29:33 <haleyb> slaweq: would have to be a warning on the server side i think if at all, i can't remember if we warn for the icmp one i mentioned
14:30:12 <haleyb> cgoncalves: no, i want to leave the API alone and not start failing on these cidrs
14:31:08 <cgoncalves> haleyb, IIRC you said you'd store in the DB the "good" CIDR so I expect the API to return it instead of the user-provided CIDR
14:31:15 <njohnston> cgoncalves raises a good point though, we should consider if "the ip you put in the rule is the same IP you expect to get out of the rule" is implicit in the api contract
14:32:30 <haleyb> i don't know, but we have precedence for doing this in the ipv6-icmp case
14:33:25 <ralonsoh> I wouldn't follow this path: we should not sanitize a CIDR (store the correct one in the DB) and don't fail
14:33:27 <haleyb> the tempest SG case might notice this and fail though
14:33:33 <ralonsoh> because the user won't notice that
14:33:37 <slaweq> yamamoto: security concern raised in comment #5 is that if user will set bad cidr in rule, he may open his SG for wider range
14:33:48 <ralonsoh> this will really break the compatibility
14:34:06 <yamamoto> slaweq: he will get what he specified, won't he?
14:34:37 <slaweq> yamamoto: I would need to check it but according to c#5
14:34:40 <haleyb> ralonsoh: even if what we're actually enforcing is the "correct" cidr?
14:34:50 <ralonsoh> yamamoto, not depending on the backend
14:35:00 <slaweq> setting 192.168.1.1/0 will end up with rule like "-A neutron-linuxbri-i4abb52c4-d -j ACCEPT"
14:35:34 <ralonsoh> haleyb, yes because (1) now we'll probably have duplicated rules and (2) we must warn the user (or fail)
14:35:34 <cgoncalves> slaweq, human config error will always happen. it is not a security issue. the system does what it is told to enforce
14:35:53 <cgoncalves> slaweq, that is correct. there is nothing wrong with it
14:35:59 <njohnston> isn't that correct behavior for a 0 netmask?  why would you ever use a /0 netmask unless you wanted to open it up tot he world?  Working As Intended
14:36:16 <ralonsoh> cgoncalves, I don't agree, if we can detect that we should raise an exception
14:36:26 <slaweq> njohnston: cgoncalves TBH I agree with You - we can't prevent users from making any mistakes
14:36:40 <cgoncalves> ralonsoh, it's a valid CIDR per the RFC
14:36:50 <haleyb> ralonsoh: i think we would actually catch duplicate rules now as we'd actually use netaddr.IPNetwork(ip).cidr
14:37:09 <njohnston> I have used /0 netmasks before, but that was when I intended to expose a service tot he world
14:37:14 <cgoncalves> ralonsoh, if you want to prevent that, sure, create a new API but don't touch the existing one
14:38:51 <mlavalle> I lean towards to avoid getting in the mind reading business
14:39:21 <mlavalle> let's do what the user is asking us to do through the API
14:39:28 <slaweq> so it seems that with haleyb's proposal even, the only improvement would be that we would change what user send in request that he could see "correct" cidr in rule show
14:39:44 <slaweq> >>> n = netaddr.IPNetwork("192.168.1.1/0")
14:39:46 <slaweq> >>> n.cidr
14:39:48 <slaweq> IPNetwork('0.0.0.0/0')
14:39:57 <slaweq> but still security "issue" would be the same
14:40:00 <ralonsoh> exactly
14:40:12 <ralonsoh> cgoncalves, and "192.168.1.1/0" is not a valid CIDR
14:40:20 <ralonsoh> the CIDR is 0.0.0.0/0
14:40:57 <cgoncalves> ralonsoh, sure it is a valid CIDR
14:41:23 <haleyb> >>> netaddr.IPNetwork('192.168.1.100/24').cidr
14:41:23 <haleyb> IPNetwork('192.168.1.0/24')
14:41:32 <mlavalle> ahh, I see your point ralonsoh
14:41:35 <haleyb> no complaints from netaddr
14:41:46 <slaweq> haleyb: yes, I also checked that
14:42:10 <slaweq> so it would have basically same effect in e.g. iptables as is now
14:42:35 <slaweq> the difference would be that user would see "correct" network address in rule get
14:43:04 <haleyb> i agree the /0 is an odd case, but really is user error if i'm remembering the last time we thought about it
14:43:39 <njohnston> I think the /0 case is unrelated to correcting the network address
14:43:52 <haleyb> slaweq: yes, it would behave the same in iptables, and maybe fix OVN?  what to do about old rules in the OVN case?
14:44:23 <njohnston> I got disconnected so let me replay the last couple of lines I sent:
14:44:37 <njohnston> so back to the core issue here, it looks like OVN only accepts the network number when specifying a CIDR as opposed to any IP in the network, and that is not standard behavior.  I think that if OVN has this incompatibility, does it impose a significant efficiency penalty for OVN to handle it within the OVN driver code?
14:44:39 <slaweq> for ovn - we can simply do such convertion on driver's level and left all in db like it is now
14:45:29 <haleyb> njohnston: ack, i think we'd need to fix something in OVN if even just to deal with existing rules
14:46:06 <haleyb> and fyi, the API says this for remote_ip_prefix - "The remote IP prefix that is matched by this security group rule."
14:46:11 <slaweq> so basically the main question here is: do we want to fix it on ovn driver's level for ovn that it will work with our current API, or do we want to change API/DB and make it works for all drivers in same way?
14:47:09 <mlavalle> if the latter, wouldn't we have a backwards compatibility issue?
14:47:10 <haleyb> slaweq: i think we have to fix OVN regardless
14:47:47 <slaweq> mlavalle: IMO with latter we will have some api change
14:48:35 <mlavalle> yeah and then wouldn't we incur in backwards compatibility issue?
14:48:40 <slaweq> so my proposal is to fix/change it in ovn driver and add maybe some warning in API ref that this should be network adress, otherwise it will be converted to network address by driver
14:49:14 <haleyb> mlavalle: do you mean at the API level?
14:49:21 <mlavalle> yes
14:50:03 <mlavalle> I'm just trying to explore the implications
14:50:29 <haleyb> it depends on if we think changing the cidr in the rule to be what we're enforcing vs what was in the POST is ok
14:51:06 <haleyb> we do tweak it for ipv6-icmp and noone's noticed
14:51:25 <mlavalle> so far at least
14:51:32 <mlavalle> LOL
14:51:34 * haleyb crosses fingers
14:51:48 <slaweq> but for cidr it may be noticed faster IMHO
14:52:02 <mlavalle> that's true
14:55:00 <haleyb> i wonder if tempest checks, it definitely fails with the API change, and that enforcement could be a bigger backwards-compat issue
14:56:23 <njohnston_> WRT /0 I think we have traditionally not been in the business of preventing users from shooting themselves in the foot
14:56:36 <slaweq> haleyb: https://github.com/openstack/neutron-tempest-plugin/blob/master/neutron_tempest_plugin/api/test_security_groups.py seems that is comparing rules' ids :)
14:56:50 <haleyb> njohnston: no, the gun is loaded
14:57:32 <njohnston_> I could imagine creating a new exception “you asked for a rule with an IP specified but a net ask of /0; to request a wide open rule you must specify 0.0.0.0/0”
14:58:01 <slaweq> njohnston_++ for such api change :)
14:58:11 <haleyb> njohnston: yes, that would be ok with me (or use "any" which is also a synonym)
14:58:45 <slaweq> ok, we are almost on top of hour so we need to finish here for today
14:59:00 <slaweq> I will summarize this disussion in comment to the LP bug
14:59:08 <slaweq> and we will get back to it next week
14:59:13 <slaweq> fine for You?
14:59:16 <mlavalle> yes
14:59:20 <njohnston> yes
14:59:24 <yamamoto> yes
14:59:26 <haleyb> yes, thanks
14:59:26 <cgoncalves> Nate's suggestion is not backward compatible
15:00:08 <slaweq> ok, thx for attending and have a great weekend (at home mostly)
15:00:10 <slaweq> o/
15:00:13 <slaweq> #endmeeting