#openstack-meeting-3 log

13:00:58 <alexpilo_> #startmeeting hyper-v
13:00:59 <openstack> Meeting started Wed Jan  6 13:00:58 2016 UTC and is due to finish in 60 minutes.  The chair is alexpilo_. Information about MeetBot at http://wiki.debian.org/MeetBot.
13:01:00 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote.
13:01:03 <openstack> The meeting name has been set to 'hyper_v'
13:01:14 <alexpilo_> Morning!
13:01:17 <primeministerp> morning!
13:01:20 <claudiub> o/
13:01:27 <sagar_nikam> Hi All, Happy new year !!!!
13:01:35 <claudiub> happy new year. :)
13:01:43 <alexpilo_> Happy New Year!! :-)
13:02:08 <atuvenie> o/
13:02:27 <alexpilo_> up your "hands" folks, let's see who do we have today
13:02:38 <alexpilo_> lpetrut?
13:02:59 <alexpilo_> sagar_nikam: is sonu joining us?
13:02:59 <abalutoiu> o/
13:03:10 <itoader> o/
13:03:33 <sagar_nikam> yes
13:03:43 <sagar_nikam> he said he will join
13:03:54 <lpetrut> o/
13:04:01 <alexpilo_> he sent a patch yesterday, we reviewed quickly to be able to discuss it today
13:04:09 <sagar_nikam> sure
13:04:13 <alexpilo_> but we can start with the other topics
13:04:20 <sagar_nikam> kvinod: Sonu joining ?
13:04:21 <alexpilo_> #topic FC support
13:04:39 <alexpilo_> lpetrut: any updates?
13:04:59 <kvinod> Sonu will join
13:05:02 <lpetrut> yep, so, all the os-win FC related patches have merged
13:05:42 <lpetrut> the CI and unit tests should be all green on the nova patches once we do an os-win version bump, so we should try to get those merged as soon as possible
13:06:17 <alexpilo_> sweet
13:06:26 <lpetrut> sagar: have you guys looked over those patches?
13:06:38 <lpetrut> sagar: it would be great if you could give it a try on your environment
13:06:45 <sagar_nikam> sure
13:06:56 <sagar_nikam> i will have somebody check it
13:07:21 <sagar_nikam> can you add me as reviewes
13:07:23 <lpetrut> thanks a lot, please let me know how it works for you
13:07:24 <lpetrut> sure
13:07:27 <sagar_nikam> reviewer
13:08:55 <alexpilo_> #link https://review.openstack.org/#/c/258617/
13:08:57 <lpetrut> I know this is unrelated to FC, but here's the iSCSI refactoring patch: https://review.openstack.org/#/c/249291/ it's currently marked as WIP because we're currently working on the unit tests, but it would be great if you guys could give us some feedback on this as well
13:09:29 <lpetrut> most probably we'll get this merged today
13:09:30 <alexpilo_> there's still a -1 by walter here
13:09:40 <alexpilo_> anybody knows his IRC nic?
13:10:01 <alexpilo_> hemna, afaik
13:10:10 <lpetrut> yep, basically, he requested a few more info to be provided. such as whether multipath should be used, and the os_type and platform
13:10:12 <lpetrut> yep
13:10:12 <sagar_nikam> lpetrut: this has the fix for CHAP ?
13:10:12 <alexpilo_> his not on this channel anyway
13:10:35 <alexpilo_> lpetrut: let's stick to FC
13:10:35 <sagar_nikam> ok
13:10:40 <claudiub> hemna is on the #openstack-nova channel.
13:10:46 <alexpilo_> we'll switch topic to iSCSI soon
13:10:47 <lpetrut> my bad, sure
13:10:49 <sagar_nikam> and the fix for rescan ?
13:10:57 <alexpilo_> sagar_nikam: ^ :-)
13:11:02 <sagar_nikam> ok
13:11:34 <lpetrut> I guess I'm a bit over enthusiastic about this one :)
13:11:48 <lpetrut> any other questions related to FC?
13:12:00 <alexpilo_> there are no HP reviews on all Nova FC patches
13:12:02 <sagar_nikam> not from my side
13:12:10 <alexpilo_> only the last one in the chain
13:12:14 <sagar_nikam> kurt ?
13:12:18 <alexpilo_> we need reviews on all of them:
13:12:31 <alexpilo_> #link https://review.openstack.org/#/c/258614
13:12:32 <sagar_nikam> i thought he did the reviews
13:12:43 <alexpilo_> #link https://review.openstack.org/#/c/258615
13:12:52 <alexpilo_> #link https://review.openstack.org/#/c/260980
13:12:55 <lpetrut> just a side note: I'll add the vHBA support ASAP
13:13:07 <alexpilo_> just a quick not on vHBA
13:13:33 <alexpilo_> we have two separate options, passthrough and vHBA
13:14:06 <alexpilo_> vHBA is a more flexible feature and easier to implement as it doesn't require all the hassle required by passthrough
13:14:15 <alexpilo_> especially for live migration
13:14:17 <sagar_nikam> yes remember and i think we decided to first implement passthrough
13:14:25 <alexpilo_> but it has some very hard limitations:
13:14:34 <alexpilo_> 1) no boot from volume
13:14:53 <alexpilo_> 2) guest OS support requirement
13:15:00 <lpetrut> so we'll implement both, maybe passthrough as default, letting the user opt for vHBA by setting the bus type to FC
13:15:11 <alexpilo_> so for this reason it will be implemented separately as soon as the current patches merge
13:15:37 <sagar_nikam> ok
13:15:39 <alexpilo_> Im just recapping this here to make sure we keep focus on passthrough
13:16:04 <sagar_nikam> i think even kurt had suggested pass through... if i remember right
13:16:15 <alexpilo_> for the rest, priority is on getting Cloudbase and HP +1s
13:16:34 <alexpilo_> so that we can move the patches to the Nova etherpad queue
13:16:42 <sagar_nikam> alexpilo_: we have 3 patches ?
13:16:48 <sagar_nikam> that needs review ?
13:16:53 <alexpilo_> 4
13:17:17 <alexpilo_> the last one is the first link I posted (the one with hemna's review)
13:17:25 <sagar_nikam> 614,815  abd 980
13:17:40 <sagar_nikam> ok
13:17:40 <alexpilo_> 258617
13:18:00 <alexpilo_> anything else you'd like to add on FC?
13:18:08 <alexpilo_> sagar_nikam, lpetrut
13:18:16 <lpetrut> not on my side
13:18:30 <sagar_nikam> no, we will start the review, i think we will need Kurt to review all 4
13:18:39 <alexpilo_> sweet, tx!
13:18:42 <sagar_nikam> will let him know
13:18:55 <alexpilo_> #topic iSCSI 3par support
13:19:09 <alexpilo_> now is the time to get wild on iSCSI :-)
13:19:23 <primeministerp> buckwild
13:19:24 <sagar_nikam> 2 issues that we have seen in 3par iscsi, CHAP and rescan
13:19:29 <lpetrut> heh, so, lun rescanning is in place now
13:19:48 <sagar_nikam> lpetrut: already merged ?
13:19:58 <sagar_nikam> lun rescan
13:20:03 <lpetrut> nope, it's the patch I mentioned before
13:20:22 <lpetrut> passing CHAP credentials when logging in portals: not yet. As this was causing issues with other backends, I wanted to test this first.
13:20:43 <alexpilo_> sagar_nikam: this brings up the topic we discussed on Monday
13:21:00 <alexpilo_> using the FC 3par array for iSCSI testing as well
13:21:07 <primeministerp> *nod*
13:21:30 <primeministerp> We need hba's for the array
13:21:45 <alexpilo_> so, as soon as we have the additional HW in place, we will start testing this ASAP
13:22:17 <alexpilo_> primeministerp would you like to add something here?
13:22:22 <primeministerp> sagar_nikam, someone on your side was looking into that
13:23:37 <primeministerp> basically we're going to need the hba to add the iscsi functionality to the array
13:23:49 <primeministerp> we may need additional licensing as well
13:24:21 <sagar_nikam_> lpetrut: did you see kmbharath: patch
13:24:25 <sagar_nikam_> on CHAP
13:24:33 <sagar_nikam_> he had a fix for it many months back
13:24:46 <primeministerp> hmmm
13:25:05 <primeministerp> hopefully he'l reconnect
13:25:32 <lpetrut> yep, but I was thinking whether we can save time by not logging in the portal twice (once without CHAP creds, once with CHAP creds), and maybe use a flag on the Cinder side. Like "portals_requires_chap_auth" or something similar
13:25:33 <sagar_nikam_> i am connected
13:25:45 <primeministerp> ;)
13:26:00 <sagar_nikam_> kmbharath: your comments ?
13:26:11 <sagar_nikam_> on lpetrut: suggestion
13:26:12 <lpetrut> we could just push this into the volume connection info
13:26:25 <alexpilo_> sagar_nikam_: as previously discussed we need to ensure that this patch wont cause issues to other backedends
13:26:33 <kmbharath> Yes agreed, if we  have a flag and do it , it would be better
13:26:43 <alexpilo_> as the 3par one seems to be the only one with this requirement
13:26:51 <sagar_nikam_> ok
13:26:56 <kmbharath> we had tested it on HP LeftHandnetwork and 3par earlier
13:27:02 <alexpilo_> the option is probably the only way to do that
13:27:08 <lpetrut> great, do you know by any change what backends require this? Is it just 3PAR, are all 3PAR backends requiring this?
13:27:21 <alexpilo_> but:
13:27:30 <sagar_nikam_> from our tests, only 3par
13:27:44 <alexpilo_> what if we have 2 backends, e.g. a 3par and another 3rd party one?
13:27:47 <sagar_nikam_> LHN/VSA worked without any change
13:27:57 <alexpilo_> one expecting portal logins and the other one failing?
13:28:22 <alexpilo_> the flag won't help, as it would force login on all of them
13:28:31 <lpetrut> no, why?
13:28:43 <alexpilo_> becuase the other backend would fail
13:29:05 <alexpilo_> an option (I believe suggested by lpetrut) would be to try the login and silently continue if it fails
13:29:29 <lpetrut> the other backend would not set this flag, and we would not use CHAP creds when logging in the portal, so it should not fail
13:29:34 <lpetrut> please correct me if I got something wrong
13:29:52 <kmbharath> can't we do check for backend type....
13:29:55 <alexpilo_> that would require changes on the cinder side
13:30:09 <kmbharath> because its only 3par what we had seen needs the portal login
13:30:34 <sagar_nikam_> lpetrut: who will set the flag ? every cinder driver ?
13:30:51 <lpetrut> this can be optional, but the driver would set it when providing the connection info
13:30:59 <alexpilo_> or we could simply have a list of backends in the Nova driver with the drivers requiring portal login
13:31:08 <lpetrut> also, the connection info does not include the backend type at the moment
13:31:21 <alexpilo_> by making it an option, this could be configurable
13:31:35 <alexpilo_> lpetrut: d'oh, that'sa  blocker
13:31:41 <sagar_nikam_> alexpilo_:this option looks better
13:32:15 <alexpilo_> lpetrut sagar_nikam_: let's bring this back to the witheboard and sync again next week
13:32:19 <sagar_nikam_> having list of backends in nova is better than cinder sending it in connection_info
13:32:31 <kmbharath> the target_iqn in connection info could help us to identify the backend
13:32:52 <lpetrut> kmbharath: umm, is this reliable enough?
13:33:23 <lpetrut> alexpilo_: ok, we can talk about this later so that we don't block the meeting for this
13:33:29 <sagar_nikam_> lpetrut: i think the iqn had 3par in it
13:33:41 <sagar_nikam_> ok, let move to next topic
13:33:46 <lpetrut> sagar_nikam: sure, just wanted to make sure that this happens all the time
13:33:54 <sagar_nikam_> can we have some networking discussion
13:33:56 <kmbharath> yes we had seen it everytime
13:34:04 <sagar_nikam_> sonu: is here i thinl
13:34:07 <alexpilo_> did sonu join?
13:34:17 <Sonu> I am listening..
13:34:21 <sagar_nikam_> i saw him joining
13:34:24 <alexpilo_> great
13:34:39 <primeministerp> sagar_nikam: yes
13:34:44 <alexpilo_> #topic SGR RPC patch
13:34:55 * alexpilo_ fetches link...
13:35:27 <alexpilo_> #link https://review.openstack.org/#/c/263865/
13:35:41 <alexpilo_> first, thanks Sonu for the patch
13:36:07 <alexpilo_> did you see claudiub's review?
13:36:22 <Sonu> I am reviewing the same.
13:36:33 <Sonu> Thanks for the comments. I will fix them and re-post
13:36:34 <alexpilo_> we prioritized it right away to be sure we could talk about it today
13:36:45 <Sonu> Thanks for that
13:36:52 <alexpilo_> claudiub Sonu: anything to add?
13:37:10 <Sonu> This patch is dependent on https://review.openstack.org/#/c/240577/
13:37:31 <Sonu> there was a bug in base security groups driver, which I have fixed and Review is in progress.
13:37:36 <claudiub> could you then add a Depends-On: <change-id> here?
13:37:43 <alexpilo_> btw we need a BP for this
13:37:43 <Sonu> I will mark the dependency
13:37:53 <claudiub> cool ty. :)
13:37:55 <Sonu> Got it. I will work on the same.
13:38:17 <alexpilo_> sweet
13:38:28 <alexpilo_> moving to a broader topic:
13:38:38 <alexpilo_> #topic networking-hyperv improvements
13:39:00 <alexpilo_> let me share some of the design aspects related to this agent
13:39:11 <alexpilo_> there are a few improvements going on
13:39:57 <alexpilo_> we already talked about this a while back, when we talked about the multiprocessing patches that HP sent for SGR
13:40:22 <alexpilo_> we are currently finalizing the results, which will turn into a BP
13:40:41 <alexpilo_> 1) drop the port discovery loop and replace it with WMI events
13:41:09 <alexpilo_> 2) use PyMI
13:41:31 <alexpilo_> 3) replace associator queries with direct WQL queries
13:41:58 <alexpilo_> 4) parallelize all the things :)
13:42:28 <alexpilo_> this includes in particular ACL which are a big bottleneck
13:42:55 <kvinod> alexpilo_ : are you talking about new approach of multiprocessing
13:43:00 <alexpilo_> unlike the Juno patch that HP is using, this doesnt require multiprocessing
13:43:27 <kvinod> is this blueprint on multiprocessing going to be different from what we posted?
13:43:30 <alexpilo_> PyMI (unlike the old WMI + pywin32) is designed to work with multiple threads
13:43:55 <alexpilo_> kvinod it's quite different, especially in the implementation
13:44:27 <alexpilo_> this is the reason why it has been kept on hold
13:44:45 <kvinod> ok, then are you saying that HP's patch sets are not required and will not get merged?
13:45:01 <sagar_nikam_> alexpilo_: how are the tests in a scale environment with this new approach ?
13:45:04 <alexpilo_> surely not in the current status
13:45:13 <claudiub> so, I've tested the patchsets you've sent and there are a couple of issues.
13:45:17 <alexpilo_> kvinod: ^
13:45:39 <Sonu> One biggest problem that we had encountered during scale was, too many port updates introducing delays in processing of new port additions.
13:45:44 <alexpilo_> sagar_nikam_: we are testing with Rally on scale
13:45:52 <claudiub> there are 2 big issues atm: 1. logging doesn't work, apparently, I've just noticed 30 minutes ago
13:46:11 <Sonu> thats one reason we separated addition of ports into a different workers scheduled on another CPU
13:46:34 <alexpilo_> Sonu: you can just use threads for that, no need for multiple processes
13:46:35 <claudiub> with the workers patch, logging is only done to stdout, the neutron-hyperv-agent.log is empty
13:47:02 <claudiub> and second, it seems that the agents die randomly during rally.
13:47:43 <claudiub> they freeze, leading to missing to report the alive state, leading to failing to spawn vms, as the neutron agents are considered to be dead and the neutron ports couldn't be bound
13:48:24 <Sonu> hmm thats a news :)
13:48:25 <kvinod> claudiub : we already noticed the logging issue and we solved it by making child process send message to parent process about port binding success or failure and parent process will log into log file
13:48:45 <alexpilo_> kvinod: that is unnecessary work
13:49:07 <claudiub> and thirdly, if there is any issue in binding a port over and over and over again, the neutron-hyperv-agent process will consume the whole cpu.
13:49:48 <kvinod> alexpilo_ : we did it that way due to limitation in logging framework as it doesnot works for multiprocessing
13:49:54 <alexpilo_> sorry for trinmming the discussion, as we have only 10' left
13:50:09 <claudiub> kvinod: yeah, i've seen that. if i start the process manually and see the std, i can see what happens in the child processes, including traces and so on. but there's nothing in the log file.
13:50:21 <claudiub> stdout*
13:50:36 <alexpilo_> the idea of parallel execution in the agent is of course the common goal here
13:50:48 <alexpilo_> Python's multiprocessing brings a lot of unnecessary drawbacks and there's no reason to use it
13:51:14 <alexpilo_> threads / green threads work perfectly fine as long as the underlying WMI calls are non-blocking
13:51:38 <alexpilo_> (otherwise we'd hit the GIL issue, which is I guess why you opted for multiprocessing)
13:52:14 <Sonu> yes you got it right
13:52:36 <alexpilo_> the discussion is anyway much broader, which is why this BP is taking a while
13:52:39 <claudiub> i'm currently trying out native threads, to see how it's going with those.
13:52:52 <Sonu> Vinod had tried all such possibilities
13:52:57 <sagar_nikam_> Sonu: Can you review the new approach if a patchset is available
13:53:13 <sagar_nikam_> and check how it will work
13:53:15 <Sonu> he can help you with his findings and observations
13:53:17 <claudiub> i haven't uploaded the native threads patch yet
13:53:23 <sagar_nikam_> based on our scale tests run on Juno
13:53:26 <alexpilo_> Sonu kvinod: that's why we wrote PyMI ;)
13:53:32 <Sonu> great
13:53:42 <alexpilo_> the main aspect here, is that the ACL API are simply terrible and no matter how you look at them they dont scale
13:53:43 <Sonu> I will have a look at the new patch set and try
13:54:02 <sagar_nikam_> claudiub: patchset not available yet ?
13:54:10 <kvinod> ok, please upload your patches
13:54:18 <claudiub> sagar_nikam_: native threads, not yet, still working on it.
13:54:23 <alexpilo_> so all this parallelization work is improving a bit the situation, but a more drastic approach will be needed
13:54:40 <alexpilo_> kvinod: expect the patches sometimes next week
13:55:19 <alexpilo_> 1) the OVS driver will become the preferred option as soon as conntrack will be available on our Windows port
13:55:29 <sagar_nikam_> sonu: kvinod: you had seen issues with security groups as well
13:55:35 <alexpilo_> as this is required for an SGR as well
13:55:51 <Sonu> have you seen the development -  https://review.openstack.org/#/c/249337
13:56:05 <alexpilo_> 2) we're evaluating a complete rewrite of the ACL WMI API
13:56:27 <alexpilo_> Sonu: tep, that's what I'm referring to
13:56:48 <Sonu> this is OVS firewall being done once OVS conntrack is available
13:56:49 <alexpilo_> we need conntrack on Windows for that to work
13:57:09 <alexpilo_> which is our main goal for OVS 2.6
13:57:13 <primeministerp> Sonu: could we have an offline discussion about the HP networking hardware that supports protocol accelleration, I want to see if what it would take to add OVS and Native vswitch testing on the appropriate network hba, the ones we currently have in the HP 3Par ci only support accellerated iSCSI
13:57:13 <Sonu> got it.
13:57:34 <alexpilo_> 3 minutes to go
13:57:47 <Sonu> sure primeministerp
13:57:55 <Sonu> Sagar mentioned about it
13:58:01 <sagar_nikam_> primeministerp: sure i will connect you and Sonu: on this topic
13:58:02 <primeministerp> Sonu, awesome
13:58:07 <alexpilo_> changing topic, if you'd like to go on with this topic, please let's move to #openstack-hyper-v
13:58:10 <primeministerp> thanks sagar_nikam_
13:58:17 <alexpilo_> #topic PyMI
13:58:32 <alexpilo_> so PyMI is feature complete for all OpenStack use cases
13:58:42 <alexpilo_> it has been tested under heavy load with Rally
13:58:45 <sagar_nikam_> that is a very good news
13:58:59 <primeministerp> woot
13:59:12 <alexpilo_> sagar_nikam_: do you think you could test it in your environments?
13:59:21 <sagar_nikam_> yes, we will
13:59:24 <alexpilo_> we tested, kilo, liberty and mitaka (master)
13:59:36 <sagar_nikam_> i need to get some slots from scale team to test this
13:59:58 <alexpilo_> all CIs are now switching to it, including nova, neutron, networking-hyperv, compute-hyperv and the cinder ones
14:00:08 <sagar_nikam_> most of our environments have also move to Liberty
14:00:15 <sagar_nikam_> need to find Juno
14:00:20 <sagar_nikam_> for testing pyMI
14:00:30 <alexpilo_> perfect tx!
14:00:37 <alexpilo_> time's up! :)
14:00:40 <alexpilo_> #endmeeting