13:00:41 <esberglu> #startmeeting powervm_driver_meeting
13:00:42 <openstack> Meeting started Tue Apr  4 13:00:41 2017 UTC and is due to finish in 60 minutes.  The chair is esberglu. Information about MeetBot at http://wiki.debian.org/MeetBot.
13:00:44 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote.
13:00:46 <openstack> The meeting name has been set to 'powervm_driver_meeting'
13:00:48 <thorst> o/
13:03:11 <esberglu> #topic Out Of Tree Driver
13:03:25 <esberglu> ocata is broken
13:03:34 <thorst> esberglu: the upload thing?
13:03:43 <esberglu> Yep
13:03:54 <thorst> crap...how did that get back ported...
13:04:06 <thorst> I thought we tested it earlier  :-/
13:04:10 <thorst> in the staging env...
13:05:15 <esberglu> So CI is still down, I can redeploy again with newton or we can get this figured out and be without CI until then
13:05:44 <thorst> esberglu: well it seems like we're going to need a new pypowervm for it
13:05:44 <esberglu> I think I may have missed some of the convo yesterday
13:05:52 <thorst> and we're going to have to bump that back in ocata...which is awful
13:05:54 <esberglu> But it looked like efried was expecting this
13:06:06 <thorst> well, efried thought this could occur.  I didn't think it could
13:06:26 <thorst> it was based off of whether or not whatever broke us was back ported
13:06:36 <thorst> efried will be on in 15 min and we can discuss more then
13:07:09 <esberglu> Do you know what broke us?
13:07:19 <thorst> nope
13:07:25 <thorst> that's the irony...and what I'm frustrated about
13:07:33 <thorst> we're trying to fix for something that we don't know why we broke
13:07:56 <esberglu> Well lets look at what got into ocata in the last week
13:08:10 <thorst> +2
13:08:13 <esberglu> Because I deployed ocata on staging at the end of last week no problem
13:10:04 <esberglu> There's nothing that has gone into ocata since I deployed on staging successfully
13:10:13 <esberglu> in the 3 powervm projects
13:10:18 <thorst> right.
13:10:21 <thorst> it'd be in nova itself
13:10:26 <esberglu> Yep I'm looking there now
13:10:43 <adreznec> It would have to be a bugfix at that point that broke us, right?
13:10:52 <adreznec> I mean Ocata's been cut for some time now
13:10:57 <thorst> I'd assume
13:10:58 <thorst> https://github.com/openstack/nova/commits/stable/ocata
13:11:02 <thorst> nothing much in the past week there tho
13:11:09 <adreznec> Right...
13:11:24 <thorst> thought maybe a global req change
13:11:26 <esberglu> None of that looks suspect
13:11:28 <thorst> but nothing much there
13:11:31 <thorst> pbr updated...
13:11:47 <adreznec> thorst: yeah, once ocata gets cut reqs are pretty much frozen
13:11:52 <adreznec> Unless there's a major breaking issue
13:12:32 <thorst> esberglu: I'm now curious if newton is hosed.
13:13:03 <esberglu> I hope not. That means CI is down for the count until this is resolved
13:13:13 <thorst> right.
13:13:56 <thorst> adreznec: does concurrent.futures use greenlet or eventlet?
13:14:09 <thorst> do you know?
13:15:26 <adreznec> thorst: not sure offhand
13:16:22 <adreznec> wait
13:16:28 <adreznec> isn't concurrent.futures a stdlib
13:17:31 <thorst> adreznec: yeah, but that's where efried sees us hanging
13:17:32 <adreznec> neither eventlet or greenlet are builtins, so it can't use either by default
13:17:48 <thorst> so efried wants to switch to all eventlet I think...
13:18:16 <efried> Howdy.
13:18:29 <thorst> so the net is...I think that is now our highest priority
13:18:31 <efried> Not sure if I missed anything, but I have a plan for the broken upload in ocata.
13:18:33 <thorst> and we should work it first oot
13:18:52 <thorst> rather than it (so we don't create a bunch of misc reviews for the core side)
13:18:57 <thorst> efried: net is, CI is down
13:19:17 <thorst> what I think we're curious about is did something change in OpenStack...or was it perhaps even lower than that
13:19:23 <thorst> like eventlet or somewhere else
13:19:29 <adreznec> thorst: is this futures in py2.7 or py3?
13:19:53 <esberglu> Staging CI is still up. So if we need to run something through we can still do it there
13:19:57 <efried> For OOT ocata, we need a bug opened; then we a) port to ocata OOT the change that moves from FUNC back to IO_STREAM; b) update the pypowervm requirement to 1.1.1.  And, of course, we'll need to release 1.1.1.
13:19:59 <adreznec> I'll admit I'm not totally in the loop on the upload issue
13:20:30 <thorst> efried: what is the 'fix'
13:20:34 <thorst> down in pypowervm
13:20:37 <efried> Yeah, so something changed in eventlet recently - I still haven't nailed down exactly what, but sdague gave me some vague pointers last week.
13:20:51 <thorst> well shit
13:20:56 <adreznec> efried: something that would have changed since ocata was released?
13:20:57 <thorst> that'll affect things way back
13:21:10 <efried> The fix is two-sided, unfortunately.  In pypowervm, we have to kill coordinated upload.  In community, we have to kill FUNC.
13:21:12 <adreznec> otherwise shouldn't it be pinned by version in reqs?
13:21:27 <efried> Because there's no way to do FUNC without threads, and there's no way to do coordinated without threads.
13:21:44 <efried> The alternative is to retool pypowervm to use eventlet instead of futures.
13:21:50 <thorst> efried: so is concurrent.futures just dead?
13:22:00 <thorst> because of a change in eventlet?
13:22:01 <efried> No, it's just incompatible with greenlet.
13:22:18 <efried> Although that might not be entirely true.
13:22:20 <thorst> so func is still viable, just not in an OpenStack env.
13:22:40 <thorst> and it also calls into question if we need a change for your VIOS Task thingy
13:22:46 <thorst> which also uses concurrent.futures
13:22:55 <efried> I don't disagree with that.
13:23:12 <thorst> non-disagreement is as close as we can hope for a resounding agreement
13:23:21 <thorst> efried: so are you focused on that change today?
13:23:27 <thorst> and we just keep CI down while we fix that?
13:23:47 <efried> Which change?  Get rid of FUNC and coordinated in ocata OOT?
13:23:58 <efried> Or convert to greenlet?
13:24:23 <efried> Perhaps we should take a couple of minutes and go over what I (sort of, maybe) know so far about the underlying cause.
13:24:35 <thorst> efried: yes, lets do that
13:24:41 <efried> So from my research with mdrabe yesterday, I *think* it goes like this:
13:24:53 <efried> There's two kinds of multiprocessing models available: threads and greenlets.
13:24:55 <adreznec> So just to clarify
13:25:04 <adreznec> What version of pypowervm is this using
13:25:07 <adreznec> With Ocata
13:25:15 <adreznec> 1.1.0?
13:25:17 <efried> I don't fully understand the difference between them, but they're totally different animals, not just different implementations on top of the same underlying threading model.
13:25:49 <efried> Openstack uses greenlets throughout.  They even have a hacking check in place to make sure you're using eventlet through their nova.utils wrapper of it.
13:26:20 <efried> (adreznec, not sure, but to fix this we'll need to release 1.1.1 and bump the ocata req to that.  Is that even legal?)
13:26:43 <adreznec> Uh
13:26:44 <esberglu> ocata is using 1.0.0.4 I believe
13:26:51 <adreznec> We can technically do that for Ocata
13:26:53 <adreznec> I guess...
13:27:07 <adreznec> in the future... please, let's never have to deal with that
13:27:08 <efried> Yeah, the req bump will need to happen regardless of which way we fix this.  Unless we can figure out some as-yet-unknown way to fix it purely in the community code.
13:27:36 <adreznec> I'm just wondering if this is only broken with some combination of versions
13:27:53 <adreznec> e.g. only with pypowervm 1.1.0 because that's where we require futures>=3.0
13:28:08 <adreznec> vs just "futures"
13:28:12 <adreznec> with no version req
13:28:32 <efried> Mm.  And presumably openstack doesn't require futures?
13:28:39 <adreznec> nope
13:28:44 <adreznec> sorry, yes, they do
13:28:54 <adreznec> that's where the >3.0 req came from
13:29:06 <adreznec> (walking to a meeting)
13:29:55 <efried> Anyway, threading in python apparently has this GIL (global interpreter lock) which actually makes it so that literally only one thread runs at a time - the others are stopped.
13:30:05 <thorst> right.
13:30:28 <efried> Normally this is okay because threads can yield and allow other threads to run, so as long as your actual programming doesn't have deadlocks in it, you're aaight.
13:30:29 <efried> But
13:30:41 <efried> This sucker is blocking on a syscall.
13:30:47 <efried> Which doesn't yield.
13:31:36 <efried> So all the other threads - including the greenlets, including the one that would kick the REST server to do its open, which would unblock the write side - are frozen.
13:31:58 <efried> Now, there is apparently a way to explicitly release the GIL.
13:32:30 <thorst> ?
13:33:03 <efried> That might be the least disruptive path, if we can figure out how to do it.  But a) it's going to be a hack (more on that in a bit), and b) it might not work in the context we would need to do it in - that is, it might only work if we can do it right at that open() call, which is in code we don't own.
13:33:36 <thorst> efried: to me...lets just fix it proper...
13:33:43 <thorst> greenlet (not eventlet)?
13:35:33 <efried> Yeah, so I don't know what the difference is there - those are different libs - but they use greenlets / green threads under the covers.
13:35:57 <efried> Whereas anything that says "thread" - like the native thread library, or concurrent.futures - uses the other kind of threads.
13:36:37 <thorst> I'd assume we can use greenlets for most things...but the pipe may not be able to use greenlet...
13:37:06 <efried> That's an unknown at this point.  But I imagine it's gotta be possible.
13:37:17 <thorst> ok...
13:37:42 <thorst> so the net is, due to this, we need to bump pypowervm...get a bug for nova-powervm...and possibly back port this way (enough) back
13:38:00 <efried> However, given that we're going to need to release a new pypowervm and bump the OOT req to it anyway, I would just as soon do the fix that avoids threading altogether.
13:38:09 <thorst> waler is testing on Mitaka, so he could actually probably tell us if Mitaka is impacted  :-)
13:38:45 <thorst> efried: for the upload?  Sure.  VIOS Feed Task stuff...not so sure
13:39:36 <efried> thorst Agree, but I think what saves us there is that what's running in that thread is non-blocking.
13:39:36 <thorst> alright...so I guess that's priority 1...
13:39:53 <thorst> yep...I agree that is probably not highly impacted.
13:39:55 <efried> So maybe it hitches the process while it's doing that POST, but as soon as the POST comes back, we keep truckin.
13:40:09 <thorst> yeah, kinda ick, not uber ick
13:40:12 <efried> Not ideal, and perhaps something we should look into for the future, but not first priority, right.
13:40:37 <thorst> alright...so gameplan build out here...
13:40:50 <thorst> 1) esberglu - would you be willing to make the bug and tag at least back to ocata
13:41:05 <esberglu> Sure
13:41:21 <thorst> I honestly suspect that newton / mitaka may still be impacted...would love to know that if you have time to redeploy with newton
13:41:47 <thorst> 2) efried you're updating the pypowervm bits?
13:42:03 <thorst> 3) should I do the nova-powervm bits to swap off func?
13:42:04 <esberglu> thorst: I should be able to do that in the background today
13:43:11 <efried> thorst Actually, let me do it.
13:43:19 <efried> Take a look at this delta: https://review.openstack.org/#/c/443189/15..16/nova/virt/powervm/disk/ssp.py
13:43:32 <thorst> yeah
13:43:38 <efried> It'll be like that, except we won't actually need the IterableToFileAdapter.
13:43:41 <thorst> we would need to basically revert into that change for both localdisk
13:43:44 <thorst> and ssp
13:43:50 <thorst> why not?
13:43:53 <efried> Because I'm gonna make a change to pypowervm ;-)
13:43:59 <efried> Since we're going to need a new version of that anyway.
13:44:02 <efried> Backward compatible.
13:44:06 <thorst> don't make it even more complicated  :-)
13:44:07 <efried> But eliminating the need for IterableToFileAdapter.
13:44:17 <efried> It makes it less complicated, really.
13:45:00 <efried> The HTTP request expects an iterable.  Glance gives us an iterable.  For some reason we had pypowervm expecting a file and converting it to an iterable, so the community had to convert the iterable to a file just so pypowervm could convert it back.
13:45:11 <efried> which is stupid.
13:45:26 <thorst> hmm...ok
13:45:45 <thorst> well, I guess I'll let you do magic and be on point to be a reviewer?
13:46:24 <thorst> how do we make actions in the meeting?
13:46:41 <thorst> (for the meeting minutes)
13:46:53 <esberglu> #action esberglu: Open bug for upload issue
13:48:12 <efried> I'm going to use 5083, which is already most of the way there.  Just need to add the iterable killer.
13:48:41 <thorst> #action efried drive pypowervm and nova-powervm fixes for upload issue
13:48:54 <thorst> #action esberglu determine if newton is impacted
13:50:01 <thorst> efried: 5083 - we should tag that with the bug that esberglu is making
13:50:08 <thorst> so we have one bug capturing this whole nightmare...
13:50:27 <thorst> #action adreznec ship out a new pypowervm once this whole fiasco is solved  :-)
13:50:38 <thorst> (heh)
13:50:44 <adreznec> lol
13:50:59 <adreznec> might need to sync that with julio
13:53:01 <thorst> what else do we have for the meeting?
13:53:10 <esberglu> Cool sounds like we have a plan. Meeting is almost up, anyone have anything else? I don't have anything for CI
13:53:17 <esberglu> efried: anything in-tree?
13:53:32 <esberglu> Just waiting for reviewers at this point correct?
13:53:54 <thorst> nbante and jay are still doing testing...  I know jay is hitting issues, I'm trying to help out once I get in the env.  nbante I thik is stuck on something with tempest in OSA.
13:53:56 <efried> I think by the time we hit the SSP change set, we'll need to bump the in-tree reqs to the new pypowervm.
13:55:09 <nbante> correct..adreznec: check once if we have to uncomment anything in user_config.yml to get that work
13:56:20 <adreznec> nbante: you'll likely have to experiment on your own there today. I'm pretty much swamped in meetings until later this afternoon
13:57:21 <nbante> sure..I already tried most of parts. But will give shot in 2-3 hours. If not work, will send you note
13:58:54 <esberglu> nbante: I may have time to take a look today. I will let you know
13:59:16 <thorst> nbante: one thought I had...
13:59:20 <nbante> sure..thanks
13:59:25 <thorst> I don't think we care if tempest is deployed via OSA
13:59:30 <thorst> or run from a separate server...
13:59:39 <thorst> you have a cloud, we just need to run tempest against it
13:59:41 <thorst> :-)
13:59:46 <thorst> so that gives you options to try
14:00:10 <thorst> but esberglu is more familiar than I am...so maybe he'll figure it out in 2 mins
14:00:25 <nbante> I never tried tempest so not sure how it worked. In SVT, we have own framework
14:00:44 <esberglu> thorst: I haven't got tempest working yet either.... so I'm pretty much in the same boat as nbante right now for OSA CI
14:00:58 <thorst> esberglu: right...but my thought is
14:01:03 <thorst> we have tempest working for IT CI
14:01:06 <thorst> or OOT CI
14:01:13 <thorst> so...uh...how'd we set it up there?
14:01:21 <thorst> and can we do the same here?
14:02:56 <esberglu> At least some tweaks will be needed. We can discuss more when I really dive into it
14:03:01 <thorst> awesome
14:03:10 <thorst> nbante: are you deployed with Cinder?
14:03:16 <nbante> no
14:03:24 <nbante> using local disk only
14:03:38 <thorst> so we're just getting back to where we were?
14:04:55 <nbante> after local disk, I worked on tempest where I stuck
14:05:16 <nbante> do you want me to parallely work on configuring cinder?
14:05:16 <thorst> ok
14:05:25 <thorst> ahh, right...so we were getting tempest and then moving to iSCSI cinder
14:05:33 <nbante> correct
14:05:33 <thorst> sorry, I'm getting my wires wrong  :-)
14:06:19 <nbante> :)
14:06:40 <thorst> OK - I'll also catch up with Jay...
14:06:51 <thorst> can you check with him to get his IRC working?
14:07:21 <nbante> sure..will check
14:07:50 <thorst> ok - I didn't have anything else.
14:11:13 <esberglu> #endmeeting