14:15:12 <adreznec> #startmeeting powervm_drver_meeting
14:15:13 <openstack> Meeting started Tue Jan 24 14:15:12 2017 UTC and is due to finish in 60 minutes.  The chair is adreznec. Information about MeetBot at http://wiki.debian.org/MeetBot.
14:15:14 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote.
14:15:17 <openstack> The meeting name has been set to 'powervm_drver_meeting'
14:15:27 <adreznec> #topic In-tree driver status
14:15:43 <adreznec> Lets start here. I'll turn things over to you efried
14:17:12 <esberglu> Hey guys sorry I'm late, forgot my badge
14:17:39 <efried> I'm slowly working my way through the early in-tree change sets.
14:17:41 <adreznec> np esberglu, just fired up the meeting. talking in-tree driver status first
14:18:03 <efried> First one is for sure ready for wider review; we're just waiting for in-tree CI before we can put pressure on the cores to review.
14:18:24 <efried> If we can get the CI up today or tomorrow, there's a chance we can get mriedem to do a review before the nova meeting on Thursday.
14:18:33 <efried> What's the o-3 date?  Is it Thursday or Friday?
14:19:02 <esberglu> Thursday
14:19:20 <adreznec> efried: It's up to projects a bit. Most will be Thursday, but Jan 23-27 is the official range
14:19:40 <efried> Well, okay, this is nova we're talking about.
14:20:00 <efried> Confirmed Thursday.
14:20:26 <thorst_> whoops
14:20:28 <efried> Sooo... we pretty much miss ocata if we don't have the CI up today.  Cause I don't think we get the change merged on the first pass.
14:20:30 <thorst_> sorry...
14:20:50 <thorst_> efried: I think we've missed Ocata.  :-)
14:20:59 <thorst_> for the intree.
14:21:30 <efried> Let's work as if there's still a chance.
14:21:36 <thorst_> agree.
14:21:49 <efried> Remainder of driver status is piddly details.  I can go more in depth if you want, but we should probably spend the time on more important stuff.
14:22:04 <efried> *in-tree driver status
14:22:18 <thorst_> I have oot driver talk, but I suspect that's a different part of meeting
14:22:35 <efried> yuh, suggest waiting til after we talk in-tree CI.
14:22:42 <efried> Anything else before we move on to that?
14:22:58 <adreznec> Nothing here
14:23:10 <adreznec> #topic In-tree driver CI status
14:23:15 <adreznec> esberglu, the floor is yours
14:23:35 <esberglu> The in-tree driver is failing to stack. "n-cpu service is not running" for some reason
14:23:44 <esberglu> Problem is that the logs are failing to copy
14:23:48 <esberglu> So i'm running one manually
14:23:52 <esberglu> To see what the deal is
14:24:09 <adreznec> esberglu: That's weird. Is it just not getting far enough to transfer somehow?
14:24:15 <adreznec> Or is there actually an scp failure?
14:25:14 <esberglu> Nah it's actually a SCP failure. Which is weird because I didn't change anything for SCP
14:25:21 <esberglu> Trying to create /srv/static/logs/59
14:25:26 <esberglu> It fails trying to create thta
14:25:36 <esberglu> the log server isn't full or anything though
14:26:31 <esberglu> I will let you guys know the results of the manual run when it finishes
14:26:38 <esberglu> The other thing I had
14:26:47 <adreznec> Hmm odd. Nothing in that flow should have changed unless something is broken in the build variables somehow
14:27:19 <esberglu> There's a "test connection" thing in the configure system, and it is connecting to the log server fine
14:27:48 <esberglu> 3 scripts run as part of the CI setup
14:27:48 <esberglu> 1) prepare_node_powervm
14:27:48 <esberglu> 2) ready_node_powervm
14:27:50 <esberglu> 3) prep_devstack
14:27:58 <esberglu> Previously we would install the patched (develop) pypowervm in prepare_node_powervm
14:27:59 <esberglu> Since we are now using 2 different versions of pypowervm (1.0.0.4 for in tree, develop for oot) I moved the installation to prep_devstack
14:27:59 <esberglu> The problem is that we need the patched pypowervm to be installed for the ready_node script to work
14:28:20 <esberglu> So I was thinking just install the patched develop in prepare_node_powervm
14:28:36 <esberglu> Then if it is in tree, just overwrite with the patched 1.0.0.4
14:28:44 <esberglu> Thoughts?
14:29:06 <efried> As long as 1.0.0.4 is in place before the nova compute process starts, it shouldn't matter when we do it.
14:29:22 <efried> Heck, I would even be okay skipping that wrinkle for now and just continuing to use develop
14:29:26 <adreznec> efried: I think it does
14:29:29 <efried> They're not different enough that it's going to cause failures.
14:29:30 <adreznec> Because I think we use pvmctl before that
14:29:32 <adreznec> To do node setup
14:29:58 <efried> and we can focus on getting things working first, then worry about that version switch later.
14:30:07 <esberglu> Yeah that was the problem. We don't know if it's in or out of tree until (3) but we need pvmctl by (2)
14:30:30 <adreznec> esberglu: we could just always install develop for step 2
14:30:38 <adreznec> then switch it out for the "right" version in 3 if needed?
14:30:56 <adreznec> Fragments it a bit, but... meh
14:30:58 <esberglu> Yeah. That's exactly how I have it set up right now. Seems to be working, just wanted to make sure I wasn't missing something
14:31:06 <efried> oh, okay.
14:31:42 <efried> thorst_ adreznec - can you think of an easy way to have e.g. Adapter() init log the pypowervm version?
14:31:58 <adreznec> There are other options like bundling pypowervm/pvmctl into a venv and shipping that whole thing so pvmctl has its own pypowervm to use always or something
14:32:01 <adreznec> But they're more work
14:32:17 <efried> I don't know where the version numbers are stored.  I'm sure it involves pbr or something.
14:32:32 <thorst_> efried: no idea...
14:34:01 <adreznec> We'd have to make a call off to pbr's version_string method to get that
14:34:25 <efried> In [12]: pbr.version.VersionInfo('pypowervm').release_string()
14:34:25 <efried> Out[12]: '1.0.0.dev4'
14:34:28 <efried> :)
14:34:34 <thorst_> could we just log the version at the end of the CI job?
14:34:37 <adreznec> Yeah
14:34:38 <thorst_> and call it a day there?
14:34:56 <adreznec> I wonder if we should stop using pbr for pypowervm though
14:35:03 <adreznec> pbr only really works well with semver
14:35:04 <efried> Well, I wanted to have a way to be sure the compute process was started with the correct version.
14:35:05 <adreznec> As you can see there
14:35:17 <adreznec> Since the version probably isn't really 1.0.0.dev4
14:35:31 <adreznec> but 1.0.0.4 or 1.0.0.5
14:35:48 <esberglu> We would have to log it before it gets patched
14:36:01 <efried> wouldn't think so
14:36:04 <esberglu> Once it's patched the version becomes 1.0.1devxxx
14:36:11 <esberglu> I'm pretty sure
14:38:02 <efried> If this is going to be more than a five-minute thing, then never mind; but it would be useful in the long run.
14:38:31 <efried> For right now, like I say, I would be okay moving forward even if the compute process is still using develop.
14:39:37 <efried> I got lost.  What's the next step here?  Seeing how the local run goes, and then nailing down the scp thing?
14:40:00 <esberglu> Yep
14:40:18 <esberglu> #action esberglu: Finish manual in tree run and update with results
14:40:30 <esberglu> #action esberglu: Figure out why logs aren't being copied
14:41:01 <efried> esberglu: would having another body help move this along any faster, or are we bottlenecked?
14:41:26 <efried> I would be volunteering someone like adreznec who knows this stuff ;-)
14:41:27 <esberglu> If someone wants to help with the SCP thing. Nothing to do for the manual run but wait
14:41:44 <efried> k.  thorst_ is that in your wheelhouse?
14:42:48 <thorst_> huh?
14:43:10 <efried> Do you have the expertise and bandwidth to help esberglu figure out this SCP boggle?
14:43:17 <thorst_> ooo, I do not.
14:43:24 <efried> adreznec?
14:43:25 <thorst_> survival mode atm
14:43:27 <efried> Cause I know I don't.
14:43:58 <adreznec> efried: Not sure yet, still bogged down right now
14:44:14 <efried> Okay, if there's nobody with the technical chops, I'd be happy to be a sounding board and additional googler.
14:44:14 <adreznec> Will depend on how things shake out with meetings really
14:44:29 <efried> #action efried to help esberglu with SCP boggle, for whatever that's worth.
14:45:21 <efried> esberglu: is there anything else you can see on the horizon that will need to be addressed?  Something we might be able to get a head start on if we're stuck waiting for whatever?
14:45:57 <efried> The big obvious thing is paring down the test list - but we don't really know where to start with that.  However, setting up the infrastructure to use a whitelist?
14:46:39 <esberglu> I already know how we should do that
14:46:46 <esberglu> This is the conf we use for out of tree
14:46:51 <esberglu> https://github.com/powervm/powervm-ci/blob/master/tempest/os_ci_tempest.conf#L26
14:46:58 <esberglu> We need to make a second conf for in tree
14:47:10 <thorst_> well, its going to be a whitelist
14:47:13 <esberglu> And then we set the BASE_TEST_REGEX to include all the tests we want
14:47:16 <thorst_> so its only supposed to be the tests we want
14:47:31 <thorst_> ahh, nm...I see
14:47:32 <esberglu> Yep. The BASE_TEST_REGEX for out of tree includes all the tests
14:47:32 <efried> oh, so it's already whitelisting.  It's just really inclusive.
14:47:35 <adreznec> consensus!
14:47:47 <thorst_> that's rough.
14:48:01 <esberglu> Yeah the "whitelist" for out of tree is all tests, then it gets reduced by the skip_list
14:48:08 <thorst_> I'm going to nope myself out of anything with regex
14:48:11 <efried> So the BASE_TEST_REGEX is going to be a regex with (id|id|id|id.....)
14:48:14 <thorst_> I find regex to be an awful creation
14:48:35 <efried> ahh, thorst_, you don't understand the beauty of regex.
14:48:40 <efried> <patpat>
14:48:51 <efried> Sokay, I'm your regex guy.
14:49:04 <thorst_> efried: you are correct, I find it flawed and awful
14:49:11 <thorst_> but that's my definition of 'beauty'
14:49:15 <thorst_> can't be awful
14:49:24 <esberglu> efried: It we be easier to use test names, then we could include groups of tests with one regex. But it was recommended to use to use ID's before
14:49:53 <efried> Yeah, however, I don't know that we really want to handle the whitelist with a regex like that.  There's probably another (better) way to do it.
14:50:22 <efried> So - let me take another look at the os_ci_tempest.sh and see what I can figure out.  Unless esberglu has already done that?
14:50:50 <efried> I forget, which project holds the real one of those?  neo-os-ci or powervm-ci?
14:50:56 <esberglu> powervm-ci
14:52:16 <efried> #action efried to investigate whitelisting
14:53:21 <adreznec> All right
14:53:29 <adreznec> Anything else on in-tree CI?
14:53:56 <adreznec> Ok, I know thorst_ had discussion on out-of-tree
14:54:06 <adreznec> #topic Out-of-tree driver discussion
14:54:19 <thorst_> yeah, so our oot CI is kinda flaking out again
14:54:27 <thorst_> I'm seeing several patches failing...
14:54:33 <thorst_> I've at least root caused one of em.
14:54:34 <thorst_> http://pastebin.com/uN5aB1Kk
14:54:44 <thorst_> that's the error.  Basically it is a non-force immediate power off.
14:54:48 <thorst_> and its just hanging
14:55:10 <thorst_> I think we've hit this a few times now...so we should get it fixed.
14:55:19 <thorst_> I opened a bug a long time ago around this
14:55:22 <thorst_> https://bugs.launchpad.net/nova-powervm/+bug/1562117
14:55:22 <openstack> Launchpad bug 1562117 in nova-powervm "power-off times are not adhered to" [Low,New] - Assigned to Lauren Taylor (lmtaylor)
14:55:35 <thorst_> I think we need some attention put on it now.  Anyone have cycles to explore that fix?
14:56:02 <thorst_> I can also check with lmtaylor on it, but she's had it for a while and hasn't updated it recently.
14:56:20 <efried> squeaky wheel
14:56:47 <thorst_> alright.
14:56:53 <thorst_> well, I'll work on it as I free up
14:56:58 <thorst_> but it's impacting CI.  Sigh
14:57:06 <thorst_> that was about it
14:57:18 <thorst_> I suspect we'll argue in the review
14:57:31 <thorst_> so awareness now, this one will be a weird review...so pay attention to the review.
14:58:06 <efried> thorst_: is the problem that you think we should be timing out faster?
15:00:18 <thorst_> OpenStack gives us values for timeout and retry
15:00:24 <thorst_> what those values mean...is open to interpretation
15:00:32 <thorst_> does a 0 mean immediately or wait forevs
15:00:42 <thorst_> I interpret it as 'immediate'
15:00:51 <thorst_> :-)
15:02:16 <efried> I see.
15:02:31 <efried> Should prolly go look at how libvirt et al interpret those values.
15:03:24 <efried> libvirt agrees with thorst_
15:03:37 <thorst_> well then, its easy
15:03:38 <efried> timeout != 0 => "gracefully"
15:03:42 <thorst_> anyway, I'll get on it
15:03:53 <thorst_> soon enough, cause its blocking my other changes (kinda)
15:04:17 <thorst_> that's all I had for OOT.  Big thanks for reviews on the fileio thing
15:04:25 <thorst_> not sure if we can get that into ocata...would've been nice
15:04:26 <efried> k, if you find you don't have time, I can take over.
15:04:45 <efried> #action thorst_ https://bugs.launchpad.net/nova-powervm/+bug/1562117 - efried to help if needed.
15:04:45 <openstack> Launchpad bug 1562117 in nova-powervm "power-off times are not adhered to" [Low,New] - Assigned to Lauren Taylor (lmtaylor)
15:04:55 <adreznec> Ok
15:04:59 <adreznec> So I know we're over
15:05:08 <adreznec> #topic Open floor
15:05:10 <adreznec> Anything else?
15:05:22 * thorst_ dances on open floor
15:05:27 <efried> not from me, but esberglu stick around so we can talk about the whitelist.
15:05:34 <adreznec> Okay
15:05:40 <adreznec> #endmeeting