14:06:23 #startmeeting PowerVM Driver Meeting 14:06:23 Meeting started Tue Nov 15 14:06:23 2016 UTC and is due to finish in 60 minutes. The chair is adreznec. Information about MeetBot at http://wiki.debian.org/MeetBot. 14:06:24 Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 14:06:27 The meeting name has been set to 'powervm_driver_meeting' 14:06:34 Helps if I use the right command 14:06:52 #topic Current status 14:07:19 All right, so let's start with the upstream driver related topic 14:07:44 Now that the blueprint is approved/spec merged, efried1 has taken the WIP off all of this change sets 14:08:10 wooo 14:08:29 Our global requirements patch for pypowervm has a +1 from Jenkins, but needs responses to the requirements team questions 14:08:30 and our very sensible version number is going splendidly 14:08:36 Yes 14:08:44 We'll call it semver++ 14:08:50 One + for each additional digit 14:09:00 For the record, I tried to produce the upper-constraints patch using the process outlined in the README. Never got it to work. 14:09:03 * adreznec grumbles off in the background 14:09:09 Hence typo 14:09:31 Didn't figure it was rocket science. And it's not. I'm just a doofus. 14:09:49 I'll take care of that questionnaire this morning. 14:10:00 Was an easy fix anyway, and it means our very excellent versioning scheme can stay intact 14:10:01 #action efried to update requirements change set per comments. 14:10:19 #action efried1 to update requirements change set per comments. 14:10:30 (I guess I'm efried1 since the last time my laptop crashed) 14:10:38 I can't wait for efried2 14:10:48 Me neither. 14:10:49 What's next. 14:10:59 Do we need action on any of the other patches yet? 14:11:00 CI stability? 14:11:12 oh, hold on, 14:11:21 let's talk a little bit about Tony's comment. 14:11:38 Do we respond to it in the change set (and attempt to explain the IBM legal reasons we don't use semver)? 14:12:12 I'm... not sure it matters either way? 14:12:18 Or maybe something flippant like, "yeah, we don't like it either, but there's a good (if boring) reason for it" 14:12:21 or just ignore it. 14:12:29 I mean it's definitely awful 14:12:43 But I'm not sure it's actually a big deal to anyone at the end of the day 14:12:55 Besides me 14:13:00 I'm just talking about for the sake of responding to Tony's comment. 14:13:51 14:13:54 I don't think it really matters as long as we can answer all the other questions 14:13:59 Roger that. 14:14:05 From the questionnaire 14:14:06 yeah, I'm not sure its a big deal. I'm OK saying 'we don't really like it either'...but I'm not sure we need to make it a big deal 14:14:10 Right 14:14:11 I think its a bigger deal to us than them 14:14:35 The other thing on current patch sets: do we want to preserve the nova-powervm versioning history on the live migrate data object, or start it at 1.0 in nova proper? 14:14:42 We can just say Watson designated the version 14:14:44 And leave it at that 14:14:47 :P 14:14:48 (see comment: https://review.openstack.org/#/c/391284/3/nova/objects/migrate_data.py@301 14:14:49 ) 14:15:35 efried1: I can respond there if you'd like 14:15:46 I'm sure they don't know the history of us being out of tree 14:15:47 but that's why 14:16:05 So we do want to preserve our out-of-tree version history. 14:16:41 it simplifies the transition 14:16:53 Right, for people currently using the out-of-tree driver. 14:16:54 for operators 14:16:55 To an extent, yeah 14:17:10 I guess it depends on if we ever want to support anyone migrating from the out-of-tree driver to the in-tree driver 14:17:26 Heck, maybe we should even bump it up to 1.2 14:17:32 or 1.1.0.0.9 14:17:45 +2 14:18:38 I guess we can respond with a comment saying we'd like to preserve it for ops 14:18:52 And if they push back we can talk about the implications of that 14:19:28 Seem reasonable thorst efried1? 14:19:46 yep 14:19:55 So 14:19:55 #action thorst to respond to https://review.openstack.org/#/c/391284/3/nova/objects/migrate_data.py@301 14:19:55 ? 14:20:03 So it would seem 14:20:22 All right 14:20:25 Okay. The last thing would be socializing the driver skeleton change set. 14:20:37 I'm still not clear how that's supposed to happen normally. 14:21:04 Guess we're not in a super hurry to get that merged until we have CI stable. Next #topic? 14:21:39 So I think we need to talk with the Nova cores about whether we should be on that subteam priority reviews list now 14:21:57 sorry - afk for a bit. 14:22:07 adreznec, I don't know what that is. 14:22:14 and list our 'top 5' reviews there like the other driver teams are 14:22:20 Let me dig that up quick 14:23:05 Like this: https://etherpad.openstack.org/p/ocata-nova-priorities-tracking 14:24:08 Oh, for sure. 14:24:10 Basically whether or not we should have a PowerVM Subteam section on there where we start listing these reviews 14:24:29 We should bring this up in the next Nova meeting then 14:24:39 Sounds like a good plan. 14:25:01 #action adreznec thorst and efried1 to bring up PowerVM subteam/reviews in the next Nova meeting 14:25:12 Okay 14:25:26 #topic Priority reviews 14:25:44 Do we have anything open that needs code review priority? 14:26:01 There's nothing on my radar in the community. 14:26:10 FYI for the team - until https://review.openstack.org/#/c/396934/ merges, multi-architecture deploys are broken 14:26:20 But that already has +2/+1s this morning 14:26:25 So just awareness 14:26:41 All right 14:26:50 Sounds like nothing high priority 14:27:05 #topic Priority bugs/defects 14:27:24 Same thing here - any key issues people want to direct attention to? 14:27:48 Other than the CI, nothing I'm aware of. thorst may have something. 14:28:24 Can we move on to CI? That could be a substantial discussion. 14:28:27 Yep 14:28:31 Ok 14:28:33 #topic CI discussion 14:28:43 esberglu: efried1, it's all yours 14:28:55 esberglu, what's the latest? 14:29:09 So amartey thinks that upgrading novalink *may* fix the LUs hanging around after the runs 14:29:24 #action esberglu update novalink in CI env 14:29:31 As far as the actual runs 14:29:44 Upgrading it to what? 14:29:54 Latest 1.0.0.4? Latest dev? 14:30:25 however, i found 183 can not retrieve vm info 14:30:41 Sorry I got disconnected. Latest 1.0.0.4 14:30:52 Ok 14:31:16 Well I guess we can give it a shot 14:31:16 I put the latest versions of the patches from efried on yesterday afternoon 14:31:29 But still not working 14:31:43 I let one run through last night 14:31:47 here are the logs 14:31:48 http://184.172.12.213/15/328315/19/check/nova-powervm-pvm-dsvm-tempest-full/9564270/ 14:32:00 Still not finding any valid hosts for many of the tests 14:32:10 I haven't dug into the logs yet from that run 14:33:50 #action efried1 esberglu to analyze http://184.172.12.213/15/328315/19/check/nova-powervm-pvm-dsvm-tempest-full/9564270/ 14:34:10 We may be at a point where we should put all hands on deck until the CI is stabilized. 14:34:56 Throwing more people at it doesn't help the runs themselves finish faster. But having all eyes on the results right away... 14:35:45 Yeah, I know people are getting pulled in a number of directions right now 14:35:52 Maybe even just a one day sprint 14:36:05 Focused on CI improvements could help 14:36:45 Unfortunately the holiday season is starting to creep up on us 14:36:54 So finding availability could be tricky 14:37:38 Yeah I'm out starting friday through all of thanksgiving week. No internet Friday to Wednesday 14:38:33 All right. I guess we'll have to table that discussion for now and discuss it outside the meeting 14:38:53 So what else do we have on CI? 14:39:11 Seems service nova-compute on host powervm-ci-powervm-devstacked-25844 is down. Last heartbeat was 2016-11-15 01:54:39. Elapsed time is 126.14957 14:39:19 Looks like the compute service died at some point in there. 14:39:49 btw - I 100% agree that we need an all hands on deck for CI 14:39:51 Compute log has no interrupt or anything - it just stops. 14:39:52 Well that would certainly explain issues finding a host 14:40:02 does it just stop because we hit 5 hours? 14:40:05 and zuul just shut it down 14:40:13 No the run went through in like 40 minutes 14:40:21 It went super fast because it couldn't find any hosts 14:40:36 o 14:40:37 huh 14:40:38 2016-11-14 19:54:43.560 is the last time stamp in the compute log. 14:41:40 The 5 hour runs happen when it times out waiting for hosts to become active (which we are past at this point I believe) 14:42:26 So should I let another run through? Maybe the host dying was just a one-off thing? 14:42:58 esberglu: the 5 hour thing happens when we wait for the marker LU 14:43:04 which we *may* be past, hopefully 14:43:12 just needed an uber clean of the env 14:43:26 Yeah and efrieds latest patch 14:43:43 right. 14:43:50 we also need the novalinks updated 14:43:55 did you see that from apearson? 14:44:00 Yep 14:44:05 already did an action for it 14:44:08 so are those the immediate next steps? 14:44:17 Yep 14:44:24 Already had the novalink upgrade as an action 14:44:50 then all hands on deck for the next run to debug 14:45:15 #action esberglu to try another CI run with latest patch and report results 14:45:37 Ok 14:45:46 Anything else on the CI front? 14:45:58 I think wangqwsh was hitting some OSA CI stuff? 14:46:04 yes, 14:46:32 the memory is extended to 12g, galera can work. 14:46:44 but it's failed to boot a vm. 14:46:57 i am trying to understand drew's mail 14:48:04 do not know how to set the network on neo14... 14:48:30 Ah, this was the OVS network issues email 14:49:42 yes. 14:50:08 I kind of want to wait on the OVS bits until we have CI stabilized 14:50:15 we need our core CI working first... 14:50:17 thorst probably has a more detailed description than this, but basically where we're at now (IIRC) is that we need to create a PHYP vswitch for each tempest VM running OSA, then put the OVS trunk on that 14:50:18 Right 14:50:29 adreznec is right. 14:51:04 thorst: yeah, I agree with you 14:51:15 ok 14:51:28 we should probably put the OSA CI work on temporary hold while we all debug the main CI issues 14:52:18 thorst: I will note that on the patch Kyle put up to fix the OSA multi-arch stuff that got broken, Jesse did comment about investigating a multi-arch CI test 14:52:33 Nothing we need to do now 14:52:40 But something for the back burner 14:53:28 So, CI stability is going to take priority 14:53:48 Do we have anything else to discuss here? Or are we just on deck for the next set of results? 14:54:37 Ok, well if there's nothing else I'm going to call it 14:54:43 Thanks for joining everyone 14:54:58 #endmeeting