13:02:57 #startmeeting powervm_driver_meeting 13:02:58 Meeting started Tue Jun 6 13:02:57 2017 UTC and is due to finish in 60 minutes. The chair is esberglu. Information about MeetBot at http://wiki.debian.org/MeetBot. 13:03:00 Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 13:03:03 The meeting name has been set to 'powervm_driver_meeting' 13:03:06 o/ 13:03:38 #topic In Tree Driver 13:03:42 #link https://etherpad.openstack.org/p/powervm-in-tree-todos 13:04:03 o/ 13:04:20 o/ 13:04:33 "Fixing" the get_info business led alll the way down the rabbit hole. 13:04:47 https://review.openstack.org/471146 13:05:12 In the end, mriedem said we should just remove all the unused fields from InstanceInfo, everywhere. 13:06:09 This will impact the OOT driver if/when it merges. 13:06:48 k 13:07:35 that's for instances... were we also talking about host stats the other day? Are there unused fields to remove there as well? 13:07:50 efried: you should probably run that by mdrabe. 13:07:53 We weren't talking about host stats. 13:07:59 k 13:08:07 I suspect pvc would be impacted (and I bet other OS products) 13:08:31 thorst Yeah, I was thinking it will probably be a good idea to blast the dev ML on this one. 13:08:53 yep. But just ping mdrabe on the side too. I'm not sure how much they view the ML 13:08:58 I know I can't (don't) keep up 13:09:05 +1 13:09:19 esberglu Depending how mriedem's comment plays out, this might impact your support matrix change. 13:09:30 https://review.openstack.org/#/c/471146/2/doc/source/support-matrix.ini@249 13:09:38 Not sure if he's gonna ask to remove that whole section. 13:09:43 efried: ack 13:10:15 I think that's it for me in tree. 13:10:26 I wanted to ask an IT question 13:10:56 floor is yours 13:10:58 so when we were looking at the support matrix it clicked for me that our SSP support that merged IT is only ephemeral 13:11:30 when we've talked about 2H17 priorities we've talked about network, config_drive, and vSCSI 13:11:51 is vSCSI there ephemeral or data or both? 13:12:02 and is vSCSI the top priority for data disk attach/detach, not SSP? 13:12:05 I don't remember a vSCSI discussion. iSCSI maybe? 13:12:31 cinder 13:12:31 thorst had said vSCSI 13:12:37 Do we want vSCSI IT? 13:12:39 thorst said cinder (via vSCSI) 13:12:45 (o/ btw) 13:12:56 vSCSI is simply a way to connect storage to a VM 13:12:59 mdrabe read up, there was something for you above 13:13:00 Gotcha. So the VSCSIVolumeAdapter. 13:13:07 when we talk about it in terms of Cinder, we typically mean FC volumes to a VM 13:13:10 ack 13:13:15 in fact in PVC, we simplified vSCSI to just mean that 13:13:24 but vSCSI is used for SSP, iSCSI, FC PV, etc... 13:13:45 so I probably used the wrong language there 13:13:53 I meant cinder support via vSCSI 13:14:05 Really, for FC? I thought we had a fibre channel mapping that was different from a VSCSI mapping. 13:14:16 FC also has this fancy NPIV support 13:14:16 Anyway, separate discussion. 13:14:38 which is like a SR-IOV like thing for FC...though, yeah, separate discussion 13:14:47 Point is, we're looking to support the VSCSIVolumeAdapter in tree. 13:14:53 +1 13:15:07 thorst, in terms of the support matrix, what should we be trying to flip to partial/complete among the storage.block items? 13:15:08 https://github.com/openstack/nova/blob/master/doc/source/support-matrix.ini#L945 13:15:15 https://github.com/openstack/nova/blob/master/doc/source/support-matrix.ini#L972 13:15:23 https://github.com/openstack/nova/blob/master/doc/source/support-matrix.ini#L993 13:15:25 etc. 13:16:18 945 - partial, 972 - complete (though we can add NPIV later), 993 - missing (for now? If we can tuck in awesome) 13:16:21 I think you're saying L972 via cinder vSCSI 13:16:30 k 13:16:38 reality is that today, everyone is FC. So that's the hole we should fill first for IT. 13:16:55 what about cinder via SSP? 13:17:04 no cinder driver for SSP 13:17:11 oh, really 13:17:13 everyone is FC? you mean Power folks? 13:17:14 we talked about making one...but it never came to fruition 13:17:20 https://review.openstack.org/#/c/372254/ 13:17:21 PowerVM - everyone is FC 13:17:23 Still open ;-) 13:17:25 rest of world...not so much 13:17:35 Last action in January 13:17:40 efried: yeah... 13:17:52 we were hoping that would then allow us to make a cinder driver 13:17:57 I think people got pulled in other directions 13:18:08 like iSCSI...and my other crazy volume connectors 13:18:21 thorst so when PowerVC uses SSP for data volumes... how is it doing that without a cinder driver? 13:18:40 edmondsw: they have a cinder driver, but it isn't upstreamed yet 13:18:46 ah 13:19:34 Anything else IT? 13:19:37 I've a question 13:19:42 Real quick, back to the get_info discussion, this also came out of it: https://review.openstack.org/#/c/471106/ 13:19:46 trivial 13:19:49 Where does os-brick come in to play with volume connectors? 13:20:23 mdrabe: good q...shyama was looking into that. Its a way to replace (I think?) the connection_info object (bdm) 13:20:28 not super sure 13:20:32 I've been working on deactivating the compute service when we can't get a pvm session or there are no VIOS ready, but not ready to put up for review quite yet 13:21:11 Ok yea can discuss later 13:22:21 Alright moving on 13:22:29 #topic Out Of Tree Driver 13:22:59 Perf improvement change (https://review.openstack.org/469982) - I owe another patch set. 13:23:17 But the testing came back good on that, so once those fixups are in, I think we're good to go there. 13:23:35 Then I plan to look into the "don't need a whole instance for the NVRAM manager" thing. 13:23:48 which could also yield perf improvements... maybe. 13:24:08 Gotta do that quick before arnoldja moves on to bluer pastures. 13:24:15 efried: I don't disagree with what you did 13:24:20 I just feel dirty about it 13:24:24 heh 13:24:28 'lets just wait 15 seconds for everything' 13:24:34 'because this API vomits up events' 13:24:36 Well, anything PartitionState 13:24:58 so I don't disagree...I just think its bleh 13:25:04 It's always that way with perf improvements. 13:25:11 yep yep yep 13:25:13 Most of the time they make the code uglier. 13:25:20 just letting my voice be heard. :-p 13:25:36 This week is the pike 2 milestone (thursday), so I will be tagging the repos accordingly. 13:26:09 efried is there a LP bug for that? 13:26:23 for the perf thing? 13:26:33 yea... I guess it's not technically a bug 13:26:36 https://launchpad.net/bugs/1694784 13:26:37 Launchpad bug 1694784 in nova-powervm "Reduce overhead for redundant PartitionState events" [Undecided,New] 13:27:04 Ah neat thx 13:29:02 This might have been said last week, but the get_inventory thing is on hold pending further baking of the infrastructure. 13:29:12 yep 13:29:12 https://review.openstack.org/468560 13:29:28 They've hit a snag with the design of shared resource providers. 13:29:53 It's going around the ML at the moment. Not sure how that's gonna shake out. An elegant solution is not yet forthcoming. 13:30:17 Subject line, in case you want to follow along at home: [openstack-dev] [nova][scheduler][placement] Allocating Complex Resources 13:30:25 I guess that's an IT/OOT thing. 13:31:08 Oh, I wanted to bring up t9n 13:31:50 I saw an email a couple days ago that may affect our stance on how aggressive we become about removing translations from various places. 13:32:26 t9n? 13:32:30 what does that stand for? 13:32:39 translation 13:32:51 what was the email? 13:32:54 Right now the policy we're following in nova-powervm is just not translating any new log messages, and removing from anything we happen to touch while doing mods. 13:33:05 so the 9 stands for len('ranslatio') 13:33:11 networking-powervm is subject to a hacking rule that disallows *any* translation. 13:33:21 (thorst, yeah, like i18n, etc.) 13:33:26 and k8s ;-) 13:33:37 (got it - finally understand i18n too) 13:33:55 ...and I'm not sure whether we've even talked about a policy for pypowervm. 13:33:58 * thorst feels dumb 13:34:31 efried: well, pypowervm is consumed by more ways than OpenStack...we've got two or three other direct users. 13:34:38 I think any change there would need to be run by them 13:35:04 we'd probably want to ask clbush from a CLI perspective too. 13:35:20 I assumed efried was going to talk about nova-powervm, not pypowervm 13:35:39 oh, missed a linee 13:35:56 * edmondsw feels dumb, joining thorst 13:36:15 So okay, agree that discussion outside this group is needed for pypowervm. 13:36:20 What about nova-powervm? 13:36:36 I dunno, I'm dragging my feet on that 13:36:48 and I'll admit, its really because I know pvc likes those messages translated. 13:36:48 It's probably not worth going all out and removing everything. 13:37:04 thorst that's not true... PowerVC doesn't want log message translated 13:37:13 thorst That's the email I was referring to, yeah. 13:37:18 o, huh 13:37:31 well, then yeah. I'm fine with either being proactive or lazy about it then 13:37:33 thorst PowerVC wants consistency, it just doesn't want to spend the resources to scrub the translations it already has in place 13:37:43 got it. 13:37:47 but a note was sent just a couple days ago abount starting to scrub things if/when you can 13:37:53 neat 13:38:07 well, then ... same goes for ceiometer-powervm too 13:38:24 that one is probably easier to do (and probably could benefit from a patch set done against it) 13:38:42 I'd probably prioritize ceilometer-powervm above nova-powervm 13:38:43 Okay, upshot for nova-powervm and ceilometer-powervm is: no need to hold back if you feel like scrubbing out all the log t9n from those guys. 13:38:51 But it's not a high priority. 13:39:00 yep 13:39:10 +1 13:39:38 I added it to the etherpad https://etherpad.openstack.org/p/powervm-in-tree-todos line 69 13:40:01 that it for OOT? 13:40:46 nothing else from me. 13:41:11 #topic PowerVM CI 13:41:21 https://etherpad.openstack.org/p/powervm_ci_todos 13:41:42 esberglu Okay, so you moved the CI to-dos out to another etherpad. 13:42:01 efried: Yeah I linked it in the other one 13:42:15 I can move it back if that's what people prefer 13:42:33 But I wanted to track tempest failures there and it was becoming a lot of info 13:42:58 esberglu I'm fine with it as long as everything's cross-linked. I added a backpointer from the CI one to the original. 13:43:10 efried: Good call. 13:43:20 What's the difference between WORKING and CURRENT? 13:43:38 Stuff that I'm actually doing (in staging) vs stuff that's just on the list 13:44:12 We still need to figure out a way to get the VNC tests working 13:44:28 And check what tests (if any) can be enabled with SSP merged 13:44:38 esberglu change "CURRENT" to "NEXT"? 13:44:44 efried that clearer? 13:45:35 Yeah, that would be fine. Not a big thang. 13:46:11 CI has been looking really good since the last couple fixes last week 13:46:29 Which should open up some time to start knocking this list out 13:46:50 I'm gonna go through and prioritize the list today 13:47:28 Been seeing way less of the timeout errors since I upped the time limit. Which to me points to slow over hanging. 13:47:43 esberglu can you make looking at the tempest failures part of that prioritized list? 13:47:50 edmondsw: Yeah 13:48:11 esberglu so you did merge that timeout bump? 13:48:53 edmondsw: I thought we were just putting it in temporarily for investigation purposes. But I can 13:49:00 Hope not. We need to have a lively discussion first. 13:49:15 no, I just asked because you're seeing "way less" 13:49:27 edmondsw: I just live patched the jenkins jobs 13:49:34 if it didn't merge, wouldn't it only be in effect on a one-by-one basis? 13:49:35 oh 13:50:55 Basically, my stance on this is that our CI isn't just testing "does it work.... eventually?" It's also there to alert us to what I'll call "performance problems" for lack of a better term. 13:51:23 So if stuff is taking a long time, we need to figure out why it's taking a long time, not just increase the timeout. 13:51:56 efried yep, I think we all agree there 13:51:57 I would even go so far as to say, if we had the space for it, we should be *decreasing* timeouts to highlight things that are taking longer than they ought. 13:52:22 I'll even agree with that... once we get these current timeouts figured out / addressed 13:52:28 yup. 13:53:29 Should I remove the timeout increase now? It will be easier to find failing runs to investigate that way 13:53:48 esberglu When you've got the space to really start digging into them, yes. 13:54:04 Not necessary if it's just going to result in more failures but no action. 13:54:21 +1 or when one of us pings you that we have that time 13:54:35 efried: Ok. I want to do a couple other things first (like get the neo logged) which should help for debugging 13:54:44 fo sho. 13:55:15 esberglu I'm not seeing getting the neo logged on your list 13:55:18 let me know if you need help figuring out how to do that; I have a couple of ideas. 13:55:24 edmondsw: Yeah that list is a WIP 13:55:41 That's all I had for CI 13:55:56 #topic Driver Testing 13:56:02 Any progress here? 13:56:30 We don't have testers on. But thorst_afk added https://etherpad.openstack.org/p/powervm-in-tree-todos starting line 92 13:56:46 ...pursuant to our call the other day. 13:56:49 efried: we're lining up the test resources still. I don't think any tangible change, just formulating plan 13:57:50 Any discussion needed here? Otherwise I'll move on, running close to time 13:58:10 don't think so 13:58:14 #topic Open Discussion 13:58:21 Any final thoughts before I call it? 13:58:43 It's really confusing in my HexChat interface that esberglu and edmondsw both start with 'e' and have the same number of letters. 13:58:56 My old IRC client had different colors for each user. Haven't figured out how to do that in HexChat. 13:59:03 efried: bringing the real problems to light 13:59:08 :) 13:59:09 You can count on me. 13:59:26 I'd make a quip...but yeah, we do count on you 13:59:59 I got a q actually 14:00:09 alright, I need to bail. Need to go spread the gospel of open vswitch 14:00:20 For test, what's the desired deployment route, devstack or OSA? 14:00:34 mdrabe: for now, devstack due to simplicity of setup 14:00:45 which is not all that simple, until you compare to OSA. 14:01:02 Hah, ironic considering OSA is supposed to be the thing that makes it simple. 14:01:17 efried: OSA is the thing to make OpenStack production grade 14:01:17 Would this be an opportunity to iron out the OSA path then? 14:01:22 Yeah 14:01:34 Sorry, thorst_afk Yeah. mdrabe No. 14:01:38 mdrabe: kinda. Lets chat more when I'm off the phone 14:02:10 I'd say wainot, but sounds like it's complicated 14:05:10 esberglu Think we're done here. 14:05:31 #endmeeting