14:01:01 #startmeeting TripleO Edge Squad Meeting 14:01:02 Meeting started Thu Sep 20 14:01:01 2018 UTC and is due to finish in 60 minutes. The chair is slagle. Information about MeetBot at http://wiki.debian.org/MeetBot. 14:01:03 Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 14:01:04 ykarel: Let's disable the temptest test for neutron and telemtry, testing RDO job for telemetry is still queued 14:01:06 The meeting name has been set to 'tripleo_edge_squad_meeting' 14:01:10 ykarel: Is more important to ensure promotions 14:01:15 ping slagle, emilien, csatari, jaosorior, owalsh, fultonj, gfidente, hjensas, jtomasek,thrash, bogdando 14:01:18 o/ 14:01:21 o/ 14:01:22 o/ 14:01:25 o/ 14:01:27 #info remove or update your nick from the Meeting Template on the etherpad if you want (or don't want) to be ping'd for the start of the meeting 14:01:31 o/ 14:01:32 o/ 14:02:07 #link https://etherpad.openstack.org/p/tripleo-edge-squad-status 14:02:13 Anyone can use the #link, #action, #help, #idea and #info commands, not just the moderatorǃ 14:02:21 #topic Agenda 14:02:27 * One off items 14:02:27 * Ongoing Work 14:02:27 * New Work/Specs 14:02:27 * Review Requests 14:02:28 * Goals for the week 14:02:55 hi everyone 14:03:10 i put together an arbitrary agenda, but we can adjust it if needed 14:03:25 #topic One off items 14:03:59 i thought it could be useful to start with some goals of this squad 14:04:12 Highlight what we're working on to increase visiblity 14:04:13 Ask for or offer help to increase collaboration 14:04:14 Align our efforts around Edge focused priorities 14:04:15 o/ 14:04:21 that's what I came up with. any other thoughts? 14:04:31 sounds good to me 14:04:56 yea it seemed like we have a lot of different projects wanting to work on tripleo edge related stuff 14:05:09 it would be good to have some cross communication and alignment 14:05:53 o/ 14:06:08 slagle: sounds good 14:06:13 alright, do we have any followup discussions from the PTG? 14:06:17 #link https://etherpad.openstack.org/p/tripleo-ptg-stein-edge 14:06:22 we should be clear with the use cases we want to target and omit 14:06:35 there is too much of use cases, when folks saying Edge 14:06:55 then we could Align our efforts around Edge focused priorities upstream 14:07:00 bogdando: what use cases do you have in mind? 14:07:02 bogdando: yes, i think the "telco 5g" usecase as identified by edge working group is the usecase 14:07:39 bogdando: if there are advocates for other usecases they can come forward, but it seems at PTG we converged on a specific architecture 14:08:12 fultonj: that's great. So could we probably set the tripleo use cases in spec? 14:08:18 a link handy? 14:08:22 said usecase has an architecture w/ diagrams 14:08:51 split control plane spec has a diagram and we need to make it more specific 14:09:21 * fultonj looks for link 14:09:54 fultonj: is that what you're referring to? https://www.dropbox.com/s/255x1cao14taer3/MVP-Architecture_edge-computing_PTG.pptx?dl=0 14:10:00 slagle: yes 14:10:03 that was sent to edge-computing@openstack.org 14:10:13 #link https://www.dropbox.com/s/255x1cao14taer3/MVP-Architecture_edge-computing_PTG.pptx?dl=0 14:10:17 URGENT TRIPLEO TASKS NEED ATTENTION 14:10:18 https://bugs.launchpad.net/tripleo/+bug/1792560 14:10:18 Launchpad bug 1792560 in tripleo "Upgrades in CI still using Q->master instead of R->master release" [Critical,Triaged] - Assigned to Jiří Stránský (jistr) 14:10:28 fultonj slagle but tripleo case is probably closer to the diagram we had in the spec 14:10:34 we have a variation w/ slightly diff terms 14:10:38 fultonj slagle though that probably needs some updates as well 14:10:54 ok, so can we take an action here to refine the tripleo split-controlplane use case and see how it lines up with that^^? 14:10:54 https://specs.openstack.org/openstack/tripleo-specs/specs/rocky/split-controlplane.html 14:11:08 slagle yeah I can take the item 14:11:14 slagle: yes, i'll help on it too 14:11:17 see where we differ, and if should adjust at all? 14:11:25 gfidente: how about a new diagram w/ comments on diffs 14:11:38 e.g. far-edge is more like our edge 14:11:42 fultonj now you're signing up for too much work :D 14:12:01 fultonj but yeah far-edge there matches more or less our edge, agreed 14:12:12 would be nice to have an updated diagram for this same meeting next week 14:12:16 +1 14:12:18 bogdando: would you like to be involved with this? does it match what you're asking for? 14:12:34 slagle: seems so 14:13:00 quiquell|rover, re. https://bugs.launchpad.net/tripleo/+bug/1793293 - where is that failing exactly? 14:13:00 Launchpad bug 1793293 in tripleo "pike promotion failing trying to use IPv6 to get buildlogs" [Critical,Triaged] - Assigned to Quique Llorente (quiquell) 14:13:04 i have proposed milestones starting on line 78 of https://etherpad.openstack.org/p/tripleo-ptg-stein-edge 14:13:05 bogdando: ok, cool, since you weren't at the PTG, it'd be good for you to be involved, asking ?'s that we might be overlooking 14:13:11 see comment (original from PaulB) 14:13:31 repourl should be using mirror 14:13:37 basically we have a few more RFEs to align 14:13:43 by all means suggest changes to milestones, i just wanted to start with something 14:13:43 not only control plane split 14:13:45 #action bogdando fultonj gfidente refine tripleo split-controlplane use case and see how it lines up with proposed Edge architecture from PTG 14:13:52 it's mostly related to plans management 14:14:07 oops meeting 14:14:12 so we need to map these onto that upstream architecture picture 14:14:19 apevec: pike's fs019 periodic job at RDO 14:14:23 bogdando: right, we don't have specs for many of those things. they are ideas at this point 14:14:26 and see the caveats and limitations for tripleo 14:14:34 apevec: at repo setup, didn't have time to find why is not using IPv4 14:14:55 slagle: ack, wrt being involved 14:15:42 #topic Ongoing Work 14:15:58 besides split-controlplane, is there other related work that folks would like to highlight? 14:17:06 any undercloud work? 14:17:15 i'll mention there was some talk at the PTG about creating a squad around metalsmith integration 14:17:19 dtantsur: oh hi :) 14:17:26 :) 14:17:57 I'm curious about potential distributed undercloud (and ironic issues around it) 14:18:52 that's what I meant talking about use case 14:19:09 dtantsur: it's a possibility. what we talked about for a first pass with split-controlplane was assume you can provision anything as we do today, or rely on deployed-server 14:19:13 like having AIO undercloud managing edges, or having it 1:1 + 1 central, or distributed... 14:19:25 do I understand correctly we're talking about decoupling not only the heat stacks but deploy from different underclouds? 14:19:35 we probably start with undercloud as we have it, but I guess cerntralized undercloud is not where we really want to go 14:19:54 probably not 14:20:07 dtantsur: may be, just not sure yet how this aligns with upstream Edge group architectures aforementioned 14:20:25 so we need to be clear for tripleo scope 14:20:28 bogdando: they don't talk much about the way to deploy things, do they? 14:20:30 as we want it 14:20:30 there's both ideas around distributed undercloud and a more portable distributed deployment via config-download export 14:20:40 AFAICT split control plane and distribuged UC are orthogonal in that i can deploy what's in the diagram using alread-deployed-servers or w/ distributed uc so ironic can stretch to the edge 14:20:50 right 14:20:57 they will work well together but neither should block progress of the other 14:21:02 fultonj: yes exactly, for step 0, we should focus on what we can do today already 14:21:09 but since I cannot help with things outside of ironic, I wonder what I can do here and now :) 14:21:15 fultonj slagle except for the networks 14:21:43 they are not very orthogonal, as undercloud shares architecture with clouds it manges ;) 14:22:21 dtantsur: i assume you'd focus on the ironic stuff and if orthogonality breaks down we'd talk 14:22:35 wasn't trying to say it shouldn't be included 14:22:36 bogdando: that's not always the case, and i think we should be looking at decoupling where it is the case 14:22:51 I mean, PXE over WAN is not the best idea 14:22:56 slagle I agree we should be looking at decoupling, maybe not stein but next :D 14:22:58 dtantsur: toally agree 14:22:59 so we're left with deployed-servers essentially 14:23:24 more like we have no alternative to deployed-servers until dtantsur's part is ready 14:23:39 but at least we can use deployed-servers to get started in the meantime 14:23:53 should we look at filing specs for some of these things, even if it's just a described use case? 14:23:56 (to get started split control plane) 14:24:10 my part won't be ready if it's not a part of the/a use case 14:24:20 rest of the spec can be empty, but if the issue/case is described, we'll have it captured 14:24:37 mwhahaha: I need to inject an iptables rule on the undercloud, where would I look to do that 14:24:44 dtantsur: i think it should be part of the usecase (wasn't trying to exlclude it) 14:25:00 beagles: new versions it goes in the THT 14:25:10 beagles: old versions go in instack-undercloud stuff 14:25:15 4:22:51 PM GMT+2 - dtantsur: I mean, PXE over WAN is not the best idea. 14:25:15 Exactly 14:25:19 mwhahaha: ack 14:25:28 so we gonna need keep undercloud close to the edge cloud cuz of that 14:25:46 and that means we need 1:N mappings 14:26:09 bogdando: you mean they need to be tightly coupled? 14:26:09 bogdando: or distributed undercloud 14:26:11 1 UC per N edge clouds grouped by LAN segments, w/o going to WAN 14:26:21 whatever we call it 14:26:27 well 14:26:34 dtantsur but to make sure, when you say distributed you mean decoupled, not undercloud services distributed on different nodes right? 14:26:37 we can have on undercloud with ironic-conductros close to the nodes 14:26:45 dtantsur or are you thinking distributing ironic-conductor? 14:26:54 yes, I'm thinking actually distributing 14:27:04 dtantsur ah ok so that's one more scenario 14:27:05 that is that Edge group calling Federated IIUC 14:27:07 dtantsur: isn't ironic just looking at that in stein? 14:27:09 like, central undercloud is a control plane with ironic's down there at the edge 14:27:15 dtantsur ok ok great 14:27:20 there is a lot of tricky places 14:27:24 so in that case networks are less of an issue 14:27:24 with state sync 14:27:26 slagle: we have options already 14:27:43 what we discuss is more about full federation, with edges being completely independent 14:27:44 like sharing images for nodes 14:27:48 dtantsur: ok. would you like to write something up about how it might work? 14:27:53 across those distributed clusters 14:28:05 i think it can stand separately from split-controlplane initially 14:28:06 slagle: I had an email, but I can wrap something more formal. you mean, a spec? 14:28:23 bogdando: ironic has local cache for images. the first time it can pull them from central glance. 14:28:26 dtantsur: it could be a spec, just describing the use case, or dump it in the squad etherpad 14:28:29 https://specs.openstack.org/openstack/tripleo-specs/specs/rocky/split-controlplane.html can still proceed w/ deployed servers regardless and then can be modified w/ distributed uc comes in 14:28:36 OR ironic can deploy from files that you push via ansible :) 14:28:45 dtantsur: so it is very similar scope of problems to solve as they have it for glance 14:28:45 fultonj: yes. we should aim to start simple 14:28:51 or we'll never get off the ground 14:29:13 slagle: yeah, I'll start with the etherpad 14:29:22 ok cool 14:29:25 bogdando: please add your ideas too 14:29:33 slagle: will do. 14:29:36 dtantsur slagle and do we care about isolated underclouds , or only about distributing undercloud services like conductor closer to the edge zone? 14:29:52 #action dtantsur bogdando describe/document distributed undercloud/ironic in squad etherpad 14:30:11 gfidente: isolated underclouds can be easy (if they're completely independent) or hard (if they're federated) 14:30:19 basically, I'm elaborating on the concern initially raised - defining use cases limitations. For undercloud et al 14:30:22 the former requires no work, the latter - a lot of work :) 14:30:31 so I'm talking about distributing some services 14:30:33 deferated Ironic is a perfect example 14:30:37 federated 14:30:39 (which has a problem with rabbit, I guess....) 14:30:47 bogdando: please define it :) 14:30:54 yeah, right 14:31:04 dtantsur so how much distributing conductor buys us compared to isolated underclouds? 14:31:30 gfidente: PXE, IPMI isolated within Edge 14:31:35 also less traffic for images 14:31:41 ah, sorry 14:31:45 I misread your question 14:31:47 dtantsur yeah but we get that with isolated underclouds as well 14:31:51 right 14:32:16 having one undercloud is a good thing on its own, no? 14:32:28 like, one installer for 50 clouds, not 50 installers 14:32:39 remember, each installer is an operation cost of its own 14:32:43 so I expect we to define the isolated vs federated and what is synced / cached from central UC et al 14:32:54 that really matters for complexity of the solution 14:33:00 dtantsur right so a benefit is that it descreases maintenance 14:33:03 yeah, we probably should have a break-down of >=3 cases 14:33:13 dtantsur: if you're planning to distribute ironic services on to the edge sites; couldn't that work piggy back from the split-control plane work? lest the distributed ironic services be deployed as a separate stack which is dependant on the central undercloud stack. 14:33:26 gfidente: also you can have a lighter node on the edge if you have only ironic stuff there 14:33:29 bogdando: agreed, but remember we don't necessarily have to define everything before we get started. we can start small 14:33:50 https://specs.openstack.org/openstack/tripleo-specs/specs/rocky/split-controlplane.html 14:33:53 jaosorior: I guess? I'm not really familiar with the split control place work, I must admit 14:33:55 ^ describes a way to start small 14:34:17 fultonj: yes. whether it's one undercloud or many underclouds, we can illustrate that case working 14:34:49 dtantsur: might want to give it a read; you might get the distributed ironic work for free if done generically enough that we could also use it for the undercloud. 14:35:54 yep. there is some specific configuration to get done for ironic to understand which nodes belong to which site 14:36:03 but that should not be hard 14:36:45 Dmitry Tantsur proposed openstack/tripleo-common master: Workflow for provisioning bare metals without Nova https://review.openstack.org/576856 14:36:58 ykarel, chkumar|off I have a good new about the ceilometer bt issue, it should not block your CI, we have the same bug in our upstream CI and the job is green, so only the tempest options default change is needed :) 14:37:00 wow you got that done quickly :) (jk) 14:37:16 heh 14:37:23 sileht, nice 14:37:46 slagle: yes, but I'm talking about very high level requirements 14:37:58 for distributed vs shared-nothing choices 14:38:09 that really matters to know how we want it 14:38:14 shared-nothing is possible right now, right? 14:38:24 cuz the associated complexity is exponentially different 14:38:35 please define "shared-nothing" 14:38:43 dtantsur: let's hope so :) 14:38:46 a link would be fine for me 14:38:51 isolated underclouds , nothing gets syncronized 14:38:54 fultonj: complete isolation? 14:38:56 nor state, not images 14:38:58 i see 14:39:13 nor performance/monitoring metrics 14:39:26 other solutions is one way sunc 14:39:28 sync 14:39:30 somewhat related: are any sessions planned at the Forum? 14:39:36 from edge to central and not back 14:39:39 there are more 14:39:41 it looks like we have two subsquads: distributed_uc + split-control plane and both should implement same usecase and click together 14:39:54 exactly 14:39:54 fultonj: we don't have 2 subsquads 14:40:00 :) 14:40:02 dtantsur: not about this, but it's a good time to propose it https://etherpad.openstack.org/p/tripleo-forum-stein 14:40:06 b/c we have nothing defined for distributed 14:40:30 I liked the click together part though 14:40:43 it sounds like we have some folks who want to investigate in that direction, and that is good 14:40:44 We have until next week to propose topics. 14:40:50 jaosorior: done 14:40:59 we have nothing defined for distributed 14:41:03 dtantsur++ 14:41:07 we have https://www.dropbox.com/s/255x1cao14taer3/MVP-Architecture_edge-computing_PTG.pptx?dl=0# 14:41:14 and the AI to define it better 14:41:39 slagle: assuming AI of diagram + write up is implemented, will that be sufficient? 14:41:52 fultonj: that's the end architecture of the result. i think what we're talking about here is distribution of the deployment tool 14:42:19 Ronelle Landy proposed openstack-infra/tripleo-ci master: Add OVB nodeset for v3 run https://review.openstack.org/604109 14:42:24 another high level choice for architecture to align it with upstream track, is how much of autonomity we want to give to control plane (UC) 14:43:12 and local management capabilities 14:43:16 vs central 14:43:35 this is a very good question 14:43:38 indeed 14:43:49 and really matters, as it also may bring some *complexity* 14:43:55 some ironic contributors want local management + central control plane. but it is hard. 14:44:05 bogdando: agree, at PTG we talked about this some 14:44:18 right, so we need to state all these very clear for tripleo and outline how that differs from upstream use cases 14:44:37 just to not set expectations too high :) 14:45:36 outcome from my perspective... edge nodes do not need to be autonomous meaning, if they loose connection to control plane they can continue to run current workload but new workloads and getting status of workload is unavailable until connection restored 14:45:52 for the telco 5g usecase 14:45:58 bogdando fultonj I think that is the purpose of the updated diagram for next week? at least regarding overcloud plans? 14:46:06 gfidente: yes 14:46:18 fultonj: you're talking overcloud, bogdando asked about undercloud 14:46:48 but yes, it's something we'd have to figure out for a distributed undercloud as well 14:47:20 My point is that we have architecture shared 14:47:24 its tht you know 14:47:26 and ansible 14:47:42 so we need to have it aligned and not going diverged 14:47:55 yes sounds good 14:48:07 we have about 10 minutes left 14:48:18 do we feel we have enough captured as to what to look at for this week? 14:48:31 we took a few actions, are there any others? 14:49:37 i'd like to get milestones 14:49:44 if it's too early i guess we can wait 14:49:50 fultonj: what did you mean by that? 14:49:56 (saw the ? in the etherpad) 14:50:24 1. POC 14:50:29 2. CI to test POC 14:50:31 3. add features 14:50:52 oh ok, i thought you were asking for the upstream milestone dates or something :) 14:50:55 basically i'm happy to start working on these things (i'm still on #1) 14:51:13 i'll certainly need help for #2 14:51:15 Dmitry Tantsur proposed openstack/tripleo-common master: nova-less-deploy: update to metalsmith 0.7.0 https://review.openstack.org/604077 14:51:26 fultonj: i'd say file some blueprints as deps of the main split-controlplane blueprint 14:51:43 slagle: ah, ok 14:51:46 sounds good 14:51:47 Nicolas Hicher proposed openstack-infra/tripleo-ci master: provider: Add vexxhost https://review.openstack.org/596432 14:51:54 fultonj: align them against the upstream milestones for stein, then we'll have a good idea of what we want to knock out for stein and when 14:52:24 excellent 14:52:40 fultonj: i like the overview in the etherpad 14:52:51 we could have blueprints for most of that stuff 14:52:56 ok, great 14:52:57 so then folks can come in and try it out 14:53:07 e.g., if i had a day or so to try network iso with split-controlplane, etc 14:53:36 fultonj: i can help file some of these 14:53:49 thanks 14:53:49 #action fultonj slagle file blueprints for split-controlplane work in stein 14:54:13 Merged openstack/tripleo-docs master: Amend minor update docs for Rocky https://review.openstack.org/597109 14:54:25 #topic Review Requests 14:54:44 are there any outstanding related reviews folks would like some eyes on? 14:55:05 Alex Schultz proposed openstack/python-tripleoclient master: Update default value for ntp servers https://review.openstack.org/604113 14:55:15 * gfidente adding items in that list as well 14:56:07 owalsh isn't here but i think he has one: https://review.openstack.org/#/c/600042 14:56:09 Mehdi Abaakouk (sileht) proposed openstack/tripleo-heat-templates stable/pike: Add a way to override base path when file driver is used https://review.openstack.org/601286 14:56:09 the metalsmith work is not a priority, but is ready for review 14:56:21 i will review it and vote after i play with it more 14:56:23 #link https://review.openstack.org/#/c/600042 14:56:43 slagle fultonj I think most of the features in 3) are necessary for the CI job at 2) to pass though 14:56:52 #link https://review.openstack.org/#/c/576856 14:56:54 Sorin Sbarnea proposed openstack/python-tripleoclient master: Update default value for ntp servers https://review.openstack.org/604113 14:57:29 gfidente: it's not. let's start very simple 14:57:36 gfidente: i didn't think that 14:57:52 i've already deployed a compute node in a separate stack, and it works fine w/o any of that stuff 14:57:52 gfidente: i thought it would work at least the POC passed the ping test (but without ceph) 14:58:00 that should be the first thing the CI job does 14:58:06 yeah 14:58:14 it will work, just not be architected as we want it 14:58:19 ++ 14:58:24 so we distribute parts out as we go 14:59:09 and what about the tirpleoclient changes? 14:59:34 gfidente: could the ci be a script until the client catches up? 14:59:53 we can keep discussin post meeting 14:59:56 the ci would run 'openstack overcloud deploy' twice 14:59:57 ok 15:00:05 thanks for the good discussion today and for joining folks! 15:00:13 #endmeeting