19:01:14 #startmeeting infra 19:01:15 Meeting started Tue Dec 5 19:01:14 2017 UTC and is due to finish in 60 minutes. The chair is clarkb. Information about MeetBot at http://wiki.debian.org/MeetBot. 19:01:16 Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 19:01:19 The meeting name has been set to 'infra' 19:01:28 ahoy mateys 19:01:29 \o 19:01:32 #link https://wiki.openstack.org/wiki/Meetings/InfraTeamMeeting#Agenda_for_next_meeting 19:01:35 o/ 19:01:42 #topic Announcements 19:02:07 o/ 19:02:09 o/ 19:02:29 The way I organized the agenda I didn't plan for this to be an announcement but it totally is. I've asked dmsimard and frickler to join us as infra roots. We'll be using the virtual sprint for control plane upgrade to onboard them and go through what that all entails 19:02:55 welcome and I at least am very excited to have additioanl help :) 19:03:07 welcome! 19:03:11 we will get into details for virtual control plane upgrades later 19:03:20 er virtual sprint 19:03:23 dmsimard, frickler: condolences^Wcongrats! :) 19:03:49 yay 19:04:10 Thanks :D 19:04:12 we should see patches up for adding users/keys soon I expect and we can merge those as a prep step for the virutal sprint 19:04:24 thanks dmsimard and frickler!!! 19:04:38 your (hopefully continued!) assistance is much appreciated 19:05:02 #topic Actions from last meeting 19:05:16 #link http://eavesdrop.openstack.org/meetings/infra/2017/infra.2017-11-28-19.00.txt Minutes from last meeting 19:05:43 there was an action to email about the jenkins user being removed from the gerrit CI group but I think we determined that gerrit doesn't remove the preexisting votes from chagnes afterall? 19:06:03 ya, seems like an announcement would be underwhelming and unecessary in that case 19:07:18 fungi: ^ any thing you want to add to that? or are we good to just treat it as unnecessary? 19:07:49 not really, i started looking into scripting vote removal from active changes 19:08:03 it wouldn't be hard, but maybe a little consensus on whether it's worth doing would help 19:08:31 zuul will currently handle them properly. That leaves reviewers as the primary target I guess? 19:08:41 there's at least an api call we can use to delete a reviewer (and thus their vote) from a change, so i can iterate over open changes with existing verify -1 from the jenkins user fairly easily 19:08:52 something like 5k changes when i queried last week 19:09:10 that seems high 19:09:21 maybe it was more like 2.5k? 19:09:28 i'd have to dig it out of the channel log 19:09:30 if we can help make reviewers' lives easier it is probably worth doing 19:09:54 as is I think gerrit will show a -1 if jenkins -1'd but Zuul +1'd when viewing changes in the list view 19:09:55 yeah, i mean other than a couple hours (probably) of my machine iterating in a loop it's only a few minutes of effort 19:10:31 it will, however, in recent gerrits like the version we're running, leave a comment on each of them, and thus trigger zuul to process that event 19:10:44 so maybe i should do it during a quiet-ish period 19:11:47 seems reasonable 19:12:05 agree 19:12:29 #action fungi delete jenkins account -1 verify votes from open changes in gerrit and announce it 19:12:45 a fun weekend activity! 19:12:51 indeed 19:13:03 ok moving on because the next topic is a fun one 19:13:05 #topic Specs approval 19:13:16 #link https://review.openstack.org/#/c/524024/ Top level project hosting 19:13:38 I don't expect that this spec is ready for approval, but it is a fairly important one and getting eyeballs on it would be a good thing 19:13:48 this is still pretty new -- i only posted it last week 19:13:53 o/ - sorry I'm late 19:14:13 jeblair: ya mostly want to call attention to it for reviewing 19:14:13 but i've been trying to point people at it 19:14:28 it's good to have the reminder--it's a really big change for us conceptually as well as technically 19:14:54 yup the tl;dr is how do we change the things we do to better serve non openstack project hosting 19:15:11 i don't want to rush things such that we make bad decisions, but i'd also like to move forward as quickly as possible 19:15:13 we have for a long time hosted non openstack projects but its always been fairly tightly coupled to the openstack brand and such 19:15:34 related, the kata containers community was officially announced to news outlets today, and we have a temporary lists.katacontainers.io server we'd like to fold into a common server with the lists for our other communities 19:16:04 yeah, ^ means this is already happening to some extent 19:16:04 so, would kata containers fall into this spec? 19:16:11 yup 19:16:13 pabelanger: yes 19:16:22 as I understand it, they are only using github.com right now 19:16:56 that will probably be a relatively minor import, though i did build lists.k.i using xenial to make sure our puppetry is working (so that upgrading lists.o.o is less of a wildcard for the upcoming sprint) 19:17:22 okay cool 19:17:41 FWIW we might want to consider *moving* their repos as far as github is concerned instead of "importing" them 19:17:44 pabelanger: I think thats where they have been and so are comfortable there, but have said they will work with us to make zuul and nodepool work for their needs (which will expose them to gerrit and maybe we can convert them >_>) 19:18:02 I had to ask github support to redirect github.com/dmsimard/ara to github.com/openstack/ara when I got the repository created 19:18:17 github handles redirectories over git clone and http properly when it's a move 19:18:24 s/redirectories/redirects/ 19:18:25 yeah, i would wait for the dust to settle a little on the kata side before we expect them to start engaging with us on ci-related workflow questions 19:18:26 (wow) 19:18:28 dmsimard: there are weird technical problems with that iirc 19:18:29 clarkb: yah, I was mostly curious where the line was for admin acccess to that org in github, not sure if we (-infra) manage that, or some other team 19:18:34 dmsimard: like you have to be an admin of both sides 19:18:40 clarkb: I would assume it would be -infra 19:18:52 dmsimard: the foundation has not seen fit to require new projects to use gerrit, so it's not clear to me that any moving/importing/whatever will happen there 19:19:10 clarkb: huh, the initiating party can initiate the transfer without being admin on the other side iirc 19:19:18 anyway, if it's important github support can fix that 19:19:19 pabelanger: I don't think those things have been decided yet 19:19:29 jeblair: good point 19:19:30 pabelanger: but probably good feedback for the spec especially if we want to have an opinion on it 19:19:49 jbryce mentioned in #openstack-tc that there's potential interest in the kata community using more of our ci infrastructure, but yeah it's not been positioned as a requirement afaik 19:19:55 clarkb: yah, I can add some comments around it for git servers section 19:19:59 jeblair: and if they use gerrit, would they end up using the review.openstack.org (openstack branded) gerrit ? sorry, don't meant to fork the discussion 19:20:02 clarkb: I think if the project uses gerrit then infra needs admin on their github org for sure ... 19:20:34 mordred: well we've also talked about decoupling infra completely from github in other places iirc 19:20:46 and individual projects would be responsible for their own mirroring if that is something they want 19:20:55 clarkb: we might need it anyway so we can manage and deal with the zuul github app ... but I do think we should make it clear that for anyone existing more strongly on github we're NOT in the business of doing admin tasks on github orgs other than zuul app and replication-from-gerrit 19:20:59 (I do think it is worth writing down what our perspective on the needs is though) 19:21:02 if kata is able to be a project on github and not use our infrastructure, then it's clear it isn't subject to the tc, which is where we derive our authority to impose consistency. so, for the moment at least, i see this as an effort to create a welcoming community that projects can choose to collaborate in. 19:21:36 dmsimard: for services we can't make cleanly multi-domain (e.g. gerrit) the current proposal mostly punts but i would personally like to do a hitless transition to a non-encumbered domain name as the primary hostname for stuff like that 19:21:39 oh - nevermind - I was thinking about hook errors, but those are on the github app, not on the org using it - so we don't actuallyneed admin there for that 19:21:50 jeblair: ++ 19:22:25 fungi: yes, clarkb mentioned something like that in a comment. i haven't responded yet, but my thought is: yes, we should move all hosting to a non-openstack branded domain, and this spec is forward-compatible with that. 19:22:41 i'm happy to add a pgraph to that intent to the spec 19:22:49 There's still the OpenStack logo at the top left but that's a detail :) 19:22:52 dmsimard: some possible names for the common infrastructure have been floated in the past, but i would rather not muddy the current proposal with an inevitable name-choosing bikeshed 19:22:56 jeblair: that would be nice to make sure we don't do anything that conflicts with that 19:23:00 * dmsimard nods 19:23:13 dmsimard: we would presumably do some rebranding of the interface in general to un-openstack it 19:23:13 jeblair: basically here is the desireable end state, we don't need to get there in this spec but lets make sure we don't conflict with it either 19:23:30 personally, i don't think you have to scrub every reference to openstack even for separate projects. just like there are apache.org references for ASF projects that are not http server 19:23:43 yep. i think it will take more effort to get there, but we can get kata and zuul on on a path that will converge with that by doing these things now. 19:23:52 jeblair: ++ 19:24:13 jbryce: i agree, it'll be a tactical choice as to which ones are important to do that with and which ones might be nice to have if we ever get around to them 19:24:32 jbryce: agree -- and some of these things (lists / git servers) have more of a branding impact than others (code review, ci hosting), so best to focus on the big fish. 19:25:15 yeah, primary focus is to stop scaring potential users of the product, secondary focus is to stop scaring potential developers for it 19:25:45 fungi: heh, s/scaring/confusing/ maybe? but yeah. :) 19:25:51 well, really primary is to stop confusing marketers and pundits ;) 19:25:59 but yes 19:26:40 i've started working on implementing the mailing list portion -- mostly because it requires the largent amout of unknown (to me) work. 19:27:13 do we think it would be worthwhile to get kata's input on the spec? 19:27:31 that will help us keep things moving while we revise the spec, and if i run into any problems, we can change course early 19:27:36 I worry that it may be information overload for them, but may be valuable to have their input too as they are probably a primary early consumer 19:27:56 clarkb: yeah, i think so. even if they say "we only care about lists for now" it would be good to start opening lines of communication 19:28:05 getting anyone's input is cool, and input from the kata community can't hurt, but it's also a lot of building things we know we need in such a way that we hope some of it will be useful to various new communities the foundation takes on 19:28:12 jbryce: ^ maybe you can point them at the spec? 19:28:26 clarkb: yep. happy to 19:28:53 they're pretty slammed right now, but i can also just try to get their feedback this week while some of them are in austin and then i will bring it back to the spec 19:29:24 and if they have questions about what it all means, i'm happy to walk through it over email or something, by way of introducing them to our community. 19:29:43 (what is an infra-spec and what does it mean?) 19:29:48 also, just fyi, on the github org, we made sure that some foundation people have admin rights there as a start. i think the rest of it will be worked out along the way 19:29:55 jeblair: awesome. thanks! 19:31:13 anything else we want to add while we are here? 19:31:21 maybe we can pencil in the idea of getting it ready to vote by next week? 19:31:53 we can make that the goal. Will the vast majority of us have time to review it in a weeks time? 19:31:54 jeblair: that seems fine. maybe aggressive but the sooner the better in my opinion 19:31:55 (I can) 19:32:10 i'll set aside time to go back over it again 19:32:37 lets aim for that and do our best to review it before then 19:32:38 ping me if you leave a comment and i'll respond faster -- also happy to chat about it in irc if you have questions that don't fit as review comments. 19:33:09 #agreed review project hosting spec this week so that we can bring it up for approval votes next week 19:33:21 (if you strongly disagree with that please speak up) 19:33:26 or disagree at all I guess 19:34:29 alright we do have other items on the meeting agenda so lets keep this moving 19:34:36 #topic Priority Efforts 19:34:42 #topic Zuul v3 19:35:01 are there any zuul v3 items we want to bring up? those that were on the agenda from last week were all addressed over the last week I think 19:35:16 i don't think there's much to say here, other than we merged a bunch of branch-related fixes which are all in production as of friday 19:36:17 #topic General topics 19:37:11 I'd like us to commit to a virtual sprint for control planes upgrade today if possible. Holidays and travel and all that stuff are appraoching quickly and getting this out of the way will make life easier for everyone 19:37:16 #link https://ethercalc.openstack.org/mz3gyl7kn62d 19:37:34 based on the table at ^ I think either next week or the first week of the yaer are looking like our best options 19:38:14 I plan to finish up the audit of our servers and list them out at https://etherpad.openstack.org/p/infra-sprint-xenial-upgrades 19:38:30 do we think doing this next week is too soon from a planning perspective? 19:38:54 * dmsimard grabs link to the etherpad 19:39:06 I don't think next week leaves much time to schedule long lived server upgrades, but not sure atm what that is 19:39:07 I think my preference would be to do it next week so that we have an early start and can pick up the crumbs later in the month as necessary 19:39:28 how many servers are we talking about ? only seeing 20 logstash workers so far 19:39:30 for that, I am leaning towards R-8 19:39:31 is that a complete list ? 19:39:41 pabelanger: ya I'm beginning to think we do what we can next week in a big push, then based on what we learn from that (systemd etc) we can properly schedule the hard ones 19:39:44 i'm up for hitting the easy stuff next week and if we have to delay some harder ones which need announcing we can push them off to a later week 19:39:58 dmsimard: manifest/site.pp gives a good overview, but I'll finish up the list today 19:40:09 Are we suspecting that some things might not be possible, for example our current version of logstash on xenial ? 19:40:13 pabelanger: ack 19:40:31 which is basically what we did for the trusty upgrades too (hit the parallel and/or cookiecutter servers first, save the hard ones for last) 19:40:33 clarkb: yah, if we want to do the easy ones in R-11, then harder in R-8, that works 19:40:34 clarkb: what's your x/2 on r-10 mean? 19:40:44 dmsimard: I expect we will find some things don't work (I don't think that is one of them) mostly related to runtime versions and systemd 19:40:49 fungi: yup! 19:41:08 clarkb: yeah I just came up with a dummy example. I guess the first step is to test a fresh deployment on xenial before we start reinstalling things :D 19:41:08 jeblair: my parents arrive in town mid week r-10 so I will be in and out 19:41:20 jeblair: I will be around just maybe not the best for focused sprint work 19:41:28 dmsimard: last time we did this, I think I upgraded about 45 servers in the week, from trusty to xenial. that was logstash, elastic-search, etc 19:41:54 dmsimard: ya we tend to do new installs on new servers then migrate data as necessary and update dns 19:41:54 ack 19:42:05 so we make sure things work before we switch over 19:42:25 I think wiki.o.o / wiki-test.o.o is going to be a harder one, we still need to get that enrolled back in to puppet, right? 19:42:46 pabelanger: ya there will definitely be more difficult servers but I also don't think we need to get them done during the sprint to consider it a success 19:42:56 sprint is good for knocking out the 80% cases and making progress and learning 19:43:04 +1 19:43:21 yeah they were unpuppeted when i did the backup server 19:43:41 pabelanger: not so much enrolled "back" in puppet, that server has never been configured using the puppet-mediawiki module 19:43:51 with that in mind what do we think about starting next week like fungi suggests 19:44:22 frickler: dmsimard ^ would that work for you? 19:44:32 pabelanger: there's a stalled effort (which i haven't found time to get back to) for building a working mediawiki server with the puppet-mediawiki module and moving our data into it 19:44:34 (btw the backup puppet could do with some automation re: deploying keys etc ... could be good part of this) 19:44:48 I purposefully didn't pick up much in my $job "sprint" which ends next friday so I should be okay. I'll hurry up and finish the stuff I need to this week. 19:45:06 i can pitch in next week 19:45:14 next week isn't optimal for me, but I'll see what I can arrange 19:45:38 speaking of stalled upgrades, i got stuck trying to rebuild the subunit-worker server on xenial. i need to revisit that and see if i can recall where it's broken but i think it had to do with python dependency resolution complexity 19:46:00 need to discuss time of day then, too 19:46:30 frickler: yes, I don't mind staying up late or waking early to sync up with you. I think dmsimard is already awake at all hours 19:46:37 lol 19:46:51 frickler: what's your tz? 19:46:59 what do we want to do about infracloud? it is still Trusty, I am assuming we just let it live out its days now? 19:47:20 pabelanger: thats sort of my opinion on it. Trusty is alive until april 2019 19:47:24 UTC+1, but I don't mind staying up late for a bit 19:47:25 I'm eastern time zone (UTC -5 right now?) 19:47:33 clarkb: yah 19:47:54 I guess 18.04 is bound to ship with queens too. 19:48:43 frickler: I think if you can arrange time that works for you let us know and I will likely be able to make room for it 19:49:01 i too am in utc -5 and can try to make a point of being up early-ish (for me anyway) next week 19:49:34 I'm usually online no later than 2PM UTC 19:49:58 frickler: i think yolanda and rcarrillocruz are in your tz; also ianw and jhesketh may also have some overlap with you 19:50:20 it'll be afternoon for them during your morning 19:50:35 yeah, afternoons / evenings for me cross with utc 19:51:38 i'll presume myself / jhesketh (if available) can mostly take on evening support for anything that's not quite working out or taking too long 19:51:39 sounds like we will be working 24x7, then ;) 19:51:49 frickler: ya :) 19:51:59 yah, we should also book #openstack-sprint for the week 19:52:11 worked well last time 19:52:22 how does this sound #agreed Next week, R-11, virtual sprint to upgrade control plane and onboard new roots. Dmsimard and frickler can work with existing infra-roots to schedule times that work for paired rooting ? 19:52:40 wfm 19:52:43 +1 19:52:44 clarkb: ++ pabelanger: ++ 19:52:45 frickler: I expect between fungi and ianw and myself we can half a follow the sun sort of setup for you 19:52:45 +1 19:52:47 ++ 19:52:50 ++ 19:52:54 and dmsimard I expect to just be awake at all hours too :) 19:53:02 ¯\_(ツ)_/¯ 19:53:04 beacuse seriously I see him on irc before I wake up and after I go to sleep 19:53:11 he's got some strong coffee apparently 19:53:19 #agreed Next week, R-11, virtual sprint to upgrade control plane and onboard new roots. Dmsimard and frickler can work with existing infra-roots to schedule times that work for paired rooting 19:53:26 pabelanger: can you reserve the irc channel? 19:53:37 kids aren't self sufficient yet, gotta feed them in the morning and send them to school :D 19:53:37 clarkb: will do 19:54:07 I had a couple other items I wanted to discuss but they are low prioriry and we are almost out of time so ... 19:54:10 #topic open discussion 19:54:17 Review request: https://review.openstack.org/520297 Add clickable CRITICAL link in log files. Has one +2 at the moment. 19:54:25 :) 19:54:29 didn't we do that? 19:54:37 ianw: You did the +2 ;) 19:54:46 seems like I've reviewed some variation of this 19:54:51 I'll take a look 19:54:57 clarkb: It is a follow-up to the first patch 19:55:04 I missed adding the link :( 19:55:08 oh, let's do that and also there's that dragonflow one out there 19:55:36 https://review.openstack.org/520320 19:57:36 I can review both of those 19:58:12 Thanks 19:58:23 ok, we determined it deployed from master, so testing == checking after puppet :) 19:58:29 ya 19:58:54 #link http://lists.openstack.org/pipermail/foundation/2017-December/002541.html mentions plans to focus the 2018 opendev event on ci/cd (that's us!) 19:59:24 i'll have to make sure i arrange to attend that one 19:59:55 would be good to find out when they roughly plan to host it 20:00:09 yup, jbryce will almost certainly keep us all posted 20:00:10 If anyone wants to see how pretty a CRITICAL log message looks like: http://logs.openstack.org/13/525713/1/check/ironic-tempest-dsvm-ipa-partition-uefi-pxe_ipmitool-tinyipa/b98dcf8/logs/screen-ir-api.txt.gz?level=ERROR 20:00:11 was september this year 20:00:21 fungi: that link doesn't work for me 20:00:29 fungi: derp, nevermind 20:00:38 user error :) 20:00:41 and that is it for time. Thank you everyone 20:00:46 the best kind of error 20:00:52 you can find us in #openstack-infra 20:00:58 #endmeeting