19:01:09 #startmeeting infra 19:01:10 Meeting started Tue Jan 8 19:01:09 2019 UTC and is due to finish in 60 minutes. The chair is clarkb. Information about MeetBot at http://wiki.debian.org/MeetBot. 19:01:11 Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 19:01:13 o/ 19:01:14 The meeting name has been set to 'infra' 19:01:25 #link https://wiki.openstack.org/wiki/Meetings/InfraTeamMeeting#Agenda_for_next_meeting 19:01:43 o/ 19:01:47 o/ 19:02:06 #topic Announcements 19:02:19 I don't actually have any announcements 19:02:23 welcome back everyoen! 19:02:58 o/ 19:03:07 #topic Actions from last meeting 19:03:19 #link http://eavesdrop.openstack.org/meetings/infra/2018/infra.2018-12-18-19.01.txt minutes from last meeting 19:03:29 #link https://review.openstack.org/#/q/status:open+topic:fedora29 Fedora29 support in our toolchain 19:03:48 there are still a few outstanding reviews for getting fedora 29 up and running if you have time to go and review those changes and their dependencies 19:04:15 Looking at other recent lists of actions we got the storyboard mysql changes in and next step is for fungi to boot new servers for storyboard 19:04:25 fungi: ^ did that happen when I was not paying attention to my computer? 19:05:00 i got sidelined working through a prerequisite 19:05:20 so that we can include rebuilds on xenial 19:05:41 ok, let us know if we can help 19:05:45 #link https://etherpad.openstack.org/p/gCj4NfcnbW tentative database move plan 19:06:03 i think i'm going to just do the server rebuilds as a separate first step 19:06:30 and wanted to build the instances as, e.g., storyboard-dev01.opendev.org 19:06:35 ++ 19:06:46 (keeping the vhost storyboard-dev.openstack.org) 19:06:49 o/ 19:07:03 but yeah, that was the series of glob/regex updates to allow us to have opendev.org instance names 19:07:25 which i think we're through now, or at least i'm not aware of any other lingering bugs/fallout from 19:07:30 so should be able to pick that back up this week 19:07:47 Also now that I think about it I do have an announcement. At 2000UTC (after this meeting) corvus and mordred are doing the vexxhost + kubernetes + ceph + percona + gitea brain dump thing. I believe the plan is to record it for those that can't make it 19:08:23 i hope alcohol is being served then, too 19:08:59 where is that taking place? 19:09:03 fungi: and ya I think those bits should be sorted now for opendev stuff. The next big hurdle is tls/ssl for actually using opendev dns records (but we have time for that later in the agenda) 19:09:11 anteaya: on pbx.openstack.org room 6561 19:09:16 thanks 19:09:30 #topic Specs approval 19:09:58 I did end up approving the opendev git hosting and storyboard attachments specs after my flight last month 19:10:04 #link https://review.openstack.org/581214 Anomaly Detection in CI Logs 19:10:43 This spec is still active and after review I do think it is close. If anyone else is able to take a look at it soon that would be great, but I'd like to check with tristanC and dirk about putting it up for approval soon 19:11:26 corvus: it has a small intersection with the way we host logs, but tristanC says we can do everything client side if we want to (so not a major concern but you may want to take a quick look given your work around logs and artifacts and swift) 19:12:35 #topic Priority Efforts 19:12:38 #topic Update Config Management 19:13:02 just prior to and after the holidays I made a push with cmurphy to get a large stack of our futureparser changes in 19:13:02 o/ 19:13:15 \o/ 19:13:17 it looks like pretty much all of them are in? 19:13:29 except ask.o.o and openstackid 19:13:34 cmurphy: I think we are currently up against the fact that smarcet and fungi are doing work on openstackid hosts nowish so adding futureparser to the mix would complicate things 19:13:53 cmurphy: I think we should rebase to put openstackid at the very end of the stack and if there are any remaining hosts not already in the list we put them before openstackid 19:14:26 yeah, smarcet is in the middle of trying to upfit openstackid source code to newer laravel and php, so has openstackid.org in the disable list 19:14:48 clarkb: I already did that so those should be the last in the stack 19:15:07 cmurphy: I think afs nodes and kerberos nodes and maybe one or two tohers may still need to be added? 19:15:17 i didn't propose anything for the afs servers or kerberos servers, i wasn't sure if they were already mostly ansibled? 19:15:24 the only other one is ask 19:15:33 (i'd have to cross check against the puppet group, which I can actually ask ansible for a full listing of and diff against the futureparser group) 19:15:36 ah ok 19:15:48 I don't think afs or kerberos are mostly ansibled yet, but ianw was working on it? 19:16:15 I can add them 19:16:24 the client side is, not the server side 19:16:36 cmurphy: I think we should consider anything in the puppet group as valid for adding to the futureparser group 19:16:42 kk 19:17:00 for the actual puppet upgrade I'd rather not do it the same way as we've been doing the future parser changes because it has been taking a long time, i think it's safe to do it in bigger batches 19:17:21 cmurphy: ya futureparser should catch most of our issues? 19:17:48 yeah 19:17:53 in any case, much progress so thank you for that 19:17:54 agreed, a big bang is likely fine for that phase 19:18:37 on the ansible + docker side of things I think we have all of the groundwork for allowing people to start sorting out actual docker hosted services 19:18:54 yah 19:19:04 I want to say that image management (of some sort) is the next big hurdle there. Likely to be a service by service thing ? 19:19:38 I'm not sure that'll be super hard- kind of similar to publishing things to pypi and stuff 19:19:48 but yeah - until we've done it done it - we haven't done it 19:20:09 i have changes in flight to start on building images 19:20:12 definitely will go service to service as to whether we're consuming someone else's image or building our own 19:20:22 corvus: for gitea? 19:20:32 ianw ^ you may be interested in that for graphite 19:20:34 #link gitea dockerfile https://review.openstack.org/626387 19:21:06 includes playbooks, roles, jobs for building and uploading 19:21:33 corvus: I wonder if we can semi standardize the building parts into zuul-jobs? 19:21:42 that is probably a followup task once we have something we are happy with though 19:22:28 perhaps 19:22:44 there are some particular things i'm doing with tags which may or may not be fully generalizable 19:22:57 i agree it's worth considering, and will be easy to move the roles if it works out 19:23:01 ++ 19:23:13 and easier for us to ponder generalizing once we've got a couple of examples too 19:24:02 ok so I t hink the take away here is we've removed known roadblocks and are ready for people to start building images and deploying services this way. Please let everyone know if you discover new roadblocks or things we need to consider more broadly. Otherwise I'm excited to see these efforts come together as deployed services 19:24:13 definitely check out the dockerfile though - it's a nice multi-stage build that winds up with an optimized final image 19:24:14 the actual job i added in that change passes, it's just had a run of bad lock with unrelated jobs 19:24:47 i rechecked it again; we should have been able to land it last year. :) 19:25:27 also this one is good for folks to know about: 19:25:34 #link jinja-init dockerfile https://review.openstack.org/626626 19:25:34 with careful manipulation of timestamps maybe we still can ;) 19:25:42 that's a pattern we may find useful 19:26:46 The last subtopic in my notes for config mgmt updates is zuul driven deployment. Do we want to try picking that up again or do we want to keep focusing on puppet4 and docker related efforts for now? 19:26:50 Is there process documentation? 19:26:56 you'll need to see https://review.openstack.org/626759 to see it in use 19:27:41 Shrews: can you elaborate on the question? 19:28:26 Probably getting ahead of the next meeting, but docs on building & deploying new services 19:28:35 On the topic of docs I think we are still in figure it out mode for a lot of this, but once we get services running in this manner we should update the system-config docs for that service with details. Then if we end up with common practice across services have a centralized "this is what the container deployed service look like" doc 19:28:50 clarkb: ++ 19:29:01 similar to how we haev links to the puppet manifests we can link to the dockerfiles and the most recently built image and so on 19:29:30 #link jinja-init docs: https://github.com/ObjectifLibre/jinja-init 19:29:58 on that subject... 19:30:04 in theory one of the things this should enable is simpler local testing/interaction 19:30:26 which hopefully makes the docs easier to consume too (because you can run the thing side by side with the docs) 19:30:31 i've been thinking that for the plain docker deployments (ie, not k8s), we should look at using docker-compose 19:30:33 yeah, one i noticed could use some updating already is the dns document in system-config 19:30:40 corvus: ++ 19:30:42 since it refers to adding things in puppet 19:31:05 corvus: that is how I run all my containers at home so would not be opposed. We should double check if that changes our networking assumptions though 19:31:17 even a simple single-app-in-a-container deployment has some ancillary things (like volumes) that are really well expressed via docker-compose. and the zuul quickstart has some ansible to run docker-compose. 19:31:17 fungi: ++ 19:31:19 it has been really pleasing in its zuul quickstart context 19:31:38 fungi, clarkb: my bad on dns; i'll update docs 19:31:58 #action corvus update dns system-config docs 19:32:16 corvus: i'd have updated them but wasn't quite sure what i might be missing 19:32:55 anything else before we move on? we have a few more agenda items I'd like to get to 19:34:02 #topic Storyboard 19:34:39 fungi: diablo_rojo really quickly wanted to double check that with the attachments spec merged and with mysql changes in place we've largely gotten the ball moving on infra related storyboard tasks? 19:34:45 anything else we should ahve on our radar? 19:35:04 Nothing atm, just getting started/organized for this new year 19:35:05 i don't think so 19:35:16 I think there are some open reviews that could use some help 19:35:23 But nothing other than that 19:35:38 good to know, thanks 19:35:44 #topic OpenDev 19:36:26 At this point I think we are completetly moved over to OpenDev DNS servers for our dns hosting and we can cleanup the old openstack.org nameservers. 19:36:45 fungi managed to update our ansible and puppet to accept openstack.org or opendev.org hostnames as appropriate 19:37:20 The website has content hosted on it now too (though its super simple and if people want to make it look nicer they are more than welcome) 19:37:30 yep, lmk if you notice any other problems which seem to result from that 19:37:50 oh, though i think the one i ran into over the holidays turned out not to have been a result of that change 19:37:55 now i don't remember what it was 19:38:05 so many eggnogs between then and now 19:38:09 I think this means that any new servers we boot should be booted in the opendev.org domain 19:38:25 zomg. so craycray 19:38:28 well, it's worth a try anyway 19:38:28 we should already specify vhost names for any of our services that have digits in their host names 19:38:35 so this should be safe for those hosts 19:38:51 as i said, i'll try launching storyboard-dev01.opendev.org rsn 19:38:59 that will be a good test 19:39:14 ++ 19:39:21 i haven't exercised the latest iteration of launch-node so fingers crossed 19:39:55 The next step after that is actually hosting things at foo.opendev.org. For many services this requires ssl certs for foo.opendev.org and possibly altnames for foo.openstack.org (so that we support existing users of $services with redirects) 19:40:23 Lets go ahead and skip ahead in the agenda to talking about letsencrypt now considering ^ 19:40:40 we have a few we could do sooner... paste/lodgeit comes to mind 19:41:01 fungi: ianw: do we still think that is our best path forward? and if so is it reasonable to get up and running in the near future to support some service updates (like for paste and etherpad) 19:41:39 or should we consider continuing to pay for certs in the near term? 19:42:05 i think it's worth a proof of concept 19:42:10 #link https://review.openstack.org/#/c/587283/ letsencrypt spec 19:42:48 as i've not had time to write up the alternative le suggestion involving separate accounts per service instead of the central broker/proxy solution, i'm not going to push back against that design even though it does seem like it would have a lot more complexity 19:43:47 do we know of anyone else who has deployed LE in a similar fashion? 19:44:06 ok maybe we can get a rebase on the spec and attempt to move forwadr with it? I expect this will become an important piece of the near term opendev puzzle (granted we can buy new certs too) 19:44:08 i mean in a similar project environment, i mean 19:44:38 ianw: cloudnull was using it for osic and may have had similar setup for all the various openstack services? 19:45:42 (I don't think we need to have the answers here today, just want to point it out as a piece that has priority trending in the upward direction) 19:46:01 well i've laid out in there the spec the issues as I see them, so I'm very open to discussion on the implementation 19:46:02 let me know if I can help sort out details too 19:46:37 ianw: yup was motly concerned that I think fungi had reservations. Maybe we can move that discussion to spec or in irc later today? 19:47:15 our hour is running out and we have a couple more items to get to so I'll move on 19:47:18 i can have a re-read and rebase today anyway, it has been a while so maybe on context switching back in new thoughts occur 19:47:23 ++ 19:47:32 #topic General Topics 19:48:09 I wanted to check in on the github admin account changes. I think fungi fully removed himself as admin on our orgs and things are working with the admin account 19:48:28 Do we want to forcefully remove everyone at this point and point at the admin account? or do we want everyone to remove themselves or? 19:48:35 fungi: ^ you probably have thoguhts since you went through the process for yourself 19:49:11 i don't object to being removed 19:49:11 yeah, i got our shared account added to all the other gh orgs we maintain 19:49:26 (before i removed mine) so they're all bootstrapped now 19:50:03 at least all the orgs i'm aware of. if i was an admin for it, i added our shared admin account before removing myself. let's put it that way 19:50:08 awesome. I also don't mind being removed - or I can remove myself 19:50:25 my only other concern was account recovery. Do we need to encrypt and stash the recovery codes? 19:50:44 I believe that github is fairly unforgiving about account recovery if you lose those? 19:50:44 ianw's instructions in the password list were nice and clear on how to use totp on bridge to generate the second factor otp 19:51:05 and includes the recovery key too, iirc 19:51:09 yep, i've completed a test login on the new account with no issues 19:51:17 clarkb, corvus: I'm removing myself now - want me to remove you too? 19:51:37 mordred: I think I'll remove myself so I'm familiar with the process 19:51:41 kk 19:51:48 mordred: are you using the new account to do so, or your own? 19:52:06 corvus: I was just going to use my own - but I coudl use the new account if that's better 19:52:08 i removed myself under my own account the other day 19:52:15 it seemed to work 19:52:28 mordred: for you and i, i think it would be better to use the new account, as a final sanity check that we've got everything :) 19:52:40 corvus: that's a good point 19:52:51 in that case, I'm not going to do that just right now :) 19:52:53 i used my account to remove my account (but only after i confirmed the shared account was working on those orgs of course) 19:53:16 And really quickly before our hour is up I've been asked if the infra team would like to attend the PTG. The PTG runs after the summit May 2-4 in Denver. 19:53:22 since if there's some orphan org, mordred and i are the most likely to have access to that 19:53:24 I don't need an answer now (we have until January 20 to respond) 19:53:51 but it would be helpful if we could get a general feel for whether or not peolpe plan to attend and whether or not we can be productive after also summiting 19:53:51 i expect to be at the ptg. that beer in denver's not gonna drink itself 19:54:09 clarkb: I will be at the PTG - I'm ahppy to be there with an infra hat on 19:54:23 I expect that if we can make it and not be zombies then it will be a great opportunity to work on opendev and the config mgmt stuff 19:54:47 I'll followup to the infra lst with ^ this info and ask for feedback thoughts there too 19:54:47 agreed -- it's the zombie thing i'm most concerned about and need to reflect on a bit 19:54:56 corvus: ya me too particularly after berlin. 19:55:00 I was really zombied 19:55:09 zombied? 19:55:14 tired? 19:55:19 anteaya: conference plague and exhaustion 19:55:23 ah thanks 19:55:30 anteaya: too tired to even use english correctly :) 19:55:37 that makes sense 19:55:52 #topic Open Discussion 19:55:54 I'm unsure if I can make Denver, but given the travel for me the more I can pack in the better. 19:55:58 though I can't imagine a state where you can't use english correctly corvus 19:56:10 before our hour is up I'll open it up to missed/forgotten/last minute items 19:56:13 Granted I'm usually a zombie anyway 19:56:21 jhesketh: BRAAAAINS 19:56:24 jhesketh: in a nice way 19:56:28 jhesketh: thats a good point 19:56:46 the more packed in may make travel more worthwhile for some 19:57:46 To be honest, I'm kinda unsure why we wouldn't. We might not be as productive as usual, but what do we have to lose? 19:58:01 (besides our sanity, if we still have any i mean) 19:58:13 Hah 19:58:25 jhesketh: ya I think it was more mostly a concern that if general feeling was flying home on the 2nd after summit then maybe we don't do it. But sounds like that isn't the case for most who have chimed in so far 19:58:27 I lost that a long time ago 19:58:35 and as you say maybe we are zombies and don't get much done but we won't know until we get there and try 19:58:59 most of us have acclimated ourselves to these marathon conferences 19:59:20 this is... longer than we've had before 19:59:34 corvus: especially if you factor in board meeting type pre activity 19:59:39 its like an 8 day conference 19:59:59 * corvus curls up in a ball under his desk 20:00:10 and with that we are just about at time. Join us for kubernetes fun in room 6561 on pbx.openstack.org 20:00:13 thank you everyone 20:00:15 #endmeeting