19:02:22 #startmeeting infra 19:02:23 Meeting started Tue Oct 18 19:02:22 2016 UTC and is due to finish in 60 minutes. The chair is fungi. Information about MeetBot at http://wiki.debian.org/MeetBot. 19:02:24 Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 19:02:26 The meeting name has been set to 'infra' 19:02:31 #link https://wiki.openstack.org/wiki/Meetings/InfraTeamMeeting#Agenda_for_next_meeting 19:02:39 many thanks to pleia2 for chairing last week's meeting in my absence! 19:02:45 you're welcome 19:02:47 i've read the log, seems it was pretty straightforward and brief 19:02:52 excellent 19:03:02 #topic Announcements 19:03:06 o/ 19:03:10 #info The Infra team meeting will be skipped next week as many will be at the summit, but we will reconvene our usual meeting on Tuesday, November 1 at 19:00 UTC. 19:03:21 as always, feel free to hit me up with announcements you want included in future meetings 19:03:33 #topic Actions from last meeting 19:03:38 #link http://eavesdrop.openstack.org/meetings/infra/2016/infra.2016-10-11-19.02.html 19:03:41 pleia2 to add infra summit schedule 19:03:46 #link https://www.openstack.org/summit/barcelona-2016/summit-schedule/global-search?t=Infrastructure%3A 19:03:51 thanks for taking care of that as well! 19:03:54 we have a schedule now \o/ 19:03:57 anyone want to volunteer to prep etherpads for each of the sessions listed there, so they're ready for use next week? 19:04:31 * fungi waits for everyone to jump at that exciting task 19:04:45 gah, lag, I'm still waiting for it to load 19:05:09 I can knock it out real quick this afternoon 19:05:10 yeah, the schedule site there is pretty slow. not sure what's up (or down?) with that 19:05:22 oh, okay, I thought it was on my end. I'm happy to do the etherpad for task-tracking 19:05:29 pleia2: awesome! 19:06:05 yeah, we just need some skeleton etherpads added to the summit etherpads wiki page, and then the people leading those sessions can add plenty of prep detail 19:06:46 #link https://wiki.openstack.org/wiki/Design_Summit/Ocata/Etherpads 19:06:55 apparently that exists already 19:07:15 great 19:07:27 #action pleia2 add skeleton infra session etherpads linked from the ocata design summit etherpads wiki page 19:07:36 #topic Specs approval 19:07:42 nothing new this week on the agenda 19:07:44 though as pleia2 pointed out last week, the zuul v3 spec clarification we agreed to approve has not merged yet 19:07:50 (i didn't forget to approve it, but it depends on another spec update we'd opted to defer for further discussion) 19:07:56 #link https://review.openstack.org/381330 Zuul v3: correct vagueness in describing job inheritance 19:08:00 expect me to push on those after summit 19:08:03 if someone (jeblair?) wants to rebase that onto the master branch tip, i'm happy to reapprove 19:08:13 or we can wait for the other to land 19:08:26 i think it's no rush 19:08:32 wfm 19:08:34 thanks jeblair! 19:08:38 ty 19:08:59 i just didn't want anyone to think i was even more absent-minded than i am ;) 19:09:09 #topic Priority Efforts 19:09:15 nothing is called out on the agenda 19:09:17 though zaro is suggesting that we should probably at least fast-track the next gerrit upgrade so that it gets done very early in the ocata cycle 19:09:27 if someone wants to put together a very small spec for it (could mostly just copy the last upgrade spec), i agree it would make sense as an ocata cycle priority effort 19:10:25 he's apparently got a few outstanding changes up for review that build up a more complete dev environment where we can demo a new gerrit version more thoroughly integrated into zuul/nodepool 19:10:35 i'm working on the spec 19:10:50 thanks zaro! i'll be keeping an eye out for it 19:11:04 #topic Host http://shields.io/ for generating badges to show on repos (flaper87) 19:11:07 #link http://shields.io/ 19:11:11 #link http://lists.openstack.org/pipermail/openstack-dev/2016-October/105562.html 19:11:14 o/ 19:11:27 i will admit i haven't caught up on the dev ml backlog while i was travelling 19:11:36 so no clue what this is about, but i'm sure you'll tell me ;) 19:12:09 So, in the context of that email thread linked there, I was wondering whether it may be possible for us to host that service. The idea is to be able to generate some badges (like the ones you'd see in github for travis) so that we can expose some information about the teams/projects directly on the readme files 19:12:48 the goal is not to duplicate this information but to read it from the governance website directly. We've several ways to make this possible but the three easist are: 19:13:36 1) Host the shield app and adapt it to our needs 2) have the host repo add `openstack` support (I've a basic patch for this) 3) HAve an API under governance.openstack.org that generates these badges 19:13:47 These won't be "rendered" in git.openstack.org, though 19:13:55 As we don't render .rst files there 19:14:01 okay, so mostly just callouts to some query against governance tags that apply to projects or deliverables to which those repos belong? 19:14:10 flaper87: why do we need a complete app to show some images? 19:14:11 yeah 19:14:20 AJaeger: because it does the job already 19:14:29 otherwise we'd have to come up with the API for that 19:14:32 render the images, etc 19:14:38 which is what shields does 19:14:54 you may argue that we may not want to host nodejs apps, and that's a good point 19:14:54 why do we need an api for that? 19:15:06 cant we just host the images anywhere (including in the repo itself) and render from there? 19:15:08 clarkb: I think we do if we want it to be dynamic 19:15:33 it seems like a lot of effort to run a new service for a service we don't even support other than as a git mirror when you can just host the images anywhere 19:15:39 (i guess you could approximate dynamic by rendering them in a cron) 19:16:04 note that this is not just to "have nice badges" in the github repos 19:16:13 oh, this would display on github? i was misreading and though this was intended to be an overlay for the cgit interface on git.openstack.org 19:16:20 It's about communicating better the status of projects, etc 19:16:32 flaper87: how often do we expect them to change? i see it for CI status, but not sure about at this level 19:16:32 fungi: it would only work in github today aiui 19:16:47 in theory cgit could be made to also render such things but it doesn't do so today 19:16:50 I would say not very often, tbh 19:17:06 clarkb: if cgit can render such things, it'd be awesome to have it do that 19:17:12 flaper87: it cannot 19:17:22 flaper87: you would have to modify cgit to do it iirc 19:17:27 clarkb: oh, nvm then 19:17:37 and this shields.io app already knows how to consume our governance tag metadata from the existing yaml file, or takes some manner of dsl to understand how to do so? 19:17:48 so, if hosting shields is a no-go, then finding a different solution would be required 19:17:58 fungi: I've a patch for that, locally 19:18:05 flaper87: I am not saying its a no go, just wondering if its really the best solution for us here 19:18:11 I was waiting for this meeting to happen before doing anything else 19:18:16 #link https://www.mail-archive.com/cgit@lists.zx2c4.com/msg01769.html interesting recent thread about cgit rst rendering 19:18:23 since we don't really support github for such things, so adding a new service just for github shinies seems odd 19:18:31 and there are other ways we could communicate this information 19:18:43 flaper87: clarkb: well, the readme rendering situation with cgit is a little more nuanced than that... pleia2 came up with some options, but ultimately it's hard to decide whether/when to display the rendered readme vs the source code. ultimately we have per-repo documentation sites to serve rendered rst anyway 19:19:25 and to me at least, the point of a source code browser is to show you source code, not rendered documentation content 19:19:31 Exposing these info in the docs is also on the works, fwiw. I'm looking into that 19:19:36 it could just as easily be included in actual documentation, rather than just README? 19:19:43 fungi: yes one of my annoyances with the rendered docs on eg github is you can't link to lines in the docs 19:19:43 fungi: right but people normally go to the git repo 19:19:58 I do understand the point but that's not the reality 19:20:16 and the goal is to make this information more evident as it's a source of confusion for many people 19:20:32 clarkb: interestingly, that's effectively what the last message in that thread says :) 19:20:36 well, i'm saying we might benefit from coming up with a tighter integration to link git repos to their rendered documentation sites 19:20:37 flaper87: we have problems with using github API, so I'm feeling uncomfortable to push people to github. 19:20:53 AJaeger: we're not going to use GH's API 19:20:53 * zaro likes that github renders readme files 19:20:55 fungi: this is the first i've heard about the "about" feature of cgit 19:21:04 AJaeger: not sure I understand that last statement 19:21:24 I know we don't like periodic jobs ... but it does seem that a periodic job that could replace some template string in README's based on a YAML file might be helpful here? 19:21:35 jeblair: yeah, it's a thing 19:21:37 jeblair: mentioned in that ml thread i guess? i'm entirely unaware of the about feature there anyway 19:21:38 rather than the other way of pull the status every time from an api 19:21:44 ianw: I'd really like to avoid duplicating the info 19:21:59 if we duplicate this info, it'll encourage people to just modify it 19:21:59 flaper87: that was in reaction to github READMEs etc. 19:22:25 * Zara has been poking around in a cgitrc a bit today, can confirm you can get it to render markdown and things (and more advanced syntax hilighting from the looks of things) but the line numbers won't match up nicely 19:22:29 I would like this info/badges to be pulled from some other API that reads/parses the governance site 19:22:35 flaper87: well you put stuff like "FOLLOWING LINE IS AUTOGENERATED DO NOT TOUCH" around it 19:22:50 ianw: that doesn't work, really. 19:22:55 Zara: (un?)surprisingly, github has the same issue 19:23:02 heh 19:23:08 is the goal to have a friendly way of displaying the governance info? or to have a way of consuming the governance info in project repos? 19:23:11 i dunno, we use it quite well for stuff like devstack plugin lists 19:23:40 jesusaur: both ? 19:24:15 FWIW, I'm not saying shields is the best solution. It's an option though. I think clarkb's idea of just generating these images might be good as well 19:24:44 flaper87: mostly I want something that will work with not github and just generating images seems like it would work everywhere 19:25:27 i can completely understand the sentiment that we should avoid reimplementing a wheel which already exists, though it sounds like the shields.io wheel may be different enough from our needs/expectations to make putting together something new (and different) more desirable 19:25:36 clarkb: could be used in readmes, documentation, and 'cgit about' if we figure out how to use that...? 19:25:47 clarkb: yup, I can see that working and it should not take long to generate these images when new changes land in the governance repos 19:25:54 jeblair: ya at least for the rendered versions of such things 19:26:03 clarkb: jeblair exactly 19:26:06 #link https://git.zx2c4.com/cgit/about/ this looks like a cgit about thingy 19:26:10 It'd show the image 19:26:43 ok, let me explore more on the image generation path for this and I'll get back to y'all 19:26:47 it's been a couple years since I looked at it, but it requires a file in the repo and the config pointing at the specific about file 19:27:02 (after the summit, of course) 19:27:04 :D 19:27:11 could we generate static content from the governance repo into a docs site and then link to the generated static info? 19:27:35 jesusaur: we could ;) 19:27:46 i dont know if that would be easier than using shields 19:27:46 #agreed The idea of having a consistent set of icons/badges displayed for various project repositories has merit, but the shields.io implementation seems to not fit a number of requirements we would have so further investigation is warranted. 19:27:53 jesusaur: a post job on governance that publishes somewhere... 19:27:59 pleia2: yeah, might end up being impractical for us. but worth some brainstorming maybe 19:28:03 AJaeger: ya thats what i was thinking 19:28:11 jeblair: nods 19:28:30 flaper87: definitely a neat idea, thanks for bringing it up 19:28:32 jesusaur: if we generate the badges, I'd rather just show them in the read me. We can link the docs too but, srsly, not many people click on those anyway. I want these info to be clear and evident and it doesn't sound invasive to me 19:28:49 fungi: thanks for listening, I'll dig more into the badge generation idea nad report back 19:28:56 hopefully with a script that we can just use 19:29:03 shouldn't be hard/long to do 19:29:14 maybe rendering the readme (and embedding the icons/badges for stuff) in the cgit about page would make sense 19:29:25 ++ 19:29:37 that's all I have for now 19:29:40 but yeah, lots of possibilities for brainstorming, and a limited window to discuss in today's meeting 19:29:41 thanks everyone 19:29:46 thanks again, flaper87 19:29:52 #topic I18n: translation checksite (ianychoi, eumel8) 19:29:52 * flaper87 bows and waves 19:29:55 #link http://lists.openstack.org/pipermail/openstack-infra/2016-October/004776.html 19:30:18 yes 19:30:24 i saw the e-mail on the infra ml, but have been gone for over a week so only just read it an hour or so ago 19:30:34 yep 19:31:30 so my primary concern here is that we still don't really use Ansible for service deployments (we use Puppet for that, Ansible is just orchestrating) 19:31:39 it's a proposal to change the architecture of the upcoming translation check site from DevStack to openstack-ansible 19:31:40 #link https://etherpad.openstack.org/p/openstack-ansible-os_translation-checksite new openstack translation checksite for i18n team 19:31:44 looks relevant 19:31:45 and I'm not exactly sure how Ansible solves the problem 19:32:18 seems if we go the container route, we could make Puppet do similar things 19:32:30 it puts horizon into lxc container for better maintenance 19:32:35 but I would like other folks to chime in here 19:33:01 does openstack-ansible successfully deploy a single-node environment from master tip of the various projects involved? 19:33:41 I think that is how they test so yes 19:33:48 i gather a lot of the prior pain was related to how to do continuous deployment consistently and deal gracefully with failures when you cant 19:33:49 I think so. There is a link in the docs for All-In-One installation 19:34:12 I see, so by using ansible we're getting away from devstack entirely 19:34:20 that's true, fungi 19:34:43 fungi: yeah, devstack breaks sometimes (heh) 19:34:46 yes, openstack-ansible is completly other code 19:34:47 so it's probably worth at least somewhat separating the discussion of infrastructure automation tooling from the openstack deployment tooling involved 19:35:04 ++ 19:35:39 in this case, ansible instead of devstack, but possibly still driven from a puppet module? 19:36:02 I was thinking that 19:36:08 mhmm, could be 19:36:12 ansible runs puppet to run ansible \o/ 19:36:22 yeah! 19:36:30 depends on the deployment mechanism of the VM 19:36:40 i don't really understand how using ansible avoids the 'breaks' that were seen with devstack 19:36:51 I feel like if you're using ansible, devstack, or whatever to deploy and openstack environment from master, it's going to be unstable. which brings you back to the same sort of "A is working, try building B and switch it in" problem we had before? 19:37:11 ianw, jeblair, yeah, that's a fair point 19:37:29 it's not really devstack that breaks sometimes, it's master doesn't always behave the way we want for this sort of thing 19:37:38 #link http://lists.openstack.org/pipermail/openstack-infra/2016-July/004524.html 19:38:18 According to this, #3 is one of weak points in DevStack.. Would Ansible address it? 19:38:36 there is no experience with openstack-ansible because it's very new 19:38:43 And for #2, i18n now thinks that we may do not have once a 19:38:43 week instance 19:39:31 but you can respawn the container when it's failed instead the whole VM 19:39:46 yep, part of the challenge here is that we currently either semi-manually deploy a server and then we keep it around running a fairly consistent and continuously deployed stack of software (our long-lived servers), or we automatically deploy throw-away servers with nodepool to run (relatively) brief jobs. the idea of having servers automatically deployed which conditionally replace each other in 19:39:48 our infrastructure is still an untrodden path and i think a lot of the challenges will be there 19:40:50 would you use openstack-ansible in general for the infrastructure? 19:40:59 in the future? 19:41:03 i guess one of the points being made is that by adding an extra layer of isolation (openstack in containers on a vm) avoids deploying entire replacement servers? 19:41:36 yes, fungi 19:41:49 so, what is the service expectation here from translators? A service that is always running and working - or would a downtime of a couple of days until OpenStack ansible team/project team fix problems be ok? 19:41:56 this also means an extra layer of things to go wrong ;) 19:42:10 so we would just build one check-site server, and try to deploy replacement containers on that server, and only drop the old container if teh new one works? 19:42:29 deploy replacement containers on a long-lived server fits fairly well with the current long-lived server model 19:42:43 i also wonder how you effectively leave the previous containerized openstack running on that server while deploying and testing another one 19:42:43 AJaeger: depends on the timeframe. in the hot translation phase it's critical 19:43:12 (otoh, if we wanted to explore replacing long-lived servers, there is good ansible infrastructure around for that. we'd need to push on the automated dns project though) 19:43:32 jeblair: or use http redirects maybe 19:44:04 have a stable redirector which can be updated through automation, then no need to auto-update dns 19:44:06 fungi: omg we'll just update a file in afs 19:44:21 * fungi gets to work reimplementing dns in afs 19:44:28 heh 19:44:58 yay custom nsswitch resolvers 19:45:01 (dns in afs wouldn't be far off from the old arpa hostlist which was distributed via ftp!) 19:45:26 someone made one of those backed by etcd for our toy tinc vpn 19:45:31 fungi: but how would you know where the afs servers are!? 19:46:16 but anyway, i agree with jeblair that the containerization feature of openstack-ansible does seem like it could fit pretty well with our long-lived server concept, and avoid the challenge of updating a devstack deployment 19:46:45 since it's presumably a completely from-scratch redeploy 19:47:10 good :) 19:47:10 without lingering cruft you'd have to deal with attempting to replace devstack 19:47:32 AJaeger, eumel8 yep depends on the time frame.. but I think one stable translation checksite around feature freeze, only Horizon update between soft freeze and hard stringfreeze (RC1 target), having a new stable translation checksite after RC1 will be release would be nice (I need to discuss such timeframe more with eumel8 ) 19:48:00 ok 19:49:00 https://github.com/CentOS-PaaS-SIG/linch-pin may be an interesting project if we were to consider the replace-the-server option 19:49:05 i have one more topic i'm hoping to get to today, but yeah this seems like it could be a path forward (speaking specifically of the "replace devstack with openstack-ansible" part of the design... how we drive that from our infrastructure is something we'd need to flesh out separately) 19:49:45 this would mean infra is deploying openstack three different ways, 1) devstack in the gate, 2) ansible for translations, 3) puppet for infracloud 19:50:04 yep. it does sound insane when stated that way ;) 19:51:09 alternatively, infra is deploying openstack three different ways: 1) short-lived servers with nodepool, 2) on bare-metal for production environments, 3) in containers for a stateless translation check site 19:51:33 each of those is a use case which informs the appropriate tooling 19:52:06 there are efficiency and complexity trade-offs to drive them in different directions 19:52:24 that's a good perspective 19:52:36 +1 19:52:50 I think this container thing will increase in the future. There is a lot's of effort in lxc and lxd 19:53:31 eumel8: ianychoi: want to follow up on the infra ml thread with a summary of this and we can try to hash out where we go next? 19:54:01 yes, thanks, fungi! 19:54:12 yep also eumel8, pleia2 then for translation checksite, it seems that http://specs.openstack.org/openstack-infra/infra-specs/specs/translation_check_site.html needs to be changed with such thoughts :) 19:54:51 #agreed Containerized openstack may make sense as an alternative to devstack for the I18n check site deployment, so further exploration is warranted. 19:55:14 ianychoi: yep, that would be one of the steps 19:55:41 ianychoi: want to start working on a change for that? 19:56:28 eumel8, would you start with me for revising this infra-spec? 19:56:44 yes, of course 19:56:52 excellent. i'm going to selfishly switch to the final topic on the agenda now since it's mine and nobody can stop me ;) thanks ianychoi and eumel8! 19:57:03 #topic Root sysadmin volunteer to deploy pholio server (fungi) 19:57:04 pleia2, yep I will do with eumel8 :) 19:57:08 #link http://specs.openstack.org/openstack-infra/infra-specs/specs/pholio.html 19:57:09 Thanks all! 19:57:12 thx to all 19:57:14 per craige, the automation for this is basically all merged (thanks for working through all that, craige!) 19:57:22 i would _love_ to see this turned up very soon, as i'm sure would piet and the rest of the ui/ux team 19:57:37 any infra-root admins interested in taking up the task of attempting to deploy it? 19:57:53 I'm too snowed under with prep for summit, sorry 19:57:53 and maybe hashing out fixes for any minor issues encountered along teh way? 19:58:03 pleia2: same here 19:58:12 pleia2: yep, you've gotta stop volunteering for everything ;) 19:58:16 haha 19:58:46 i know little about it, but I think craige and i are in the same tz? 19:59:06 ianw: closer together than the rest of us at least 19:59:11 ianw: yep, he said he's around at least via e-mail if not irc to help with the handoff to this part of the task 19:59:35 well i can read up and correspond with him 19:59:36 if you're able, give it a shot and ask for help as needed 20:00:00 i'm happy to try to back you up on it too, though of course timezones are a thing 20:00:21 #action ianw work on deploying a pholio server 20:00:27 thanks! 20:00:34 cool 20:00:35 we're out of time--thanks everyone! 20:00:37 o/ 20:00:47 see some of you _very_ soon! 20:00:53 #endmeeting