19:01:18 #startmeeting infra 19:01:18 Meeting started Tue Jan 16 19:01:18 2018 UTC and is due to finish in 60 minutes. The chair is clarkb. Information about MeetBot at http://wiki.debian.org/MeetBot. 19:01:19 Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 19:01:21 The meeting name has been set to 'infra' 19:01:29 #link https://wiki.openstack.org/wiki/Meetings/InfraTeamMeeting#Agenda_for_next_meeting 19:01:34 o/ 19:01:39 o/ 19:01:46 #topic Announcements 19:02:08 #info clarkb and fungi missing January 23rd meeting due to travel. 19:02:33 pabelanger and ianw have volunteered to run the meeting next week iirc. I'll let them sort out between themselves who wants to do it 19:02:45 #info OpenDev focused on CI/CD happening April 26-27 in San Francisco 19:02:47 meetings can have co-chairs :) 19:02:57 corvus: oh ya they can cochair too :) 19:03:14 cochise 19:03:29 Last year the openstack foundation had an "opendev" event in san francisco with a focus on edge computing. This year they are hosting a similar event with a focus on CI/CD 19:03:39 * mordred waves 19:03:51 If that is something you are interested in mark those dates on your calendar. Sounds like there should be good industry representation too so not openstack focused 19:04:06 it'll be nice to have folks there to talk about our experiences 19:04:06 #link http://lists.openstack.org/pipermail/openstack-dev/2018-January/126192.html Vancouver Summit CFP open. Submit your papers and/or volunteer to be on the programming committee. 19:04:20 ++ 19:04:48 The OpenStack summit CFP has also opened. One of the tracks is CI so now is the time to propose papers and/or volunteer to sit on a programming committee 19:05:09 I think they are looking for people that have never been on a summit programming committee to volunteer so if you are interested you should go for it 19:05:21 #link http://lists.openstack.org/pipermail/openstack-infra/2018-January/005800.html Zuul has mailing lists 19:05:25 clarkb: I guess the OpenDev website isn't up to date, it doesn't mention the April dates 19:05:35 dmsimard: ya I think we got the prerelease info on that 19:05:44 ok 19:05:45 dmsimard: but I'm told the dates and rough location shouldn't change 19:05:52 I've added my name to volunteer for programming committee 19:05:54 dmsimard: aiui, they're still finalizing the venue contract 19:06:16 Zuul has mailing lists now. I just tested that sign up works and it does! 19:06:16 Do we know if there's a CFP or if they need help organizing ? Sounds like a great venue to advertise the work we do in OpenStack in terms of CI/CD 19:06:32 (Sorry for slightly sidetracking, didn't get time to reply when it was on-topic) 19:06:34 dmsimard: I can ask for that info and pass it along 19:06:43 dmsimard: its ok (we just have a large agenda so trying to keep moving) 19:06:48 ok, thanks 19:06:59 dmsimard: I'll let them know that updates to the website with more info would be useful and ask about cfp 19:07:38 i'm on the opendev programming committee 19:07:48 as of 3 hours ago 19:08:01 i'll be happy to keep this group updated 19:08:10 awesome, thanks 19:08:19 corvus: grats :p 19:08:22 appreciated! 19:08:26 nice 19:08:27 we're having our first meeting friday, so probably more info after that? 19:08:59 #topic Actions from last meeting 19:09:12 There were none \o/ 19:09:18 #link http://eavesdrop.openstack.org/meetings/infra/2018/infra.2018-01-09-19.01.txt minutes from last meeting 19:09:25 and a fine week it was 19:09:32 didn't i... 19:09:36 * corvus patchbomb project name removals from zuul.yaml 19:09:50 oh did I read the wrong agenda? 19:09:55 I may have opened the wrong link when I looked 19:10:02 i haven't done that, fwiw -- i decided with zuul going oom i should sit on that a bit. 19:10:13 maybe this friday... 19:10:18 ya I must've 19:10:19 #action corvus patchbomb project name removals from zuul.yaml 19:10:23 thanks 19:10:39 #topic Specs approval 19:10:46 #link http://lists.openstack.org/pipermail/openstack-infra/2018-January/005779.html Cleanup old specs. 19:11:10 I sent mail about starting to cleanup some old specs and pinged others out of band for a couple others. Things have started moving on this. 19:11:16 If you have input please send it in 19:11:21 #link https://review.openstack.org/#/c/531242/ IRC Bot Improvements 19:11:43 fungi and I reviewed ^ last week. I think jeblair was hoping to put it up for a vote this week? 19:11:50 i haven't had a lot of input since last week's rfc 19:12:03 it seemed like a really well thought out and written spec 19:12:03 yeah, something did just dawn on me earlier today on that one too 19:12:17 does anyone want to delay another week for more collaboration, or should we vote? 19:12:47 I'm comfortable with it as is, but more input would probably be good especially if fungi has new thoughts 19:13:08 (it's not urgent, but if we agree on a plan, it's very divisible, so we can pick of parts of it in parallel in our copious spare time) 19:13:11 wrt auto-kickbanning abusers based on spotting highlight spam, is there an easy way to differentiate it from when meeting chairs do the same to give people a heads up that a meeting's starting? or should we just get people out of that habit 19:13:54 fungi: i feel like we can make that a tunable thing 19:13:56 maybe whitelist the meeting bot and have it do it? 19:14:11 oh, maybe 19:14:11 something like #notify fungi corvus clarkb 19:14:12 percentages could factor in -- also, we could be more generous for folks with registered nicks 19:14:36 anyway, not a big enough concern for me to bring it up in review i guess 19:14:39 +1 spec is well written 19:15:07 yeah, i don't think we need to change the general approach laid out in the spec, just fine-tune the algorithm 19:15:34 I'm happy to move forward with it this week. Does anyone think they want more time to review it (I think its perfectly acceptable to do more review too) 19:16:02 ianw: dmsimard frickler ^ 19:16:28 mnaser: ^ 19:16:50 * fungi kickbans clarkb for hightlight spam ;) 19:16:57 :) 19:18:33 * dmsimard catches up with backlog 19:19:05 I had quickly skimmed the spec but didn't give it a good review, I'll take a look after the meeting. 19:19:09 corvus: where do you sit on it? 19:19:15 corvus: you still want to move ahead this week? 19:19:29 Using the supybot fork is reasonable -- using errbot would be nice but a lot of work. 19:19:45 clarkb: let's put it up for vote and if dmsimard -1s it we'll take another pass 19:19:51 sounds good to me 19:20:14 Something about errbot that's nice is the built-in cross-platform thing (slack, irc, xmpp, etc.) 19:20:27 I'll vote on the spec. 19:20:29 #agreed put https://review.openstack.org/#/c/531242/ IRC Bot improvements up for vote this week and if dmsimard -1s we'll take another pass 19:20:47 #topic Priority Efforts 19:20:53 #topic Storyboard 19:20:53 yah - that was the original motivation for errbot- but the amount of work has resulted in no progress since we started talking about it around the paris summit 19:21:13 this is a quick one 19:21:22 fungi: has a topic for storyboard "should we allow search indexes to index storyboard" 19:22:01 in the course of the -dev ml thread covering tripleo's testing out of sb, i noticed that we have a robots.txt denying indexing by web crawlers on sb.o.o 19:22:34 a little archaeology indicates that has been baked into the webclient source/build since its very initial commit 19:22:45 :) 19:22:48 yah, seems like some we should be allowing, helping users find the reports 19:23:17 subsequent discussion in #storyboard indicates that it may not be particularly straightforward since the content is rendered via javascript calling the sb api 19:23:23 indexing of #! based sites is not easy 19:23:29 maybe we want to unblock it on storyboard-dev first, make sure that it doesn't cause the server to fall over then do it on storyboard.openstack.org? 19:23:32 though google changed how they do stuff with that recently 19:23:34 fungi: oh right 19:23:35 it may be easier 19:23:42 there was a suggestion that google at least does some of this, yeah 19:23:46 (apparently, the google indexer can now render js pages with some fidelity) 19:24:30 i'd be less keen on allowing indexing of sb-dev.o.o because we could end up with misleading stale copies of stories and other test story content turning up in web searches 19:24:36 probably someone needs to do a little bit of research and determine if it's feasible and not destructive to enable with respect to the current state of the art, or whether we need to make storyboard changes 19:24:45 fungi: because we put duplicate test stories in there? 19:24:50 or even the test migrations I guess 19:25:15 clarkb: both testing migrations there and importing copies of the production db 19:25:22 yeah I definitely think dev should stay unindexed 19:25:39 our routing is old anyway so it would be nice to update that more generally 19:25:42 in that case we need to watch production carefull when we make the change but that should be doable and is an easy change to undo 19:26:08 but I'm in favor of that 19:26:41 yeah i'm not super sure how to test whether this would have a negative impact without just doing it and keeping a close eye on the server resources 19:27:28 do we have a volunteer for doing that? watching cacti and general server responsiveness from your local browser should get you msot of the way there (so doesn't require root) 19:27:37 then if things really go sideways or we need more info a root can dig that up 19:28:08 I sould be able to if someone sends me links 19:28:18 (or I can go digging) 19:28:23 i can volunteer to be infra-root primary point of contact for that stuff too since i opened the can of worms to start with 19:28:39 Zara: fungi thanks! I think that means fungi can dig up links 19:28:40 and try to keep an eye on it during waking hours 19:28:45 great, thanks! :D 19:28:51 yes, i'll put together links Zara 19:29:01 anything else on this or are we good to move on to zuul v3? 19:29:12 that's all i had 19:29:24 _o_ 19:29:25 * mordred owes Zara another patch ... 19:29:37 #topic Zuul v3 19:29:54 quick note that I almost have the zuul v3 issues etherpad cleared out and migrated to other data tracking systems 19:30:04 hope to finish that up today 19:30:14 fungi is working on making a bigger zuul v3 instance 19:30:25 fungi: anything you need to help move that along? 19:30:27 yep, and relearning puppet apparently 19:30:40 clarkb: nothing else i know about yet 19:31:02 infra-root ^ helping that is somewhat important so that we can stabilize zuul.o.o 19:31:09 will hopefully have it booted shortly after the meeting 19:31:17 ++ 19:31:27 The other thing I had on the agenda to talk about is merging feature/zuulv3 into master 19:31:32 then i'll rebase the remaining system-config change for cacti/firewalls 19:31:46 fungi: cool, sounds like it should be done soon then 19:31:54 here's hoping 19:32:00 #link http://lists.openstack.org/pipermail/openstack-infra/2018-January/005790.html Communicating a feature/zuulv3 merge plan 19:32:06 #link https://etherpad.openstack.org/p/zuulv2-outstanding-change-triage Triaging existing changes on v2 master branches 19:32:29 i added that last one to the agenda mainly to get a touchpoint on what we want to do 19:32:40 I think we are really close to being able to merge the feature/zuulv3 at least from a technical standpoint 19:32:50 the etherpad above was news to me (or I had forgotten about it at least) 19:33:12 have we declared bankruptcy on that? 19:33:29 i think it's worth going through old changes and see what might be relevant 19:33:34 it's a lot of changes open on master and not only do we generally have enough trouble reviewing urgent changes in a timely fashion, i at least feel like i lack sufficient context to judge which ones would still be relevant in v3 19:33:50 I am hoping to push up changes need to bring nb01 back online with nodepool feature/zuulv3 running. Will need some changes to nodepool.yaml syntax first 19:34:01 ok so maybe take a step back on the thursday merge goal ( as I expect going through all those changes will take longer?) 19:34:06 oh no sorry 19:34:16 we're not merging anything to the current master 19:34:18 corvus: oh do you mean look through them and forward port? 19:34:47 pabelanger: what's changed in the nb* configs? 19:34:58 i.e. master branch is frozen for further commits until feature/zuulv3 is merged to master 19:34:59 i mean look through them and triage. we can abandon lots of them. the rest we can indicate are good targets for forward-porting. 19:35:07 corvus: got it 19:35:13 i think that can happen at any time -- even after the v3.0 release 19:35:22 ya shouldn't need to happen ebfore the merge 19:35:30 or at any other specific time 19:35:33 but i do think that we owe it to our contributors to do it at some point. at least spend a few seconds on each one. 19:35:41 granted many (most?) will merge-conflict if they don't already 19:35:50 Shrews: nodepool-builder still using old syntax, need to updated to new format 19:36:01 Shrews: and switched to python3 19:36:01 I can try taking a first pass through it and categorize once I am done with the zuulv3-issues etherpad 19:36:28 i started trying to go through those back in december but was honestly pretty lost 19:36:40 #action clarkb to take pass through old zuul and nodepool master branch changes to at least categorize changes 19:36:40 Shrews: old: http://git.openstack.org/cgit/openstack-infra/project-config/tree/nodepool/nodepool.yaml new: http://git.openstack.org/cgit/openstack-infra/project-config/tree/nodepool/nl01.openstack.org.yaml 19:36:42 clarkb: thanks 19:36:46 maybe we can establish a tree 19:37:09 Shrews: and we cannot use nb01.openstack.org.yaml as it doesn't have all providers listed for uploading images 19:37:23 other than the old master changes are there any concerns realted to merging the two branches on nodepool and zuul? I put details on what I think the process will be in the mailing list 19:37:28 I also think we want to do both at the same time 19:38:02 clarkb: agreed, both at once. and yeah, i think we'll just have a bunch of changes we'll need to re-propose. 19:38:59 corvus: might also be worth an email to the openstack dev list and the new zuul announce list saying "we are doing this" maybe send that nowish? 19:39:02 we can let folks know if they had open changes on v3, they'll need to repropose them. 19:39:37 clarkb: sounds good, i'll do that. 19:39:40 thanks 19:39:54 #action corvus send email to openstack-dev, openstack-infra, zuul-announce about feature branch merge 19:40:10 maybe also third-party-ci-announce? 19:40:13 ++ 19:40:52 any other zuul items before we move on? 19:40:56 nak 19:41:04 #topic General topics 19:41:14 Thank you everyone that helped get us patched against meltdown 19:41:27 as of this morning we no longer have any instances that ansible inventory knows about running old kernels 19:41:35 (we had to delete a couple instances that were no longer used) 19:41:40 and nothing has melted down yet 19:41:42 pabelanger also finished up infracloud patching 19:42:04 it's possible the zuul executors are a bit... overheated... :) 19:42:05 though we have a sinking suspicion it may be to blame for increased load on the zuul executors? 19:42:09 right, that 19:42:10 while nothing has properly melted down I think we have seen some etra performance impact 19:42:12 ya 19:42:40 we should keep an eye out for that, unfortunately I think our only way to address that is bigger instances/more instances 19:43:06 supposedly google managed to patch with minimal performance impact but not sure they have shared those details pubicly 19:43:19 and also spectre patches are rolling out to cpus and they may too have more performance impacts 19:43:26 its a fun time all around 19:43:40 "fun" 19:44:03 for masochistic definitions of that word 19:44:16 thanks again for everyone helping out on that 19:44:29 the next item is somewhat related to that: Infracloud 19:44:34 until the next firedrill 19:44:51 during meltdown patching we lost the infracloud controller node due to bad disks ( we also lost at least 2 compute hosts) 19:45:12 the initial idea we had when patching last week was to rehome vanilla computes to the chocolate controller 19:45:20 but this morning an interesting email was forwarded to me 19:45:25 "Unfortunately, HPE will not move forward with the donation of hardware at this time as there is a need to use this equipment for other organizational support purposes." 19:45:32 oh. my 19:45:35 really? 19:45:37 ouch 19:45:40 this has not been communicated directly to any of us from what I can tell 19:45:44 they think they want to use those computers? 19:45:51 mordred: ya that was my reaction :) 19:46:09 BUT it is an indication we won't have the hardware for much longer and I think we should plan for that 19:46:14 ++ 19:46:32 agree 19:46:33 how hard is it to re-home choc compute? 19:46:34 I'm inclined to just get my fix to nodepool for handling cloud outages in then use chocolate until they take it away from us 19:46:48 er, backwards 19:46:48 corvus: I'm not sure. I think we have to update tyhe config on all the computes then accept them on the controller? 19:46:54 yeah, this was in a heads-up i received from our contact at osu-osl whom we'd been talking to about relocating the deployment there 19:47:14 Yah, i think it would just be config updates to add vanilla into chocolate 19:47:15 is it maybe worth doing that and keeping it up until they pull the plug? 19:47:16 so basically hpe told osu-osl this, but not us... no idea what the story really is 19:47:24 busy time and all 19:47:42 corvus: maybe? the vanilla hardware is far more flaky than chocolate based on patching experience 19:48:13 I mean, I'm happy to try adding vanilla to chocolate, if we want to give it a shot 19:48:16 there is a possibility that bad hard drives and or ram would cause problems if running on vanilla computes. I think we want ot monitor that closely if we rehome them 19:48:23 I won't stop anyone from doing it :) 19:48:37 just want to make sure we have a common understanding of what may end up happening there soon 19:48:50 ok, maybe not worth it 19:49:17 part of the complexity with the hpe/osu/osf triangle is that hpe can't donate hardware to a 501c(6) like osf, but can donate it to an educational institution like osu 19:49:29 I mean, they can't delete the domain name, so we have that going for us :D 19:49:43 so some communication was bypassing openstack people entirely and going between hpe and osu 19:50:44 I do think if people want ot learn more about openstack they should have it, don't let me stop you. But I don't think it needs to be a priority. 19:51:07 (patching meltdown on infracloud felt like a priority last week due to the way meltdown could be exploited by VMs) 19:51:29 it's also a great opportunity if we have another organization who wants to donate hardware/hosting we might reboot it on 19:52:03 any other infracloud concerns/ideas/topics? 19:52:03 we've heard from some off and on but nothing has really panned out so far 19:52:09 fungi: ya 19:52:31 fungi: yah, there have been some talks about what to do with tripleo-test-cloud-rh1 after OVB jobs move into RDO, but nothing 100% yet 19:53:05 only 8 minutes left /me continues on 19:53:08 #topic Project Renames 19:53:18 mordred: fungi ^ any new progress on making these doable with zuul v3? 19:53:31 I think I had changes pulled up to review last week then metldown patching happened and my brain turned to mus 19:53:41 i haven't touched it at all, no 19:54:22 clarkb: I think the jeblair patch to remove names is the biggest thing 19:54:45 mordred: is that in or do i still need to go review it? 19:54:56 * clarkb seems to recall the zuul changes made it in now we have to update configs? 19:54:57 the patch to remove the need for names is in and running 19:55:06 I think he patch-bombed already too 19:55:23 woot 19:55:25 we also may need to make non-existent required-projects non-fatal from a zuul config loading perspective 19:55:36 oh right taht was the thing that came up last week 19:55:43 fatal to the job but not zuul as a whole 19:55:46 no i haven't patch-bombed; i mentioned that earlier. but we don't have to wait on that -- just go ahead and remove the project name from any projects that need renaming. 19:55:54 and there was talk about ditching the system-required template in favor of implicit templates? 19:56:00 jobs that are referencing renamed projects in required-projects will still need to be fixed for the job to work - but there is a difference between a broken job and zuul not eing able to start 19:56:07 corvus: ++ 19:56:28 coincidentally, we found another reason to make required-projects non-fatal -- it lets us do some cool things with cross-repo-dependencies 19:56:38 corvus: is that something you think you'll be adding too then? 19:56:44 can we stop using system-required? afaik it does nothing at the moment because we're no longer running merge-check jobs 19:56:50 (we don't need a volunteer for the feature?) 19:57:04 so i think that's likely to happen (it's got 2 things going for it now) 19:57:19 sounds like we might be able to do renames soon then :) 19:57:26 #topic open discussion 19:57:36 really quickly before we run out of time? anything else? 19:57:41 and if we need something like system-required in the future we can switch to an implicit template at that time? 19:57:49 how urgent is it? can we do a rename without that? could we say we'll do a rename with disabling/re-enabling the job for now? 19:58:06 ianw and linaro are working on uefi + gpt support in dib so that we can built arm64 images 19:58:17 mordred: ^ I think you have the best grasp of how badly that affects your rename 19:58:42 looking to start gerrit upgrade testing for next week on review-dev.o.o 19:58:43 i don't suppose dropping system-required is urgent, just simplifies project renames and new project creation to have on fewer place to potentially need to list a project 19:59:01 s/on fewer/one fewer/ 19:59:02 i'm hoping to do that soon, but honestly, i'd like to devote more time to 3.0-blocker reviews and memory improvements first, if possible. that would probably set it back a few weeks. 19:59:11 corvus: looking at codesearch, I think the one I was most concerned about from a rename perspective is only listed in required-projects in jobs in 2 repos 19:59:42 corvus: so I think yes, removing/altering the test jobs then doing rename seems fine 19:59:46 pabelanger: maybe we can sync up on that again this week as I will be largely afk next week 19:59:46 will keep fiddling on gpt stuff; we also got new node builds out with pti ... i guess nothing has exploded yet 20:00:08 ok. i'll keep it in mind as a target of opportunity, but not put it at the top of the list. 20:00:25 corvus: mordred fungi sounds like a plan, maybe we can schedule a project rename next meeting? 20:00:25 clarkb: sure, recap would be fine 20:00:30 and now we are out of time 20:00:33 clarkb: wfm 20:00:34 Thank you everyone! 20:00:36 clarkb: thanks! 20:00:38 thanks clarkb! 20:00:41 #endmeeting