19:01:41 #startmeeting infra 19:01:42 Meeting started Tue Aug 27 19:01:41 2019 UTC and is due to finish in 60 minutes. The chair is clarkb. Information about MeetBot at http://wiki.debian.org/MeetBot. 19:01:43 Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 19:01:45 The meeting name has been set to 'infra' 19:01:55 #link http://lists.openstack.org/pipermail/openstack-infra/2019-August/006458.html Our Agenda 19:02:00 the bot is super helpful 19:02:07 #topic Announcements 19:02:11 if only it could read your mind 19:02:23 OpenStack election season is upon us 19:02:38 fungi has mentioned he will be busy helping there quite a bit this time around 19:02:54 also now is your chance to run for PTL of $project or TC if you would like to 19:03:07 I think nominations open sometime today? 19:03:10 (or just run screaming?) 19:03:29 yeah, in another 4 hours and change 19:03:36 #link https://governance.openstack.org/election/ 19:04:01 Also this weekend is a long one for many of us (I'll not be around Friday though may actually work the holiday itself to make up for Friday) 19:04:13 though folks are welcome to push up nominations ahead of time, the officials just won't confirm them before the start of the official nomination period 19:04:35 fungi: gotcha 19:04:45 yeah, i've been informed that this weekend is when i'll be getting all sorts of construction projects done around the house 19:04:53 so i don't expect to be on hand much 19:05:01 we have family/friends in town so are doign things with them 19:05:24 #topic Actions from last meeting 19:05:31 #link http://eavesdrop.openstack.org/meetings/infra/2019/infra.2019-08-20-19.01.txt minutes from last meeting 19:05:43 ianw did do an audit of the static server which is an agenda item further on in the agenda 19:05:52 lets keep moving and we'll talk about it there instaed 19:06:00 thanks ianw! 19:06:00 #topic Priority Efforts 19:06:17 #topic Update Config Management 19:06:48 Anything to add under this topic? I've been distracted by other things the last week or so and haven't noticed much from others 19:07:15 i think focus has been elsewhere 19:08:16 #topic OpenDev 19:08:50 Similar to config management updates I think things have gotten quiet here too. I'ev not had time to reload my golang stuff into memory to run tests on my fix for that or file the bug I talked about last week 19:08:59 I suppose I should just go ahead and file the bug/do a PR 19:09:07 the updates to publication jobs are sort of on topic here 19:09:36 ajaeger's updates to use the promote pipeline? 19:09:43 but again probably better covered under the separate topic entry 19:09:59 yeah, they're somewhat intertwined with ianw's review of static hosting as well 19:10:04 gotcha 19:10:10 We'll keep moving then 19:10:19 #topic Storyboard 19:10:34 i was able to work out how to reenable slow query logging 19:11:03 and how to use the pt-query-digest tool corvus and mordred found 19:11:40 and so i have an updated analysis, i'm just now fiddling with a local openafs issue so i can post the results somewhere consumable 19:12:00 excellent news. 19:12:15 * SotK looks forward to seeing it, thanks 19:12:18 fwiw I almost always have local afs issues because tumbleweed and usually just use one of the fileservers directly 19:12:47 the in kernel openafs stuff that ianw has been helping along should help my install quite a bit though 19:13:17 i almost never have afs trouble, but for some reason today i'm getting connection timeouts on operations 19:14:06 #topic General Topics 19:14:18 First up is server upgrade progress 19:14:23 #link https://etherpad.openstack.org/p/201808-infra-server-upgrades-and-cleanup 19:14:30 aha, and afs fixed after an lkm reload and a client service restart' 19:14:46 fungi: the next step with wiki-dev is to redeploy it with the split out static content and the bumped release version? 19:15:07 shouldn't have been a need to redeploy since the static content was never there 19:15:25 what about the upgrade? I guess that just updates the git checkout and submodules? 19:15:41 but i've now overwritten the wiki-dev trove db contents with the most recent mysqldump from the production server 19:15:56 and copied in a tarball of the images tree from production to the new cinder volume on wiki-dev 19:16:28 and yeah, a redeploy might not hurt, or i can just blow away /srv/mediawiki and let puppet put it back again to be sure 19:16:42 was going to do that next because content currently isn't loading 19:16:48 it seems the server is serving a blank page currently? 19:16:53 ya 19:17:17 after that, i'll be diffing the production Settings.php (mediawiki's primary config file) against the templated one puppet is installing on wiki-dev 19:17:26 and coming up with a patch to update it accordingly 19:17:40 sounds good thanks 19:18:04 oh, also the old wiki-dev server has been deleted, since there's nothing of value on it anyway 19:18:40 and on the new server the only state is being stored in trove and cinder now 19:19:05 whereas before it was the root disk and trove right? that should simplify future replacements to have the static resources on cinder 19:19:11 yeah 19:19:13 though we might want to look soon at moving the db to a local mysql and out of trove 19:19:21 but that's less critical 19:19:45 first i want to get a working wiki deployed from puppet-mediawiki and then blow away the old production server and replace it 19:19:58 and then we no longer have that liability hanging over us 19:20:04 ++ 19:20:08 after that we can look at further upgrades 19:20:22 Next up is ianw's audit of static.o.o services (which ties into server upgrades because static needs upgarding or replacing) 19:20:37 #link https://etherpad.openstack.org/p/static-services service audit 19:21:15 ianw: it sounds like you want to try using haproxy as a light weight http redirector 19:21:23 so yeah, last week was an action item to look into what to do 19:21:45 fwiw, we have redirect sites hosted on files.o.o already 19:21:56 that was one thought, to get the legacy redirects it currently does off it 19:22:11 #link https://review.opendev.org/#/c/677903/ Generic HaProxy ansible configs 19:22:19 #link https://review.opendev.org/678159 Service load balancer/redirector 19:22:19 and also to provide another load-balancer we could expand a bit but that doesn't overlap with gitea 19:22:24 are the changes related to that 19:22:41 curious why haproxy is preferred over the ones we're already serving with apache 19:23:09 fungi: well, i'm just thinking that the future isn't static servers setup with apache 19:23:27 instead it's... static servers set up with haproxy? ;) 19:23:53 well it's containerised haproxy, so sort of one step above that :) 19:24:20 ianw what do you think we should do with the sites that are more than redirects? 19:24:27 got it. does containerizing haproxy have any benefits over just installing the packaged haproxy? 19:24:32 or feel free to say you don't have an opinion on that yet :) 19:24:45 fungi: its currently the only way we are deploying it so consistency I would guess 19:24:46 that's the other question; is there are reason the other sites can't use afs based publishing? 19:25:06 clarkb: what's currently the only way we are deploying? 19:25:16 fungi: haproxy in container deployed with docker-compose 19:25:21 fungi: we don't deploy haproxy from packages 19:25:32 ianw: i had always assumed we were moving them all to afs and making them vhosts on files.o.o 19:25:40 just hadn't found time to do that yet 19:25:51 clarkb: oh, i get you 19:26:01 haproxy specifically 19:26:01 ok, so if it's just "haven't found time" v "can't be done" that's good 19:26:02 ya I think moving to afs volumes is likely a reasonable long term strategy. Before that happens we may want to sort out some of the vos release problems we've had recently though? 19:26:31 also I think the xenial to bionic upgrade for the fileservers will require outages because we can't mix the versions of afs on the fileservers like that? 19:26:44 which is a risk/downside but not a reason to avoid it 19:27:40 ianw: the tricky bits are likely going to be tarballs (if we want to roll it into the new tarballs.opendev.org volume, because of a slightly different layout lacking prefix namespacing) and governance (because it will need something akin to the root-marker setup we use to keep documentation subtrees from wiping each other) 19:27:40 so, i think this is probably all pointing towards "static.o.o" is not upgraded, but retired 19:28:30 right, i would much prefer the outcome where static.o.o is simply no longer in use and we deleet it 19:28:49 extra effort to upgrade it seems like a waste 19:29:23 ok sounds like two next steps can be "move redirect hosting of current static.o.o redirects" and "move one vhost with files from static.o.o to afs" then "move the other vhosts to afs" 19:29:42 with the goal of simply deleting static.o.o in the nearish future 19:30:19 yeah, i mean if people wish to put names against parts of https://etherpad.openstack.org/p/static-services ...? 19:30:42 ianw: maybe we can put a TODO list at the bottom that distills the info above? 19:30:50 but then ya we can start grabbing work off that TODO list when it is there 19:31:00 sure 19:31:23 thanks! 19:31:33 we can think about the generic load balancer (with redirects) into the future ... i'm not tied to it 19:31:34 if we're going with the plan of doing redirects with haproxy instead of apache, then we probably need to include in that a plan to move our other redirects to the same new solution for consistency 19:32:05 i don't personally see the benefit, but will admit i haven't looked super closely at the proposed change yet 19:32:05 ya I think to start reusing the redirects on files.o.o is likely the simplest thing 19:32:21 but like fungi I should review the changes before committing to that :) 19:32:33 if the implementation of doing redirects in haproxy is simpler and cleaner then i could be convinced, surely 19:32:48 fungi: that is one nice thing, it's all just in one config file 19:32:55 rather than split into vhosts 19:33:17 well, apache could be done that way too if we wanted 19:33:39 i think it makes most sense if it was also a front-end doing some other lb tasks for other services too 19:33:43 one vhost with all the redirect names as aliases, and then individual redirect rule groups for eth various targets 19:33:58 fungi: how does that work with ssl? 19:34:11 would that force us to use multiname certs? 19:34:47 nope, could just use sni 19:34:51 none of the ones on static.o.o have provided ssl anyway, which simplifies things a bit 19:34:54 we do that on it already in fact 19:34:58 ianw: oh that is good to know thanks 19:35:06 fungi: right but isn't a cert usually mapped to a single vhost? 19:35:21 (I'm not sure how to use multiple certs in one vhost for different names) 19:35:30 oh, good question, i'd have to double-check what the options are there 19:35:32 sounds like it is a non issue in this case thankfully 19:36:07 well, a non-issue at the moment, but if we want to start redirecting more openstack.org sites to opendev.org in the future it could become relevant 19:36:15 ya 19:36:22 but we can also sort that out outside the meeting 19:36:33 lets note that as a potential item to solve for any redirecting setup and mvoe on 19:36:51 anyway, i agree that we should consider changes in how we host site redirects as a separate effort from moving stuff off static.o.o so it's not a blocker 19:37:30 Next up is Fedora's mirror vos release struggles 19:37:45 #link http://eavesdrop.openstack.org/irclogs/%23openstack-infra/%23openstack-infra.2019-08-26.log.html#t2019-08-26T07:25:16 19:38:23 yeah, right now it's releasing again ... it seems to finish but something is wrong with it taking this long 19:38:27 ianw: I like the idea of improving our debugging. We might also be able to tcpdump those transfers (though that may produce way too much data?) 19:39:03 worst case, can we mirror into a new volume and then dump the old one? 19:39:05 audit logging might be the next step; see 19:39:09 #link https://review.opendev.org/#/c/672847/ 19:39:25 although, !0 chance restarting the servers might just fix things mysteriously too 19:39:34 but yeah. knowing *why* it's doing this would be good 19:39:47 since if it's happened once it could quite possibly happen again 19:39:47 ianw: magic 19:39:56 #link https://review.opendev.org/#/c/678717/ 19:39:58 fungi: it has happened several timse now so I expect it will keep happening 19:40:04 updates the logging to have more timestamps 19:40:28 "has happened several time" that vos release is slow, or that we've made the problem go away and then it's come back 19:40:29 especially timestamping logs of vos release might be useful when correlating across audit logs 19:40:36 great I'll do my best to review those quickly 19:40:37 and for different volumes or just this one? 19:40:42 the vos release slow is new i think 19:40:50 it just not working in various ways is not :) 19:41:01 fungi: many of them were sad at the end of july but this is the only one that has been persistent about it 19:41:25 fungi: epel, suse, fedora, and I think somethign else all went out to lunch ~july 26 19:41:39 then ianw and I fixed them a coupel weeks ago and fedora is theo nly one that continues to struggle aiui 19:42:10 this seems like a reasonable path forward (more debugging info) and lets see what we catch 19:42:18 i think that's right, given opensuse seems to be upstream issues 19:43:39 We have ~2 topics left lets keep moving. But please do review those two changes when you get a chance infra-root 19:43:55 Next up is Kayobe needs an opendev.org project rename to be included in the openstack train release? 19:44:08 I feel like we might be conflating two different things there 19:44:23 fungi: ttx ^ can you expand on that from the TC perspective? 19:45:06 yes, kayobe is currently not in the openstack namespace 19:45:10 The two things are "where the project is hosted" and "is this project considered openstack by the TC" 19:45:25 in the past the openstack TC has considered things outside of the openstack namespace to be openstack 19:45:41 i gather the openstack release team isn't comfortable doing release management on kayobe until it's renamed 19:45:47 into the openstack namespace 19:45:50 and we've done our best to reconcile the hosting name delta when it makes sense 19:46:07 (aka when we can take a gerrit downtime) 19:46:35 fungi: ok this is just news to me that we are conflating the two because my memory says we've never done that before 19:46:43 instead we've just done our best to be eventually consistent 19:47:45 i thought we were just going to discuss scheduling the next rename, but i can dig into the details from the release team's perspective and find out if it is related to their automating and tooling needing fixes to deal with projects in the x namespace 19:48:08 though they released things in the openstack-dev (and maybe openstack-infra) namespaces so i would be surprised if that's the case 19:48:16 ya the struggle is getting it down early enough if that is indeed a requirement for them 19:48:44 with all the other end of cycle stuff going on we've typcially not done renames until early the next cycle 19:49:00 however we could probably make it work R-4? 19:49:03 that looks like a quiet week 19:49:20 and is pre RC1 deadline 19:49:38 Would have to be early that week for me as I'm traveling at the end of that week 19:49:54 maybe we pencil in September 16 and run that by the release team? 19:50:25 yeah, no mention of "kayobe" in the last release team meeting logs 19:50:26 * mordred waves to clarkb 19:50:33 * mordred apologizes for tardiness 19:50:43 the only mention of them in #openstack-release was on august 1 19:51:10 in that case maybe that is our best option, aim for the 16th and bring that up with the release team to see if that works for them 19:51:23 mordred: ^ does the 16th work for you to do project renames? 19:51:39 mordred: might be good to have you and/or corvus around due to the relatively recently chagnes to gitea management 19:52:14 aha, seems to be related to this: 19:52:15 clarkb: I will be back from vacation on the 16th - although I'll be in EU timezone 19:52:16 #link https://review.opendev.org/673234 Add deliverable file for Kayobe 19:52:24 so - yes - pending hour of day 19:52:35 mordred: early morning is probably best anyway since it is early week (getting done sooner is better) 19:52:51 (speaking of - I'm on vacation 31/8-14/9) 19:52:55 okay, so if we don't rename first, then the deliverable name will have to change in releases after the rename 19:52:55 I mean my early morning 19:52:55 clarkb: woot 19:53:05 clarkb: yeah - your early morning will work great for me 19:53:15 * mordred enjoys ambushing clarkb with tons of things in his early morning 19:53:22 also it's going to be a pain for upcoming election work because governance now refers to a repositories which do not exist 19:53:48 fungi: well the change ahsn't merged right? so no reference yet? 19:53:55 oh wait governance not releases 19:53:56 got it 19:54:11 fungi: the 16th is after polling begins 19:54:20 fungi: so your voter rolls should be complete by then 19:54:24 hopefully thatsimplifies things? 19:55:19 if goverance says openstack/kayobe then that seems buggy fwiw 19:55:28 should say x/kayobe then update after we rename 19:55:36 clarkb: that change isn't what matters 19:55:53 #link https://review.opendev.org/669299 Add kayobe as a kolla deliverable 19:56:04 that added nonexistent deliverable repos to openstack governance 19:56:28 those repos do exist 19:56:28 now i likely need to get the tc to rush through an un-renaming of those in governance 19:56:43 oh, nevermind 19:56:53 I think if we do the reanme on the 16th that solves the election problem 19:56:56 i mistakenly thought they'd been added with the openstack namespace 19:56:59 but I don't know if it solves any releases problems 19:57:02 so yes, elections will be fine 19:57:05 * fungi sighs 19:57:18 this is all stuff i should have thought to research before the meeting, sorry 19:57:28 Lets aim for the 16th and bring that up with the release team 19:57:38 #topic Open Discussion 19:57:46 and now we have ~2.5 minutes for any other topics 19:58:04 I will be venturing into the city to take photos for a visa I need to apply for today. I'll be AFK for however long that takes 19:58:14 then back again to review teh changes ianw pointed out earlier that need review 19:58:23 clarkb: have fun with that! 19:58:40 one quick note on pbx.openstack.org ... i noticed yesterday it was dead (looking at cacti) 19:58:53 it was under a huge pbx spam attack that flooded the logs i think 19:59:00 Also be aware that tehre is a zuul bug that can cause zuul to test the wrong commit under special circumstances. Fixed here https://review.opendev.org/#/c/678895/3 19:59:17 this is a thing ... http://www.voipbl.org/ is all about mitigating it a bit 19:59:18 I don't think the risk for that is super high right now but will be as soon as openstack starts branching stable/train 19:59:35 ianw: of course there is a voip pbl :) 19:59:42 out of interest i installed their rules, but then realised it's 85,000+ iptables blocks 19:59:45 ianw: is it pretty simple to add that to asterisk? 20:00:16 and we are at time. Thanks everyone 20:00:19 #endmeeting