19:01:04 #startmeeting infra 19:01:04 Meeting started Tue Mar 16 19:01:04 2021 UTC and is due to finish in 60 minutes. The chair is clarkb. Information about MeetBot at http://wiki.debian.org/MeetBot. 19:01:05 Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 19:01:07 The meeting name has been set to 'infra' 19:01:20 fungi: just be careful around brick walls 19:01:26 oww 19:01:38 #link http://lists.opendev.org/pipermail/service-discuss/2021-March/000196.html Our Agenda 19:01:41 o/ 19:01:44 #topic Announcements 19:01:58 I will be out next week. ianw has offered to chair the meeting on the 23rd. Thank you ianw! 19:02:28 Also the DST change has happend in North America but the EU, Australia, and I'm sure others are still few weeks away so timezones are weird right now. Double chcek your meeting calendar :) 19:02:50 #topic Actions from last meeting 19:02:59 #link http://eavesdrop.openstack.org/meetings/infra/2021/infra.2021-03-09-19.01.txt minutes from last meeting 19:03:34 I didn't readd corvus' unfork jitsi action, but I did push a new patchset that is passing testing now. Maybe someone other than corvus or myself can review taht change and we can get it in? 19:03:43 #link https://review.opendev.org/c/opendev/system-config/+/778308 unfork jitsi meet 19:03:51 fungi: ^ looks like ianw already reviewed it, maybe you have time for that today? 19:04:35 oh, yup 19:04:40 after dinnert 19:04:42 thanks 19:04:47 #topic Priority Efforts 19:04:48 which is like dinner and tea? 19:04:54 #topic OpenDev 19:04:56 mmm tea 19:05:09 I have account inconsistency updates 19:05:27 since we last had our meetings I managed to correct all of the preferred email address don't have external id consistency issues. 19:05:36 That means the only issues we have now are with conflicts in external ids 19:05:52 I also corrected about 100 of those or so and the total number of conflicts is down to 545 19:06:23 I've since started looking at categorizing the next sets of accounts so that we can decide if we can safely clean more up and I think I have identified a number of accounts where this appears safe. 19:06:38 those notes can be found at review:~clarkb/gerrit_user_cleanups/notes.20210315 and if other infra-root can take a look athat would be great 19:07:21 They are tiered from what I expect are super safe cleanups to less safe cleanups but all seems fairly safe. But feedback on where the line seems to be for contacting people first would be appreciated 19:08:21 also fungi and I found mroe correlation of manage-projects to gitea sadness which prompted https://review.opendev.org/c/opendev/system-config/+/780904 19:09:07 essentially it undoes our always update descriptions change and offers some further improvement suggestions in the commit message 19:09:15 it does pass testing now so I think we can start reviewing that 19:09:33 anything else opendev related or should we move on? 19:11:09 #topic Update Config Management 19:11:19 ianw has change(s) up to ansible krb5 servers 19:11:26 I've reviewed the main change and it lgtm 19:11:40 ianw: do you need more reviews and/or are there newer changes to look at? 19:12:24 well it *should* be a no-op for production 19:12:49 but clearly one i'll want to babysit closely 19:13:11 so if noboy has particular interest your review probably suffices 19:13:13 what's the review topic? i think i reviewed some of those but not sure if it was the most recent revisions 19:13:27 also possible i forgot to vote on them 19:14:40 #link https://review.opendev.org/q/topic:%22kerberos-server%22+(status:open%20OR%20status:merged) 19:15:21 looks like there are a couple I need to review, I'll look at those after lunch 19:15:35 are there any other config management updates to call out? 19:16:00 i have a little cleanup of sslcert checks in 19:16:02 #link https://review.opendev.org/c/opendev/system-config/+/780140 19:17:47 #topic General Topics 19:17:54 #topic Picking up steam on server upgrades 19:18:26 I hit a little issue with nodepool launcher replacements that required an update to nodepool itself to address (basically its unique launcher name registration awsn't unique enough) 19:19:01 The current opendev hourly deploy is expected to configure nl01.opendev.org. Hoping that transition will be completed real soon now 19:19:48 it is a bit more of a dance as we need to coordinate config changes between the servers as we do it. Though I expect we can do 02-04 in one go if 01 looks happy 19:20:07 #topic Deploy new Refstack server 19:20:29 I suspect that this may be the last time I need to bring this topic up? DNS has been updated to point at the new server and we corrected a couple of issues 19:20:44 ianw: kopecmartin: can we call this done? or do we want to keep it up for things like db backups etc 19:21:03 db backups are done, i fixed that yesterday 19:21:22 if we are happy, we can remove the puppet 19:21:23 excellent 19:21:25 #link https://review.opendev.org/c/opendev/system-config/+/780138 19:21:50 seems like we may have reached that point 19:21:54 have we ripped out the old puppetry and retired the module repo? 19:21:59 thank you to ianw and kopecmartin for pushing this over the finish line 19:22:00 i can leave a todo to cleanpu the old server/db in a month or so just for saftely 19:22:03 fungi: thats the cahgne above 19:22:06 safety even 19:22:12 yup, got it 19:22:17 i haven't retired the puppet module 19:22:18 ianw: sounds good 19:22:26 there's probably several we could do that to now 19:22:31 i think it's done, thanks for the help with that 19:24:05 #topic review server upgrade 19:24:11 #link https://review.opendev.org/c/opendev/infra-specs/+/780478 19:24:34 after our discussion last week ianw put together a spec document to help outline the considerations to make when replacing this server 19:24:39 i don't want to tie us up in paperwork over this :) 19:25:18 yeah, i think the spec is pretty complete out of the gate, though seemed like you wanted mnaser to add some info 19:25:44 I think the idea was to use it as a way for mnaser to double check he was comfortable as a hosting option 19:25:54 since it outlines the considerations we should make 19:26:19 yeah, i feel like we are tending towards vexxhost (if they'll have us :) 19:27:02 y'all are welcome to use it 19:27:07 happy to coordinate all this 19:28:50 mnaser: do you think you could look over https://review.opendev.org/c/opendev/infra-specs/+/780478 and double check there aren't any major gotchas or concerns that jump out to you in that? 19:29:12 and if not we'll go ahead and start doing some testing of ah ost in vexxhost? 19:29:32 id suggest backups to be in another region 19:29:52 mnaser: you mean we should move our current backups or that the gerrit host be bakced up to a different region? 19:30:07 i think you'll find better hardware in the montreal region 19:30:15 and im not sure wher ethe backups are rifght now 19:30:28 we have a backup server in sjc-1 19:30:30 I think ianw said they are in sjc1 so we should be good if we use montreal for the review host 19:30:45 we don't currently have anything in mtl 19:31:06 ah ok perfect, the rest seems fine to me, obviously suggest using v3 flavors and boot from volume 19:31:49 great, thank you for checking. I guess we can go ahead and spin something up and do comparison of performance? 19:32:02 yep, go for it, and if you see anything odd, let me know 19:32:14 mnaser: will we still need to coordinate ipv6 reverse dns, or you have an interface to set that now? 19:32:29 fungi: no we'll have to coordinate for that 19:32:40 ipv4 too iirc 19:32:44 yeah 19:32:50 but can easily be done 19:32:58 i haven't checked but do we have some quota there for the /home/gerrit2 volume on something fast? 19:33:12 or maybe the question is more, how should i best create that volume 19:33:20 one other thing we probably should have added to the spec, the server *does* directly send e-mail via smtp 19:33:29 fungi: ++ 19:33:29 you should be able to create a volume with volume_type=ssd 19:33:43 (all those gerrit watched change/projects notifications to users) 19:33:47 we don't block port 25, so that shouldn't stop you from having any problems out of the bat other than rdns 19:33:57 great, thanks mnaser! 19:34:24 ok, i can take an action item to start up a server and see how we go 19:34:37 i have a few related changes out 19:34:39 #link https://review.opendev.org/q/topic:%22review-update-2021%22+(status:open%20OR%20status:merged) 19:34:54 removing review-dev from testing is the big one 19:34:56 btw, you can ping me or guilhermesp in #vexxhost 19:35:04 should you need anything 19:35:21 good to know, thanks 19:35:31 thanks, joined :) 19:35:45 anything else on this topic? 19:35:59 so i'll look at at ~96gb instance right? 19:36:04 i.e. we want to go a bit bigger 19:36:41 ++ particularly since we want to move the db 19:36:52 but that should also give the kernel space for caches again 19:37:00 I expect that will make for a much more responsive experience 19:37:18 i think we have v3-standard-16 which is 16C/64G memory or v3-standard-32 which is 32C/128G memory 19:37:40 mnaser: is it better to just use the bigger flavor or figure out a 24C/96GB flavor? 19:37:46 16C/64G is very close to what we have now 19:37:51 i think it's much easier for you to leap to 32C/128G than me getting 24C/96G up and going 19:37:59 ok that works for me :) 19:38:08 didn't think it would be an issue =) 19:38:39 hrm, it doesn't seem we have an openstackci tenant in mtl? 19:38:44 also with the provisioned port trick, we can always move to a smaller flavor later without an ip address change if desired 19:38:57 ianw: we should unless somethign changed 19:39:07 ianw: the actual region name is ca-ymq-1 or similar 19:39:23 yeah, mtl is the new name but same region 19:39:27 yes, sorry for the gross name :) 19:39:47 well, ymq is also a local airport ;) 19:39:50 oh! ok, that makes sense 19:40:02 i thought we were talking about another region all together :) 19:40:22 ca == canada ymq == montreal airport 19:40:34 i would kindly suggest looking at moving zuul over too not far after 19:40:42 just to drop the amount of ingress/egress traffic overall 19:41:06 so i see v3-standard-96 19:41:08 mnaser: though that will shift the ingress/egress to zuul and may not necessarily be the dent you think it is 19:41:22 (definitely something to consider though) 19:41:24 but only v1-standard-128 19:41:25 ah yeah, cause zuul caches stuff mostly 19:41:41 yeah you want to go for v3-standard-32 which is 32C/128G memory :) 19:41:46 # = cpu count and not memory 19:42:13 ok ... :) i'm getting there it's early 19:42:26 although it's nice when the number of cores can be easily confused with the number of gigabytes of ram :) 19:43:02 Ok lets move on 19:43:06 i think i am going to call it review02.opendev.org (even though we never really had an 01) 19:43:10 ianw: wfm 19:43:18 yeah, move on, getting into the weeds :) 19:43:20 #topic PTG Planning 19:43:22 that'll make some of our dns gymnastics easier, yeah 19:43:42 I've tried to spend a little time brainstorming what a PTG would look like for us 19:43:55 I expect it would be really low key and not all that different than when we chat on IRC 19:44:26 we could chat on irc like usual, and put up a commemorative "ptg" themed /topic? 19:44:28 I'm leaning towards not trying to PTG time for us as a result, but am considering having a few office hours in case there are others who have testing/service/etc questions they want to bring up 19:44:50 but I'm not sure if others agree with me on that assessment 19:45:11 If you'd like to have voice/video time during the week of PTG I'll sign us up and work on a small agenda 19:45:37 about the only thing which comes to mind is we could take it as an opportunity to try to arrange a sprint around something 19:45:54 that isn't a bad idea, and I hadn't considered that actually 19:46:21 do actual coding and reviews rather than primarily be discussion focused 19:47:26 we do also have some new config-core reviewers. maybe it could be a good opportunity to help answer questions they have (that's more in-line with your office hours idea i guess) 19:47:39 fungi: ++ that is the sort of thing I had in mind for office hours 19:47:58 er, not actually new config-core reviewers yet, but volunteers who are helping config-core 19:48:29 I think I'll go ahead and sign us up but note in the survey we intend the time to be office hours type and maybe we'll try sprinting if we can come up with a good topic 19:49:17 #topic Open Discussion 19:49:52 nl01.opendev.org got ansibled. It didn't auto start the launcher so I'm going to do that momentarily. I don't expect it to cause problems but we'll have two different launchers operating on the rax nodepool providers 19:50:10 the openstack.org launcher has non zero max server values, the opendev.org launcher has 0 max-server values so it should largely be idle 19:50:25 then we can land the change to flip the configs around, then we can retire nl01.openstack.org and clean things up 19:50:27 actually, i guess that last topic was a good reminder, gmann and amotoki have been taking a closer look at config changes, and they've been a big help 19:50:59 please let me know if you see nodepool weirdness today :) 19:51:16 fungi: oh tahnks for the heads up. I guess they were part of the volunteer group too? 19:51:36 yep 19:52:08 zuul is running a crazy amount of new code; so keep an eye out for behavior changes 19:52:27 an area of particular interest (since it changed recently but rarely does) is git repo prep 19:52:55 git repo state is frozen and consistent for all repos in a buildset now 19:53:00 or at least should be 19:53:13 oh, neat, i didn't realize it wasn't before 19:53:38 (previously, the repo of the change under test (and any dependent changes) would be consistent for all builds in a buildset, but repos in required-projects could theoretically be different) 19:53:51 something i need to brainstorm a bit, with the new gerrit api and acls having the ability for us to dole out branch deletion permissions now, it would be nice if we could scale the openstack release managers permissions down from global to just over the openstack namespace. gerrit does have the ability to do acl inheritance, but the inherited acl needs a "project" to contain it... would creating an 19:53:53 empty project for that be ugly, are there alternative ways to do that? 19:54:02 fungi: yeah, i mean, it *pretty much* was -- this is definitely an edge case 19:54:30 fungi: my understanding is "empty project" is the gerrit way 19:54:33 fungi: ya I think creating the empty repo is the way to go 19:54:43 okay, i'll get to work proposing that 19:55:33 there are a few hundred pending branch deletions for eol openstack project branches, and i'd rather not wind up being the one to do them 19:55:47 ++ 19:55:49 but this is a good opportunity to test out the new permissions model 19:57:45 alright sounds like that may be it. Thank you everyone! 19:57:51 #endmeeting