Friday, 2023-03-10

opendevreviewMerged openstack/project-config master: Add charms-stable-maint group for charms projects  https://review.opendev.org/c/openstack/project-config/+/87240600:28
opendevreviewMerged openstack/project-config master: Retire puppet-tacker - Step 1: End project Gating  https://review.opendev.org/c/openstack/project-config/+/87453900:29
fungiclarkb: ^ you were wanting to see successive project-config changes and how checkouts impacted things at deployment00:30
fungii approved a few00:30
opendevreviewMerged openstack/project-config master: Periodically update Puppetfile_unit  https://review.opendev.org/c/openstack/project-config/+/87530200:31
clarkbthanks. I think there is a weird interaction where the job fails successfully or something in the rename case though? its definitely something I want to dig into to understand better and probably document00:32
opendevreviewMerged openstack/project-config master: Add the main NebulOuS repos  https://review.opendev.org/c/openstack/project-config/+/87605400:32
opendevreviewMerged openstack/project-config master: Add Ironic Dashboard charm to OpenStack charms  https://review.opendev.org/c/openstack/project-config/+/87620500:32
clarkbTomorrow I'll look at finishing up the gitea05-07 deletions00:36
clarkbI don't expect anyone has stashed anything they need on those servers but infra-root consider this your warning00:37
fungii definitely haven't00:37
opendevreviewMerged openstack/project-config master: Add the NebulOuS tenant  https://review.opendev.org/c/openstack/project-config/+/87641401:43
fungiyoctozepto: ^ that's deployed01:57
fungihttps://zuul.opendev.org/t/nebulous/jobs01:58
fungijust inherited stuff for now, but it's there and ready for next steps01:59
opendevreviewOpenStack Proposal Bot proposed openstack/project-config master: Normalize projects.yaml  https://review.opendev.org/c/openstack/project-config/+/87705702:15
ianwi took the liberty of adding yoctozepto to nebulous-core02:41
fungioh, good thinking!02:41
fungiianw: nebulous-project-config-core was made as a separate group too02:43
ianwok i added to that too :)  luckily i still have the admin console up from pushing changes yesterday02:44
fungithanks!02:45
opendevreviewMerged openstack/project-config master: Normalize projects.yaml  https://review.opendev.org/c/openstack/project-config/+/87705702:54
WhoisJMHhello, I have a question. In the openstack environment built using devstack in the Ubuntu 20.04 environment, the instance is created and operated well, but there is no problem.   When I try to create a new instance, the message "Exceeded maximum number of retries. Exhausted all hosts available for retrying build failures for instance" appears and the creation fails. Although it is a single node environment, the server has enough resources for cpu, ram,07:12
WhoisJMHWhich part should I check first to solve this problem?07:12
yoctozeptoWhoisJMH: hi, nova-compute logs will be best now to know the reason for rejection; just note this channel is not devoted to openstack support, please go to #openstack for further queries07:38
yoctozeptoianw, fungi, clarkb, frickler: thanks for all your feedback on the new project+tenant and for merging that; I will proceed with setting up the tenant today and let you know how it goes07:43
*** jpena|off is now known as jpena07:46
yoctozeptojust one last request for now - please also add me to nebulous-release ;D07:57
ianwyoctozepto: done :)08:22
yoctozeptoianw: many thanks08:23
opendevreviewdaniel.pawlik proposed zuul/zuul-jobs master: Provide ensure-microshift role  https://review.opendev.org/c/zuul/zuul-jobs/+/87608109:00
opendevreviewdaniel.pawlik proposed zuul/zuul-jobs master: Provide ensure-microshift role  https://review.opendev.org/c/zuul/zuul-jobs/+/87608109:10
opendevreviewdaniel.pawlik proposed zuul/zuul-jobs master: Provide ensure-microshift role  https://review.opendev.org/c/zuul/zuul-jobs/+/87608109:31
bbezakHi, I'm having quite a lot of network connectivity issues - only involves 'provider: inmotion-iad3' ones.10:52
bbezakInterestingly it happens towards tarballs.openstack.org. On both ubuntu (focal,jammy) and centos stream 8 jobs. Most often than not I'm affraid (but I saw good runs too on the occasions on iad3, but less often). I haven't seen those issues on 'rax' provider for instance - https://paste.opendev.org/raw/btENz9poC0tQ0p3t7Hny/10:52
bbezakby the look of it, it started yesterday10:52
fungibbezak: looks like it's not just tarballs.o.o, the first failure i pulled up was complaining about reaching the releases site: https://zuul.opendev.org/t/openstack/build/3a9c0f69727f47ba8e7747eba3f2d678/log/primary/ansible/tenks-deploy#2030-203612:36
fungibut still from a node in inmotion-iad312:37
bbezakI've seen issues with releases as well. But not in last several runs, so I didn't mention it12:37
fungiwell, it helps to know that there's more than one site the jobs are having trouble reaching from there12:40
fungiand the nodes in that region are ipv4-only so we can rule out ipv6-related issues12:40
bbezakhowever those are resolving to the same static01.opendev.org fungi12:43
bbezak(at least from my end)12:44
fungioh, yes that's a good point, they're different sites on the same vm12:47
fungianyway, i'm checking for connectivity issues between that provider region and those sites12:48
bbezakthx fungi12:48
funginot seeing any packet loss at the moment12:50
bbezakit just failed on  173.231.253.119 fungi13:16
bbezakit got 200 on https://tarballs.openstack.org/ironic-python-agent/tinyipa/files/tinyipa-stable-xena.vmlinuz.sha256, but got Network is unreachable on  https://tarballs.openstack.org/ironic-python-agent/tinyipa/files/tinyipa-stable-xena.vmlinuz13:18
fungiyeah, whatever it is, it's clearly intermittent13:19
bbezakyeah, "the best" kind13:20
fungicould be an overloaded router in that providers core network and only some flows are getting balanced through it, for example13:20
fungii'm still trying to reproduce connectivity errors with lower-level tools13:20
fungicould also be farther out on the internet in some backbone provider13:22
fungithe route between those environments is, unfortunately, asymmetrical so will be harder to track down if so13:22
fungilooks like from inmotion to rackspace (where the static server resides) both providers peer with zayo, while in the other direction they both peer with level313:24
fungigoing through zayo it transits their atl facillity to get from iad to dfw, though the level3 hop between dfw and iad is not identifying itself currently13:27
fungimtr from rackspace to inmotion is recording around 0.2-0.3% packet loss at the moment13:28
funginot seeing any in the other direction, which is strange, but maybe just not a statistically significant volume of samples yet13:29
bbezakok13:31
fungibbezak: one thing to keep in mind, jobs shouldn't normally need to fetch urls like https://releases.openstack.org/constraints/upper/yoga since they can access the same constraints file from the openstack/requirements repository checkout provided on the test node13:32
fungiand we could look into baking the tinyipa kernels into our node images in order to reduce further traffic across the internet, or add the tarballs site to our mirrors in all providers (they're both backed by data in afs, so it would just be a matter of adding a path in the apache vhost to expose that to clients)13:34
fungimaking connections across the internet in a job should be avoided whenever possible (though we perform some brief internet connectivity tests in pre-run for all jobs in order to weed out test nodes with obviously bad internet connections)13:38
bbezakyeah, that makes sense, we have the var for requirements_src_dir already in the job, so it shouldn't be difficult to override it for the CI only13:40
fungisetting up some mtr runs from montreal and san jose to static.o.o as well for a baseline13:40
opendevreviewdaniel.pawlik proposed zuul/zuul-jobs master: Provide ensure-microshift role  https://review.opendev.org/c/zuul/zuul-jobs/+/87608113:57
opendevreviewdaniel.pawlik proposed zuul/zuul-jobs master: Provide ensure-microshift role  https://review.opendev.org/c/zuul/zuul-jobs/+/87608114:03
opendevreviewdaniel.pawlik proposed zuul/zuul-jobs master: Provide ensure-microshift role  https://review.opendev.org/c/zuul/zuul-jobs/+/87608114:48
fungimtr wasn't turning up any packet loss to static.o.o from other providers either, and the 0.3% loss i initially saw from there to inmotion dropped to 0.2%, then to 0.1% and eventially 0.0% so seems there may have been a very brief blip early in the mtr run but that's it15:02
fungii'm currently downloading tinyipa-stable-xena.vmlinuz on a machine in inmotion-iad3 in a loop with a 1-second delay, trying to get it to fail15:04
fungiover 2k downloads so far with no failures16:02
fungiwhatever the issue, i don't think it's steady, must come and go in small bursts16:03
clarkbas expected the gitea13 and 14 replication continues this morning16:15
clarkbits a bit more than halfway done16:15
clarkbfungi: sounds like typical internet behavior16:17
fungiyeah, close to 3k successful downloads and no failures. i'm going to stop the loop before i waste any more bandwidth16:17
fungithough i do think exposing the tarballs afs volume on our mirrors might be useful for some stuff like the ipa kernel downloads16:18
clarkbas far as adding the tinyipa image to test nodes the main struggle there is you end up with a bunch of versions and no one knows when it is safe to remove. If we do that I think we should explicitly state we can do latest and latest-1 and then older versions which are used less often can ocntinue to be fetched remotely. This is basically what we're moving towards with cirros16:18
clarkboh ya simply making use of our afs caches isn't a bad idea16:18
fungispeaking of cirros, should i go ahead and self-approve 873735? it's been about a month with no objections16:19
clarkbI've got no objections though I worry it may disrupt the openstack release somehow (the latest 6.1 version isn't used anywhere because it changes dhcp clients and tempest doesn't know how to interact with it or something to check dhcp things are working_16:20
clarkbbut I think 5.2 is used and not 5.1?16:21
fungiyeah, i'll keep it on the back burner until post-release16:22
fungigood call16:23
yoctozeptoinfra-root: I think I need your help with merging this initial change: https://review.opendev.org/c/nebulous/project-config/+/87710716:25
yoctozeptoor may you help me set it up to allow me to merge things on demand from gerrit?16:25
fungiif it's what i think it is (haven't looked yet), yes there's a bootstrapping step where manual merging is needed to add a base job16:26
yoctozepto(in case we break this base config in the future)16:26
yoctozeptofungi: yeah, it's adding the noop job to the nebulous/project-config repo as well as pipelines16:26
yoctozeptobased on opendev/project-config16:26
clarkbI think what you've got is correct. Add pipelines and a noop job16:27
clarkbthen you can land changes from there16:27
fungiadding verified+2 and submit perms in the project-config repo for a special group might make sense... infra-root: ^ opinions?16:27
clarkband ya that would need a gerrit admin to apply a verified +2 and hit submit16:27
yoctozeptobtw, the CI/CD side will be Apache 2.0 licensed; the project itself will be MPL 2.0 because that is what we have in the grant agreement16:28
fungidefinitely a line to walk between risk for the user and needing to involve our gerrit admins more often16:28
clarkbfungi: I seem to recall there was some consideration for that in the past.16:28
clarkbI want to say it implies a higher level of trust than what is limited to the tenant for some reason but I may be misremembering16:28
yoctozeptofor one, not many people will be allowed to approve anything in that repo16:29
yoctozeptolikely just me and some other person that we have not found yet16:29
clarkbbut ya ultimately if they can land changes normally then allowing them to bypass ci is not much extra16:30
clarkbI'm willing to give it a go16:30
yoctozeptothanks16:30
yoctozeptowhat should I do?16:30
clarkbyoctozepto: it will require an acl update to give some group verified -2/+2 perms and allow the submit button16:30
yoctozeptoook16:31
clarkbthen instructing that group to do their best to avoid relying on those perms and only perform the actions when you can't get around zuul being stuck due to the config you are tring to update16:31
yoctozeptoso verified +2 I think I know how to do16:31
clarkbthis situation is an example of that16:31
yoctozeptobut the submit button16:31
yoctozepto:-)16:31
clarkbyoctozepto: once the necessary votes are applied the button shows up in the top left panel of the change16:32
clarkbnext to rebase/abandon/edit16:32
clarkbyou apply the required votes, then click the button16:32
yoctozeptooook, I see you mean top-right16:32
yoctozeptothen I will reconfigure the group to allow V+216:33
clarkboh sorry yes16:33
fungiyoctozepto: one thing you might consider is having separate admin accounts you add to your administrative group, it's what we (opendev sysadmins) do for our gerrit admin access in order to minimize risk of accidentally doing something we didn't mean to over the course of normal use of the system or unnecessarily exposing the more privileged account to compromise16:36
fungialso lp/uo sso 2fa is highly recommended16:36
yoctozeptofungi: thanks for the hints! I think we are less impactful so a separate account is an overkill but it surely would be handy to disallow non-2FA logins going forward16:45
clarkbunfortunately I don't think we're able to control that via gerrit16:46
fungiright, well what you can do is make sure you have 2fa set up for the account(s) you use16:47
clarkbyup and you can configure your account to require 2fa, but I don't know that we can enforce it on the service side with the way thing are currently implemented16:48
fungiand you can always separate your roles into multiple accounts later pretty easily since you control the group membership anyway, so nothing you need to decide right now16:48
opendevreviewRadosław Piliszek proposed openstack/project-config master: Allow nebulous-project-config-core to add V+2  https://review.opendev.org/c/openstack/project-config/+/87710816:52
yoctozeptoclarkb, fungi: yeah, I meant for other project members to also be good citizens with 2FA :D but thankfully nothing makes it obligatory for us so it's good as it is atm16:53
yoctozeptoanyways, the change is up ^16:53
clarkbyoctozepto: yup just reviewed16:54
opendevreviewRadosław Piliszek proposed openstack/project-config master: Allow nebulous-project-config-core to add V+2  https://review.opendev.org/c/openstack/project-config/+/87710816:56
yoctozeptoclarkb: fixed&replied16:56
yoctozeptoq if you are sure about the "submit =" part16:57
yoctozeptobecause nothing else has it16:57
yoctozeptohappy to oblige otherwise16:57
clarkbyoctozepto: I'm pretty sure. Yesterday when we were tesitng things and adding all the +2 votes the submit button would show up but was greyed out because you also need explicit submit perms. If you look in system-config/doc/source/gerrit.rst you'll see where we document that only zuul and the project creation tool have it by default16:57
clarkbfungi can double check16:58
yoctozeptook, you are right, I feel convinced16:58
opendevreviewRadosław Piliszek proposed openstack/project-config master: Allow nebulous-project-config-core to add V+2  https://review.opendev.org/c/openstack/project-config/+/87710816:58
opendevreviewRadosław Piliszek proposed openstack/project-config master: Allow nebulous-project-config-core to submit changes  https://review.opendev.org/c/openstack/project-config/+/87710816:59
yoctozeptoclarkb, fungi: all done ^16:59
yoctozeptoeven adapted the commit message17:00
yoctozeptofingers crossed17:00
*** jpena is now known as jpena|off17:02
corvusyoctozepto: clarkb fungi i think we can remove these permissions.  this is a one-time event17:04
yoctozeptocorvus: unless I manage to break the zuul config there and then come here bragging for help ;D17:05
corvuscurrently both opendev and zuul tenants have only noop jobs configured for their config-projects, so it's exceedingly unlikely that further involvement from infra-root would needed17:05
fungiwell, it's a one-time event until someone accidentally merges breakage to the base job and then gerrit admins need to step in again17:05
corvusyoctozepto: well, that's part of it...17:05
fungiahh, good point with noop17:05
clarkbya I explicitly noted that you can stick with noop and non voting in my +1 review17:05
corvusyou can't break the tenant if the config project is gated with noop.  but you can break it if you have submit perms17:06
clarkbbasically the change does what fungi suggested earlier, but I wanted more feedback and gave alternatives17:06
yoctozeptowhat if I break the pipelines?17:06
yoctozepto:D17:06
yoctozeptoI mean17:06
yoctozeptoI can only really ever break pipelines there17:06
yoctozeptoas it will like 99,9% stay "tested" with noop17:06
corvusit would be exceedingly hard to break the pipelines if the config-project is gated17:07
corvusit is easy to break them if it is not17:07
funginote i wasn't necessarily suggesting this, but asking what others thought about the tradeoffs17:07
clarkbI think pipelines are unlikely to change often and considering that other projcet configs haven't resorted to this I think I'm coming around to corvus' reasoning and we can try it with noop for now17:07
clarkbfungi: ack17:07
corvusokay sorry i saw a flurry of changes and messages and am not 100% sure what the current status is17:07
yoctozeptook, so someone just merge this for me and I abandon the extra perms17:07
yoctozeptoI don't mind either way for now :D17:07
corvusso we're at "consider additng perms" not "we just added perms"  that's cool, then i'm jumping into the conversation about evaluating what to do :)17:08
yoctozepto:D17:08
clarkbcorvus: yup the change has not merged yet. Just at the point where a change that does that has been proposed and is in early review17:08
corvusyoctozepto: anyway, not trying to make things hard, and if it becomes a problem, i'm not opposed to more perms in principle.  i think that not having perms is sufficient and the most safe, and should not actually block you17:08
yoctozeptoI agree, I don't like exceptions in my stuff either17:09
clarkbthe main trick would be avoiding gate jobs that vote or if they vote always return success17:09
yoctozeptoif I never break the change and gate pipelines, then I should be fine, right?17:09
fungii'm also not against helping bypass zuul to merge things on rare occasions where there are no alternative solutions, mainly just want to avoid it becoming a frequent activity17:09
yoctozeptoas in17:09
yoctozeptoif I misconfigure some other pipeline17:09
corvusyeah, and we've never seen fit to add anything other than noop to the opendev or zuul tenants, so i would expect the same for nebulous too17:09
yoctozepto++17:09
corvusyoctozepto: yes -- and if you follow the pattern in the opendev or zuul tenants, hopefully only gate would matter.17:10
clarkbeg no clean check requirement17:10
yoctozeptoah, right17:10
yoctozeptothat's true17:10
yoctozeptoso I can even break the check, sweet17:11
yoctozepto:D17:11
yoctozeptolet there be havoc17:11
corvusyoctozepto: consider carefully which tenants to base your pipelines on.  opendev and zuul tenants do not have a clean check requirement, which means only gate is needed to work (and merging changes can be much faster); openstack has a clean check requirement (because people don't always follow best practices)17:11
yoctozeptocorvus: yeah, I went for the quicker way for now and will see how our partners behave17:12
yoctozeptoin the worst case, it will be the openstack way17:12
yoctozeptowhich is not bad17:12
yoctozeptojust slower for some stuff17:12
corvusyoctozepto: i think it's great to start with no clean check and add only if needed17:13
clarkb++17:13
yoctozeptohappy to hear that I am making the blessed choices :-)17:13
yoctozeptosoo17:14
yoctozeptoI think we have a consensus17:14
yoctozeptolet me abandon the extra perms17:14
yoctozeptoand some of you merge me that nebulous/project-config change17:14
yoctozeptohttps://review.opendev.org/c/nebulous/project-config/+/87710717:15
corvusinfra-root: i can do the force-merge -- i think that's probably a non-controversial action that i could do immediately?17:19
fungicorvus: thanks! i have no objection17:20
clarkbcorvus: yup as long as the pipeline config doesn't look broken I guess. But if it is functional enough to land a followup that isn't a big deal17:20
corvus#status log force-merged https://review.opendev.org/877107 to bootstrap nebulous tenant17:22
opendevstatuscorvus: finished logging17:22
corvusyoctozepto: https://zuul.opendev.org/t/nebulous/status17:23
yoctozeptothanks, corvus 17:23
yoctozeptoit's a verbatim copy of opendev/project-config now17:24
yoctozeptoI think I made it explicit in the commit message17:24
opendevreviewJohn L. Villalovos proposed openstack/diskimage-builder master: chore: support building Fedora on arm64 AKA aarch64  https://review.opendev.org/c/openstack/diskimage-builder/+/87711217:32
fungiclarkb: what do you think about further increases to the launch timeout in rax-ord? it looks like whenever we have a spike in node requests, we end up with lots of launch timeouts there even with the timeout increased to 15 minutes, but the instances do seem to eventually boot after nodepool gives up waiting17:38
fungimy concern is that the longer we make the timeout, the longer some jobs will spend waiting for node requests to be filled17:38
clarkbfungi: I suspect that reducing the max-servers count may result in better throughput?17:39
fungiif we had some way to limit the number of nodes booting in parallel that might help, since the cloud does appear to be capable of handling a large number of nodes once they've booted17:39
clarkbthat will reduce the size of potential rushes there and may keep us booting nodes in a reasonable time frame17:39
clarkbfungi: ooh thats a good idea, but I'm not sure we have support for that yet?17:40
fungiit's the region in rax where we have the largest quota, but we've already reduced max-servers there by half (so it's now ~2/3 of the other two rax regions)17:40
clarkbone problem with increasing the timeout is that we retry 3 times too17:41
clarkbso in the owrst case a job may wait 3 * timeout17:41
fungiit did represent 40% of our theoretical rax capacity, now it's 25%17:41
clarkbmaybe what we should do is raise the timeout and not allow retries in that region17:41
clarkbthat should also help with the thundering herd problem since we won't retry so much17:42
clarkb(maybe, if we are at capacity chances are another request will show up soon enough though)17:42
clarkbI think that is what I would do. don't allow retries and increase timeout a bit17:42
fungiwe're able to control retries independently per provider? i'll give that a shot17:43
clarkbI think we are17:43
fungilaunch-retries is per provider, yep17:44
fungiwe currently don't set it for any provider and take the default from nodepool17:44
fungihttps://zuul-ci.org/docs/nodepool/latest/openstack.html#attr-providers.[openstack].launch-retries17:45
fungislightly misleadingly named/documented, since it's the number of time to try, not the number of times to retry17:45
fungiso we want it set to 1 for a single try (i.e. no retries)17:45
clarkbfungi: and also maybe we want to increase the api rate limit17:48
clarkbbut thta impact is unlikely to matter much17:49
fungiyou mean the delay between api calls? already did by an order of magnitude in an earlier change but can do it some more17:49
clarkb(it would slow down boots when thundering herd happens just not much comapred to the timeouts)17:49
clarkbya its still 100 requests a second17:49
clarkbwe may want 1 a second? I dunno17:49
fungiworth a try i guess17:49
opendevreviewJeremy Stanley proposed openstack/project-config master: Further increase rax-ord launch timeout no retries  https://review.opendev.org/c/openstack/project-config/+/87711317:53
fungiif it ends up helping, maybe we can try turning the max-servers there back up some17:54
clarkbcorvus: ^ that may interest you from a general nodepool functionality perspective17:55
fungiit's a fairly pathological case though, not sure how many knobs for dealing with a situation like that make sense17:59
corvusoh oh18:00
clarkbI think being able to control the parallelism of inflight node creations is worth considering though18:00
corvuswhat about https://zuul-ci.org/docs/nodepool/latest/configuration.html#attr-providers.max-concurrency  ?18:00
clarkboh do we have that TIL18:00
clarkb++ that is exactly what we need18:01
fungiwhoa. mind *blown*18:01
clarkbI think we still drop the retries to avoid stalling builds out18:01
clarkbbut maybe we keep the old api rate and set parallelism to something that grafana graphs make look like it can handle18:01
fungii'll start it off at 1018:02
corvus(the docs need updating because it's not actually threads anymore, but the statemachine framework does honor that -- it still controls the concurrency for new requests)18:02
fungiif this helps, we can probably back the launch-timeout down to something smaller as a safety, and then possibly re-add retries?18:03
clarkbfungi: I suspect that we're more likely to hit timeouts than valid failures needing a retry?18:05
clarkbbasically in a cloud suffering these issues it is probably better in all cases to have alonger timeout and not keep trying if you fail18:05
fungiyeah, maybe18:07
opendevreviewJeremy Stanley proposed openstack/project-config master: Limit rax-ord launch concurrency and don't retry  https://review.opendev.org/c/openstack/project-config/+/87711318:09
fungicorvus: clarkb: ^18:09
corvus++18:10
fungiyou can sort of see the misbehavior by comparing the three test node history graphs at https://grafana.opendev.org/d/a8667d6647/nodepool-rackspace?orgId=1&from=now-24h&to=now18:11
fungithough the error node launch attempts graph is across all three providers, from the logs it seems to be mainly rax-ord18:12
fungiand you can clearly see the variability for that region in the time to ready graph18:12
clarkbitdoes seem to be able to handle ~30 booting nodes. I guess we monitor things and increase the concurrency if it holds up18:12
fungiwell, that's potentially misleading. it "handles" accepting that many boot requests in parallel, but definitely does look like things get a lot worse when we ask for all its capacity at once18:14
clarkbright you can see where it requests far more than 30 and has a sad. But there are a couple instances where it requests 30ish and seems to do well with that18:15
clarkband an instance of about 38 where it falls over18:15
clarkbI suspect the tipping point is around 30 for this reason18:16
fungipotentially, but i also wouldn't rule out external factors since we're not the only user of that cloud18:16
fungidoes anybody know what the multiple stats for the providers are in the api server operations graphs?18:19
funginot all regions have the same number of them either18:19
fungifor example in the post server graph there are 5 lines for dfw, 3 each for iad and ord18:20
fungii guess i could look at the yaml for that18:21
clarkbif you hit the graph menu there is an inspect option18:21
clarkbit looks like only one of them actually has data18:22
fungilooks like the same thing i found in git, aliasByNode(stats.timers.nodepool.task.$region.compute.POST.servers.*.mean, 4)18:22
clarkbsomehow $region is ending up with multiple identical results? And i think that variable list can be retrieved by querying graphite?18:23
fungiyeah, i'm fiddling with graphite18:23
corvusone for each response code?18:24
clarkboh maybe18:25
fungithat seems to be it, yep18:25
corvusthere are 5 for dfw and 3 for ord18:25
fungidfw has returned 500 and 503, in addition to the 202, 400 and 403 returned by the other regions18:25
fungii wonder if the intent was to aggregate those18:26
corvusshould probably either aggregrate or adjust the alias so it's rax-ord 200; either is possible with a different graphite function18:27
opendevreviewMerged openstack/project-config master: Limit rax-ord launch concurrency and don't retry  https://review.opendev.org/c/openstack/project-config/+/87711318:29
clarkbalright I'm going to work on deleting gitea05-07 now18:30
clarkbif they are anything like gitea08 I will need to delete their backing boot volume separately18:30
clarkb#status log Deleted gitea05-07 and their associated boot volumes as they have been replaced with gitea10-12.18:41
opendevstatusclarkb: finished logging18:41
opendevreviewClark Boylan proposed opendev/zone-opendev.org master: Remove gitea05-07 from DNS  https://review.opendev.org/c/opendev/zone-opendev.org/+/87711718:43
clarkbunder 1k replication tasks for gitea13 and 14 now too18:43
clarkbthe cryptography min requirement discussion for openstack prompted me to look and cryptography is actually requiring a pretty reasonable rust compiler version19:10
clarkbA lot of projects (firefox famously) require a prerelease compiler but cryprography wants 1.45.0 or newer and ubuntu has 1.65.0 (in universe though)19:10
fungiyeah, however jammy ships python3-cryptography 3.4.8 because it was contemporary at the time, and would need to backport a rust-built version of the package with numerous new build-dependencies to update to a newer version19:25
fungicompare https://packages.ubuntu.com/source/jammy/python-cryptography with https://packages.ubuntu.com/source/lunar/python-cryptography19:25
fungithat's the sort of complex change a stable distro is generally going to avoid19:26
clarkbfor sure. My point is more I don't think the issue is rust specific so much as stable distros don't wholesale backport new releases typically19:27
clarkband then that is made worse by the old software they are having been in part replaced with something completely different so any updates may require a lot of effort19:29
fungibut also it may not just be rust itself in this case, if cryptography needs newer versions of other toolchain components (cargo, setuptools-rust extension, various rust libs)19:30
clarkbright, I just think its distracting to bring up rust in this case. Ubuntu basically never takes a 2 year newer version of a libarary and replaces their stable version with it regardless of the compiler19:31
fungilooks like the python3-cryptography in lunar has 9 different rust-based libs it also depends on19:31
clarkbthis is more about "stable distro has a stable library version we need to continue to do our best to support that". Its no different than not requiring a newer libvirt19:32
fungisure, the discussion has fixated on rust because it's what coreycb brought up as the challenge for that particular package, but i tried to point out earlier that it's really just one example19:32
fungithe rax-ord graph looks considerably better20:49
fungithough it'll be hard to know for sure until monday20:50
fungipossibly just wishful thinking on my part20:51
clarkbone replication event is still processing. Once that is done I'll trigger global replication for the replication config reload and then we should be good to land the change to add gitea13 and 14 to haproxy21:08
clarkbfull replication started21:23
clarkbfungi: the boot times don't seem to have dropped significantly though21:26
clarkbwe may still need to bump the timeout up as a result21:27
clarkbI've approved https://review.opendev.org/c/opendev/system-config/+/877047 as replication is complete21:47
clarkbfungi: if you are still around https://review.opendev.org/c/opendev/zone-opendev.org/+/877117 should be an eas one21:47
clarkb(cleans up gitea05-07 dns records)21:47
clarkbthank you!21:53
funginp, just cooking dinner and reviewing changes21:56
opendevreviewMerged opendev/zone-opendev.org master: Remove gitea05-07 from DNS  https://review.opendev.org/c/opendev/zone-opendev.org/+/87711721:58
opendevreviewMerged opendev/system-config master: Add gitea13 and gitea14 to the haproxy load balancer  https://review.opendev.org/c/opendev/system-config/+/87704723:00

Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!