Friday, 2021-06-18

corvusi've gone from zero experience with synapse db upgrades/issues to "a tiny bit".  so far, i'd say it's reasonable and not too difficult.  i did encounter a bug the first time i ran the migration script, but it had already been fixed in the latest software (thus the upgrade).  so, stuff will happen.  i think this just reaffirms my view -- it's well within our capability to run, but even better if someone else does.  :)00:02
opendevreviewGhanshyam proposed openstack/project-config master: Update retiring uc-recognition repo ACL to openstack/retired.config  https://review.opendev.org/c/openstack/project-config/+/79697100:32
opendevreviewGhanshyam proposed openstack/project-config master: Update retiring ops-tags-team repo ACL to openstack/retired.config  https://review.opendev.org/c/openstack/project-config/+/79697200:35
opendevreviewGhanshyam proposed openstack/project-config master: Update retiring workload-ref-archs repo ACL to openstack/retired.config  https://review.opendev.org/c/openstack/project-config/+/79697300:39
opendevreviewGhanshyam proposed openstack/project-config master: End project gating for retiring arch-wg repo  https://review.opendev.org/c/openstack/project-config/+/79696200:42
opendevreviewGhanshyam proposed openstack/project-config master: Update retiring enterprise-wg repo ACL to openstack/retired.config  https://review.opendev.org/c/openstack/project-config/+/79697400:46
opendevreviewGhanshyam proposed openstack/project-config master: Update project gating for retiring project-navigator-data repo  https://review.opendev.org/c/openstack/project-config/+/79697500:51
opendevreviewGhanshyam proposed openstack/project-config master: Update project gating for retiring governance-uc repo  https://review.opendev.org/c/openstack/project-config/+/79697600:57
opendevreviewGhanshyam proposed openstack/project-config master: Update project gating for retiring workload-ref-archs repo  https://review.opendev.org/c/openstack/project-config/+/79697801:05
opendevreviewIan Wienand proposed zuul/zuul-jobs master: Switch jobs to use fedora-34 nodes  https://review.opendev.org/c/zuul/zuul-jobs/+/79563601:10
opendevreviewIan Wienand proposed zuul/zuul-jobs master: Ensure dnf-plugins-core before calling "dnf copr"  https://review.opendev.org/c/zuul/zuul-jobs/+/79697901:10
opendevreviewGhanshyam proposed openstack/project-config master: Update project gating for retiring openstack-specs repo  https://review.opendev.org/c/openstack/project-config/+/79698001:13
opendevreviewIan Wienand proposed openstack/diskimage-builder master: fedora-container: install dnf-plugins-core  https://review.opendev.org/c/openstack/diskimage-builder/+/79698402:09
opendevreviewIan Wienand proposed openstack/diskimage-builder master: fedora-container: install dnf-plugins-core  https://review.opendev.org/c/openstack/diskimage-builder/+/79698402:10
opendevreviewIan Wienand proposed zuul/zuul-jobs master: Switch jobs to use fedora-34 nodes  https://review.opendev.org/c/zuul/zuul-jobs/+/79563602:14
opendevreviewMerged zuul/zuul-jobs master: Ensure dnf-plugins-core before calling "dnf copr"  https://review.opendev.org/c/zuul/zuul-jobs/+/79697903:12
opendevreviewIan Wienand proposed opendev/system-config master: review02 : bump heap limit to 96gb  https://review.opendev.org/c/opendev/system-config/+/78400303:20
opendevreviewMerged zuul/zuul-jobs master: Switch jobs to use fedora-34 nodes  https://review.opendev.org/c/zuul/zuul-jobs/+/79563603:30
opendevreviewMerged zuul/zuul-jobs master: ensure-zookeeper: better match return code  https://review.opendev.org/c/zuul/zuul-jobs/+/79353703:30
opendevreviewMerged opendev/system-config master: Add note about afs01's mirror-update vos releases to docs  https://review.opendev.org/c/opendev/system-config/+/79689303:39
opendevreviewMerged opendev/system-config master: review02 : bump heap limit to 96gb  https://review.opendev.org/c/opendev/system-config/+/78400303:53
diablo_rojoianw, around? 04:02
ianwdiablo_rojo: yep!04:02
diablo_rojoianw, I was going to start looking at converting the puppet-ptgbot to use ansible if you've got any pointers or examples I should look at? 04:03
diablo_rojoI'm excited to take a stab at it :) 04:03
ianwcool, umm let me see ...04:04
diablo_rojoYeah no rush. I won't be up for a ton longer, I just wanted to message you before I forgot today. This week has gotten away from me. 04:05
ianwthe first thing will be to create a container with ptgbot in it04:05
ianwit looks like a pretty standard python app04:06
diablo_rojoYeah I think ttx tried to keep it pretty simple. 04:06
diablo_rojoHow would I go about creating the container? (pardon my ignorance please :) )04:06
ianwhaha ignorance not even vaguely considered :)04:07
ianwhttps://opendev.org/opendev/statusbot/commit/6da21b94992661aa9596c746c7bcbf60cf9c2ac2 would be an example04:08
ianwthe only part of this we can't pre-test is that secret04:08
diablo_rojoOkay so docker file and a yaml.04:09
diablo_rojoHow does that get generated? 04:09
ianwthat Dockerfile would be basically correct, it would need a different command of course04:09
ianwthat command would be currently in puppet04:09
diablo_rojoianw, coolio, yeah seems easy enough. 04:10
diablo_rojothe command for generating the secret? 04:10
ianwsorry i mean the startup command for the daemon04:10
ianwthe secret an infra-root will need to generate for the ptgbot project from our docker key04:11
ianwhttps://opendev.org/opendev/puppet-ptgbot/src/branch/master/files/ptgbot.init04:11
diablo_rojoI think I am following so far :) 04:11
ianwyeah, so currently (or previously) it was installed and then started via ^^ that init script04:12
ianwso instead the daemon will want to run in the container04:12
ianwonce we have the container, we should write a role in system-config to deploy it04:12
ianwthat would look a lot like https://opendev.org/opendev/system-config/src/branch/master/playbooks/roles/statusbot04:13
diablo_rojoso would those be two separate patches? One to setup the container and another to actually write the ansible role? 04:13
ianwyes, i'd build and publish the container from the ptgbot project, and then system-config will consume it04:13
ianwwriting the role is where you'll want to more-or-less translate what's happening in puppet-ptgbot to ansible04:14
diablo_rojoOkay so the changes will be in opendev/puppet-ptgbot and opendev/system-config?04:14
diablo_rojoErr maybe not. 04:14
ianwpuppet-ptgbot won't be used any more; it's esentially an exercise in converting that to ansible04:15
ianwhttps://opendev.org/opendev/system-config/src/branch/master/playbooks/roles/statusbot/tasks/main.yaml are the broad strokes of it04:15
diablo_rojoOkay so the container stuff will live in openstack/ptgbot then? 04:15
ianwmake config directories, deploy config files, that sort of thing04:15
diablo_rojoOkay. I think I am still following :) 04:16
ianwand then in system-config there will be a docker-compose file to start the service, like https://opendev.org/opendev/system-config/src/branch/master/playbooks/roles/statusbot/files/docker-compose.yaml04:16
ianwthat's where you map in config files, log volumes, whatever, from the underlying host04:16
diablo_rojoOkay. 04:17
ianwyeah; the theory should be that the ptgbot container is a generic thing that theoretically anybody could use04:17
ianw(i mean realistically we'll be the only consumer i'd say, but it makes for nice separation of concerns)04:17
diablo_rojoRight okay. That makes sense :) 04:18
ianwso once you have the ptgbot role, you'll want to add it to the eavesdrop playbook04:18
ianwhttps://opendev.org/opendev/system-config/src/branch/master/playbooks/service-eavesdrop.yaml04:18
diablo_rojoOkay easy enough :) 04:19
ianwat that point, you can test it in the gate04:19
diablo_rojoCool :) And hopefully doesn't explode. 04:19
ianwi can almost guarantee it will at first :)04:20
diablo_rojoI will try to get a WIP posted this weekend/early next week for at least the container stuff and docker image. 04:20
ianwhttps://zuul.opendev.org/t/openstack/builds?job_name=system-config-run-eavesdrop is the job that will run04:21
diablo_rojoOf course :) Just hopefully not *that* bad. 04:21
diablo_rojoOkay cool. 04:22
diablo_rojoI think thats all I need for now? Anything else I should be aware of? 04:22
ianwwe can put the job on hold and you can live inspect and fiddle with it04:22
diablo_rojoI know where to find you when I inevitably will have more questions :)04:22
diablo_rojoOh sweet. 04:22
ianwi can generate the secret for the ptgbot project and paste that for you to use when ready04:23
ianwwe shouldn't *have* to have that published to dockerhub for this to work04:23
diablo_rojoOh nice. That would be helpful :) I will give you a heads up when I am ready. 04:24
ianwwe use an intermediate registry, so earlier jobs push their images to it, and later jobs download from it.  so everything can hang together speculatively based on Depends-On04:24
ianwthe system-config-run-eavesdrop job may need some tweaking to require the ptgbot jobs, etc. for that to hang together04:25
diablo_rojoOkay. Noted. 04:26
ianwyou'll probably also want to setup things like letsencrypt certs; is it staying an openstack thing or should it be ptg.opendev.org?04:28
diablo_rojoI'm guessing the latter?04:29
ianwin that case you'll want a CNAME added to https://opendev.org/opendev/zone-opendev.org/src/branch/master/zones/opendev.org/zone.db to point ptgbot.opendev.org -> eavesdrop01.opendev.org (and an _acme-challenge record for letsencrypt)04:30
ianwwe should redirect ptgbot.openstack.org to that as well.  unfortunately only an infra-root can manage openstack.org as it has to be done via RAX's interface04:31
ianwi can however pre-add _acme-challenge.ptgbot.openstack.org now though, which will allow us to get a letsencrypt certificate covering it04:32
ianwhttps://docs.opendev.org/opendev/system-config/latest/letsencrypt.html should be a pretty good overview of the letsencrypt process; lots of examples in the code now04:33
diablo_rojoOkay just a few more steps :) 04:34
ianwyep, but you will want to have that setup in the initial change, because you'll need the certificates for setting up the webserver04:34
ianwin testing, we just make self-signed certs04:35
diablo_rojoOhh okay. 04:35
diablo_rojoThanks for all the info ianw :) I am going to head to bed. Enjoy the rest of your day!04:43
ianwdiablo_rojo: no worries, later!04:43
ianwcorvus / mordred: a question for when you're around : it's openstack/ptgbot -- in theory i guess we should publish a container for it under openstackorg (https://hub.docker.com/u/openstackorg) ... so far nothing is published there04:58
ianwin practice it feels more like an opendev thing, i don't know04:59
ianwdiablo_rojo: i've generated the secrets section for both options @ http://paste.openstack.org/show/QR8BeDPufFWCNqvZVgRm/ ... you can copy-paste either of those depending on if we want to publish it under opendev or openstack05:00
*** ykarel|away is now known as ykarel05:23
*** marios is now known as marios|ruck06:02
*** jpena|off is now known as jpena07:18
*** rpittau|afk is now known as rpittau08:17
yoctozeptoinfra-root: ethercalc is down08:19
*** ykarel is now known as ykarel|lunch09:00
*** raukadah is now known as chandankumar09:26
frickleryoctozepto: what issue do you see exactly? seems to be working fine for me. though the instance has an uptime of only ~2d ...09:28
yoctozeptofrickler: it works now; it was not responding (timeout)09:37
*** ykarel|lunch is now known as ykarel10:13
opendevreviewchandan kumar proposed openstack/project-config master: Enable publish-openstack-python-tarball job  https://review.opendev.org/c/openstack/project-config/+/79704910:22
*** jpena is now known as jpena|lunch11:41
fungiianw: the foundation has an osf/ git namespace which they're probably going to want renamed to openinfra/ at some point11:48
fungioh, but you're talking about dockerhub not opendev git/gerrit11:49
*** bhagyashris_ is now known as bhagyashris11:50
*** ysandeep is now known as ysandeep|brb11:58
fungiianw: diablo_rojo_phone: the ensuing discussion around https://review.opendev.org/780947 indicated that we might want it to be ptg.openinfra.dev since the ptg is an event put on by the foundation12:02
fungibut we can decide that closer to completion of the conversion12:03
*** whayutin is now known as weshay12:07
rosmaitahello! when someone has a few minutes, i still can't get my brick-cinderclient-dsvm-functional-py36 job working, even with the ubuntu-bionic nodeset -- there may be some required option I'm not setting.  Error is here: https://zuul.opendev.org/t/openstack/build/fc3c9eed53d24157983e35cac1eb5ad912:34
*** jpena|lunch is now known as jpena12:38
fungirosmaita: that might be better discussed in #openstack-qa since the error indicates you're not passing some expected values to the devstack playbook the openstack qa team maintains... to be honest i'm not that familiar with it12:41
rosmaitafungi: thanks!12:41
*** ysandeep|brb is now known as ysandeep12:54
opendevreviewMerged openstack/project-config master: Update retiring uc-recognition repo ACL to openstack/retired.config  https://review.opendev.org/c/openstack/project-config/+/79697113:00
opendevreviewMerged openstack/project-config master: Update retiring ops-tags-team repo ACL to openstack/retired.config  https://review.opendev.org/c/openstack/project-config/+/79697213:03
opendevreviewMerged openstack/project-config master: End project gating for retiring arch-wg repo  https://review.opendev.org/c/openstack/project-config/+/79696213:07
opendevreviewMerged openstack/project-config master: Update retiring enterprise-wg repo ACL to openstack/retired.config  https://review.opendev.org/c/openstack/project-config/+/79697413:08
opendevreviewMerged openstack/project-config master: Update project gating for retiring project-navigator-data repo  https://review.opendev.org/c/openstack/project-config/+/79697513:09
opendevreviewMerged openstack/project-config master: Update project gating for retiring governance-uc repo  https://review.opendev.org/c/openstack/project-config/+/79697613:09
opendevreviewMerged openstack/project-config master: Update project gating for retiring workload-ref-archs repo  https://review.opendev.org/c/openstack/project-config/+/79697813:09
opendevreviewMerged openstack/project-config master: Update project gating for retiring openstack-specs repo  https://review.opendev.org/c/openstack/project-config/+/79698013:09
*** raukadah is now known as chandankumar13:13
*** ysandeep is now known as ysandeep|away13:37
*** rpittau is now known as rpittau|afk14:09
*** marios is now known as marios|ruck14:34
mordredlinux australia has made a public matrix "space" (which is still a beta feature, but they're trialing it) https://matrix.to/#/#linux-australia:matrix.org ... in case looking at how such a thing works is a useful to folks. In element you need to enable the experimental spaces feature (settings -> labs)14:40
*** jpena is now known as jpena|out14:48
clarkbmordred: is there a tldr on what the space is? like a super channel?15:20
mordredIt's a named collection of channels15:20
clarkbinfra-root also appears that LE is still failing to update on a number of servers I'll be looking into why that is still failing after fixing nb03's disk situation next15:21
mordredAlthough in matrix itself it's implemented as a channel that contains channels15:21
mordredBut it's a way to curate and name related collections of things. It's also not exclusive, so a given channel can be in multiple spaces15:22
mordredOne could imagine an openstack space with ALL of the openstack channels plus #opendev. And a zuul space with zuul, opendev, ansible, gerrit, etc15:23
fungiclarkb: one theory was that we're running up against le cert limits for static.o.o since we're getting individual certs for all the sites rather than a single cert with them all as altnames15:23
mordredSo for a new user to a given sub-community they can just add the space and not have to hunt for various things they might want to be in15:24
fungimordred: so it's customized indices of channels, essentially?15:24
mordredfungi: Ugh15:24
mordredfungi: yes!15:24
clarkbfungi: that would only affect those names though I think and not review. But could be they look at the aggregate too15:25
fungioh, review was also impacted?15:25
fungithe expiring cert warnings i saw today were mostly stuff hosted on static, like the legacy git redirects15:26
fungialso entirely possible we have more than one problem15:26
clarkbfungi: yes review is in the list of emails too15:26
fungiahh, okay15:26
clarkbinfra-prod-letsencrypt (the job) has succeeded since I fixed nb0315:28
clarkbGerrit reports [Fri Jun 18 06:47:44 UTC 2021] Using CA: https://acme.zerossl.com/v2/DV90 (note that isn't letsencrypt) and later it fails due to Can not resolve _eab_id. I have found https://github.com/acmesh-official/acme.sh/wiki/Change-default-CA-to-ZeroSSL which15:30
clarkbs/which//15:30
clarkbI wonder if we're consuming acme.sh from not releases and it is in a broken transitory state15:31
clarkbBut also that seems like a good way to make people mad15:31
clarkbhttps://github.com/acmesh-official/acme.sh/blob/dev/acme.sh#L32-L33 yup I think this is our problem15:35
clarkbhttps://github.com/acmesh-official/acme.sh/blob/dev/acme.sh#L3541-L3547 is the error we are hitting fwiw. But I think we can just change over the server value and we'll be ok?15:40
*** dviroel is now known as dviroel|busy15:46
opendevreviewClark Boylan proposed opendev/system-config master: Be explicit about server used in acme.sh  https://review.opendev.org/c/opendev/system-config/+/79713615:51
clarkbinfra-root ^ I think that may fix things for us (or at least return us to using LE instead of zerossl then we'll find the next problem)15:51
fungii guess https://github.com/acmesh-official/acme.sh/issues/3556 and https://github.com/acmesh-official/acme.sh/issues/3557 are related15:51
clarkbI think we'd probably be ok with zerossl after a quick look, but it isn't working and I want working more than I need to get into a whether zerossl or le is better :)15:53
clarkbinfra-root another option would be to use https://github.com/acmesh-official/acme.sh/tree/2.9.0 and pin to that. I worry that if LE makes changes to their provisioning process that will be problematic for us. I think it is better to continue to roll forward15:55
fungiclarkb: probably we need to have an account at zerossl for it to function?15:56
clarkbfungi: ya that is my guess, its doing the http post and not getting what it expects back (due to our lack of account setup?)15:57
clarkbif we think zerossl is a better option for us we can figure that out as a next step15:57
clarkbI've been happy with LE though and don't like changing defaults in this way15:57
corvusmaybe we can use, er... LE's python thing whatever it's called?15:59
corvuscertbot16:00
corvusthey have a container now, we can "docker run --rm" it and not worry about packaging/venv/etc16:00
clarkbcorvus: does it have coordnated dns mode? I think that was one reason we ended up on acme.sh as it allowed you to run off and set up dns yourself16:01
clarkbthen come back to it and say ok try to finish issuance16:01
corvusclarkb: maybe this?  https://certbot.eff.org/docs/using.html?highlight=dns#manual16:02
clarkbya it seems a bit more difficult to use their hooks since it wants to do everything in one process run16:02
*** marios|ruck is now known as marios|out16:02
clarkband we rely on calling process of $tool to run off and set up dns. The inversion might be painful with ansible coordination but theoretically doable16:03
clarkbbut also acme.sh has worked well I'm not sure we need to completely reengineer this16:03
clarkbwe just need to be more explicit16:03
clarkbya with acme.sh the process seems to be we run the issue command with --dns and --yes-I-know-dns-manual-mode-enough-go-ahead-please this causes acme.sh to do as much of the acme with LE to get the dns confirmation code and return it. We then take that, update our dns servers then run acme.sh again with renew and --yes-I-know-dns-manual-mode-enough-go-ahead-please and that tells it to16:10
clarkbpick up from the previous request16:10
clarkband that works out well for having ansible coordinate things across services16:10
fungiand solves our catch-22 for needing to sometimes generate certs for names which don't yet resolve in dns or don't point to an actual running/configured server yet, and without needing something like dynamic dns update services16:15
corvusyeah, just saying it looks like there may be an option using the 'manual' plugin16:18
*** jpena|out is now known as jpena16:22
*** jpena is now known as jpena|off16:48
clarkbzuul has +1'd https://review.opendev.org/c/opendev/system-config/+/797136 though the bulk of the change isn't actually tested since it goes through the staging path16:57
clarkbthat said I think we can probably land it then manually run the LE playbook so we don't wait for the periodic run?16:57
fungiyeah, i've +2'd, seems fine16:59
clarkbfungi: I've paged in some of the gerrit stuff. We have 275 conflicts identified by gerrit. ~clarkb/gerrit_user_cleanups/notes/proposed-cleanups.20210416 lists 181 email address and accounts. The idea is we will "retire" all of those account numbers then after some time we will run system-config/tools/remove-user-external-ids.py on proposed-cleanups.20210416 as long as users don't17:21
clarkbcomplain about the retirements17:21
clarkbfungi: the reason there isn't clear annotation for why each of those has been listed in the file is I just sort of manually went through the 275 and used manual judgement to assess which are probably ok and the reasoning behind each one may be quite specific. That said its been long enough since I did that that going trhough the list again is probably worthwhile and when I do that I17:22
clarkbcan make notes for each one17:22
clarkbfungi: ~clarkb/gerrit_user_cleanups/notes/audit-results.yaml.20210415 is the information being processed to construct proposed-cleanups.2021041617:23
fungiokay, so it was at least expected that there might be some duplicate ids in the list, as well as duplicate addresses17:23
clarkbthat yaml file contains a number of account state attributes that are using for making these decisions.17:23
clarkbfungi: yup because an email address may be used by a number of accounts and need to be cleaned up across all of them and a single account may have multiple email addresses that conflict with other accounts :(17:24
fungiyep, got it. i'll take a closer look at the input list as well, but i think i'm okay moving forward with those at this point17:24
clarkbfungi: as far as double checking things goes I think you can look in proposed-cleanups.20210416 then cross check with audit-results.yaml.20210415 and ensure none of the account attributes in yaml indicate an account that may actually be used17:24
clarkbspot checking is probably fine. The list is quite large17:25
fungiyeah, that's what i was going to do for spot-checks17:25
fungiexactly17:25
clarkbwe're also retiring first because we can undo retirement pretty easily. So start with retirement. wait a couple of weeks then do the more destructive cleanup next17:25
fungiright17:25
clarkbfungi: as an example spot check the first entry in the proposal is also the first entry in the yaml. That individual has two conflicting accounts, both created 5 years ago. I made a judgement call to keep the newer of the two accounts and retire and cleanup the older of the two17:27
clarkbThe reason for this is both accounts are active, neither has pushed or reviewed code, but the older account doesn't have a valid openid so we just go for it even though the older account has valid ssh keys17:28
clarkbthat individual, should they decide to show up again, will not be able to login to the older account but can log in to the newer account and add new ssh keys there17:28
clarkbso we remove the older account17:28
fungiyep17:29
clarkbtalking out loud about it helps build my own confidence in what I did too so thank you for listening :)17:30
clarkbfungi: do we want to approve https://review.opendev.org/c/opendev/system-config/+/797136 nowish? ianw isn't around today due to timezones and is most familiar with the tools there, but I think I got it right.17:33
clarkbfungi: we can wait for ianw to take a look in a couple of days and review it too or run it now and see what happens17:33
fungimay as well give it a shot, approved just now17:37
clarkbk17:37
opendevreviewMerged opendev/system-config master: Be explicit about server used in acme.sh  https://review.opendev.org/c/opendev/system-config/+/79713618:38
fungiclarkb: ^18:38
fungiguess we wait for it to deploy18:38
clarkbyup the deploy seems to eb running now18:42
johnsomHi OpenDev. Just an FYI, I got an odd pop-up loading a gerrit patch: http://paste.openstack.org/show/806780/18:42
johnsomReload had no issue.18:43
clarkbjohnsom: the backend may have timed out that request and sent an incomplete response or similar18:43
fungijohnsom: neat, it went through on reload? the server might be struggling and returning questionable responses to apache. checking cacti graphs now18:44
johnsomYeah, not a big deal, just thought I would mention it in case the backend rotation has an issue.18:44
clarkbjohnsom: fwiw there is no backend rotation in this case, just a single gerrit. We are making progress on transitioning it to a larger server which we hope will ease these problems (if this is in fact related to system load/memory use)18:45
fungiwell, there's no rotation in this case18:45
fungier, what clarkb said18:45
johnsomHa, well. Ok then. I guess it's easy to know which backend it is speaking of... lol18:45
fungithough nothing on the cacti graphs are immediately jumping out as anomalous18:46
fungithere might have been a connectivity issue with the db since it's not on the same system... will check the gerrit error log18:46
johnsomIn case it matters for some reason in the future, this was the patch URL: https://review.opendev.org/c/openstack/tripleo-heat-templates/+/79396918:47
fungijohnsom: do you have an approximate timestamp?18:47
fungithough thanks, the request may help me narrow it down18:47
johnsomAbout enough time to cut/paste into the pastebin page before 11:4218:47
fungiunfortunately gerrit likes to throw tons of benign java backtraces into its error log so it's the typical needle in a haystack search18:48
johnsom(Pacific time)18:48
opendevreviewMerged zuul/zuul-jobs master: Add role to enable FIPS on a node  https://review.opendev.org/c/zuul/zuul-jobs/+/78877818:50
clarkbreview's acme log just said it verified things and the cert is available18:51
clarkbthe playbook is still running though. I'll check certs as reported by servers when the playbooks is done18:52
fungiNot After : Jul 16 05:43:27 2021 GMT18:52
fungion the current review.o.o https cert18:52
clarkbfungi: ya I think handlers run last18:52
fungik18:52
clarkband we use a handler to reload apache so we need the playbook to complete to see it on the server side18:52
fungiahh, yep18:52
clarkbthe playbook is done now18:55
fungiNot After : Sep 16 17:51:16 2021 GMT18:55
clarkbboom18:56
* fungi cheers18:56
clarkbI'm getting a complete list of names to check out of the ansible log and will check them18:58
fungiecho|openssl s_client -connect review.opendev.org:https 2>/dev/null|openssl x509 -text|grep -i after|cut -c25-18:59
fungiif you want a fast check18:59
clarkbah yup I think I may need to do that because a number of the certs redirect to other sites and firefox won't show me the initial cert19:01
clarkbthe review cert lgtm as does the static.o.o cert and mirror.regioneone.linaro.opendev.org. I'll switch to s_client now to avoid redirects19:01
fungioh, that may not help if relying on sni19:02
fungisince that's getting you the initial cert19:03
fungi-servername name19:03
fungiaccording to the s_client manpage19:03
fungithat should get you the correct sni context19:03
* clarkb tries again19:04
clarkbthough in this case I got different times for each one19:04
clarkbimplying they were different (maybe my s_client is new enough for sni by default?)19:04
clarkb"If -servername is not provided, the TLS SNI extension will be populated with the name given to -connect if it follows a DNS name format." yup I should be good19:05
fungiyeah, i just tested and the hostname i passed on -connect seems to bring up the correct cert19:05
fungirighteous19:05
clarkbthis all looks good to me now. I'm happy we check these a month in advance :)19:05
fungii suppose if servername isn't in dns then you need -servername to work around t19:05
fungiit19:06
clarkbor if you talk to an ip address to test a specific backend or something along those lines'19:06
fungiyeah19:06
fungijohnsom: so i can find the entry in the apache error log:19:08
fungiAH00898: Error reading from remote server returned by /changes/openstack/tripleo-heat-templates~793969/revisions/1/related19:08
fungino luck finding a corresponding error in the gerrit logs19:08
fungibut thanks for pointing it out, will keep a closer eye on it19:08
johnsomSure, NP19:08
fungiapache returned the error at 18:40:57.11470019:08
fungiutc19:08
fungiit's possible the error in the gerrit logs is delayed by some minutes due to an internal timeout or something, but there was nothing mentioning that change or project in the flood of noise gerrit spews to its error log19:09
clarkbfungi: johnsom  I think we have a 60 second timeout in place on the proxypass directive (via apache defaults)19:11
clarkbI would expect apache to log that the timeout was hit rather than an error though19:11
fungirefreshing the cacti graphs, now it shows there was a fairly large but very brief spike swapping in for the sample ~10-15 minutes after that error19:12
johnsomYeah, I have found that to not be called out as well as I would like in Apache.19:12
fungialso the server's load average was up a bit around the time of the error: http://cacti.openstack.org/cacti/graph.php?action=view&local_graph_id=26&rra_id=all19:13
johnsomI have spent some time tracking a uwsgi bug with custom compiled apache modules. The errors aren't always as direct to the issue as I had hoped.19:13
fungiso maybe there was something going on which slowed the server response beyond apache's proxy timeout tolerance19:14
fungiload average around 15 for a server with 16 vcpus though, not really substantial19:14
johnsomIn Designate, with uwsgi proxied through apache, we get a pipe closed error randomly in the apache log. I still need to put more time into that.19:15
fungianyway, snmp is being polled every 4 minutes, so the load average might have spiked much higher than 15 in that timespan19:15
fungier, every 5 minutes19:15
clarkbTIL about pip install --ignore-installed. I wish there was a way to instruct --ignore-installed to leave system packages alone, but I'm not sure it can really distingusih without a bunch of cases for linux and other OSes19:16
fungipip doesn't want to have to care about things it didn't install. the pip maintainers still consider `sudo pip` to be a case of "you're doing it wrong"19:17
clarkbyup, I know. Just wondering if there was a better way to address https://review.opendev.org/c/openstack/devstack/+/797069/3/tools/fixup_stuff.sh without rewriting devstack to use a virutalenv (whcih has been tried numerous times and failed for various reasons)19:18
fungibut probably worth trying again19:18
clarkbya the biggest hurdle at this point is grenade iirc. That and maybe a functional job or two (ironic and swift I think) that make assumptions about the installation that don't hold if moved to a virtualenv19:19
fungithere will come a time when the pip maintainers are going to be all "yeah we're just going to add a check in pip to see if it's being run as root, and then exit 1"19:19
fungithe discussion has come down to the pip maintainers on one side who don't want to support pip installing into system-wide paths, and the distro package maintainers on the other side who don't want pip installing into system-wide paths19:20
fungiso anything relying on `sudo pip install` is running on borrowed time at this point19:21
clarkbto be fair /usr/local is intended for this purpose isn't it? But maybe that escape hatch isn't sufficient for modern software and we need to move beyond it19:22
clarkb(once upon a time in a galaxy far far away we nfs mounted /usr/local on sparc solaris to provide a set of gnu tools because the solaris ones lacked features and compatibility with a bunch of stuff)19:22
fungithat works well as long as your sparc /usr/local wasn't also mounted on your x86 machines19:23
fungior you didn't mount the sparc64 /usr/local to 32-bit sparc hosts19:23
clarkbya we discovered that the hard way when we got a few x86 solaris machines. Had to do smarter nfs mounts after that19:23
clarkbnot sure what I would've done without those gnu tools though. Probably argued for installing linux on the machines >_>19:24
fungibut yes, i worked a site which did rootnfs boot with sun workstations, it gets fun when your architectures begin to vary19:24
clarkbmelwitt: the latest version of https://review.opendev.org/c/opendev/jeepyb/+/795912/ looks great. I wish the gerrit api existed and was as useful when we first wrote tools for this :)20:13
opendevreviewMerged opendev/jeepyb master: Convert update_blueprint to use the Gerrit REST API  https://review.opendev.org/c/opendev/jeepyb/+/79591220:47
opendevreviewMerged opendev/git-review master: Fix nodeset selections for zuul jobs  https://review.opendev.org/c/opendev/git-review/+/79675421:04
corvusmasterpe: you look like a normal user to me here :)21:12
masterpe[m]yep21:12
*** ChanServ sets mode: +o corvus21:13
corvus...21:17
*** ChanServ sets mode: -o corvus21:18
*** ChanServ sets mode: +o corvus21:19
corvusokay, irc and matrix agree about my mod status here now (though it took an on/off cycle for that to happen, probably because of my kick the other day)21:20
*** ChanServ sets mode: -o corvus21:22
ianwclarkb: thanks for updating acme.sh.  switching the default feels odd and i hope it doesn't suggest anything weird going on the background (a bit burnt by recent irc :)22:29
ianwhttps://github.com/containers/podman/issues/10717 suggests we've done something to the capabilities of shadow-utils in our fedora 34 installs.  i don't imagine anyone is going to be jumping to debug that, it's on my todo list22:51

Generated by irclog2html.py 2.17.2 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!