Friday, 2021-03-12

clarkbhttps://docs.python.org/3/library/ssl.html#id9 is the related documentation00:03
clarkbI bet that this is realted to 1.3 somehow00:03
*** sboyron has quit IRC00:05
fungiclarkb: my change and the dnm change stacked on it indicate that your revision works with 3.6-3.8 on bionic and 3.9 on focal, so i expect it's safe00:18
fungiis there more we want to test?00:18
*** cloudnull has quit IRC00:27
*** cloudnull has joined #opendev00:27
TheJuliaAnything special about limehost?00:29
TheJuliaI ask because it to be the cloud our multinode job in ironic loves to fail on00:30
fungilimestone? it's ipv6-only with ipv4 access to the internet via many-to-one nat00:33
TheJuliayeah00:33
fungiwhat kind of failures? talking to things on the internet? ipv4-only things?00:33
TheJuliait *looks*  like the vxlan tunnel is just not passing traffic00:33
TheJuliabetween the two nodes00:33
fungioh, neat. i think we do our multinode setup specifically p2p so that vxlan won't try to use multicast... are you using ours or did you roll your own?00:34
fungiis it failing to pass traffic over vxlan between nodes all the time there, or only sometimes?00:35
fungipossible the lan there has gotten partitioned or something, i suppose00:36
openstackgerritMerged opendev/system-config master: refstack: use CNAME for production server  https://review.opendev.org/c/opendev/system-config/+/78012500:38
fungiTheJulia: another possibility is that ipv4 connectivity for some nodes is breaking partway into the build? we'd still be able to reach them via ipv6 so zuul wouldn't realize anything had gone wrong network-wise00:40
fungimight make sense to look at syslog on one of the failure examples, see if dhcpd logs any lease updates, arp overwrites, et cetera00:41
TheJuliahttps://0b3775447bad164395a7-ce9ebe3ea1326bbb58a211f00836955d.ssl.cf2.rackcdn.com/778145/2/gate/ironic-tempest-ipa-wholedisk-direct-tinyipa-multinode/4c83c2d00:41
TheJuliaWhen we power up VMs attached to brbm, basically the packets never get through it appears00:41
TheJuliaso they never boot00:41
TheJuliaat least off of compute100:41
fungibasically we're communicating with those nodes exclusively over ipv6, while vxlan is communicating between the nodes over ipv4, so if the latter is dying that could explain it00:42
fungiwe do at least test initially that each node can reach something on the internet over ipv4, but it could be breaking after that i suppose00:43
fungisyslog shows iptables blocking a bunch of multicast traffic00:44
fungiis that typical?00:44
fungivxlan will try to tunnel layer-2 broadcast traffic over multicast ip00:45
fungipossible that's just benign noise00:49
*** tosky has quit IRC00:51
TheJuliaI think it is noise00:51
TheJuliacross node traffic seems to work just fine otherwise00:55
TheJuliaI'm not an ovs expert but it almost looks like ovs kind of works, datapath gets established, and then ovs seems to become unhappy and boom00:59
* TheJulia wonders about MTUs00:59
guillaumecclarkb, indeed, "context.options |= ssl.OP_NO_TLSv1_3" solves the zuul ssl test issue01:01
fungiTheJulia: ooh, good line of inquiry. that could vary by provider too01:01
fungiTheJulia: https://zuul.opendev.org/t/openstack/build/4c83c2d9c1774ce09f0d447bbdbed4d1/log/zuul-info/zuul-info.compute1.txt#2601:02
fungi150001:02
fungii think devstack tries to set the virtual interfaces lower to accommodate that01:02
TheJulia1500 feels like classic physical interface. v6 has pmtu discovery, I wonder if we're in some weird cross-hypervisor packet dropping01:03
* TheJulia prepares to mark the job non-voting :(01:03
fungiyeah, also any particular snapshot of the pmtu for those peers won't necessarily be consistent01:04
fungilogan-: ^ if you're around, maybe you could have some theories since you know what the underlying network looks like01:05
TheJuliaFor the VM's themselves we have, we're dropping the mtu to 1330. Neutron runs at 1430 (there is a reason for the 100  bytes, I just don't remember it without tasty beverages.)01:07
fungiyeah, that memory is best not relived without some chemical safety net01:08
TheJuliaYeah, I'd think the only way to really figure this out is to be able to catch it in the act with a pcap or something01:09
TheJuliabut that would be huge01:09
fungiis this the most common failure for that job? if so an autohold could at least keep the nodes around after the job fails01:10
fungidoesn't mean whatever was breaking them would still be broken by the time we logged in, but worth a shot01:10
TheJuliafungi: I looked at ?3? randomly on that job and it was all the same01:13
TheJuliaall on limestone01:13
TheJuliaI dunno, I'm kind of okay with just defferring it at the moment, too much work to do.01:13
TheJuliathat is unless magical ideas appear01:14
openstackgerritJeremy Stanley proposed opendev/gear master: DNM: see if intermediate Python versions work too  https://review.opendev.org/c/opendev/gear/+/78013101:18
fungiTheJulia: once you (or anyone really) is ready to dig into it, we can set up an autohold for that job and wait for it to catch a failure01:23
*** mlavalle has quit IRC01:24
TheJuliafungi: much appreciated01:34
*** mgagne has joined #opendev02:29
ianwkopecmartin / clarkb : i gave the containers a cycle after the config change applied and i can see results on https://refstack.openstack.org now.  so i think it's working and won't roll back02:38
johnsomI'm trying to push that tag for wsme, but ssh with gerrit is rejecting me. Even if I try to checkout a patch using ssh I get permission denied. Any tips/ideas?02:41
johnsomThe key in gerrit (web) is correct02:41
*** artom has quit IRC02:46
openstackgerritIan Wienand proposed opendev/system-config master: refstack: cleanup old puppet  https://review.opendev.org/c/opendev/system-config/+/78013802:49
johnsomOk, it's something broken on this fedora workstation. Everything works fine from other VMs.02:54
*** whoami-rajat_ has joined #opendev02:55
openstackgerritIan Wienand proposed opendev/system-config master: certcheck: cleanup letsencrypt domains  https://review.opendev.org/c/opendev/system-config/+/78014003:01
ianwjohnsom: fedora 33?03:01
johnsomyeah03:02
ianwyep, that's a known issue03:02
johnsomlol03:02
fungiopenssl security defaults03:02
johnsomCan I get an hour refund?03:02
ianwhttps://issues.apache.org/jira/browse/SSHD-1118 if you'd like to read too much inconclusive detail on it :)03:02
johnsomha, thanks, I will take a look03:02
ianwspeaking of, RAX got the wrong end of the stick with my report that fedora 33 doesn't work with their console host03:03
ianwi think they thought it meant fedora 33 hosts don't show a console, not that you can't connect to their console host via fedora 33 with default configuration03:04
ianwthat's even more screwed up and i'm owed a bigger refund than johnsom there :)03:05
johnsomYep, that was the exact problem. Thanks ianw03:08
*** whoami-rajat_ is now known as whoami-rajat03:16
ianwi filed https://issues.apache.org/jira/browse/SSHD-1141 as requested in sshd-111803:32
ianwi think i distilled it correctly, fungi ^ could maybe check :)03:32
openstackgerritIan Wienand proposed opendev/system-config master: kerberos-kdc: role to manage Kerberos KDC servers  https://review.opendev.org/c/opendev/system-config/+/77884004:06
openstackgerritIan Wienand proposed opendev/system-config master: kerberos: switch servers to Ansible control  https://review.opendev.org/c/opendev/system-config/+/77989004:06
openstackgerritIan Wienand proposed opendev/system-config master: kerberos-kdc: add database backups  https://review.opendev.org/c/opendev/system-config/+/77989104:06
openstackgerritIan Wienand proposed opendev/system-config master: refstack: add backup  https://review.opendev.org/c/opendev/system-config/+/77506104:18
openstackgerritIan Wienand proposed opendev/system-config master: borg-backup hosts: use exact names  https://review.opendev.org/c/opendev/system-config/+/78014404:28
*** ysandeep|holiday is now known as ysandeep04:33
*** ykarel has joined #opendev04:50
ykarelianw, hi, u around?05:11
ykarelwe facing mirror issues for centos 8-stream, and on seeing i see the centos mirror from which infra mirrors follow is not synched for 12 hours05:12
ykarelhttp://mirror.dal10.us.leaseweb.net/centos/8-stream/AppStream/x86_64/os/repodata/ one05:13
ykarelin https://mirror-status.centos.org/ i see some mirror which are good05:14
openstackgerritMerged opendev/system-config master: refstack: add backup  https://review.opendev.org/c/opendev/system-config/+/77506105:15
ykarelthe mirror u added in https://review.opendev.org/c/opendev/system-config/+/684437 is good currently, which was changed later to current one ^ as that was not up to date that time and was not in mirror-status.centos,org05:17
ykarelin https://review.opendev.org/c/opendev/system-config/+/71660205:17
*** stevebaker has quit IRC05:18
*** stevebaker has joined #opendev05:23
ykarelok http://mirror.dal10.us.leaseweb.net/centos/8-stream/AppStream/x86_64/os/repodata/ is updated now, so next rsync should fix infra mirrors05:25
*** whoami-rajat has quit IRC05:28
ykarellast run missed that, and now next run is in approx 1.25 hour05:37
ykarelif it can be manually triggered before that it will be good, else have to wait05:37
*** ykarel_ has joined #opendev06:08
*** ykarel has quit IRC06:08
*** ralonsoh has joined #opendev06:18
*** marios has joined #opendev06:20
*** ykarel_ has quit IRC06:31
*** ykarel has joined #opendev06:32
*** whoami-rajat_ has joined #opendev06:56
ykarelmirrors got updated now07:01
*** slaweq has joined #opendev07:11
*** eolivare has joined #opendev07:25
ianwykarel: sorry, missed this, things in sync now?07:34
ykarelianw, yes now it's synched07:34
ykarelianw, now seeing issue with epel repos not synched07:47
*** hashar has joined #opendev07:48
ykarelhttp://pubmirror1.math.uh.edu/fedora-buffet/epel/8/Everything/x86_64/repodata/?C=M;O=D vs mirror.ord.rax.opendev.org/epel/8/Everything/x86_64/repodata/?C=M;O=D07:50
ykareland other epel mirror https://dl.fedoraproject.org/pub/epel/8/Everything/x86_64/repodata/?C=M;O=D07:51
*** sboyron has joined #opendev08:05
*** andrewbonney has joined #opendev08:33
*** amoralej has joined #opendev08:44
kopecmartinianw: clarkb return_to address is fixed, thanks for that, but I still can't sign in, I suspect it might be something with realm, in openstackid.org i can see that I'm signing in from "Site" realm instead of 'refstack.openstack.org'08:50
*** tosky has joined #opendev08:51
*** jpena|off is now known as jpena08:58
ttxkopecmartin: yes I confirm I see the same. It's weird as the URL has openid.realm=https%3A%2F%2Frefstack.openstack.org09:08
kopecmartinttx: hmm, is there something else which has to be set on the server side in order to have the correct realm?09:10
ttxkopecmartin: I have no idea.. I'll ask openstackID folks to have a look. Was anything changed in the parameters, or was just everything copied over from the old one?09:10
ttxOr could it be some DNS propagation issue ? Like the IP we have for refstack.o.o is not the same as the one the openstackid server sees?09:12
kopecmartinttx: there were lots of changes in the server config (like puppet -> containers move, py2->py3, OS version ...) but no significant changes in the configs09:14
kopecmartinwell , a little workaround with redirection https://review.opendev.org/c/opendev/system-config/+/776292/18/playbooks/roles/refstack/templates/refstack.vhost.j209:15
kopecmartinmaybe that ^^?09:15
kopecmartinmight be, unfortunately the dns config is outside my scope09:17
*** stevebaker has quit IRC09:17
ttxhmm, probably not. Here the issue is that clicking "Log In" should give us the login form, not the openstackid.org front page09:17
ttxI'll ask the ID provider guys, they should be able to tell us what's missing. i'll let you know here if they reply anything useful, And thanks again for working on this!09:18
kopecmartinttx: sure, thanks .. I'm gonna quickly check the refstack project to see how the signin url is form - i remember there were some changes too09:18
*** ysandeep is now known as ysandeep|lunch09:41
openstackgerritAurelien Lourot proposed openstack/project-config master: Add Magnum charm to OpenStack charms  https://review.opendev.org/c/openstack/project-config/+/78021109:50
*** dtantsur|afk is now known as dtantsur09:51
openstackgerritAurelien Lourot proposed openstack/project-config master: Add Magnum charm to OpenStack charms  https://review.opendev.org/c/openstack/project-config/+/78021109:54
*** smcginnis has joined #opendev10:38
*** bodgix has quit IRC10:59
*** bodgix_ has joined #opendev10:59
*** slaweq has quit IRC11:00
*** slaweq has joined #opendev11:02
*** brinzhang0 has quit IRC11:11
openstackgerritRotan proposed openstack/diskimage-builder master: replace the link which is in the 06-hpdsa file  https://review.opendev.org/c/openstack/diskimage-builder/+/73028611:20
*** ysandeep|lunch is now known as ysandeep11:36
openstackgerritMerged zuul/zuul-jobs master: bindep.txt: skip python-devel for el8 platform  https://review.opendev.org/c/zuul/zuul-jobs/+/78005011:47
*** hashar is now known as hasharLunch12:10
*** smcginnis has quit IRC12:28
*** jpena is now known as jpena|lunch12:32
*** artom has joined #opendev12:44
*** tkajinam has quit IRC12:54
*** hasharLunch is now known as hashar13:00
*** smcginnis has joined #opendev13:05
*** ykarel has quit IRC13:08
*** ykarel has joined #opendev13:09
*** amoralej is now known as amoralej|lunch13:23
*** jpena|lunch is now known as jpena13:49
*** smcginnis has quit IRC13:52
*** mlavalle has joined #opendev13:59
dtantsurhi folks! is it only me, or there is some issue with published logs? https://zuul.opendev.org/t/openstack/build/6f9c830b828e4ff382ed05bfdc608a80/log/job-output.txt14:03
fungiianw: that mina-ssh feature request looks good to me, also they've already replied suggesting you could implement it for them ;)14:04
fungidtantsur: "This logfile could not be found" usually means either we failed trying to upload it, or it disappeared off the swift server after upload. i'll take a look in the executor debug logs in a bit to rule out the former (that usually ends in a post_failure result though)14:06
dtantsurthanks! note that it's a very recent run, so it shouldn't have timed out.14:06
fungii'll need to look at it after i run some errands this morning, but will dig in as soon as i'm back14:10
*** amoralej|lunch is now known as amoralej14:12
openstackgerritRich Bowen proposed opendev/yaml2ical master: Adds second- and fourth- week recurring meetings  https://review.opendev.org/c/opendev/yaml2ical/+/78026614:14
*** hashar is now known as hasharAway14:20
*** mfixtex has joined #opendev14:24
*** smcginnis has joined #opendev14:28
TheJuliaput of curiosity, is the new gerrit webui making huge calls for lists of everythning as it could relate to the user interaction?14:32
*** lpetrut has joined #opendev14:35
*** mfixtex has quit IRC14:37
*** whoami-rajat_ is now known as whoami-rajat14:40
fungikopecmartin: clarkb: ttx: apparently the problem is the auth url should be https://openstackid.org/accounts/openid2 not just the base site url14:47
fungiTheJulia: not entirely sure, it's implemented with polymer... but the reason we suspect it's slow is for backend reasons (the relational database has been replaced with objects in git repositories)14:48
fungiand it seems like memory pressure might be making filesystem caches inefficient14:48
*** hasharAway is now known as hashar14:49
TheJuliaOH!14:52
TheJuliaYeah, that explains a lot14:52
TheJuliasince it *looks* like the client asks for things like all my changes, all my blah, at least what I can grok on the screen, and if that doesn't load quite fast enough then the page load breaks it seems14:54
TheJuliaThis is why database indexes are a thing too14:54
TheJulia"Hi, give me the index" vs "hi, pls tablescan this for me"14:54
fungiyeah, and gerrit maintains very large in-memory and on-disk caches of stuff, but the indexing even in caches becomes quite important14:57
*** Green_Bird has joined #opendev14:59
*** ysandeep is now known as ysandeep|dinner15:00
*** eolivare has quit IRC15:01
*** Green_Bird has quit IRC15:01
*** eolivare has joined #opendev15:02
*** Green_Bird has joined #opendev15:02
fungidtantsur: it looks like uploads for that build worked fine, but that one file is not available in swift for some reason (other logs uploaded for that same build can be accessed no problem). have you seen more examples of this? maybe we can find a commonality15:03
*** Green_Bird has quit IRC15:03
fungispecifically, https://14f46b65f6b8edf7deec-a7117e65d5d46fb2ebde9a8b3aa13b86.ssl.cf2.rackcdn.com/780251/1/check/releases-tox-list-changes/6f9c830/job-output.txt reports a "Content Encoding Error" from the rackspace swift cdn15:03
*** artom has quit IRC15:04
*** Green_Bird has joined #opendev15:04
dtantsurI haven't seen other cases, no15:05
fungiso this may be something broken in rackspace's cdn layer, or data corruption at rest (though swift i think prevents that, i don't know how "swift" rackspace's deployment is), or it could be we did something weird when uploading the file (but which did not produce any error)15:05
*** Green_Bird has quit IRC15:05
*** Green_Bird has joined #opendev15:06
openstackgerritMartin Kopec proposed opendev/system-config master: refstack: Fix openid endpoint  https://review.opendev.org/c/opendev/system-config/+/78027215:09
kopecmartinfungi: clarkb ianw ^^15:09
kopecmartinfungi: thanks .. i didn't notice it was overrided in the config, I checked just the default value in refstack ..ah15:10
fungikopecmartin: awesome, reviewing now15:11
fungikopecmartin: i've approved, once it deploys please double-check whether things are working as desired15:12
*** artom has joined #opendev15:16
*** lpetrut has quit IRC15:16
fungii need to pop out to run some errands (i'm a bit behind) but should be back in an hour15:19
kopecmartinfungi: thank you, sure15:20
openstackgerritMerged opendev/system-config master: refstack: Fix openid endpoint  https://review.opendev.org/c/opendev/system-config/+/78027215:42
clarkbfungi: re the gear change safety I think guillaumec is saying that those chagnes will break zuul testing on focal with python 3.816:03
clarkbfungi: and it appears related to the enablement of tls 1.3 via PROTOCOL_TLS16:03
clarkbwe could maybe update the bottom change to disable 1.3 for now?16:03
clarkb( I worry that is the sort of change that becomes permanent)16:04
*** hashar is now known as hasharAway16:06
clarkbguillaumec: maybe we can try to do a minimal reproducer forcing tls 1.3 between client and server and then asking both of them to stop?16:08
clarkbguillaumec: since the test is timing out I suspect that it may just be a teardown/cleanup problem?16:08
*** hasharAway is now known as hashar16:11
*** dhellmann has quit IRC16:12
clarkbfungi: TheJulia: ovs did not support vxlan over ipv6 until relatively recently (and even that may be spec defying?). One option may be to update the multi node bridge stuff to run it over ipv6 if present as that will get us on the prefered IP stack for providers like limestone16:13
*** dhellmann has joined #opendev16:13
clarkbthough using codesearch I'm not sure that the multi node bridge stuff is involved? seems like this may all happen in devstack16:14
TheJuliayeah, there is some magic there someplace in the entire multinode setup16:14
TheJuliaI have to hunt it down every single time I need to look at it :\16:14
TheJuliaThat might be an option, interestingly enough cross-node v4 seems to be fine in general but we may just not be seeing everything from the job logs that could be happening that makes it seem like everything is fine16:16
*** ysandeep|dinner is now known as ysandeep16:18
clarkbya and vxlan is udp and could be more sensitive to those problems?16:18
*** klonn has joined #opendev16:19
clarkbanother issue it could be is conflicting ip addrs16:20
clarkbwe saw that way back when osic was around because they assigned test node ips out of 10/8 and occassionally the overlays would overlap ip ranges and routing would break16:20
clarkbok it is using the multinode network setup via zuul. It does so with a patch interface between brbm and br-infra called phy-brbm-infra16:22
clarkband phy-infra-brbm16:22
clarkbthey are opposite ends of the same virtual cable16:22
clarkbdoes not appear to be an ip conflict. br-infra uses 172.24.4.0/24 and the limestone nodes are 10.4.70.0/2416:25
clarkbthat probably rules out the easy things, holding a couple of nodes and inspecting the result is likely the easiest way to debug16:29
clarkbianw: the mina sshd feature request lgtm. Also chris might be my hero16:38
kopecmartinclarkb: fungi will this https://review.opendev.org/c/opendev/system-config/+/780272 be applied on the server automatically or is there a manual action required?16:39
fungikopecmartin: looks like we don't have a separate deploy job for it yet, so it should get applied in our hourly deployment i think? i'll check in a sec16:40
fungior it could be the deploy jobs haven't finished yet16:40
kopecmartingreat, thanks .. just wanted to be sure16:41
clarkbthere should be an infra-prod job, we may have to intervene and restart the service to pick up the config change though16:41
fungikopecmartin: oh, it just hasn't run yet, see the deploy pipeline at https://zuul.opendev.org/t/openstack/status16:41
clarkbI notice ianw did that earlier for the fqdn switch16:41
fungithere is an infra-prod-service-refstack build for it in waiting state, but it's a ways down the list16:42
fungiand yeah, maybe the playbook needs a restart handler for config changes16:42
*** amoralej is now known as amoralej|off16:42
*** marios is now known as marios|out16:45
ttxone of the jobs seems to have failed16:47
ttxinfra-prod-base on the deploy of 78027216:47
fungiyeah, the infra-prod-base job probably had trouble deploying to a down server somewhere, i'm about to go hunting in the logs on the bastion, it shouldn't affect refstack deployment unless it was the refstack server which was the problem16:48
ttxack16:48
*** marios|out has quit IRC16:48
fungithat's the job which does things like add our sysadmin accounts, set up mta configs, et cetera16:48
fungibut it's running against every machine in our inventory, so if one is down/hung somewhere, that'll report a build failure16:49
fungid'oh!16:50
fungirefstack.openstack.org     : ok=0    changed=0    unreachable=1    failed=0    s16:50
fungikipped=0    rescued=0    ignored=016:50
fungiso bridge can't reach refstack.openstack.org16:50
fungiaha, expected16:51
fungirefstack01.openstack.org   : ok=60   changed=2    unreachable=0    failed=0    skipped=7    rescued=0    ignored=016:51
fungirefstack01.openstack.org is working, but the refstack.openstack.org server in our inventory (i'm guessing the old one) is unreachable, probably offline in preparation for being deprovisioned but we haven't deleted it from the inventory yet16:51
clarkbyup, ianw says the old server would be shutdown but not removed for now16:52
fungittx: so that build failure is expected in this case16:52
fungii suppose we could have added that server to our disable list to avoid the deploy build trying to reach it and reporting failure16:55
fungisomething we could consider for future deprovisioning work16:55
TheJuliaclarkb: oh yeah, definitely way more sensitive16:56
clarkbfungi: ya and maybe we should go ahead and add it now to prevent confusion until it is removed?16:57
TheJuliafungi: ^^^ that is why we lowered the mtu a long time ago.... I remembered :(16:57
*** hashar has quit IRC16:59
*** eolivare has quit IRC17:06
*** jpena is now known as jpena|brb17:12
fungiTheJulia: i'm sorry, we should have waited to trigger those memories until beer time17:25
fungiclarkb: good call, added it just now17:25
ttxfungi: ok let me know when I should be testing again :)17:28
fungiwill do, looks like there are still three deploy jobs ahead of it17:36
openstackgerritClark Boylan proposed opendev/system-config master: Enable srvr, stat and dump commands in the zk cluster  https://review.opendev.org/c/opendev/system-config/+/78030317:36
fungithe semaphore those jobs use tends to slow this down quite a bit17:36
clarkbcorvus: ^ enabling those commands17:37
*** jpena|brb is now known as jpena17:54
fungittx: kopecmartin: refstack deployment finished at 17:49:57 utc, i'll check whether the service got restarted17:55
*** ralonsoh has quit IRC17:55
fungilooks like the container was last upped at 02:34 utc, according to ps17:56
fungialso /var/refstack/refstack.conf was last modified on february 10, not sure if that's old. checking the bindmounts now17:57
clarkbfungi: I need breakfast now that the nodepool launcher debugging is done, but I can help with refstack once I've eaten something17:58
fungiaha, yeah that's cruft, it looks at /var/lib/refstack/etc/refstack.conf now and that was modified 17:4917:58
fungiopenstack_openid_endpoint = https://openstackid.org/accounts/openid217:59
fungittx: kopecmartin: so the config looks correct. will the service need a restart to see the updated refstack.conf file or does it reload it autonomously? sounds like ianw did an explicit restart to pick up an earlier config change17:59
kopecmartinfungi: a restart will be needed18:01
kopecmartinso that the config gets copied to the container and is applied18:02
*** ykarel has quit IRC18:04
fungikopecmartin: okay, doing that now18:05
fungiwe should consider adding a handler to do that on config updates if that's safe, or abstract the configuration loading into something which can be triggered by a signal (or watch for file updates directly)18:05
fungiit's on its way back up now18:06
*** artom has quit IRC18:06
fungittx: kopecmartin: i guess go ahead and test it now18:06
kopecmartinfungi: \o/ it works!! thank you!!18:07
*** artom has joined #opendev18:08
fungikopecmartin: no thanks needed, i just pushed a few buttons... but glad it's sorted now18:08
*** dtantsur is now known as dtantsur|afk18:10
*** hamalq has joined #opendev18:14
fungi#status log Restarted the containers on refstack01 to pick up configuration change from https://review.opendev.org/78027218:15
openstackstatusfungi: finished logging18:15
*** smcginnis has quit IRC18:25
clarkbcorvus: still planning to do a zuul restart today? queues aren't tiny but also not huge. Node demand is very low.18:26
clarkbalso openstack release team said they could extend to monday if necessary (they seemed ok with the friday plan)18:27
*** smcginnis has joined #opendev18:30
corvusclarkb: extend what?18:31
clarkbcorvus: feature freeze18:33
clarkb(I kinda got the impression a few things were going to slip even if we did nothing so they were already considering it)18:33
corvusclarkb: yeah, i agree nodes look good now, but maybe after lunch?18:34
clarkbcorvus: wfm. Though I'll be trying to enjoy this good weather on the bike this afternoon, but will be around before and after that18:35
fungii'll be around18:35
*** klonn has quit IRC18:38
clarkbfungi: now that we've had a day to think about it, any reasons to not move forward with simply retiring those accounts with the no external id for preferred email address problem? We theorize these are the result of fallout from other sql db based account mangling as we dont' expect this is doable as a normal user. Also none of the accounts have been used in a year according to the audit script18:43
*** jpena is now known as jpena|off18:44
fungino, i still think it seems like it should be entirely safe to retire those18:49
clarkbok I'll proceed with that now then18:49
clarkbI went over the data again ab it more today and yo ucan see for some of the accounts they clearly transitioned from one account to another (just builds more confidence this is the right move)18:50
*** LowKey has joined #opendev18:52
fungiyep, i expect they're all like that, it's just harder to connect the dots for a few since it happened years before18:52
*** andrewbonney has quit IRC18:54
clarkbalright that is done and logs have been uploaded to review19:00
clarkbI'm going to do a consistency check next19:00
clarkb#status log Corrected all Gerrit preferred email lacks external id account consistency problems.19:14
openstackstatusclarkb: finished logging19:14
clarkbstill have quite a number of external id conflicts but this is progress19:14
clarkbthe consistency resutls in are in my homedir on review19:14
* fungi will take a look shortly19:16
*** hashar has joined #opendev20:07
*** klonn has joined #opendev20:08
corvusclarkb, fungi: i'm going to start that restart now20:55
corvusclarkb: did your nodepool change land?  should we restart nodepool too?20:56
clarkbcorvus: it did land and I worked through them yesterday already20:56
corvusclarkb: ok, so we'll just leave nodepool alone?20:56
clarkbya should be fine to leave nodepool alone20:56
fungicool, i'm here. need help?20:57
corvusfungi: i don't think so; i'm just going to save queues then run the zuul_restart playbook20:57
fungii'm around to dig in if it goes pear shaped20:58
corvusstopping now21:00
corvusthings are starting21:02
corvuscat jobs are catting21:04
fungiso catty21:06
fungiour zuul is practically jellicle21:07
corvusre-enqueing21:07
corvus2021-03-12 21:08:02,726 DEBUG zuul.RPCListener: Formatting tenant openstack status took 0.005 seconds for 93502 bytes21:08
corvusthat's a new log line btw21:08
*** sboyron has quit IRC21:08
funginice! i like the (albeit miniscule) measurement there21:08
corvussee where that is when all the changes are re-enqueued :)21:08
fungii suppose it gets bigger when there's queue data21:08
fungiheh, right that21:09
*** artom has quit IRC21:09
clarkband we cache that for ~1second still right?21:09
corvusyep21:09
fungilast i looked at the apache config21:09
corvuswe cache internally too21:10
corvusapache protects zuul-web, and zuul-web protects zuul-scheduler21:10
fungioh, right, the cache duration is expressed in the headers21:11
funginot hard-coded in the apache vhost config21:11
corvuswe're at about .03s for 500k so far21:11
corvus(still enqueueing)21:11
*** whoami-rajat has quit IRC21:13
corvus#status log restarted all of zuul at commit 13923aa7372fa3d181bbb1708263fb7d0ae1b44921:19
openstackstatuscorvus: finished logging21:19
corvusre-enqueue is done.21:19
corvus2021-03-12 21:19:30,233 DEBUG zuul.RPCListener: Formatting tenant openstack status took 0.059 seconds for 877325 bytes21:20
corvusthat's looking typical21:20
corvussometimes it's higher, but it's not in the main thread, so can suffer from contention21:20
corvus0.1 looks to be the max21:20
clarkbstill well below the cache time which is why I was curious21:21
fungistill fairly small, good sign21:21
fungiwe had almost no node request backlog prior to the restart, and the reenqueue really only shot it up to 500 briefly21:23
fungiit's already burning down quickly21:23
fungiwe weren't even using max quota at the time of the restart, so seems like it was good timing21:24
clarkbya I expected even with feature freeze that friday would be much calmer21:24
fungieveryone's already drinking21:24
fungiwhy am i not drinking yet?21:24
clarkbI'm not drinking because it is almost time to get some exercise21:25
fungitime to exercise my liver21:25
corvusthen drinking21:25
clarkbcorvus: it is almost as warm here as there. I'm really excited21:25
fungiit's 22.5c here21:26
fungicrazy given this is technically still winter for more than a week21:26
clarkbwill get to 16 here in about an hour. I'm timing my outside time around that temp peak :)21:26
fungibreezy but sunny. we should have this temperature all the time21:27
clarkbif I go out in half and hour then my 1-1.5 hours outside should involve max warmth21:27
fungii should walk to the beach, but it's almost dinner21:27
*** hashar has quit IRC21:37
*** smcginnis has quit IRC21:38
*** smcginnis has joined #opendev21:44
clarkblooks like grafana says backlog is back to basically nil21:49
fungiyes, we're back down under quota again21:50
fungii think that means the weekend is here21:50
clarkbfungi: ruamel will serialize human readable yaml right?21:50
clarkbI think my next step on the gerrit account work is to have the audit script spit out serialized data so I can write queries against it more easily21:50
fungiclarkb: i don't know how to interpret some of those words, but it preserves ordering and comments21:50
fungiit also comes at the cost of a spaghetti pile of ruamel libraries as dependencies21:51
clarkbfungi: heh maybe "more human readable than pyyaml" is more accurate21:51
clarkbI guess I can try pyyaml first21:51
clarkbin particular what I want to start looking at is if there are any more accounts that have broken openids regardless of previous activity and realized for taht I should just try to serialize as much info as possible then write separate queries against that21:51
clarkbalso do you think we can land the tooling as proposed?21:52
fungiworkaround is to actually make comments in yaml (like have a "description" field, et cetera)21:52
clarkbits been used a fair bit now and would make it easier for me when I switch between system-config branches to not have to always checkout that one branch to have the tools present21:52
fungier, yeah i'm not entirely understanding the "human readable" bit then21:53
fungiif it's not about comments, then...21:53
*** smcginnis has quit IRC21:53
fungiyou can make pyyaml emit more human-friendly yaml formats, you just need to configure it21:53
clarkbfungi: maybe the pain has been in configuring it then21:54
fungihttps://mudpy.org/gitweb?p=mudpy.git;a=blob;f=mudpy/data.py;h=b73959a1b63d857657dbdd4f5afce32c3746e593;hb=HEAD#l16121:55
fungii've overridden the dumper there specifically to force it to indent lists, but you can probably ignore that21:55
fungithe end result though is to make pyyaml write files that yamllint can stomach21:56
fungii fond it mildly incoherent that yamllint (written in python) objects to the default output of the most commonly-used python yaml implementation, but i've come to terms with that21:58
fungis/fond/find/21:58
clarkbis pyyaml optimized for wire transfers by default? I seem to recall there may be reasons like that21:58
fungiyeah, could be21:59
fungianyway, feel free to steal that, it's all isc licensed. maybe you want the indented lists too, the _IBSEmitter class is not that complicated to add22:00
clarkbthanks22:00
*** klonn has quit IRC22:05
fungii keep meaning to push a pr to pyyaml to make that configurable, but... enotime22:32
*** gothicserpent has quit IRC23:14
*** gothicserpent has joined #opendev23:20

Generated by irclog2html.py 2.17.2 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!