Friday, 2020-05-15

ianwi'm not sure if we've been updating or deleting dns00:00
clarkbianw: the nameserver playbook was one of those that ran yesterday during the great big enqueing and it was successful00:03
clarkbI didn't look at the logs though00:03
ianwoh sorry i meant for the old servers, it looks like some we haven't cleaned up00:03
ianwi'll add it to the todos00:03
clarkboh in rax managed dns, got it00:04
ianwi'm also going to merge the ord.rax.opensatck removal, as that is also shutdown and currently in emergency00:05
ianwthat will finalise ord+dfw00:06
ianwiad to be switched in when we feel like watching it00:06
ianwthis leaves limestone and linaro-london00:06
ianwclarkb: you have two shutoff test instances in limestone ... do you want them?00:08
clarkbianw: uhm I think they can go I was debugging something there iirc. what are they called?00:09
ianwclarkb-test1 and clarkb-test2 :)00:09
clarkbya those can go00:09
clarkbI think they were debugging a weird neutron thing that never resulted in much00:09
openstackgerritMerged opendev/system-config master: Remove mirror02.dfw.rax.openstack.org  https://review.opendev.org/72789400:13
ianwwe also had an old unattached "afs" volume in there00:14
openstackgerritMerged opendev/system-config master: Remove mirror01.ord.rax.openstack.org  https://review.opendev.org/72789700:20
ianwkevinz: do you have thoughts on the linaro london zone?  will it ever come back, or should we remove it?00:22
openstackgerritIan Wienand proposed opendev/zone-opendev.org master: Add limestone replacement mirror  https://review.opendev.org/72831400:25
openstackgerritMerged opendev/system-config master: Add infra-root-keys-2020-05-13 to rotate older ssh keys  https://review.opendev.org/72786500:29
openstackgerritmelanie witt proposed zuul/zuul-jobs master: Run sphinx-build in parallel for releasenotes  https://review.opendev.org/72747300:35
mordredianw: sorry. wasn't really here, still not really here. it's entirely probably that the -arm64 image in /opt/images on bridge is garbage and should be rebuilt00:38
mordredianw: I built the others using /opt/nodepool_dib/make-focal.sh on nb04.opendev.org and then scp'd them over to bridge00:39
ianwmordred: ok, i've been just using an upstream aMD64 image for the latest couple of mirror servers, just to get things going.  we may have to debug glean more on focal, although i don't recall seeing any gate failures00:44
openstackgerritMonty Taylor proposed opendev/base-jobs master: Add jobs for publishing javascript content  https://review.opendev.org/72809700:47
mordredianw: nod. we should probably try a re-build now that you added focal support properly too00:48
smcginnisAnyone know if this new docs job step was just recently added: "tox -e pdf-docs -vv > [file]"00:48
ianwsmcginnis: not much help, i remember some pdf things months ago but nothing recently00:52
clarkbsmcginnis: yes00:52
clarkbchqnge was landed today because jobs were producing hundreds of megabytes of logs in latex warnings00:53
smcginnisLooks like the normal step ignores errors, but then tries to run again and fails.00:53
smcginnishttps://storage.gra.cloud.ovh.net/v1/AUTH_dcaab5e32b234d56b626f72581e3644c/zuul_opendev_logs_31c/725767/9/check/openstack-tox-docs/31c3adb/job-output.txt00:53
smcginnisI've seen multiple failures in different repos.00:53
openstackgerritMerged opendev/zone-opendev.org master: Add vexxhost opendev.org mirrors  https://review.opendev.org/72830800:53
openstackgerritMerged opendev/zone-opendev.org master: Add limestone replacement mirror  https://review.opendev.org/72831400:53
smcginnisNot every repo does the pdf builds, so it needs to ignore errors.00:53
clarkbsmcginnis: ya https://storage.gra.cloud.ovh.net/v1/AUTH_dcaab5e32b234d56b626f72581e3644c/zuul_opendev_logs_31c/725767/9/check/openstack-tox-docs/31c3adb/sphinx-build-pdf.log I think we need the same check the include role before it has00:54
clarkbI can push a fix soon00:55
smcginnisclarkb: Yeah, looks like it needs the "when" part? https://opendev.org/openstack/openstack-zuul-jobs/commit/edc845f0a6da5e7e278eff7d9cc54a261ee531aa00:55
openstackgerritIan Wienand proposed opendev/system-config master: Add limestone opendev.org server  https://review.opendev.org/72831600:56
openstackgerritIan Wienand proposed openstack/project-config master: Switch limestone server to opendev.org  https://review.opendev.org/72831801:08
openstackgerritIan Wienand proposed opendev/system-config master: Remove limestone openstack.org mirror  https://review.opendev.org/72831901:15
*** ysandeep|sleep is now known as ysandeep01:34
kevinzHi ianw: yes we can remove it. We will only use linaro US in the future02:00
ianwkevinz: ok, thanks, that will simplify things02:02
ianwclarkb: if around, not sure what you mean by cleanup from nb04?  that doesn't have a separate config file now?02:07
ianwi agree we can remove the server if we want02:08
clarkbianw oh I think frickler and I assumed there was still a separate file02:09
clarkbI checked inventory for it (its there) but not project-config02:09
ianwok, i'll stack linaro-london removal ontop of it too02:11
openstackgerritIan Wienand proposed openstack/project-config master: Remove linaro-london cloud  https://review.opendev.org/72833202:17
*** icarusfactor has joined #opendev02:25
openstackgerritIan Wienand proposed opendev/system-config master: Remove citycloud  https://review.opendev.org/72790502:26
openstackgerritIan Wienand proposed opendev/system-config master: Remove linaro-london cloud  https://review.opendev.org/72833402:26
*** factor has quit IRC02:26
ianwi believe that is everything.  i'll try and stack it all02:49
*** icarusfactor has quit IRC02:51
*** icarusfactor has joined #opendev02:52
openstackgerritIan Wienand proposed opendev/system-config master: Remove citycloud  https://review.opendev.org/72790503:06
openstackgerritIan Wienand proposed opendev/system-config master: Remove linaro-london cloud  https://review.opendev.org/72833403:06
openstackgerritIan Wienand proposed opendev/system-config master: Add vexxhost opendev.org mirrors  https://review.opendev.org/72830903:06
openstackgerritIan Wienand proposed opendev/system-config master: Add limestone opendev.org server  https://review.opendev.org/72831603:06
openstackgerritIan Wienand proposed opendev/system-config master: Remove vexxhost openstack.org mirrors  https://review.opendev.org/72831103:06
openstackgerritIan Wienand proposed opendev/system-config master: Remove limestone openstack.org mirror  https://review.opendev.org/72831903:06
openstackgerritIan Wienand proposed opendev/system-config master: Remove citycloud  https://review.opendev.org/72790503:12
openstackgerritIan Wienand proposed opendev/system-config master: Remove linaro-london cloud  https://review.opendev.org/72833403:12
openstackgerritIan Wienand proposed opendev/system-config master: Add vexxhost opendev.org mirrors  https://review.opendev.org/72830903:12
openstackgerritIan Wienand proposed opendev/system-config master: Add limestone opendev.org server  https://review.opendev.org/72831603:12
openstackgerritIan Wienand proposed opendev/system-config master: Remove vexxhost openstack.org mirrors  https://review.opendev.org/72831103:12
openstackgerritIan Wienand proposed opendev/system-config master: Remove limestone openstack.org mirror  https://review.opendev.org/72831903:12
openstackgerritIan Wienand proposed opendev/system-config master: Remove iad.rax openstack.org mirror  https://review.opendev.org/72833903:12
openstackgerritIan Wienand proposed openstack/project-config master: Switch RAX IAD mirror to opendev.org version  https://review.opendev.org/72791703:21
openstackgerritIan Wienand proposed openstack/project-config master: Remove citycloud kna1/lon1/sto2 clouds  https://review.opendev.org/72790203:21
openstackgerritIan Wienand proposed openstack/project-config master: Remove Citycloud from grafana  https://review.opendev.org/72795603:21
openstackgerritIan Wienand proposed openstack/project-config master: Remove linaro-london cloud  https://review.opendev.org/72833203:21
openstackgerritIan Wienand proposed openstack/project-config master: Switch vexxhost mirrors to opendev.org  https://review.opendev.org/72831003:21
openstackgerritIan Wienand proposed openstack/project-config master: Switch limestone server to opendev.org  https://review.opendev.org/72831803:21
openstackgerritIan Wienand proposed openstack/project-config master: site-variables: remove opendev.org mirror switch  https://review.opendev.org/72834503:21
openstackgerritIan Wienand proposed opendev/system-config master: Remove puppet mirror support  https://review.opendev.org/72835003:32
*** kevinz has quit IRC03:37
openstackgerritIan Wienand proposed opendev/system-config master: Remove puppet mirror support  https://review.opendev.org/72835003:40
ianwmy new work entitled "how to remove openstack.org mirrors in 15 easy steps"03:42
*** factor has joined #opendev03:45
*** kevinz has joined #opendev03:46
*** icarusfactor has quit IRC03:47
openstackgerritIan Wienand proposed opendev/system-config master: Remove linaro-london cloud  https://review.opendev.org/72833403:49
openstackgerritIan Wienand proposed opendev/system-config master: Add vexxhost opendev.org mirrors  https://review.opendev.org/72830903:49
openstackgerritIan Wienand proposed opendev/system-config master: Add limestone opendev.org server  https://review.opendev.org/72831603:49
openstackgerritIan Wienand proposed opendev/system-config master: Remove vexxhost openstack.org mirrors  https://review.opendev.org/72831103:49
openstackgerritIan Wienand proposed opendev/system-config master: Remove limestone openstack.org mirror  https://review.opendev.org/72831903:49
openstackgerritIan Wienand proposed opendev/system-config master: Remove puppet mirror support  https://review.opendev.org/72835003:49
*** factor has quit IRC03:51
*** factor has joined #opendev03:52
*** ykarel|away is now known as ykarel03:54
*** ysandeep is now known as ysandeep|afk04:51
ianwclarkb: so, the tl;dr for the mirror work is start @ https://review.opendev.org/#/c/728339/ and follow the depends-on and changes ontop of that05:07
ianwit alternates between removing things from project-config and then system-config ... but it should all unravel.  probably need to wait between deployments for some of it05:07
ianwthe "did read" version is at https://etherpad.opendev.org/p/openstack.org-mirror-be-gone05:08
*** DSpider has joined #opendev05:16
openstackgerritIan Wienand proposed openstack/diskimage-builder master: ubuntu-minimal : only install 16.04 HWE kernel on xenial  https://review.opendev.org/72699605:39
openstackgerritIan Wienand proposed openstack/diskimage-builder master: ubuntu-minimal: Add Ubuntu Focal test build  https://review.opendev.org/72575205:39
*** diablo_rojo has quit IRC05:54
*** dpawlik has joined #opendev05:58
*** ysandeep|afk is now known as ysandeep06:03
fricklerinfra-root: the ubuntu mirror still seems to be 7d old, it is also quite close to its quota, not sure if that might be related06:29
openstackgerritMerged openstack/project-config master: Switch RAX IAD mirror to opendev.org version  https://review.opendev.org/72791706:32
fricklerre-running the vos release in the screen on afs01.dfw06:35
*** ykarel is now known as ykarel|afk06:49
fricklero.k., that finished pretty fast and now /afs/openstack.org/mirror/ubuntu/timestamp.txt has the same May 13 date as the .openstack.org version06:53
fricklerstill waiting for someone to confirm and check the quota situation before dropping the lock on mirror-update06:53
ianwfrickler: it looks like 40-50gb free; i don't oppose upping it if you like but probably not related to any issues i'd say?07:03
*** rpittau|afk is now known as rpittau07:12
*** DSpider has quit IRC07:22
*** DSpider has joined #opendev07:27
*** tosky has joined #opendev07:35
*** ykarel|afk is now known as ykarel07:57
*** moppy has quit IRC08:01
*** moppy has joined #opendev08:01
*** slaweq has joined #opendev08:04
*** tkajinam has quit IRC08:14
*** dtantsur|afk is now known as dtantsur08:17
*** ykarel is now known as ykarel|lunch08:27
openstackgerritAndreas Jaeger proposed zuul/zuul-jobs master: DNM: Debug sibling install  https://review.opendev.org/72838408:44
*** ysandeep is now known as ysandeep|lunch09:01
*** ykarel|lunch is now known as ykarel09:07
frickler#status log vos release for mirror.ubuntu completed successfully, dropped the lock on mirror-update to resume normal operations09:14
openstackstatusfrickler: finished logging09:14
fricklerfungi: ^^ I left your screens in place in case you want to crosscheck, feel free to clean up09:15
*** ysandeep|lunch is now known as ysandeep09:39
*** rpittau is now known as rpittau|bbl10:09
*** dpawlik has quit IRC10:27
*** dpawlik has joined #opendev10:28
openstackgerritSorin Sbarnea (zbr) proposed zuul/zuul-jobs master: bindep: Add missing virtualenv and fixed repo install  https://review.opendev.org/69363710:31
openstackgerritMerged openstack/project-config master: Remove citycloud kna1/lon1/sto2 clouds  https://review.opendev.org/72790212:05
*** rpittau|bbl is now known as rpittau12:06
openstackgerritMerged openstack/project-config master: Remove Citycloud from grafana  https://review.opendev.org/72795612:07
openstackgerritMerged openstack/project-config master: Remove linaro-london cloud  https://review.opendev.org/72833212:07
*** sshnaidm|afk is now known as sshnaidm|off12:07
*** ysandeep is now known as ysandeep|afk12:16
openstackgerritTristan Cacqueray proposed zuul/zuul-jobs master: Add remove-zuul-sshkey  https://review.opendev.org/68071212:24
*** ysandeep|afk is now known as ysandeep12:29
*** panda|out is now known as panda12:30
openstackgerritAlbin Vass proposed zuul/zuul-jobs master: Fix broken tox-siblings.yaml test  https://review.opendev.org/72843812:55
*** ykarel is now known as ykarel|afk12:56
openstackgerritMonty Taylor proposed opendev/base-jobs master: Add jobs for publishing javascript content  https://review.opendev.org/72809713:00
mordredAJaeger: how does that look now? ^^13:06
mordredAJaeger: I updated https://review.opendev.org/#/c/726554/ to use it13:06
mordred(of course config project, so that patch doens't work yet)13:06
openstackgerritSorin Sbarnea (zbr) proposed zuul/zuul-jobs master: tox: allow tox to be upgraded  https://review.opendev.org/69005713:13
openstackgerritAlbin Vass proposed zuul/zuul-jobs master: Fix broken tox-siblings.yaml test  https://review.opendev.org/72843813:32
openstackgerritMerged zuul/zuul-jobs master: Add remove-zuul-sshkey  https://review.opendev.org/68071213:33
fungifrickler: awesome, thanks for following up!13:46
fungiit was at the top of my to do list for when i woke up, so that's a nice start to my morning ;)13:47
*** kevinz has quit IRC13:50
*** mlavalle has joined #opendev13:59
*** icarusfactor has joined #opendev14:02
*** factor has quit IRC14:05
*** ykarel|afk is now known as ykarel14:10
*** avass has quit IRC14:16
openstackgerritMerged zuul/zuul-jobs master: Fix broken tox-siblings.yaml test  https://review.opendev.org/72843814:19
*** kevinz has joined #opendev14:28
donnydis there any way to run dib-lint without checking the lib dir14:44
donnydI am trying to write standalone elements that are tracked in separate repos, and I want to just lint element14:44
donnydI don't see a way to do that in here https://opendev.org/openstack/diskimage-builder/src/tag/2.36.0/bin/dib-lint14:45
fungiyeah, not sure, you might ask in #openstack-dib14:52
donnydoh, didn't know that was a thing14:52
donnydthanks fungi14:52
fungiyw14:52
AJaegerinfra-rot, https://zuul.opendev.org/ is timing out for me14:57
AJaegerinfra-root, I mean ^14:58
mordredwfm14:59
AJaegernow as well for me15:00
AJaegermight have been a longer ntework problem15:00
fungiyeah, a cursory check of the server yields nothing surprising15:06
AJaegerI now have other network problems as well, so might be on my end - thanks for checking15:11
fungino problem, i'd rather catch problems early if we have any15:16
*** ykarel is now known as ykarel|away15:16
*** avass has joined #opendev15:18
clarkb++15:32
AJaeger:)15:34
clarkb728286 merged. That eould be a good one to restart the zuul scheduler on. I think our zuul-web memory canary is still happy too15:34
*** ysandeep is now known as ysandeep|away15:36
*** dpawlik has quit IRC15:41
openstackgerritAlbin Vass proposed zuul/zuul-jobs master: Don't require tox_envlist  https://review.opendev.org/72682915:44
clarkbmordred: we might need to remove the +A from https://review.opendev.org/#/c/728310/ and https://review.opendev.org/#/c/728318/2 because I'm not sure the depends on is sufficient to ensure we have a running new server prior to updating site-variables?15:47
clarkbmordred: and maybe you can double check my comment on https://review.opendev.org/#/c/728334/415:51
openstackgerritJeremy Stanley proposed opendev/jeepyb master: Update OpenDev Manual URL in new contributor intro  https://review.opendev.org/72847915:53
openstackgerritMerged opendev/system-config master: Remove iad.rax openstack.org mirror  https://review.opendev.org/72833916:10
openstackgerritAlbin Vass proposed zuul/zuul-jobs master: Don't require tox_envlist  https://review.opendev.org/72682916:12
openstackgerritMerged opendev/system-config master: Remove citycloud  https://review.opendev.org/72790516:18
openstackgerritAlbin Vass proposed zuul/zuul-jobs master: Remove unecessary environment from tox_siblings test  https://review.opendev.org/72849416:21
openstackgerritJeremy Stanley proposed opendev/jeepyb master: Update OpenDev Manual URL in new contributor intro  https://review.opendev.org/72847916:29
*** cmurphy is now known as cmorpheus16:33
openstackgerritMerged openstack/project-config master: Finish retiring x/pbrx  https://review.opendev.org/72646316:35
clarkbinfra-root I think I'm in a good spot to restart the zuul scheduler and mergers (and executors and web I guess if we do the whole thing). Looking at running jobs it looks like tripleo may have a whole stack of things mergein the next half hour so maybe plan for ~17:15 ish restart?16:36
clarkbI think my change to add a start.yaml for executors landed so we can use the big restart playbook16:37
clarkbthis will get us the "don't reload configs for tags" and "fix config errors when projects are used across tenants in exciting ways" bugfixes16:37
clarkbas well as python3 usage for scheduler and mergers16:37
clarkb*python3.716:40
mordredclarkb: yes - I agree with your comment on 72833416:43
corvusclarkb: sounds good; i'm back from running an errand16:46
fungii've also got my lunchtime chores out of the way and will be on hand for a while16:48
openstackgerritOleksandr Kozachenko proposed zuul/zuul-jobs master: Add DaemonSet check for wait-for-pods role  https://review.opendev.org/72850316:52
openstackgerritJames E. Blair proposed opendev/system-config master: Run Zuul, Nodepool, and Zookeeper as the "container" user  https://review.opendev.org/72695817:04
*** mlavalle has quit IRC17:05
clarkbzuul says its ~5 minutes until the end of that stack. I'll start a root screen on bridge17:05
*** mlavalle has joined #opendev17:06
clarkbI've got the command queued up there ready to go if it looks good (currently prefixed by # so it won't run early)17:07
clarkbmaybe someone else can capture and restore queues?17:07
corvusclarkb: sure17:07
clarkbcorvus: thanks. We are just waiting on tripleo changes to merge (or fail) now17:08
*** dtantsur is now known as dtantsur|afk17:08
openstackgerritJames E. Blair proposed opendev/system-config master: Add iptables_extra_allowed_groups  https://review.opendev.org/72647517:10
clarkbinfra-root we've also got a few deploy things queued up. I'm not sure how important it is to wait for those17:11
clarkbtripleo gate just cleared out17:11
corvusclarkb: they will get re-enqueued17:11
*** rpittau is now known as rpittau|afk17:11
clarkbcorvus: let me know when you are ready for me to run the ansible-playbook command in the bridge screen17:11
corvusclarkb: and most things should be idempotent, so i think the main risk would be if we killed it while it was in the middle of something important.  i say we yolo it's friday.17:11
openstackgerritOleksandr Kozachenko proposed zuul/zuul-jobs master: Add DaemonSet check for wait-for-pods role  https://review.opendev.org/72850317:11
clarkbcorvus: rgr17:12
corvusclarkb: i have saved queues; clear to go17:12
clarkbcorvus: ok proceeding in bridge screen now17:12
clarkbwe are in the wait for executors to stop phase17:13
clarkbeverything else seems to have stopped17:13
clarkbzuul process count on ze01 is slowly falling17:15
clarkb(still waiting but numbers look to be dropping, just need patience)17:19
clarkb5 executors have stopped17:22
clarkbnow waiting on only ze1017:23
clarkbplaybooks is done it claims to have started things17:24
clarkbI see the expected processes running17:24
Open10K8SHi team17:24
Open10K8SZuul is not responding17:24
Open10K8S503 service unavailable17:24
corvuscat jobs are away17:25
clarkbscheduler has submitted merge jobs as expected17:25
clarkbOpen10K8S: we are doing a global restart to pick up a number of changes that got queued up behind openstack release17:25
clarkbOpen10K8S: it should be up in a few more minutes17:25
Open10K8Sclarkb: ok, got it.17:26
Open10K8SScheduled downtime is not a bad thing :)17:26
clarkbwell semi scheduled. Its friday and we had a good window :)17:26
Open10K8Sclarkb: Just want to check if it is an unscheduled one :)17:26
Open10K8Sclarkb: Just nofity  to check if it is an unscheduled one :)17:26
clarkb++ and tahnks17:26
Open10K8Sclarkb: pleasure17:26
clarkbcorvus: looks like scheduler is up now17:27
corvusre-enqueing17:27
clarkbthats good, now we've tested the new version of the zuul restart playbook17:28
clarkbwhen you're happy with it I'll status log what this means for the running zuul17:28
corvusclarkb: re-enqueue is still going, but i think we see enough jobs actually running we can call it good17:29
clarkbgreat17:29
fungiyeah, lgtm17:29
mordredOpen10K8S: people sometimes ask us "what sort of monitoring system do you use" and the answer is "our users" ;)17:29
corvusit's made out of people17:30
Open10K8Smordred: loll, yeah17:30
clarkb#status log Restarted All of Zuul on version: 3.18.1.dev166 319bbacf. This has scheduler, web, and mergers running under python3.7. We have also incorporated bug fixes for config loading (handle errors across tenants and don't load from tags) as well as improvements to the merger around reseting repos and setting HEAD.17:31
openstackstatusclarkb: finished logging17:31
AJaegermordred: what shall we do with openstackci in https://review.opendev.org/720892 ? That one is still used.17:31
AJaegermordred: apparently for nb0317:31
clarkbAJaeger: mordred I think we can keep openstackci testing for now and do cleanup for other bits, then when nb03 is dockered we can clean it and openstackci up?17:34
mordredyeah17:34
fungias a side note, our zuul scheduler vm has been up 2 years 4 months since its last reboot17:35
mordredfungi: wow17:35
fungiwho says clouds aren't stable?17:35
mordredfungi: "they"17:35
clarkbinfra-root any reason to keep the bridge screen running? the zuul_restart.yaml playbook ran with no errors17:35
fungiclarkb: i know of none17:36
mordredclarkb, fungi: if you've got a sec, https://review.opendev.org/#/c/728097/ - needed for https://review.opendev.org/#/c/717371/17:36
mordredclarkb: nope17:36
corvusclarkb: nope.  and reenqueue is done.17:36
clarkbcool I'll exit out of it now17:36
clarkbOpen10K8S: now is a good time to recheck your chagne if it isn't queued already17:37
clarkb(zuul may have missed the latest patchset events while it was off)17:37
openstackgerritJames E. Blair proposed openstack/project-config master: Use serial manager for deploy pipeline  https://review.opendev.org/72853217:38
corvusclarkb, mordred, fungi: ^  that's available to us now17:39
corvusi think we can drop the mutex after that17:39
clarkbcorvus: we also run hourly and daily jobs via periodic pipelines. Maybe we need to serialize those too ? ( think we may need our own daily pipelien for that too)17:39
mordredclarkb: \o/17:40
clarkbmordred: looking at the js thing now17:40
mordredclarkb: thanks17:40
Open10K8Sclarkb: ok17:40
corvusclarkb: i don't think they need serial; if anything, i'd say supercedent17:40
corvusbut not sure it'd work there17:40
fungialso we still have a toctou sort of bug with delayed execution in periodic jobs, right?17:41
fungiracing with the deploy pipeline too, i'd wager17:42
clarkboh and we'd need the mutex to handle cross pipeline jobs17:42
clarkbya17:42
clarkbso we may still need the mutex?17:42
fungior did we fix that by not using the zuul ref in periodic?17:42
corvusi think the decision was made to not use the zuul ref; unsure if it has been implemented17:43
clarkboh right17:44
clarkband ya not sure if we implemented it yet. I seem to recall reviewing a change mordred wrote around it though17:44
Open10K8Sclarkb: seems likely queued/waiting status is keeping anyway17:45
mordredgod, did I write that?17:46
mordredyes!17:48
mordredinfra_prod_run_from_master17:48
mordredis the key17:48
clarkbmordred: looking at the js change I feel like we've already got what we need for that but we call it "docs" or similar?17:51
clarkbmordred: we aren't doing anything js specific there, we're just saying pull tarballs from these builds and extract them to $location17:52
Open10K8Shttps://zuul.opendev.org/t/zuul/status17:52
Open10K8Squeuing for 18mins :(17:52
clarkbOpen10K8S: all the jobs had to restart so we'll probablybe at capacity for a bit while we catch back up again17:52
clarkbmordred: AJaeger: that makes me wonder if we shouldn't just have a "extract-tarball-to-afs" job17:52
AJaegeror role ;)17:53
AJaegerclarkb: good spotting17:53
mordredclarkb: yeah - I agree17:54
clarkbanyway I +2'd the change but didn't approve in case that dedup was something others thought worthwhile17:54
clarkb(left what I said above on the chagne too)17:54
mordredclarkb: I think it can be done as a followup - extract an extract-tarball-to-afs job and then make the docs and javascript jobs child jobs with some vars set17:55
clarkbmordred: k. in that case feel free to approve17:55
mordredclarkb: kk. I'm about to be afk for a bit, so I'll hold off just so I can poke at things17:56
clarkbya I'm gonna keep an eye on zuul for a bit longer then try and get a bike ride in before the ussuri celebration thing later today17:58
clarkbI even ordered beer that should be arriving before that call :)17:58
corvusclarkb: when is that?17:58
clarkbcorvus: 2000UTC (1pm pacific)17:59
corvusthx17:59
clarkbcorvus: I think meetpad is the planned venue but I don't know the specific url17:59
*** slaweq has quit IRC18:12
openstackgerritMerged zuul/zuul-jobs master: bindep: use virtualenv_command from ensure-pip  https://review.opendev.org/72756118:16
*** iurygregory has quit IRC18:35
fungiclarkb: corvus: according to http://lists.openstack.org/pipermail/openstack-discuss/2020-May/014781.html the room name is virtual-ussuri-celebration18:37
*** iurygregory has joined #opendev18:38
fungii also have a beer i'm preparing to crack open in ~1.5 hours18:38
mnaserinfra-root: did nodepool get a restart too?18:39
clarkbmnaser: no18:39
clarkbIm gonna pop out in that bike ride now18:39
mnaserhmm, ok, i am seeing a lot of odd queries on sjc1 gathering lists of all vms in 86bbbcfa8ad043109d2d7af530225c72 (aka openstackjenkins)18:40
mnaserno nodepool changes seem to stick out18:40
fungithe launchers do periodically query the full list of instances to look for leaked nodes with their metadata set but which they've otherwise lost track of18:44
*** iurygregory has quit IRC18:45
openstackgerritMerged openstack/project-config master: Use serial manager for deploy pipeline  https://review.opendev.org/72853218:46
*** icarusfactor has quit IRC18:47
mnaseri do wonder how often they're listing all instances though19:00
mnasercause "pretty damn oftem" seems to be from how its hitting the api (and i just kinda ran into a slight nova overoptimization thats' hurting it)19:00
fungiwhat frequency of those queries are you seeing? curious if it matches our cleanup frequency19:03
fungii think that's tunable on our side but will need to dig into it19:03
mnaserfungi: i haven't looked but the db seems to be getting hit a lot19:04
fungii thought we only queried it every 10 minutes19:07
fungilooking19:07
mnaserfungi: i think they may be like accumulating and then retrying?19:08
mnaserlike even when i kill it i instantly see like over 20+ queries/attempts19:09
mnaserunless thats nova retrying19:09
fungilooks like the nodepool.launcher.NodePool class hard-codes its cleanup_interval to 60 (seconds) and i can't seem to find any configuration option to adjust that (even globally, much less per provider): https://opendev.org/zuul/nodepool/src/branch/master/nodepool/launcher.py#L87419:16
mnaseryeah but i think another reason here is the way that quota is being calculated19:17
mnaserquota checks did those queries and the big burst of nodepool provisions hit the cloud hard i think19:20
fungimnaser: i do see some tracebacks in the launcher logs on nl03 about cleanupworker getting "Server Error for url: https://compute-sjc1.vexxhost.us/v2.1/os-volumes_boot, No server is available to handle this request.: 503 Service Unavailable"19:31
fungithe cleanup_interval seems to be the wait time between completion of the previous pulse and the start of the next, so taking pulse runtime into account it looks like it's waking up roughly every two minutes19:33
fungion nl03 anyway19:33
fungiit's relatively infrequent, but happened at 17:49, 18:28, 18:35, 18:45, 18:5419:35
fungithose are the only occurrences in the past 6 hours though19:35
fungiof the http 503 errors i mean19:36
corvusmnaser: i can turn on more debugging if it would help to know when we're sending what api queries19:37
mnasercorvus: i think we're able to troubleshoot, i think it might be something in the nova quota code, we're debugging in #openstack-nova :) tgank you tho!19:37
corvusmnaser: ack.  this anomoly is interesting: http://grafana.openstack.org/d/nuvIH5Imk/nodepool-vexxhost?orgId=119:38
mnasercorvus: we found out that a certain query wasn't optimized that could result in >16s per server boot, i assume post-zuul reboot, we got a big surge of new vm requests, 16s each really started straining the db19:38
corvusmnaser: yeah, i think that lines up with the graph19:39
openstackgerritSorin Sbarnea (zbr) proposed zuul/zuul-jobs master: Bump ansible-lint to 4.3.0  https://review.opendev.org/70267919:39
*** noonedeadpunk has quit IRC19:42
*** noonedeadpunk has joined #opendev19:42
corvusyep this is not a valid iptables rule: https://zuul.opendev.org/t/openstack/build/ec6dea2a44a04fd0aa03c287be6e59da/log/zk01.opendev.org/rules.v4.txt#1819:44
corvusmaybe that's running into mordred's "sometimes hostvars don't seem to be set" thing....?19:45
corvusoh!19:46
corvusour fake inventory in gate jobs doesn't have "public_ipv4" and "public_ipv6" set19:46
corvusonly "ansible_host"19:47
corvushrm, we need to map nodepool.public_ipv4 to public_ipv4 when we rewrite the inventory19:51
corvusactually, nodepool.public_ipv4 -> public_v419:51
corvusparty time?20:01
fungiparty time!20:01
clarkbyup I've just gotten back to my desk too20:02
fungiwe're up to 13 in the room20:02
fungiup to 15 now, though audio is getting choppy for me20:04
fungi17 people on now20:05
fungiis audio inaudibly choppy for anyone else, or is it just my connection?20:06
clarkbits been mostly ok for me20:08
openstackgerritOleksandr Kozachenko proposed zuul/zuul-jobs master: Add DaemonSet check for wait-for-pods role  https://review.opendev.org/72850320:08
clarkbdepends on the person. diablo_rojo_phon and bnemec and tim are coming through ok20:08
clarkbttx was harder to undersatnd20:08
fungii think it's me, my load average is up to 10 now, tons of chromium worker processes20:08
fungithis is a 4-core 1.6ghz intel atom x7 so maybe i just need to use a beefier machine20:10
clarkbya my 4 core 7 year old intel cpu is spinning a lot of cycles too20:11
clarkbhas my cpu temp up to 76C20:12
clarkbactually it may be firefox that is slow20:16
clarkbI started in FF then switched to chrome after I remembered it is better20:16
openstackgerritOleksandr Kozachenko proposed zuul/zuul-jobs master: Add DaemonSet check for wait-for-pods role  https://review.opendev.org/72850320:20
fungiwe peaked at 22 participants i think?20:23
clarkbmy last count was 21 so that is probably correct20:23
fungiyeah, i saw 22 on for a bit20:23
clarkbmnaser: fungi corvus is there a nodepool thing we still need to track down?20:23
fungiclarkb: sounds like we maybe thundering herded an inefficient nova operation in vex20:23
clarkbah ok so related to the service restart20:24
fungiyeah, so it seemed given the timing20:25
fungior at least the bulk reenqueue immediately following that restart20:25
clarkbtaking ttx's mobile app suggestion we might send that out to the jitsi thread on openstack-discuss too20:26
clarkb"if your cpu meltsdown, try your phone or tablet"20:26
fungii wonder if my purism5 will support the mobile app, when it eventually ships that is20:27
fungier, librem520:27
corvusthere is a technical reason chromium is more efficient, btw20:27
corvusthat's probably worth mentioning too (it sends less video data in some circumstances)20:28
clarkbcorvus: ah ya I already suggested it over firefox in my initial response but details on that would be great20:28
fungidirect webrtc implementation in the browser engine?20:28
clarkbcorvus: my security keys just got dropped off20:29
clarkbcorvus: I'll have to fiddle with them over the weekend to see how they do20:29
corvusfungi: they both do, but they use a different method for something something sending video of different rates something20:29
corvusfungi: the way it got lodged in my head is something like "firefox sends multiple streams at different bitrates simultaneously while chromium only sends one"20:30
fungigot it, technical stuff ;)20:30
corvusfungi: yeah, i think they just need to reverse the polarity, or decouple the heisenberg compensators20:31
fungimoar antichronotons20:31
fungior antitachyons, i never can remember which is better20:31
corvusjust don't cross the streams20:32
clarkbzuul memory use continues to look good so I expect python3.7 isn't drastically different than 3.8 with respect to memory consumption20:34
clarkbwe should add meetpad to cacti20:37
clarkbmemory use looks good, its definitely using its cpus though20:37
corvusload avg is currently 4-520:38
clarkbI expect if we need to scale it up cpu (and maybe bw) will be the thing to focus on20:38
clarkbcorvus: ya its an 8cpu host so that should be well within its limits20:38
corvuslooks like that's the jvb process20:38
fungithat's better than the load average for chromium on my machine with only the meetpad page open ;)20:38
corvusthat's using most of the cpu20:38
fungi(still hovering at a load of 10)20:38
clarkbI think there are also ways to run maybe jitsi's behind an haproxy20:39
clarkbbut I haven't looked into that too closely yet (but that might be an answer to scaling up if necessary20:39
corvusfrickler pointed to some info about scaling20:40
clarkbs/maybe/many/20:40
clarkbfwiw it seems to be doing well with a single call this size.20:40
corvusand based on this, it seems that the jvb component is the one to focus on20:40
clarkb++20:40
corvus(i think that's the one doing video processing, and the rest is just passing xmpp messages around)20:41
clarkbya jvb == jitsi video bridge20:41
*** iurygregory has joined #opendev20:42
fungiclarkb: and yeah, sensors(1) says my internal processor core thermocouples are at 79c20:47
openstackgerritClark Boylan proposed opendev/system-config master: Add meetpad to cacti  https://review.opendev.org/72857420:48
clarkbfungi: corvus ^ there is meetpad in cacti change20:48
fungithanks!20:48
clarkbfungi: mine has fallen to 73C20:48
fungiclarkb: related monitoring request in a comment on that review20:50
fungiif you don' tmind20:50
fungior i can submit a followup change20:50
clarkblooking20:50
clarkboh ya let me do that20:50
openstackgerritClark Boylan proposed opendev/system-config master: Add meetpad to cacti and ssl certcheck  https://review.opendev.org/72857420:51
clarkbfungi: ^20:52
fungithanks!we run it from the same server, so that reminded me to check20:52
fungiclarkb: on a related note, how about 72738920:56
clarkb+2 I can't recall if ianw's set of changes updates that file or ssl domains either20:57
clarkbbut we should do an accounting of the mirrors as part of that20:57
fungimeetpad held up really well21:01
clarkbcorvus: another takeaway from jvb == load is we might be able to get away with bigger calls/more calls/etc via turning off webcams21:02
fungigranted that was probably half the number of people who would normally be in nova's ptg room, but still as a moderate load test of a single room it worked well21:02
*** diablo_rojo has joined #opendev21:03
diablo_rojoSo that seemed to go pretty well.21:03
diablo_rojoI did see an email from arkady that he couldnt get in though?21:03
fungi"Meetpad did not worked for me"21:04
fungithat's... not very actionable21:04
diablo_rojoRight.21:04
diablo_rojoI asked him if he had errors or the browser.. really any extra detail would help..21:04
clarkbya anyone want to suggest the mobile app now?21:05
clarkbI think that might be a cheat mode for people21:05
openstackgerritSorin Sbarnea (zbr) proposed zuul/zuul-jobs master: Bump ansible-lint to 4.3.0  https://review.opendev.org/70267921:06
clarkbI don't actually now how to connect it to our server though /me looks21:06
clarkbits an option under settings. That just makes so much sense :)21:07
* clarkb composes response to existing thread on openstack-discuss21:07
fungii will eventually have a smart phone which might be capable of using the mobile app, i really have no idea21:07
fungisince the phone runs a modified debian distro and all mainline kernel, i suspect it won't21:08
clarkbfungi: I think you'd need an android emulator (whcih does exist)21:08
clarkbunsure of how well the existing emulators would handle jitsi though21:08
fungiahh, okay21:08
fungii suspected it might be android-specific in some way21:09
clarkbok response sent21:12
clarkb(not to arkady, to the other thread)21:13
fungilooks like purism's recent shipping update is projecting mid-august for my preorder fulfilment21:13
clarkbfungi: https://anbox.io/ thats one way of doing it21:14
fungineat, thanks!21:14
clarkbin fact you/I/we could try that on our desktops21:15
clarkbthat might be a good workaround for cpu issues if it works21:16
fungitrue!21:16
*** paladox has quit IRC21:28
*** paladox has joined #opendev21:30
openstackgerritMerged opendev/system-config master: Add meetpad to cacti and ssl certcheck  https://review.opendev.org/72857421:36
*** slittle1 has joined #opendev22:13
openstackgerritMerged opendev/base-jobs master: Add jobs for publishing javascript content  https://review.opendev.org/72809722:41
*** DSpider has quit IRC22:53
*** yuri has joined #opendev22:56
*** ysandeep|away is now known as ysandeep23:04
*** ysandeep is now known as ysandeep|weekend23:09
*** mlavalle has quit IRC23:31
*** diablo_rojo has quit IRC23:34
openstackgerritMatthew Thode proposed openstack/project-config master: drop python2.7 from generate-constraints  https://review.opendev.org/72859123:37
*** tosky has quit IRC23:39

Generated by irclog2html.py 2.17.2 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!