Thursday, 2020-05-14

openstackgerritIan Wienand proposed openstack/project-config master: Switch nodepool builders to opendev.org mirror  https://review.opendev.org/69075700:03
*** mlavalle has quit IRC00:04
openstackgerritIan Wienand proposed opendev/system-config master: Remove mirror02.dfw.rax.openstack.org  https://review.opendev.org/72789400:04
openstackgerritMerged openstack/project-config master: Switch nodepool builders to opendev.org mirror  https://review.opendev.org/69075700:20
ianwmirror01.ord.rax.openstack.org has nothing but bot hits, and the opendev.org mirror is switched in, ergo we can remove that.  i'll shut it down first now00:26
*** ysandeep|away is now known as ysandeep00:28
openstackgerritIan Wienand proposed opendev/system-config master: Remove mirror01.ord.rax.openstack.org  https://review.opendev.org/72789700:30
ianwthat leaves the IAD mirror.  the opendev.org mirror there was being used as the KAFS testing host.  i think we should rebuild that as a focal host and switch it in, then we can remove the openstack.org one00:30
ianwi've deleted that server and will rebuild it now00:37
ianwwe have plenty of notes on the kafs stuff, and gate testing changes (that need remerge due to the significant recent changes, but it's all there)00:37
openstackgerritIan Wienand proposed opendev/zone-opendev.org master: Update mirror.iad.rax.opendev.org address  https://review.opendev.org/72789900:49
openstackgerritIan Wienand proposed opendev/system-config master: Replace mirror.iad.rax.opendev.org host  https://review.opendev.org/72790101:14
openstackgerritMerged opendev/zone-opendev.org master: Update mirror.iad.rax.opendev.org address  https://review.opendev.org/72789901:17
ianwthe builder no longer reference it so i've shutdown mirror02.dfw.rax.openstack.org; we can remove it after system-config merge01:41
ianwclarkb: what is the deal with citycloud kna1/lon1/sto2 ? "# Disabled until 2018-07-01 at request of citycloud."01:44
clarkbianw: we turned it off due to cappacity issues. at the time they said they hoped to turn it back on a few months later01:48
clarkbat this point we may just want to retire those providers and can add them back again if they are able01:48
openstackgerritMerged opendev/system-config master: Replace mirror.iad.rax.opendev.org host  https://review.opendev.org/72790101:52
openstackgerritIan Wienand proposed openstack/project-config master: Remove citycloud kna1/lon1/sto2 clouds  https://review.opendev.org/72790201:56
ianw 02:10:29 up 854 days ... not a bad effort02:10
openstackgerritIan Wienand proposed opendev/system-config master: Remove citycloud  https://review.opendev.org/72790502:12
ianwboo ... infra-prod-base failed02:19
ianwRun specified playbook on bridge.o.o and redirect output ... had a non-zero exit, but it's not really clear why02:22
ianwthere are no failed tasks02:22
ianwbut there are unreachable hosts02:25
clarkbianw you found the logs on bridge right?02:26
clarkbwe dont publish in zuul for msot things yet as they need review of output02:26
ianwyep, there were no failed tasks in it02:26
ianwansible gives !0 exit with unreachable hosts ... according to https://github.com/ansible/ansible/issues/1972002:27
openstackgerritIan Wienand proposed opendev/system-config master: production-playbook : ignore unreachable hosts  https://review.opendev.org/72790702:37
ianwi think this was my own fault for downing the mirror server before adding to emergency02:41
ianwi'm not sure on the best way to retrigger the run02:41
clarkbI think infra-prod-base runs hourly02:42
clarkbyou can also touch the lock file in zuuls homedir (I forget the path) then run the playbook from zuuls homedir directly02:42
clarkbthe lockfile got added to documentation02:42
clarkbjobs will just queue up behind it and when you remove it the jobs run aiui02:43
clarkbwaiting for hourly run may not be too badshould happen in about 20 minutes02:43
ianwhrm, in this case i'd like it to happen in order because it's for the new iad mirror server, so i want base, letsencrypt,  mirror etc02:44
clarkbah02:44
clarkbI think you can reenqueue the change then02:44
clarkbsincethat will get all the jobs for you02:44
ianwyeah, let's see if i can remember the magic spell :)02:47
ianwi tried "zuul enqueue-ref --tenant openstack --trigger gerrit --pipeline deploy --project opendev/system-config --ref master --newrev a751cab84d1c767c1a9503925ea5da2cc6cdcc98"02:50
ianwwhich is ... kind of right ... every promote-image job has triggered failed02:51
ianwbut it has also enqueued the base job, so ...02:52
clarkbI think deploy is change not ref based02:54
clarkbI always have to look it up in the inventory of the jobs that ran before02:55
clarkbyou get clues based on the zuul vars that line up to the command flags02:55
clarkbianw ya I wouldve expected a lot fewer jobs which is probqbly related to not having files in the event to filter on02:58
clarkbchecking logs and docker hub I think the image promotes all failed because there was no previous manifeat pointing out images to promote so thats all fine03:02
clarkbif you are happy with what base is doing then I think this is working well enough03:04
ianwyeah, i don't think it managed to destroy anything03:04
ianwhopefully base passes now then LE will run03:04
clarkbmordred: corvus: tomorrow it might be good to clarify if that enqueue command was the expected method to eb used. I think maybe if enqueue and not enqueue-ref was used file filters may have worked? but I'm not positive that is more correct03:09
clarkbianw: base was successful and looking at the log it doesn't seem like anything went weird. I do notice we have an apt-get autoremove -y and an ssh key update that both show up as changed. But those were both changed in the last run as well03:16
clarkbSo I think this is just going to look a lot like our old run all script and apply all the things03:16
ianwyeah, it looks like mirror01.iad.rax is getting its certs now03:16
clarkbits not targetted like we've gotten used to so we should keep an eye out for odd failures, but otherwise seems to be happy03:17
ianwi think because it's TOT (nothing else has merged) it's unlikely to try reverting anything03:17
clarkbya03:18
clarkbmore thinking it will be a canary for things that may be unhappy with their ansible03:18
clarkbsince we're potentially running those things less often now?03:19
clarkbthat info would be good to collect if so03:19
openstackgerritOleksandr Kozachenko proposed zuul/zuul-jobs master: Patch CoreDNS corefile  https://review.opendev.org/72786803:23
clarkbit continues to be happy. I'll resume my normal evening now03:33
ianwclarkb: thanks!03:41
ianwyay, https://mirror.iad.rax.opendev.org/ is up, is a focal node, and things look good03:56
*** ykarel|away is now known as ykarel03:58
openstackgerritIan Wienand proposed openstack/project-config master: Switch RAX IAD mirror to opendev.org version  https://review.opendev.org/72791703:58
ianwclarkb: well i had a go at uploading the focal raw image from /opt/images to sjc1 vexxhost and didn't get anything bootable and also i can't delete the image now either ... so ... yeah05:42
ianwanyway, if yourself, or anyone else feels like pulling at bits of https://etherpad.opendev.org/p/openstack.org-mirror-be-gone we're quite close now05:42
openstackgerritOleksandr Kozachenko proposed zuul/zuul-jobs master: Add container and pod log in the test for ensure-kubernetes role  https://review.opendev.org/72792905:43
*** DSpider has joined #opendev05:44
*** dpawlik has joined #opendev06:06
openstackgerritOleksandr Kozachenko proposed zuul/zuul-jobs master: Patch CoreDNS corefile  https://review.opendev.org/72786806:55
openstackgerritOleksandr Kozachenko proposed zuul/zuul-jobs master: Add container and pod log in the test for ensure-kubernetes role  https://review.opendev.org/72792906:55
*** tosky has joined #opendev07:31
*** dtantsur|afk is now known as dtantsur07:32
*** rpittau|afk is now known as rpittau07:33
openstackgerritAndreas Jaeger proposed openstack/project-config master: Remove citycloud kna1/lon1/sto2 clouds  https://review.opendev.org/72790207:35
openstackgerritAndreas Jaeger proposed openstack/project-config master: Remove Citycloud from grafana  https://review.opendev.org/72795607:35
AJaegerianw: fixed your change and added one on top ^07:35
openstackgerritAndreas Jaeger proposed openstack/project-config master: Stop translation stable branches on projects without Dashboard  https://review.opendev.org/72321707:47
*** moppy has quit IRC08:01
*** moppy has joined #opendev08:01
openstackgerritJens Harbott (frickler) proposed opendev/system-config master: Fix access to clouds on bridge  https://review.opendev.org/61519708:07
*** ykarel is now known as ykarel|lunch08:11
*** jaicaa has quit IRC08:14
*** ysandeep is now known as ysandeep|lunch08:14
*** jaicaa has joined #opendev08:17
*** dpawlik has quit IRC08:40
openstackgerritSorin Sbarnea (zbr) proposed zuul/zuul-jobs master: yamlint: EOF newlines and comments indent  https://review.opendev.org/72551608:42
*** dpawlik has joined #opendev08:45
*** ysandeep|lunch is now known as ysandeep09:03
*** sshnaidm|afk is now known as sshnaidm09:11
openstackgerritMerged zuul/zuul-jobs master: yamlint: EOF newlines and comments indent  https://review.opendev.org/72551609:27
*** ykarel|lunch is now known as ykarel09:31
*** ykarel is now known as ykarel|mtg10:02
*** rpittau is now known as rpittau|bbl10:18
openstackgerritAlbin Vass proposed zuul/zuul-jobs master: fetch-tox-output: use tox_extra_args  https://review.opendev.org/72802310:50
*** ykarel|mtg is now known as ykarel11:18
*** Eighth_Doctor is now known as Conan_Kudo11:19
*** Conan_Kudo is now known as Eighth_Doctor11:20
*** lpetrut has joined #opendev11:25
AJaegermordred: want to approve pbrx retirement change https://review.opendev.org/726462 and abandon all open reviews, please?11:52
*** tkajinam has quit IRC12:01
openstackgerritMerged zuul/zuul-jobs master: Combine javascript deployment and deployment-tarball jobs  https://review.opendev.org/72737012:02
openstackgerritTristan Cacqueray proposed zuul/zuul-jobs master: fetch-sphinx-tarball: introduce zuul_use_fetch_output  https://review.opendev.org/68187012:03
*** rpittau|bbl is now known as rpittau12:15
*** lpetrut has quit IRC12:36
*** lpetrut has joined #opendev12:48
*** ykarel is now known as ykarel|afk12:54
AJaegerconfig-core, pbrx is ready to be removed from Zuul, please review https://review.opendev.org/#/c/726463/13:39
*** lpetrut has quit IRC13:46
*** ykarel|afk is now known as ykarel14:07
openstackgerritMonty Taylor proposed opendev/base-jobs master: Add jobs for publishing javascript content  https://review.opendev.org/72809714:14
mordredAJaeger: ^^ that should work I think14:14
openstackgerritAlbin Vass proposed zuul/zuul-jobs master: fetch-tox-output: use tox_extra_args  https://review.opendev.org/72802314:15
mordredAJaeger: hrm - although we might actually want to untar in place and then do an rsync --delete - otherwise we're going to grow cruft in the target location14:19
AJaegermordred: will review later14:20
mordredAJaeger: thanks14:20
openstackgerritMonty Taylor proposed opendev/base-jobs master: Add jobs for publishing javascript content  https://review.opendev.org/72809714:25
openstackgerritMonty Taylor proposed opendev/base-jobs master: Add jobs for publishing javascript content  https://review.opendev.org/72809714:26
clarkbmordred: can you check the question about reruning deploy jobs from around 0240utc?14:26
AJaegermordred: lGTM, fix the linter job please14:34
openstackgerritAlbin Vass proposed zuul/zuul-jobs master: fetch-tox-output: use tox_extra_args  https://review.opendev.org/72802314:37
openstackgerritMonty Taylor proposed opendev/base-jobs master: Add jobs for publishing javascript content  https://review.opendev.org/72809714:37
openstackgerritAlbin Vass proposed zuul/zuul-jobs master: fetch-tox-output: use tox_extra_args  https://review.opendev.org/72802314:38
*** mlavalle has joined #opendev14:40
clarkbinfra-root I'm trying to help a project with possibly force merging a change over in #openstack-infra. I've tried to pull up Project Bootstrappers via web ui to add myself and it doesnt' seem to be in the list. If I ls-groups via the ssh api it is there. Any idea of what is going on?14:51
clarkbnevermind I'm a derp14:52
clarkbthats what I get for early morning gerrit stuff (I was looking in the project list not the group list in the web ui)14:52
openstackgerritAlbin Vass proposed zuul/zuul-jobs master: fetch-tox-output: use tox_extra_args  https://review.opendev.org/72802314:54
openstackgerritMerged zuul/zuul-jobs master: tox siblings installed packages: Add PEP 440 direct reference format  https://review.opendev.org/72747515:00
*** diablo_rojo has joined #opendev15:00
*** ykarel is now known as ykarel|away15:12
fungiclarkb: i generally just use the ssh api and `ssh -p 29418 fungi@review.opendev.org 'gerrit set-members "Project Bootstrappers" --add fungi'` and then similarly with --remove when i'm done15:14
clarkbI normally use the webui but will keep ^ in mind as that is nice and repeatable15:22
openstackgerritAlbin Vass proposed zuul/zuul-jobs master: fetch-tox-output: use tox_extra_args  https://review.opendev.org/72802315:23
fungiyeah, my workflow is to --add myself, ctrl-r my gertty view of a change and then review to add verified and ctrl-u to submit, or git push --force gerrit whatever stuff i've prepared, and then --remove myself again15:27
fungisaves me fumbling with the webui for any of it15:27
fungitook me a while to get to the point where i consistently remember the gerrit command for that though ;)15:28
fungibtw, i restarted the vos release of mirror.ubuntu from yesterday, first attempt eventually failed with "Failed to end transaction on rw volume: Possible communication failure; The volume 536870949 could not be released to the following 1 sites: afs02.dfw.openstack.org /vicepa"15:30
openstackgerritAlbin Vass proposed zuul/zuul-jobs master: tox: use tox_extra_args, add tox_config_file  https://review.opendev.org/72802315:30
fungithis time around it noted "Deleting extant RO_DONTUSE site on afs02.dfw.openstack.org... done; Creating new volume 536870950 on replication site afs02.dfw.openstack.org:  done"15:31
fungiso maybe it'll be better15:31
openstackgerritAlbin Vass proposed zuul/zuul-jobs master: tox: use tox_extra_args, add tox_config_file  https://review.opendev.org/72802315:31
fungiif it fails the same way again, i'll see if i need to vos endtrans something, or delete the ro volume on afs02.dfw and recreate15:31
*** ysandeep is now known as ysandeep|sleep15:31
clarkbfungi: I almost read that deleting extant... thing to mean its removed the ro volume and is recreating it15:32
clarkbhere is hoping that makes it happy15:32
clarkbemergency.yaml looks a lot better now. Thanks everyone for giving that a quick cleanup15:35
clarkbinfra-root can I get reviews on https://review.opendev.org/#/c/727865/ to add a new nova key entry in our clouds? Then we'll follow up with https://review.opendev.org/#/c/727867/ to complete work I told shrews and dmsimard to not worry about (since it needs this rotation step)15:36
AJaegerinfra-root, logstash queue is up to 50k and not catching up15:37
clarkbAJaeger: thanks. Chances are we've got another extra large log file clogging the pipes15:37
clarkbAJaeger: I'll try to take a look soon15:37
openstackgerritAlbin Vass proposed zuul/zuul-jobs master: tox: use tox_extra_args, add tox_config_file  https://review.opendev.org/72802315:39
corvusclarkb: drat, i was hoping i'd have the signet hc now and a new ssh key.  :/15:42
clarkbcorvus: mine arrive tomorrow according to ups15:43
corvusclarkb: exciting!15:43
clarkbAJaeger: screen-monasca-api.txtMon Apr 27 15:54:23 2020 314.2M <- that will do it15:43
AJaeger314 M? Argh ;(15:43
clarkbcorvus: ya its been a struggle for them with covid and manufacturing and all that. I'm really hopeful the utility will outweigh all of that and maybe even buy more for people I know15:44
clarkbAJaeger: also note the size of the job-output on this one https://storage.bhs.cloud.ovh.net/v1/AUTH_dcaab5e32b234d56b626f72581e3644c/zuul_opendev_logs_c91/724633/2/check/tox-py35/c911410/15:44
clarkbthat was https://review.opendev.org/#/c/724633/ which avass has abandoned15:45
clarkbso maybe that is a one off15:45
AJaeger715 M ;(15:46
clarkbAJaeger: https://29f6adc0364e579577f3-f81ad92f06c83d9d7da63d2d01eeb258.ssl.cf5.rackcdn.com/727363/2/check/openstack-tox-docs/69825e7/ that is from ironic but wonder if that is widespread too?15:47
clarkbAJaeger: perhaps realted to the work happenign around theming and stuff right now?15:47
clarkbAJaeger: I think we start there, see if we can get monasca to log less verbosely if not exclude it in our log processing. And check openstack tox docs jobs as well15:48
clarkbI'm going to download the tox-docs log from above to see if I can make sense of what is happening15:48
AJaegerLet me check tox-docs as well...15:48
clarkbAJaeger: looks like it may be 120MB of latex warnings from pdf generation?15:51
clarkbI'm guessing that is something we can pretty easily clean up thankfully15:51
clarkbbasically each line of docs has multiple lines of warnings/errors15:51
AJaegerlet me silence "automated-steps" at least15:53
openstackgerritClark Boylan proposed opendev/base-jobs master: Don't index monasca-api logs in elasticsearch  https://review.opendev.org/72812115:54
clarkbAJaeger: ^ monasca-persister is already in the exclude list so I think we can just add monasca-api to start15:55
AJaegerclarkb: +215:55
clarkbAJaeger: I think if we can get that fix alnded and a fix for pdf/latex things in I'll restart the workers and we can see if we catch more of these :)15:55
clarkbrestarting now would likely just result in quick crashes particularly if pdf generation is that verbose15:56
AJaegernot sure how to silence the latex thing ;(16:01
*** rpittau is now known as rpittau|afk16:01
openstackgerritMerged zuul/zuul-jobs master: Policy rule for ownership between remote and executor  https://review.opendev.org/72485516:04
clarkbAJaeger: we run `sphinx-build -b latex` which runs pdflatex16:04
clarkbso far havent' found anything useful on the sphinx side. now going to see the pdflatex side16:05
clarkbmight also be able to change the latex_engine in sphinx but that may produce slightly different results in the output?16:06
AJaegerwe use xelatex in sphinx - doubt that changing that helps16:08
clarkbAJaeger: ironic's sphinx conf.py doesn't set latex_engine so I assumed they were using the default. But the logs show xetex being used?16:09
AJaegeropenstackdocstheme sets xelatex16:10
AJaeger(with exception of 2.1.0 which was just fixed in 2.1.1)16:10
fungicould tox be configured to redirect (or filter) the output from that latex command in cases where it's just too verbose?16:11
clarkbfungi: ya, thats the big hammer options available to us I think16:11
fungi|grep -v some patterns16:11
clarkbwe could also have projects fix their latex (though I expect that to be fiddly and not easy)16:11
AJaegeroverfull hboxes ;(16:11
fungiugly, and needs /usr/bin/grep added to the out-of-venv command whitelist in the tox config to silence another warning16:12
clarkbAJaeger: underful too16:12
AJaegeryep16:12
clarkbfungi: AJaeger: possible bad idea redirect all the output to a file then log that file separate from job-output. THen we'll collcet it as a log but not index it16:12
clarkbhowever I don't really feel like this info si useful right now becuase i really doubt anyone will figure out the hboxes16:13
funginot a terrible idea16:13
fungialso if it's that repetitive, it probably compresses well16:13
clarkbAJaeger: fungi: I think that is the best idea I've got for now. We have the jobs section for pdf generation do redirection into a file, emit a note to job-output that this step tends to be incredibly verbose and logs can be found at $location if necessary16:16
AJaegerroles/build-pdf-docs in openstack-zuul-jobs handles the building, we could add something there16:19
*** sshnaidm is now known as sshnaidm|afk16:21
AJaegeror in roles/tox in zuul-jobs16:26
AJaegerno good idea right now ;(16:26
clarkbAJaeger: maybe we can do the tox role as is but modify it to install deps only. Then do a followup task that executs tox in shell and redirects things?16:26
clarkbthat way we get all the dependency stuff managed by the tox role as well as a virtualenv ready to go?16:27
*** dpawlik has quit IRC16:27
clarkbya we can set tox_extra_args16:27
clarkblet me try write that change up really quickly16:27
AJaegerclarkb: I can't wrap my head around that right now. Thanks for write that up, will try reviewing later16:28
AJaegerfungi, want to approve the monasca logstash block change https://review.opendev.org/728121, please?16:30
*** dtantsur is now known as dtantsur|afk16:44
*** mlavalle has quit IRC16:47
openstackgerritMerged opendev/base-jobs master: Don't index monasca-api logs in elasticsearch  https://review.opendev.org/72812116:59
AJaegerianw has a change up to switch RAX to opendev mirror, please review https://review.opendev.org/72791717:06
AJaegerand we're ready to retire pbrx: https://review.opendev.org/72646317:07
AJaegerconfig-core, please review these two ^17:07
openstackgerritTristan Cacqueray proposed zuul/zuul-jobs master: Simplify zuul-output usage and remove merge-output-to-logs  https://review.opendev.org/72815117:11
openstackgerritJames E. Blair proposed zuul/zuul-jobs master: Move artifactory test job  https://review.opendev.org/72815217:13
openstackgerritAlbin Vass proposed zuul/zuul-jobs master: Add a tox module  https://review.opendev.org/72815417:14
*** panda is now known as panda|out17:15
openstackgerritAlbin Vass proposed zuul/zuul-jobs master: Add a tox module  https://review.opendev.org/72815417:16
openstackgerritMerged zuul/zuul-jobs master: fetch-sphinx-tarball: introduce zuul_use_fetch_output  https://review.opendev.org/68187017:19
openstackgerritMerged zuul/zuul-jobs master: Add upload-artifactory role  https://review.opendev.org/72567817:26
openstackgerritAlbin Vass proposed zuul/zuul-jobs master: Add a tox module  https://review.opendev.org/72815417:28
*** avass has joined #opendev17:39
openstackgerritMerged zuul/zuul-jobs master: Move artifactory test job  https://review.opendev.org/72815218:05
*** hashar has joined #opendev18:11
*** diablo_rojo has quit IRC18:14
openstackgerritAndreas Jaeger proposed opendev/system-config master: Stop cloning a bunch of puppet modules we don't use  https://review.opendev.org/72089218:23
AJaegermordred: I rebased your change. infra-root, the change above is the next step to retire a couple of puppet modules ^18:24
openstackgerritAlbin Vass proposed zuul/zuul-jobs master: tox: update lint regex to not require column  https://review.opendev.org/72503018:31
AJaegermordred, infra-root, any idea why 720892 fails? Fixes welcome ;)18:50
*** dpawlik has joined #opendev18:51
clarkbAJaeger: mordred https://zuul.opendev.org/t/openstack/build/2bcf368be87f46b493fc25804b70ecba/log/applytest/puppetapplytest23.final.out.FAILED Could not find declared class ::openstackci::nodepool_builder at /home/zuul/applytest/puppetapplytest23.final:10:3 on node ubuntu-xenial-inap-mtl01-0016575779.openstack.org18:55
clarkbdid we remove openstackci?18:55
clarkbI'm guessing yes and nb03 is still using it?18:55
AJaegerclarkb: mordred planned to remove it18:56
AJaegerclarkb: see https://review.opendev.org/#/c/720901/18:57
AJaegerif we still need it, we can add it back in...18:57
clarkbAJaeger: ya I think nb03 is still using it18:58
clarkbits close to not using it anymore but is still using it aiui18:58
AJaegermordred: could you rework, please? ^18:58
*** hashar has quit IRC19:22
*** hashar has joined #opendev19:23
*** dpawlik has quit IRC19:46
clarkbok responded to the jitsi meet thread with things I've learned20:18
fungithanks! i wanted to but am pretty sure i just don't have time to get to it20:19
clarkbfungi: https://review.opendev.org/#/c/727865/ is another change on my todo list for landing if you can review it20:26
clarkbfungi: step 1 in rotating nodepool node ssh keys20:26
fungithat brings us to 7 ssh keys20:28
clarkbya :(20:28
fungi5 normally residing in north america, one in australia and one in europe20:28
clarkbI'm pulling up ianw's mirror backlog20:30
clarkbinfra-root https://review.opendev.org/#/c/727917/1 will put our first focal mirror node in production. It has tbe +2's necessary but not sure if there is any other vetting we want to do before doing that? I've clicked around the http served indexes and that seems happy20:32
clarkbmordred: (but infra-root in general) https://review.opendev.org/#/c/727907/1 is an interesting one that deals with ansible failures to connect to servers. I've left some thoughts on potential issues with the change but am curious to see if others think that is an issue or not20:40
*** hashar has quit IRC20:49
corvusclarkb: 917 seems like one to wait until release is all clear; is it?20:54
clarkbcorvus: yes, and I believe the release is all clear except for the trailing projects which do their stuff in ~2 weeks?20:55
clarkbsmcginnis: ^ ya'll are basically done for a bit right?20:55
fungithere's no longer a deadline for the cycle-trailing projects20:58
fungithey get done whenever they get done, and we no longer hold up infrastructure changes for them20:58
smcginnisclarkb: Yep20:58
clarkbthe fixes for logstash sadness have landed. I'm going to restart logstash worker services to ressurrect them21:02
fungimirror.ubuntu volume re-re-release is still going21:10
clarkbfungi: I want to say it was about a 12 hour release when I added focal21:12
fungiwell, it's hopefully not replacing all the files for focal21:13
clarkbfungi: from the message you posted earlier it wouldn't surprise me if it is replacing all the files21:14
clarkbit said it removed the old volume and is recreating it21:14
fungion afs02.dfw anyway21:14
fungibut yeah, quite possible21:14
clarkbright the ro copy21:16
openstackgerritJames E. Blair proposed opendev/system-config master: Add iptables_extra_allowed_groups  https://review.opendev.org/72647521:19
openstackgerritJames E. Blair proposed opendev/system-config master: Run Zuul, Nodepool, and Zookeeper as the "container" user  https://review.opendev.org/72695821:28
clarkbfungi: any reason to not approve https://review.opendev.org/#/c/727865/ ? it should only add a new key (we won't use it yet) impact should be low21:36
*** diablo_rojo has joined #opendev21:43
clarkbianw for when your day starts I had a couple comments on some of your mirror and ansible changes21:48
ianwok, cool, will look!21:58
ianwi got stuck booting something successfully to replace the two vexxhost ones ... perhaps we should start with https://review.opendev.org/#/c/726886/22:00
ianwi got that .raw image booting in vexxhost, but it didn't come up on the network.  wasn't sure if it was me or glean22:00
clarkbhuh I wonder if our test nodes have similar issues on vexxhost with focal22:03
ianwi also then couldn't delete that image22:05
ianwyeah i still can't "The image cannot be deleted because it is in use through the backend store outside of Glance"22:06
clarkbianw: are t he servers you booted on it refusing to delete?22:10
clarkbI think we've always seen glance agree to delete those images once the instances were removed22:10
ianwyeah, the dud server that booted were removed without issues22:11
ianwmaybe i should try booting it again while i'm here22:12
ianwhere's a full boot log http://paste.openstack.org/show/793619/22:16
clarkbianw: oh did you double check if the volume leaked?22:16
clarkbbecause that has happened. Server deletes but volume for it does not. Image refuses to delete as a result22:16
ianwclarkb: it probably is that22:18
ianwhttp://paste.openstack.org/show/793620/22:18
ianwi saw those blank ones there yesterday and didn't want to delete them22:18
clarkbianw: if you show them there should be details on what hosts they belonged to in a json blob22:18
clarkbshould give you the nova server name22:18
clarkbthen from that you can decide if it is safe to delete the volume22:19
clarkblogstash queue backlog has fallen by ~10% since I restarted services22:19
clarkbdown from 43k to 39k22:19
ianwgood point, they have "volume_image_metadata"22:19
ianwin that log we see22:21
ianw[[0;32m  OK  [0m] Reached target [0;1;39mNetwork is Online[0m.22:21
ianwthen at the very end22:22
ianw[[0;32m  OK  [0m] Finished [0;1;39mGlean for interface ens3[0m.22:22
ianw[[0;32m  OK  [0m] Reached target [0;1;39mNetwork (Pre)[0m.22:22
clarkbthat looks like glean is starting late22:23
ianwwhy pre comes after network is anyone's guess ...22:23
clarkbnetwork is online is supposed to mean basic networkign has happened (so you can bind to addrs iirc)22:23
clarkbhttps://www.freedesktop.org/wiki/Software/systemd/NetworkTarget/ ya22:24
clarkbbut I guess that is platform dependent so ubuntu may be setting that early22:24
ianwi converted the upstream images to raw, and uploaded that, and that doesn't even seem to boot, or at least nothing comes up on the console22:37
clarkbI wonder if mnaser or Open10K8S have looked at focal yet22:38
mnaserclarkb: like in terms of22:42
mnaserRunning it on our cloud?22:42
clarkbmnaser: ya22:42
clarkbmnaser: per ianw above the upstream ubuntu images don't seem to boot after being converted to raw and uploaded22:43
clarkband our built iamges using glean aren't any better22:43
mnaserI don’t believe we have yet.  Maybe noonedeadpunk can dig a bit..22:43
ianwohhhh wait wait i might be trying to boot an *arm64* image22:44
clarkbianw: oh ha22:44
ianwyeah22:44
ianwit's almost like i need fonts that distinguish between "rm" "md" in big red letters or something! :)22:45
ianwheh, well there we go, that works22:49
ianwi think that probably works out to about an hour per letter i wasted fiddling with things :/22:51
ianwand we have learnt, i think, that something is up with glean ... maybe?  i'm not 100% sure on those images mordred made22:52
clarkbianw: any idea if ipv6 is coming up properly? the kernel handles that with RAs in vexxhost I think22:54
clarkbianw: that might allow us to get in and debugipv4 from there22:54
clarkbbut ya debugging these things tends to be difficult if we cant get a shell on the ost22:55
ianwi'll get these hosts up and maybe try things out in the ci tenant22:56
ianwjenkins/zuul tenant i mean22:56
*** tkajinam has joined #opendev22:58
clarkbianw: did you see my comment about returning success when a host is unreachable? I think we don't want to univesally apply that rule and may not want to apply it at all due to the relationships between playbooks23:03
clarkbthat said I've thought about it more and in the run all days we would continue if any one playbook failed too23:04
clarkbwe were careful to keep related things in individual playbooks to avoid misappling changes on unreachable failures (or any failure really)23:04
*** tosky has quit IRC23:05
ianwyeah, it's a fair point ... it's just we don't have much of a "this is failing" alert system, other than you watching it directly23:05
ianwperhaps getting someone in the loop, rather than ignoring it, is better23:06
clarkbianw: the proposed change would make it harder to notice that too right? because we'd basically paper over the situation?23:06
ianwtrue :)23:07
openstackgerritIan Wienand proposed opendev/zone-opendev.org master: Add vexxhost opendev.org mirrors  https://review.opendev.org/72830823:23
clarkbianw: in ^ the firsttwo A/AAAA records have an hour ttl and the next set don't. Is that intentional?23:26
ianwhrm, it seems it depends if you copy/paste from the launch output or other bits in the file23:28
ianwactually i wonder if emacs adds that for me?23:30
clarkbI think we've done hour long ttls on mirrors to avoid extra lookups in jobs23:30
clarkbsince dns in some jobs has been a flaky thing23:30
openstackgerritIan Wienand proposed opendev/zone-opendev.org master: Add vexxhost opendev.org mirrors  https://review.opendev.org/72830823:31
ianwthe output from launch.py doens't have the ttl and i don't remember ever manually setting it23:31
clarkbya I think its mostly a mirror thing23:32
clarkbsince we've seen jobs do dns lookups and fail to pull them down due to dns thorugh nat or whatever23:33
openstackgerritIan Wienand proposed opendev/system-config master: Add vexxhost opendev.org mirrors  https://review.opendev.org/72830923:39
ianw^ both hosts are up, with a separate volume attached for openafs/apache cache23:40
openstackgerritIan Wienand proposed opendev/system-config master: Add vexxhost opendev.org mirrors  https://review.opendev.org/72830923:40
clarkbianw: the depends on is wrong23:42
clarkbit is pointing to itself, that is why zuul complains23:42
openstackgerritIan Wienand proposed opendev/system-config master: Add vexxhost opendev.org mirrors  https://review.opendev.org/72830923:42
clarkbianw: two more things noted on the change23:45
openstackgerritIan Wienand proposed opendev/system-config master: Add vexxhost opendev.org mirrors  https://review.opendev.org/72830923:49
openstackgerritIan Wienand proposed openstack/project-config master: Switch vexxhost mirrors to opendev.org  https://review.opendev.org/72831023:50
clarkbianw: close :) one small thing on 309 still23:51
ianwyes :)23:52
openstackgerritIan Wienand proposed opendev/system-config master: Add vexxhost opendev.org mirrors  https://review.opendev.org/72830923:52
ianwi feel we should be autogenerating this list ... another time though :)23:52
*** DSpider has quit IRC23:54
openstackgerritIan Wienand proposed opendev/system-config master: Remove vexxhost openstack.org mirrors  https://review.opendev.org/72831123:54
ianwi'm going to merge the removal of the dfw.rax.openstack ... that server is shutdown and in emergency23:58

Generated by irclog2html.py 2.17.2 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!